Terraform-provider-aws: Terraform thinks user-data is changing when it isn't, resulting in unnecessary resource replacement

Created on 27 Jun 2018  路  21Comments  路  Source: hashicorp/terraform-provider-aws

_This issue was originally opened by @Lsquared13 as hashicorp/terraform#18343. It was migrated here as a result of the provider split. The original body of the issue is below._


Terraform Version

Terraform v0.11.7
+ provider.aws v1.25.0
+ provider.local v1.1.0
+ provider.null v1.0.0
+ provider.template v1.0.0
+ provider.tls v1.1.0

Terraform Configuration Files

module "vault_cluster" {
  source = "github.com/hashicorp/terraform-aws-vault.git//modules/vault-cluster?ref=v0.0.8"

  cluster_name  = "REDACTED"
  cluster_size  = "${var.vault_cluster_size}"
  instance_type = "${var.vault_instance_type}"

  ami_id    = "${var.vault_consul_ami}"
  user_data = "${data.template_file.user_data_vault_cluster.rendered}"

  s3_bucket_name          = "${aws_s3_bucket.REDACTED.id}"
  force_destroy_s3_bucket = "${var.force_destroy_s3_bucket}"

  vpc_id     = "${var.aws_vpc}"
  subnet_ids = "${aws_subnet.vault.*.id}"

  target_group_arns = ["${aws_lb_target_group.REDACTED.arn}"]

  allowed_ssh_cidr_blocks            = ["0.0.0.0/0"]
  allowed_inbound_cidr_blocks        = ["0.0.0.0/0"]
  allowed_inbound_security_group_ids = []
  ssh_key_name                       = "${aws_key_pair.auth.id}"
}

data "template_file" "user_data_vault_cluster" {
  template = "${file("${path.module}/user-data/user-data-vault.sh")}"

  vars {
    aws_region                = "${var.aws_region}"
    s3_bucket_name            = "${aws_s3_bucket.REDACTED.id}"
    consul_cluster_tag_key    = "${module.consul_cluster.cluster_tag_key}"
    consul_cluster_tag_value  = "${module.consul_cluster.cluster_tag_value}"
    vault_cert_bucket         = "${aws_s3_bucket.vault_certs.bucket}"
    REDACTED_role          = "${var.REDACTED_role}"
    REDACTED_role       = "${var.REDACTED_role}"
  }
}

Expected Behavior


I expect that since none of the user data variables has changed, my second time running terraform init proposes no changes to the infrastructure. The removal of the tag on the s3 bucket could be ignored though I still find it confusing. (See Actual Behavior for more info)

Actual Behavior


The second time I run terraform init it proposes the following plan. In particular, my issue is that the unexpected user data hash change is forcing a new launch configuration which is forcing a new autoscaling group. This makes multiple apply operations destructive.

Terraform will perform the following actions:

 <= module.REDACTED_vault.data.template_file.user_data_vault_cluster
      id:                                        <computed>
      rendered:                                  <computed>
      template:                                  "REDACTED"
      vars.%:                                    "7"
      vars.aws_region:                           "us-east-1"
      vars.consul_cluster_tag_key:               "consul-cluster"
      vars.consul_cluster_tag_value:             "REDACTED-consul"
      vars.REDACTED_role:                  "REDACTED-20180627163921901300000003"
      vars.s3_bucket_name:                       "REDACTED-2018062716392224750000000b"
      vars.REDACTED_role:                     "REDACTED-20180627163921914400000005"
      vars.vault_cert_bucket:                    "REDACTED-vault-certs-2018062716392226290000000c"

  ~ module.REDACTED_vault.aws_s3_bucket.REDACTED_vault
      tags.%:                                    "1" => "0"
      tags.Description:                          "Used for secret storage with Vault. DO NOT DELETE this Bucket unless you know what you are doing." => ""

  ~ module.REDACTED_vault.module.vault_cluster.aws_autoscaling_group.autoscaling_group
      launch_configuration:                      "REDACTED-vault-20180627164208768300000021" => "${aws_launch_configuration.launch_configuration.name}"

-/+ module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration (new resource required)
      id:                                        "REDACTED-vault-20180627164208768300000021" => <computed> (forces new resource)
      associate_public_ip_address:               "false" => "false"
      ebs_block_device.#:                        "0" => <computed>
      ebs_optimized:                             "false" => "false"
      enable_monitoring:                         "true" => "true"
      iam_instance_profile:                      "REDACTED-vault2018062716392283740000000f" => "REDACTED-vault2018062716392283740000000f"
      image_id:                                  "ami-REDACTED" => "ami-REDACTED"
      instance_type:                             "t2.medium" => "t2.medium"
      key_name:                                  "REDACTED-key-20180627163922247100000007" => "REDACTED-key-20180627163922247100000007"
      name:                                      "REDACTED-vault-20180627164208768300000021" => <computed>
      name_prefix:                               "REDACTED-vault-" => "REDACTED-vault-"
      placement_tenancy:                         "default" => "default"
      root_block_device.#:                       "1" => "1"
      root_block_device.0.delete_on_termination: "true" => "true"
      root_block_device.0.iops:                  "0" => <computed>
      root_block_device.0.volume_size:           "50" => "50"
      root_block_device.0.volume_type:           "standard" => "standard"
      security_groups.#:                         "1" => "1"
      security_groups.879695302:                 "sg-11f5815a" => "sg-11f5815a"
      user_data:                                 "9391db96cfba819eefef3353a3f01daf3a50b4ab" => "643cd0eab8f3d7def9cef600a65268cf39d49ecf" (forces new resource)

Steps to Reproduce

  1. terraform apply
  2. terraform apply

References

  • hashicorp/terraform#4197
    I thought this issue might be related but upgrading to aws provider version 1.25 did not help
bug servicautoscaling

Most helpful comment

The issue still exists on 2.62.0

No changes made in user_data. Though, terraform wants to recreate the aws_instance node because of change in user_data (actually no change). I hope someone will give a fix on this issue.

All 21 comments

Correction (I can't edit because technically the bot made this):

In the 'Expected Behavior' and 'Actual Behavior' sections I mention running terraform init. That is a mistake. I actually mean terraform apply.

Thank You

@Lsquared13 are you able to determine what the user_data value of 9391db96cfba819eefef3353a3f01daf3a50b4ab (a SHA1 sum) is via the AWS CLI or similar to determine what difference Terraform might be seeing? e.g.

aws autoscaling describe-launch-configurations --launch-configuration-name REDACTED-vault-20180627164208768300000021 --query 'LaunchConfigurations[0].UserData' --output text | base64 -D

You can also enable debug logging with Terraform, e.g. TF_LOG=debug to help determine the API response Terraform is receiving.

@bflad Thanks for the prompt reply. I did some testing around this and the results I got were... interesting...

I ran

aws autoscaling describe-launch-configurations --launch-configuration-name REDACTED-vault-20180627164208768300000021

Which gave me a base64 user-data that I could decode into exactly the script I expect. So far so good.

Then I run terraform apply and get the same plan I posted yesterday, claiming to make this change to user_data:

"9391db96cfba819eefef3353a3f01daf3a50b4ab" => "643cd0eab8f3d7def9cef600a65268cf39d49ecf" (forces new resource)

I went ahead and applied this plan, and the apply looked like this:

Plan: 1 to add, 2 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

module.REDACTED_vault.aws_s3_bucket.REDACTED_vault: Modifying... (ID: REDACTED-2018062716392224750000000b)
  tags.%:           "1" => "0"
  tags.Description: "Used for secret storage with Vault. DO NOT DELETE this Bucket unless you know what you are doing." => ""
module.REDACTED_vault.aws_s3_bucket.REDACTED_vault: Modifications complete after 2s (ID: REDACTED-2018062716392224750000000b)
module.REDACTED_vault.data.template_file.user_data_vault_cluster: Refreshing state...
module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration: Creating...
  associate_public_ip_address:               "" => "false"
  ebs_block_device.#:                        "" => "<computed>"
  ebs_optimized:                             "" => "false"
  enable_monitoring:                         "" => "true"
  iam_instance_profile:                      "" => "REDACTED-vault2018062716392283740000000f"
  image_id:                                  "" => "ami-0982dc76"
  instance_type:                             "" => "t2.medium"
  key_name:                                  "" => "REDACTED-key-20180627163922247100000007"
  name:                                      "" => "<computed>"
  name_prefix:                               "" => "REDACTED-vault-"
  placement_tenancy:                         "" => "default"
  root_block_device.#:                       "" => "1"
  root_block_device.0.delete_on_termination: "" => "true"
  root_block_device.0.iops:                  "" => "<computed>"
  root_block_device.0.volume_size:           "" => "50"
  root_block_device.0.volume_type:           "" => "standard"
  security_groups.#:                         "" => "1"
  security_groups.879695302:                 "" => "sg-11f5815a"
  user_data:                                 "" => "9391db96cfba819eefef3353a3f01daf3a50b4ab"
module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration: Creation complete after 2s (ID: REDACTED-vault-20180628163710785300000002)
module.REDACTED_vault.module.vault_cluster.aws_autoscaling_group.autoscaling_group: Modifying... (ID: tf-asg-20180627164209710400000022)
  launch_configuration: "REDACTED-vault-20180627164208768300000021" => "REDACTED-vault-20180628163710785300000002"
module.REDACTED_vault.module.vault_cluster.aws_autoscaling_group.autoscaling_group: Modifications complete after 1s (ID: tf-asg-20180627164209710400000022)
module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration.deposed: Destroying... (ID: REDACTED-vault-20180627164208768300000021)
module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration.deposed: Destruction complete after 0s

Apply complete! Resources: 1 added, 2 changed, 1 destroyed.

So the launch configuration was replaced with a new one just like the plan said: REDACTED-vault-20180628163710785300000002

I then retrieved the launch configuration from the API again:

aws autoscaling describe-launch-configurations --launch-configuration-name REDACTED-vault-20180628163710785300000002

The user data returned was exactly the same as that from the first launch configuration.

I then ran terraform apply again to see the plan that was produced. It looked the same as the last plan. In particular, it had this entry:

-/+ module.REDACTED_vault.module.vault_cluster.aws_launch_configuration.launch_configuration (new resource required)
      id:                                        "REDACTED-vault-20180628163710785300000002" => <computed> (forces new resource)
      associate_public_ip_address:               "false" => "false"
      ebs_block_device.#:                        "0" => <computed>
      ebs_optimized:                             "false" => "false"
      enable_monitoring:                         "true" => "true"
      iam_instance_profile:                      "REDACTED-vault2018062716392283740000000f" => "REDACTED-vault2018062716392283740000000f"
      image_id:                                  "ami-0982dc76" => "ami-0982dc76"
      instance_type:                             "t2.medium" => "t2.medium"
      key_name:                                  "REDACTED-key-20180627163922247100000007" => "REDACTED-key-20180627163922247100000007"
      name:                                      "REDACTED-vault-20180628163710785300000002" => <computed>
      name_prefix:                               "REDACTED-vault-" => "REDACTED-vault-"
      placement_tenancy:                         "default" => "default"
      root_block_device.#:                       "1" => "1"
      root_block_device.0.delete_on_termination: "true" => "true"
      root_block_device.0.iops:                  "0" => <computed>
      root_block_device.0.volume_size:           "50" => "50"
      root_block_device.0.volume_type:           "standard" => "standard"
      security_groups.#:                         "1" => "1"
      security_groups.879695302:                 "sg-11f5815a" => "sg-11f5815a"
      user_data:                                 "9391db96cfba819eefef3353a3f01daf3a50b4ab" => "643cd0eab8f3d7def9cef600a65268cf39d49ecf" (forces new resource)

Note the last line in particular. It's claiming the user_data hash should change in the exact same way. A user data with hash 643cd0eab8f3d7def9cef600a65268cf39d49ecf was never created, but the provider seems to think it needs to change it to something that matches that hash.

If I had to guess I'd say the interpolations aren't happening as expected during the planning phase, but you're the expert here.

Thanks again for the timely response!

That issue says it was released in aws provider version 1.25.0, which is the version I'm using.

Please correct me if I'm doing something wrong testing this, but I don't think I should be experiencing those issues on this version if #4991 was the fix.

Sorry, I'm no expert here. We're experiencing a similar issue (but haven't tried 1.25.0 yet) so I was just investigating things and trying to connect the dots.

One way this issue can be reproduced is by adding a depends_on attribute in a template_file Data source that is loading a user_data script.

Below is a simple example to reproduce the issue:

  • main.tf without depends_on within Data source.
  • main.tf with depends_on within Data source

Metadata:

$ terraform --version
Terraform v0.11.7
+ provider.aws v1.31.0
+ provider.template v1.0.0
$ terraform plan
(partial output)
  ~ module.aws_minimum_viable_deployment.aws_autoscaling_group.mvdserver_asg
      launch_configuration:        "mvdserver-20180809223007699000000001" => "${aws_launch_configuration.mvdserver_lc.name}"

-/+ module.aws_minimum_viable_deployment.aws_launch_configuration.mvdserver_lc (new resource required)
      id:                          "mvdserver-20180809223007699000000001" => <computed> (forces new resource)
...

      security_groups.1785063276:  "sg-0aa95d55107d75d0a" => "sg-0aa95d55107d75d0a"
      user_data:                   "f2289ceab41a69b5f99fb2978015e0769a8e5739" => "656bca31b5f7cea3181e684bfea743272dcaefca" (forces new resource)
...

This upstream Terraform core issue with depends_on and with the template_file data source might be relevant here: https://github.com/hashicorp/terraform/issues/11806

Something interesting here that is worth checking as well is if the right hand side of user_data difference exactly matches the configured reference in the Terraform configuration. e.g.

With this snippet of configuration:

data "template_file" "user_data" {
  template = "${file(var.user_data_file_path)}"

  depends_on = ["aws_key_pair.mvdkeypair"] # triggers this behavior
}

resource "aws_launch_configuration" "mvdserver_lc" {
  # ... other configuration ...
  user_data = "${data.template_file.user_data.rendered}" # StateFunc with SHA1 hashing
}

Receiving this plan output:

 <= data.template_file.user_data
      id:                          <computed>
      rendered:                    <computed>
      template:                    "#!/bin/bash\ncd /tmp\nsudo apt-get update -y\nsudo apt-get install wget git curl jq -y\ngit clone https://github.com/hashicorp/demo-terraform-101.git\necho \"This user-data was input by Terrafom by $(whoami) @ $(date)\" > /tmp/user_data.txt"

-/+ aws_launch_configuration.mvdserver_lc (new resource required)
...
      user_data:                   "f2289ceab41a69b5f99fb2978015e0769a8e5739" => "656bca31b5f7cea3181e684bfea743272dcaefca" (forces new resource)

In the above, the interesting tidbit is that the "new" user_data value is actually a SHA1 hash of ${data.template_file.user_data.rendered} (to understand why this is occurring, user_data does exactly this SHA1 hashing in its StateFunc, but is receiving an "unknown" value from Terraform core for the template_file attribute):

echo -n '${data.template_file.user_data.rendered}' | shasum
656bca31b5f7cea3181e684bfea743272dcaefca  -

While this doesn't provide any clearer guidance on workarounds or potential code/configuration fixes, hopefully this helps explain some of the odd behavior here.

This bug remains on provider v2.54.0

I am also with this problem with v2.56.0. Any update on this issue?

Still with the problem on 2.61 - I thought it may be due to me having variables in the user-data - but even replacing the variable with the value (in this case an IP address - observed by going into the AWS-Console and viewing the user-data directly) - still wants to do a change

The issue still exists on 2.62.0

No changes made in user_data. Though, terraform wants to recreate the aws_instance node because of change in user_data (actually no change). I hope someone will give a fix on this issue.

I believe I'm hitting the same issue but I was able to workaround by using ignore_changes with user_data, apply the code once, remove the ignore_changes part and re-run again. Now the user_data no longer wants to recreate the resource.

Issue still exists in 3.0

Just ran into this with 2.70.0. Adding lifecycle { ignore_changes = ["user_data"] } prevents TF from wanting to redeploy the EC2 instances.

I'm applying userdata via template file data resource like so

data "template_file" "bootstrap" {
template = file(${path.module}/path/to/shellscript.sh)
}

resource "aws_instance" "instances" {
count = var.count
<bunch of ec2 related stuff>
user_data = data.template_file.bootstrap.rendered

shellscript.sh has not changed since deployment - confirmed via git commit history and extracting out the userdata from the aws cli

Any ideas ? this is rather annoying

I'm also currently hitting this issue on 2.70.0, though unfortunately I can't use any of the workarounds (eg. ignoring userdata changes) because I do actually need to replace the instance on a userdata change - but nothing is changing!

I've somewhat worked around it by using remote-exec provisioner that waits until the new-unneccessarily-created host is up, replaces the current host with it, then nukes the old hosts via an AWS CLI command, but yeah, this is a terrible workaround. Thankfully these hosts aren't stateful, so them being arbitrarily replaced is fine, but it's certainly a waste of time and makes using Terraform for validating its created infrastructure impossible.

I seem to be having a bit of luck in avoiding this bug by using the gzip option in template.cloudinit_config. Terraform seems less likely to find imaginary differences between gzipped strings. If anybody is able to verify this, I'd be interested in hearing your results.

thought I had a solution, but didn't (it's late)

This issue is still present in 3.7.0

I found that this problem is most likely to exist where you might be running terraform on multiple platforms. I.e. Windows and Linux. I stumbled across it when applying through a CircleCI pipeline vs my local Windows repository. The problem was caused by Windows CRLF vs the pipeline image (Linux) applying with LF only. Converting the user_data template file in my local reposiory to LF only, and setting git to disable autocrlf as follows solved it for me:

git config --global core.autocrlf false

Hopefully this helps if you are struggling with this issue...

I found that this problem is most likely to exist where you might be running terraform on multiple platforms. I.e. Windows and Linux. I stumbled across it when applying through a CircleCI pipeline vs my local Windows repository. The problem was caused by Windows CRLF vs the pipeline image (Linux) applying with LF only. Converting the user_data template file in my local reposiory to LF only, and setting git to disable autocrlf as follows solved it for me:

git config --global core.autocrlf false

Hopefully this helps if you are struggling with this issue...

Thanks, the solution fixed my issue.

Was this page helpful?
0 / 5 - 0 ratings