Terraform: Data resource forces update in modules with count

Created on 6 Aug 2020 · 12Comments · Source: hashicorp/terraform

Terraform Version

Terraform v0.13.0-rc1

Terraform Configuration Files

The submodule (./modules/eks_helm_components):

data "aws_region" "current" {}

resource "helm_release" "cluster_autoscaler" {
  name       = "cluster-autoscaler"
  namespace  = "kube-system"
  repository = "https://kubernetes-charts.storage.googleapis.com"
  chart      = "cluster-autoscaler"
  version    = lookup({ for k, v in var.helm_components : "chart_version" => v["chart_version"] if k == "cluster_autoscaler" }, "chart_version")
  values = [
    yamlencode(lookup({ for k, v in var.helm_components : "values_default" => v["values_default"] if k == "cluster_autoscaler" }, "values_default")),
    yamlencode(lookup({ for k, v in var.helm_components : "values" => v["values"] if k == "cluster_autoscaler" }, "values")),
    ### The below values can not be overridden by user
    yamlencode({
      awsRegion = data.aws_region.current.name
      autoDiscovery = {
        clusterName = var.cluster_name
      }
    })]
}

The root module:

module "eks_helm_components_2" {
  source                      = "./modules/eks_helm_components"
  count                       = var.cluster_2 != null ? 1 : 0
  cluster_name                = var.cluster_1.name
  providers = {
    helm = helm.cluster_1
  }
  helm_components = { for k, v in local.helm_components_properties : k => merge(v, lookup(var.cluster_2.base_components, k, {})) }
}

Debug Output

module.test.module.eks_helm_components_2[0].data.aws_region.current will be read during apply
  (config refers to values not yet known)
 <= data "aws_region" "current"  {
      + current     = (known after apply)
      ~ description = "Europe (Ireland)" -> (known after apply)
      ~ endpoint    = "ec2.eu-west-1.amazonaws.com" -> (known after apply)
      ~ id          = "eu-west-1" -> (known after apply)
      ~ name        = "eu-west-1" -> (known after apply)
    }

module.test.module.eks_helm_components_2[0].helm_release.cluster_autoscaler will be updated in-place
  ~ resource "helm_release" "cluster_autoscaler" {
        atomic                     = false
        chart                      = "cluster-autoscaler"
        cleanup_on_fail            = false
        create_namespace           = false
        dependency_update          = false
        disable_crd_hooks          = false
        disable_openapi_validation = false
        disable_webhooks           = false
        force_update               = false
        id                         = "cluster-autoscaler"
        lint                       = false
        max_history                = 0
        metadata                   = [
            {
                chart     = "cluster-autoscaler"
                name      = "cluster-autoscaler"
                namespace = "kube-system"
                revision  = 1
                values    = jsonencode(
                    {
                        autoDiscovery  = {
                            clusterName = "test"
                        }
                        awsRegion      = "eu-west-1"
                        cloudProvider  = "aws"
                        image          = {
                            tag = "v1.17.1"
                        }
                        nameOverride   = "cluster-autoscaler"
                    }
                )
                version   = "7.3.4"
            },
        ]
        name                       = "cluster-autoscaler"
        namespace                  = "kube-system"
        recreate_pods              = false
        render_subchart_notes      = true
        replace                    = false
        repository                 = "https://kubernetes-charts.storage.googleapis.com"
        reset_values               = false
        reuse_values               = false
        skip_crds                  = false
        status                     = "deployed"
        timeout                    = 300
      ~ values                     = [
          - <<~EOT
                "cloudProvider": "aws"
                "image":
                  "tag": "v1.17.1"
                "nameOverride": "cluster-autoscaler"
            EOT,
          - jsonencode({}),
          - <<~EOT
                "autoDiscovery":
                  "clusterName": "test"
                "awsRegion": "eu-west-1"
            EOT,
        ] -> (known after apply)
        verify                     = false
        version                    = "7.3.4"
        wait                       = true
    }

Crash Output

Expected Behavior

TF should not force Helm resource update on each plan unless data resource change.

Actual Behavior

TF forces the Helm resource to update on each plan.

Steps to Reproduce

terraform init
terraform apply
terraform plan

Additional Context

The state file has the "current": null, value. We run TF in TF Cloud.

"mode": "data",
      "type": "aws_region",
      "name": "current",
      "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "current": null,
            "description": "Europe (Ireland)",
            "endpoint": "ec2.eu-west-1.amazonaws.com",
            "id": "eu-west-1",
            "name": "eu-west-1"
          }
        }
      ]

References

bug waiting for reproduction

Source

atimush

Most helpful comment

Hi @qwerty1979bg, I think the issue you are having is reported at https://github.com/terraform-providers/terraform-provider-aws/issues/14579.

remilapeyre on 16 Aug 2020

👍2

All 12 comments

@jbardin can you help me understand this? Is this a matter of the aws_region data data source being evaluated during application rather than during planning?

danieldreier on 13 Aug 2020

Yes, it appears that data.aws_region.current is being deferred to apply, but I'm not certain exactly what is causing this as it's clear there is no data source configuration to be "unknown". The only other thing that should cause it to be deferred is if there are any depends_on references with changes, but the only other resource shown is a direct dependent and there are no depends_on attributes in the sample config.

The "current": null field in the state is expected with aws provider versions <3.0.0, since that field has been deprecated but not yet removed.

If this is still happening with 0.13.0, I would need a trace log or reproduction to learn more.

jbardin on 13 Aug 2020

Thanks @jbardin! @atimush, do you think you can help generate a simplified reproduction case that we can run directly? The code you've provided has dependencies I can't see, so we'd need something we can copy-paste or git clone. Ideally, this would have a similar structure but leave out helm and eks, because the effort of setting up an EKS + helm infrastructure is pretty significant.

danieldreier on 13 Aug 2020

@danieldreier

Note: this might turn out to be an unrelated issue

It seems that at least this specific AWS data source (aws_vpcs) forces an update in v0.13.0. It does not have to be in a module or use count.

provider "aws" {
  version = "3.2.0"
  region  = "us-east-1"
}

data "aws_vpcs" "this" {
}

resource "null_resource" "test" {
}

This code applies only once, as expected, on TF v0.12.29.

With TF v0.13.0 it tries to read the data source during apply.

  # data.aws_vpcs.this will be read during apply
  # (config refers to values not yet known)

(Note: The AWS provider version is the same in both cases - 3.2.0)

qwerty1979bg on 16 Aug 2020

Hi @qwerty1979bg, I think the issue you are having is reported at https://github.com/terraform-providers/terraform-provider-aws/issues/14579.

remilapeyre on 16 Aug 2020

👍2

@remilapeyre, thanks ! This is an exact match ! :)

@atimush,
Your code uses data "aws_region" "current" {}, which is in the list of affected resources in https://github.com/terraform-providers/terraform-provider-aws/issues/14579

qwerty1979bg on 17 Aug 2020

Thanks for the pointer, @remilapeyre!

I believe the issue over in the AWS provider affects the aws_regions (plural) data source, but not the aws_region singular one, because the singular one sets its id to the name of the single region it looks up.

My sense is that this problem arose because a plural data source doesn't really have a single id string to use, and the Terraform SDK requires that every data source result include a non-empty id so the developers of these data sources used the current time as a placeholder ID value.

However, because aws_region _does_ converge on a stable result (it uses the current region name as the id), I think the behavior reported here is separate from terraform-providers/terraform-provider-aws#14579 and warrants some independent investigation.

With that said, as @danieldreier said the next step here would be to find a simplified configuration that we can run to reproduce the problem. Hopefully there's a smaller reproduction case that includes only aws_region and potentially an additional placeholder resource like null_resource to exercise its result, which would then allow us to observe it without needing a full Kubernetes cluster.

apparentlymart on 20 Aug 2020

To be clear, the non-deterministic ID issues described in terraform-provider-aws#14579 is only the symptom. There are 2 primary causes we have found: data sources that are altering non-computed attributes, and the SDK altering zero block values. Since those two situations cover all the known cases so far, we should probably just wait and confirm the status of this issue with the upcoming aws provider and core releases.

jbardin on 20 Aug 2020

Hello guys,
Thank you for your efforts and apologies for a late reply. I've been trying to provide a simple local execution to simulate the issue but all runs are perfectly fine :(

Still, the existing EKS modules are reporting the same issue. Noticed that the problem is not related with AWS region only data source but with any data resources (even with some of the AWS IAM Policy documents). I would assume could be something related to the providers:
providers = { helm = helm.cluster_1 }

Providing below some of the traces that might help:

2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1.data.aws_region.current": expanding dynamic subgraph
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.ResourceCountTransformer
2020/08/23 19:37:07 [TRACE] ResourceCountTransformer: adding module.g8s.module.eks_helm_components_1[0].data.aws_region.current as *terraform.NodeRefreshableDataResourceInstance
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.ResourceCountTransformer with new graph:
  module.g8s.module.eks_helm_components_1[0].data.aws_region.current - *terraform.NodeRefreshableDataResourceInstance  ------
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.OrphanResourceInstanceCountTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.OrphanResourceInstanceCountTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.AttachStateTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.AttachStateTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.TargetsTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.TargetsTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.ReferenceTransformer
2020/08/23 19:37:07 [DEBUG] ReferenceTransformer: "module.g8s.module.eks_helm_components_1[0].data.aws_region.current" references: []
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.ReferenceTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.RootTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.RootTransformer (no changes)
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1.data.aws_region.current": entering dynamic subgraph
2020/08/23 19:37:07 [TRACE] dag/walk: visiting "module.g8s.module.eks_helm_components_1[0].data.aws_region.current"
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1[0].data.aws_region.current": starting visit (*terraform.NodeRefreshableDataResourceInstance)
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1[0].data.aws_region.current": evaluating
2020/08/23 19:37:07 [TRACE] [walkRefresh] Entering eval tree: module.g8s.module.eks_helm_components_1[0].data.aws_region.current
2020/08/23 19:37:07 [TRACE] eval: *terraform.EvalSequence
2020/08/23 19:37:07 [TRACE] eval: *terraform.EvalGetProvider

Feel free to close the case if we can not find the root cause of this behaviour. Will try to update in the upcoming hours if able to re-produce it in a more simple way.

Thank you,
Andrei

atimush on 23 Aug 2020

@atimush

You mention that this happens with a module count in conjunction with the data sources. Without a repro we can't tell exactly what's going on, but some general debugging may help. I suspect the data source being shown in the plan is simply a side-effect of what's going on, and not the reason. From the partial config here, I would suspect it's being caused by something in the helm_components inputs being unknown.

First thing to check, is this still a problem with 0.13.1?
Does the situation change when hard-coding the count value?
Does hard-coding the awsRegion value change anything?

Reducing the problem to a single config change that triggers the diff could give us a clue as to what is going on.
The relevant section of the trace log would be from the plan creation, but the entire log would be more helpful if possible.

Thanks

jbardin on 1 Sep 2020

There hasn't been any new activity here, so I'm going to close this out.
Similar issues should be rectified in 0.14, as the merging of refresh+plan has allowed for more data resource lifecycle improvements.

Thanks!

jbardin on 13 Oct 2020

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.