Terraform v0.13.0-rc1
The submodule (./modules/eks_helm_components):
data "aws_region" "current" {}
resource "helm_release" "cluster_autoscaler" {
name = "cluster-autoscaler"
namespace = "kube-system"
repository = "https://kubernetes-charts.storage.googleapis.com"
chart = "cluster-autoscaler"
version = lookup({ for k, v in var.helm_components : "chart_version" => v["chart_version"] if k == "cluster_autoscaler" }, "chart_version")
values = [
yamlencode(lookup({ for k, v in var.helm_components : "values_default" => v["values_default"] if k == "cluster_autoscaler" }, "values_default")),
yamlencode(lookup({ for k, v in var.helm_components : "values" => v["values"] if k == "cluster_autoscaler" }, "values")),
### The below values can not be overridden by user
yamlencode({
awsRegion = data.aws_region.current.name
autoDiscovery = {
clusterName = var.cluster_name
}
})]
}
The root module:
module "eks_helm_components_2" {
source = "./modules/eks_helm_components"
count = var.cluster_2 != null ? 1 : 0
cluster_name = var.cluster_1.name
providers = {
helm = helm.cluster_1
}
helm_components = { for k, v in local.helm_components_properties : k => merge(v, lookup(var.cluster_2.base_components, k, {})) }
}
module.test.module.eks_helm_components_2[0].data.aws_region.current will be read during apply
(config refers to values not yet known)
<= data "aws_region" "current" {
+ current = (known after apply)
~ description = "Europe (Ireland)" -> (known after apply)
~ endpoint = "ec2.eu-west-1.amazonaws.com" -> (known after apply)
~ id = "eu-west-1" -> (known after apply)
~ name = "eu-west-1" -> (known after apply)
}
module.test.module.eks_helm_components_2[0].helm_release.cluster_autoscaler will be updated in-place
~ resource "helm_release" "cluster_autoscaler" {
atomic = false
chart = "cluster-autoscaler"
cleanup_on_fail = false
create_namespace = false
dependency_update = false
disable_crd_hooks = false
disable_openapi_validation = false
disable_webhooks = false
force_update = false
id = "cluster-autoscaler"
lint = false
max_history = 0
metadata = [
{
chart = "cluster-autoscaler"
name = "cluster-autoscaler"
namespace = "kube-system"
revision = 1
values = jsonencode(
{
autoDiscovery = {
clusterName = "test"
}
awsRegion = "eu-west-1"
cloudProvider = "aws"
image = {
tag = "v1.17.1"
}
nameOverride = "cluster-autoscaler"
}
)
version = "7.3.4"
},
]
name = "cluster-autoscaler"
namespace = "kube-system"
recreate_pods = false
render_subchart_notes = true
replace = false
repository = "https://kubernetes-charts.storage.googleapis.com"
reset_values = false
reuse_values = false
skip_crds = false
status = "deployed"
timeout = 300
~ values = [
- <<~EOT
"cloudProvider": "aws"
"image":
"tag": "v1.17.1"
"nameOverride": "cluster-autoscaler"
EOT,
- jsonencode({}),
- <<~EOT
"autoDiscovery":
"clusterName": "test"
"awsRegion": "eu-west-1"
EOT,
] -> (known after apply)
verify = false
version = "7.3.4"
wait = true
}
TF should not force Helm resource update on each plan unless data resource change.
TF forces the Helm resource to update on each plan.
terraform initterraform applyterraform planThe state file has the "current": null, value. We run TF in TF Cloud.
"mode": "data",
"type": "aws_region",
"name": "current",
"provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
"current": null,
"description": "Europe (Ireland)",
"endpoint": "ec2.eu-west-1.amazonaws.com",
"id": "eu-west-1",
"name": "eu-west-1"
}
}
]
@jbardin can you help me understand this? Is this a matter of the aws_region data data source being evaluated during application rather than during planning?
Yes, it appears that data.aws_region.current is being deferred to apply, but I'm not certain exactly what is causing this as it's clear there is no data source configuration to be "unknown". The only other thing that should cause it to be deferred is if there are any depends_on references with changes, but the only other resource shown is a direct dependent and there are no depends_on attributes in the sample config.
The "current": null field in the state is expected with aws provider versions <3.0.0, since that field has been deprecated but not yet removed.
If this is still happening with 0.13.0, I would need a trace log or reproduction to learn more.
Thanks @jbardin! @atimush, do you think you can help generate a simplified reproduction case that we can run directly? The code you've provided has dependencies I can't see, so we'd need something we can copy-paste or git clone. Ideally, this would have a similar structure but leave out helm and eks, because the effort of setting up an EKS + helm infrastructure is pretty significant.
@danieldreier
Note: this might turn out to be an unrelated issue
It seems that at least this specific AWS data source (aws_vpcs) forces an update in v0.13.0. It does not have to be in a module or use count.
provider "aws" {
version = "3.2.0"
region = "us-east-1"
}
data "aws_vpcs" "this" {
}
resource "null_resource" "test" {
}
This code applies only once, as expected, on TF v0.12.29.
With TF v0.13.0 it tries to read the data source during apply.
# data.aws_vpcs.this will be read during apply
# (config refers to values not yet known)
(Note: The AWS provider version is the same in both cases - 3.2.0)
Hi @qwerty1979bg, I think the issue you are having is reported at https://github.com/terraform-providers/terraform-provider-aws/issues/14579.
@remilapeyre, thanks ! This is an exact match ! :)
@atimush,
Your code uses data "aws_region" "current" {}, which is in the list of affected resources in https://github.com/terraform-providers/terraform-provider-aws/issues/14579
Thanks for the pointer, @remilapeyre!
I believe the issue over in the AWS provider affects the aws_regions (plural) data source, but not the aws_region singular one, because the singular one sets its id to the name of the single region it looks up.
My sense is that this problem arose because a plural data source doesn't really have a single id string to use, and the Terraform SDK requires that every data source result include a non-empty id so the developers of these data sources used the current time as a placeholder ID value.
However, because aws_region _does_ converge on a stable result (it uses the current region name as the id), I think the behavior reported here is separate from terraform-providers/terraform-provider-aws#14579 and warrants some independent investigation.
With that said, as @danieldreier said the next step here would be to find a simplified configuration that we can run to reproduce the problem. Hopefully there's a smaller reproduction case that includes only aws_region and potentially an additional placeholder resource like null_resource to exercise its result, which would then allow us to observe it without needing a full Kubernetes cluster.
To be clear, the non-deterministic ID issues described in terraform-provider-aws#14579 is only the symptom. There are 2 primary causes we have found: data sources that are altering non-computed attributes, and the SDK altering zero block values. Since those two situations cover all the known cases so far, we should probably just wait and confirm the status of this issue with the upcoming aws provider and core releases.
Hello guys,
Thank you for your efforts and apologies for a late reply. I've been trying to provide a simple local execution to simulate the issue but all runs are perfectly fine :(
Still, the existing EKS modules are reporting the same issue. Noticed that the problem is not related with AWS region only data source but with any data resources (even with some of the AWS IAM Policy documents). I would assume could be something related to the providers:
providers = {
helm = helm.cluster_1
}
Providing below some of the traces that might help:
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1.data.aws_region.current": expanding dynamic subgraph
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.ResourceCountTransformer
2020/08/23 19:37:07 [TRACE] ResourceCountTransformer: adding module.g8s.module.eks_helm_components_1[0].data.aws_region.current as *terraform.NodeRefreshableDataResourceInstance
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.ResourceCountTransformer with new graph:
module.g8s.module.eks_helm_components_1[0].data.aws_region.current - *terraform.NodeRefreshableDataResourceInstance ------
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.OrphanResourceInstanceCountTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.OrphanResourceInstanceCountTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.AttachStateTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.AttachStateTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.TargetsTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.TargetsTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.ReferenceTransformer
2020/08/23 19:37:07 [DEBUG] ReferenceTransformer: "module.g8s.module.eks_helm_components_1[0].data.aws_region.current" references: []
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.ReferenceTransformer (no changes)
2020/08/23 19:37:07 [TRACE] Executing graph transform *terraform.RootTransformer
2020/08/23 19:37:07 [TRACE] Completed graph transform *terraform.RootTransformer (no changes)
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1.data.aws_region.current": entering dynamic subgraph
2020/08/23 19:37:07 [TRACE] dag/walk: visiting "module.g8s.module.eks_helm_components_1[0].data.aws_region.current"
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1[0].data.aws_region.current": starting visit (*terraform.NodeRefreshableDataResourceInstance)
2020/08/23 19:37:07 [TRACE] vertex "module.g8s.module.eks_helm_components_1[0].data.aws_region.current": evaluating
2020/08/23 19:37:07 [TRACE] [walkRefresh] Entering eval tree: module.g8s.module.eks_helm_components_1[0].data.aws_region.current
2020/08/23 19:37:07 [TRACE] eval: *terraform.EvalSequence
2020/08/23 19:37:07 [TRACE] eval: *terraform.EvalGetProvider
Feel free to close the case if we can not find the root cause of this behaviour. Will try to update in the upcoming hours if able to re-produce it in a more simple way.
Thank you,
Andrei
@atimush
You mention that this happens with a module count in conjunction with the data sources. Without a repro we can't tell exactly what's going on, but some general debugging may help. I suspect the data source being shown in the plan is simply a side-effect of what's going on, and not the reason. From the partial config here, I would suspect it's being caused by something in the helm_components inputs being unknown.
First thing to check, is this still a problem with 0.13.1?
Does the situation change when hard-coding the count value?
Does hard-coding the awsRegion value change anything?
Reducing the problem to a single config change that triggers the diff could give us a clue as to what is going on.
The relevant section of the trace log would be from the plan creation, but the entire log would be more helpful if possible.
Thanks
There hasn't been any new activity here, so I'm going to close this out.
Similar issues should be rectified in 0.14, as the merging of refresh+plan has allowed for more data resource lifecycle improvements.
Thanks!
I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
Hi @qwerty1979bg, I think the issue you are having is reported at https://github.com/terraform-providers/terraform-provider-aws/issues/14579.