Error: error creating EKS Node Group ${EKS_CLUSTER}:${EKS_CLUSTER}-${NODE_NAME}-${RANDOM_PET): ResourceInUseException: NodeGroup already exists with name ${EKS_CLUSTER}-${NODE_NAME}-${RANDOM_PET) and cluster name ${EKS_CLUSTER}
status code: 409, request id: 722abab9-21b6-418a-99c8-8c974adbf16a
on ../../terraform-aws-eks/node_groups.tf line 69, in resource "aws_eks_node_group" "workers":
69: resource "aws_eks_node_group" "workers" {
Currently the module should only be replacing managed nodes when either instance_type, ec2_ssh_key, source_security_group_ids, or node_group_name changes but is attempting to replace upon any change. This is then causing a name collision, something that random_pet should be preventing(?).
It is worth noting that if any of the keepers for random_pet do change the expected create_before_destroy behavior is respected.
TF Output(Slightly Redacted)
Terraform will perform the following actions:
# module.${CLUSTER_NAME}-eks.aws_eks_node_group.workers["${NODE_NAME}"] must be replaced
+/- resource "aws_eks_node_group" "workers" {
~ ami_type = "AL2_x86_64" -> (known after apply)
~ arn = "arn:aws:eks:us-east-1:${AWS_ACCOUNT}:nodegroup/${CLUSTER_NAME}/${CLUSTER_NAME}-${NODE_NAME}-${RANDOM_PET}/a8b770ba-3a3b-bead-1e7d-868c50634d14" -> (known after apply)
cluster_name = "${CLUSTER_NAME}"
~ disk_size = 20 -> (known after apply)
~ id = "${CLUSTER_NAME}:${CLUSTER_NAME}-${NODE_NAME}-${RANDOM_PET}" -> (known after apply)
instance_types = [
"m5a.large",
]
labels = {
"NodeGroupType" = "Managed"
}
node_group_name = "${CLUSTER_NAME}-${NODE_NAME}-${RANDOM_PET}"
node_role_arn = "arn:aws:iam::${AWS_ACCOUNT}:role/${CLUSTER_NAME}-managed-node-groups"
~ release_version = "1.14.7-20190927" -> (known after apply)
~ resources = [
- {
- autoscaling_groups = [
- {
- name = "eks-REDACT"
},
]
- remote_access_security_group_id = ""
},
] -> (known after apply)
~ status = "ACTIVE" -> (known after apply)
subnet_ids = [
"subnet-REDACT",
"subnet-REDACT",
"subnet-REDACT",
]
tags = {
"NodeGroupType" = "Managed"
}
version = "1.14"
+ remote_access { # forces replacement}
scaling_config {
desired_size = 1
max_size = 10
min_size = 1
}
}
So this is probably related to the notes left on node_group.tf:
# This sometimes breaks idempotency as described in https://github.com/terraform-providers/terraform-provider-aws/issues/11063
remote_access {
ec2_ssh_key = lookup(each.value, "key_name", "") != "" ? each.value["key_name"] : null
source_security_group_ids = lookup(each.value, "key_name", "") != "" ? lookup(each.value, "source_security_group_ids", []) : null
}
Upon setting at least the ec2_ssh_key I am able to maintain expected behavior.
I was having the same issue with the default configuration in the managed_node_group example. I think it can be fixed by making the remote_access block dynamic. Something like:
dynamic "remote_access" {
for_each = [for s in [{
ec2_ssh_key = lookup(each.value, "key_name", "") != "" ? each.value["key_name"] : null
source_security_group_ids = lookup(each.value, "key_name", "") != "" ? lookup(each.value, "source_security_group_ids", []) : null
}] : s if s["ec2_ssh_key"] != null || s["source_security_group_ids"] != null ]
content {
ec2_ssh_key = remote_access.value["ec2_ssh_key"]
source_security_group_ids = remote_access.value["source_security_group_ids"]
}
}
I'll open a PR with these changes if they resolve the issue.
@jeffmhastings I tried your snippet and it worked well for me
@kamirendawkins @jeffmhastings Hi guys, i still experienced the same issue at the latest release v12.0.0 and v11.1.0 on EKS v1.6
So without specifying key_name, or set the key_name ="" will require replacing the node group every time ?
The mentioned workaround of using dynamic block didn't worked for me.
Most helpful comment
I was having the same issue with the default configuration in the managed_node_group example. I think it can be fixed by making the remote_access block dynamic. Something like:
I'll open a PR with these changes if they resolve the issue.