Terraform-aws-eks: cluster_create_security_group is mapped to Additional security groups in EKS

Created on 14 Mar 2020  路  23Comments  路  Source: terraform-aws-modules/terraform-aws-eks

While creating the cluster I set the Cluster Control Plane security group by assigning cluster_create_security_group = "sg-xxxxxxx" . However once the cluster is created this will be associated as "Additional security groups" in the EKS not as primary "Cluster security group". (The wording is as per EKS aws console Networking block) . Also if I try to check the output of "cluster_create_security_group" it is still "Additional security groups".

I have some VPC endpoints and I have to associate this "Cluster Security group" & "Additional security groups" to that. However I cannot see any output for "Cluster Security group".

So is there anyway I can get, set or modify the primary "Cluster security group" ?

Most helpful comment

@ck3mp3r

How are you going to get the ID if you need the name to use that data resource? Not forgetting the name is still unknown until the run as completed...

Do you want the security group ID? i.e. sg-abcd123456? You don't need to use the name for this data resource. You can use tags or name to match a single SG, then get its SG ID.

Second option, the name is known but there is an output of aws_eks_node_group.workers from the module that should include some name details, like name of the ASG. Can the name of the SG be calculated from the name of the ASG?

Why has the option of being able to specify the cluster security group been removed?! It should be optional to be able to specify a self managed one. Using the one configured as cluster_security_group_id to additionally populate the security groups in the VPC settings and NOT using it for what it was intended is dangerous. Especially considering the misleading naming of the input variables if not used for exactly that purpose.

All 23 comments

https://github.com/terraform-aws-modules/terraform-aws-eks/pull/792

I already added a PR to fix the outputs and map the additional security group, The cluster primary security group is automatically created and you can not point it to any other security group, The only thing you can do it get the ID and modify it outside the module, which is what this PR provides.

this is from the outputs in the terraform eks cluster section.

cluster_security_group_id - The cluster security group that was created by Amazon EKS for the cluster.

once this PR gets merged into the module, you can add to it by doing something like this outside the module.

resource "aws_security_group_rule" "eks_cluster_add_access" {
  type        = "ingress"
  from_port   = 111
  to_port     = 111
  protocol    = "tcp"
  cidr_blocks = ["1.2.3.4/32"]

  security_group_id = module.eks.cluster_security_group_id
}

The variable naming in the module around security groups is historical.

The primary "Cluster security group" created by the EKS platform is a recent addition they added as part of supporting managed node groups. The module hasn't been updated yet to output the ID number of this group.

Previously you had to create a security group and associate it when creating the cluster or nothing would work. Now the console has renamed those groups to "Additional Security Groups".

You cannot set the primary security group. It's controlled by the EKS platform.

The cluster security groups are for allowing communication between the cluster control plane and the kubernetes worker nodes. I'm not sure what you're doing with VPC endpoints so that they need to communicate with the cluster control plane but can you not achieve it via the "additional" security groups?

You can see more about the cluster's security groups in the AWS documentation

I am using something lime slimm609 suggested now. By adding security group rule referring module.eks.cluster_security_group_id . However it goes to "Additional Security Groups" and I think this ok.
I was trying to move some of my eksctl stuff to terraform and in eksctl they create a ClusterSharedNodeSecurityGroup and add rule to it in the primary "Cluster security group". Also was trying to add the primary "Cluster security group" to one of the vpce sg's com.amazonaws..ecr.api . As I assume this will help me to avoid the cluster FAILED status during provisioning on a private VPC subnets with private access enabled / public access disabled in organisation account.

Guys, can we stay "as is" current outputs for backward compatibility but simply add another one that will provide a new value of vpc_config.cluster_security_group_id?
In the current state, there is no way to get security group value for managed nodes in 1.14+ if you use eks module to manage it.

Hi, any movement on this?

there is no way to get security group value for managed nodes

Do you just want the ID? Can't use you use a security_group data resource? https://www.terraform.io/docs/providers/aws/d/security_group.html

there is no way to get security group value for managed nodes

Do you just want the ID? Can't use you use a security_group data resource? https://www.terraform.io/docs/providers/aws/d/security_group.html

How are you going to get the ID if you need the name to use that data resource? Not forgetting the name is still unknown until the run as completed...

@ck3mp3r

How are you going to get the ID if you need the name to use that data resource? Not forgetting the name is still unknown until the run as completed...

Do you want the security group ID? i.e. sg-abcd123456? You don't need to use the name for this data resource. You can use tags or name to match a single SG, then get its SG ID.

Second option, the name is known but there is an output of aws_eks_node_group.workers from the module that should include some name details, like name of the ASG. Can the name of the SG be calculated from the name of the ASG?

You can't use a data source to discover the ID of the "default" security group created by EKS because it won't exist when creating a new cluster.

We really need to export this ID via an output. #792 did this plus a load of other random breaking changes.

@ck3mp3r I would fork the repo and look at the closed PR I linked in my comment above. The group ID isn't given and you can't look it up because tags don't get applied to them (there is an open PFR for this in AWS) https://github.com/aws/containers-roadmap/issues/374 . you could attempt to parse it out of the aws_eks_node_group.workers one but its not attached to the workers so you can't really parse it via TF. you would have to search the security group applied to the workers and find the sg id out of it.

792 worked but there is an issue because of old code and incompatible changes between 1.13 and 1.14. I had it working but they didn't like some stuff in there but could not provide anything other than the fact that they didn't like it. So I closed the PR and i just have my own fork of it with some other changes as well.

The current one incorrectly gives you the additional security group ID as the cluster ID, which they seem to think is not "broken"

@ck3mp3r

How are you going to get the ID if you need the name to use that data resource? Not forgetting the name is still unknown until the run as completed...

Do you want the security group ID? i.e. sg-abcd123456? You don't need to use the name for this data resource. You can use tags or name to match a single SG, then get its SG ID.

Second option, the name is known but there is an output of aws_eks_node_group.workers from the module that should include some name details, like name of the ASG. Can the name of the SG be calculated from the name of the ASG?

Why has the option of being able to specify the cluster security group been removed?! It should be optional to be able to specify a self managed one. Using the one configured as cluster_security_group_id to additionally populate the security groups in the VPC settings and NOT using it for what it was intended is dangerous. Especially considering the misleading naming of the input variables if not used for exactly that purpose.

because 1.13, you still need to specify a cluster security group, they changed it in 1.14. That's why my PR checked the version and gave either the cluster security group ID or the additional security group ID based on the version. When they rename something like that, there is no other way that I have seen to support breaking changes like that but it was said to be "bad practice", yet no other way has been recommended. Once eks 1.16 is out this entire issue goes away because 1.13 is dropped and the whole issue is moot at that point.

You can't use a data source to discover the ID of the "default" security group created by EKS because it won't exist when creating a new cluster.

Sure but you can still just add the data resource after cluster creation and filter for the tags aws:eks:cluster-name=<cluster name>, or make the data resource depend on the cluster_id output. I think that would give a clean apply until we merge a fix, no?

@slimm609 @ck3mp3r what do you think about the @fregatte suggestion https://github.com/terraform-aws-modules/terraform-aws-eks/issues/799#issuecomment-608006191 ? I think we can safely add additional_security_group_id as new output instead of renaming the existing cluster_security_group_id

The problem is the cluster security group ID is currently the additional security group ID, so even if you add it, they would be the same output, because the current cluster security group ID for 1.14 or later is never actually shown, so you have to rename since it鈥檚 currently incorrect

@slimm609 would you mind re-openning your PR and continue to work on it, so we can merge it when we (including you) will agree in how to fix this ?

Thanks again for your time on this project.

@barryib rather than reopening, I think taking a different approach would make it a lot easier. Based on the your comment and #799, i think doing 2 things would make work "for now" as well as solve it correctly in the future.

1) make a issue to rename cluster_security_group_id to additional_security_group_id once the 1.16 release comes out, since 1.13 will drop support at that time most of the trouble with the PR is around how the changes from 1.13 to 1.14 happened on the EKS side.
2) create a PR that outputs the cluster_security_group_id as primary_cluster_security_group_id or something similar to not mess with the cluster_security_group_id and just add some text to the descriptions of them to hold until 1.16 release.

Thoughts?

I'm happy with whatever is going to be "migratable" without having to tear down existing clusters...
Happy to look at any PRs and provide my 20c...

@max-rocket-internet @slimm609 @dpiddockcmp what do you think about changing #792 and add outputs like this ?

output "additional_security_group_id" {
  description = "Additional security group ID attached to the EKS cluster."
  value       = local.additional_security_group_id
}

output "cluster_security_group_id" {
  description = "Cluster security group ID attached to the EKS cluster."
  value = coalesce(
    element(concat(aws_eks_cluster.this[*].vpc_config[0].cluster_security_group_id, list("")), 0),
    local.additional_security_group_id
  )
}

I don't test it yet, but the idea here is to output vpc_config.cluster_security_group_id if it's defined otherwise output the now called additional_security_group_id ? It'll make output consistent for all version of EKS.

Renaming the variable cluster_security_group_id into additional_security_group_id is probably a breaking change, but I think it's a acceptable one and should be easy to handle during the upgrade.

For those who experiencing this issue, please follow https://github.com/terraform-aws-modules/terraform-aws-eks/issues/816#issuecomment-611404784 for a workaround for now.

Personally, I think changing the underlying source for the output variable is a dangerous thing to do. There's bound to be a few users who don't fully read the changelog and then complain when their cluster stops working correctly.

I like #828. It's a simple change just introducing the new output.

  • No weird behaviour changes when upgrading the EKS version from 1.13 to 1.14.
  • No needing to update existing plans when moving from module 11.0.0 to, I guess, 12.0.0.
  • No need to do a major release with breaking changes for this issue.
  • Module doesn't end up with technical debt to of version based quirks to remove when 1.13 is finally deleted by AWS.

We could maybe write documentation explaining the security group variable and output names? Add to the description on the variables and outputs mapping to their current aws console labels?

@dpiddockcmp thanks for your feedback. You got me :)

We could maybe write documentation explaining the security group variable and output names? Add to the description on the variables and outputs mapping to their current aws console labels?

Let's add those to descriptions on variables and outputs.

Was this page helpful?
0 / 5 - 0 ratings