Tell us about your request
Could EKS allow us to specify cloud provider parameters such as DisableSecurityGroupIngress and ElbSecurityGroup?
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
My company is trying to migrate from self-managed Kubernetes to EKS. Our current setup consists of several hundred type LoadBalancer Service resources. Each developer manages their own ingress to the cluster in this way. When trying to migrate this workload, we ran into an error from reaching the maximum inbound or outbound rules per security group. I am aware of:
https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups
and the recent updates: https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/
But even these possible increases are not enough to "lift-and-shift" our workload. It's not possible for us to undertake a significant refactor in order to migrate to EKS at this time.
Are you currently working around this issue?
How are you currently solving this problem?
We currently manage security group creation manually. We set the cloud provider config. file with the options:
[Global]
DisableSecurityGroupIngress = true
ElbSecurityGroup = sg-XXX <== This is a group with 0 ingress/egress rules
And pass --cloud-config= to our control plane components and kubelets. Then we manage both the Node security group/ELB security group outside of Kubernetes. A SG is created for the ELB and attached to the node SG's ingress rules. Then we add the following annotation to the Service manifest to attach it:
service.beta.kubernetes.io/aws-load-balancer-extra-security-groups
This solution gives us the control we need to stay within the security group limitations.
Refer to:
https://github.com/kubernetes/cloud-provider-aws/commit/dbc66dc675eee8b53c399a5c300122e37d3fade0
https://forums.aws.amazon.com/thread.jspa?messageID=896987
Additional context
I am aware of other possible workarounds, such as:
Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
Fixed with https://github.com/kubernetes/legacy-cloud-providers/blob/e68b55f48d8efc43af007916afc0d560838e9c7f/aws/aws.go#L153-L156
This update has been backported to EKS 1.14 and 1.15.
@kh34 given that the ServiceAnnotationLoadBalancerSecurityGroups = "service.beta.kubernetes.io/aws-load-balancer-security-groups" gives you the functionality that you need i.e. using this annotation your can provide the security group you want do you see a need for any change ?
In short, no need for a change, and this issue can probably be closed. Enforcing the use of this new annotation in an enterprise or in a multi-tenant cluster is probably out of scope for this discussion. We'll be looking into developing our own custom operator or MutatingWebhook for that.
https://github.com/kubernetes/legacy-cloud-providers/blob/e68b55f48d8efc43af007916afc0d560838e9c7f/aws/aws.go#L153-L156 is not suffice - EKS still tries to find workers SG and modify it. And if I don't have workers security groups tagged, it gives
Error creating load balancer (will retry): failed to ensure load balancer for service default/svc: Multiple untagged security groups found for instance i-******; ensure the k8s security group is tagged
Our security requirements does not allow SGs to be modified by EKS, hence either DisableSecurityGroupIngress or something else disabling that https://github.com/kubernetes/legacy-cloud-providers/blob/e68b55f48d8efc43af007916afc0d560838e9c7f/aws/aws.go#L3975-L3979 behavior is required. K8s version 1.14.
We have the exact same problem as described by @dee-kryvenko.
@dee-kryvenko Did you find any workaround or are you still blocked on this? I'd also prefer to disable security group modifications by the EKS/AWS cloud provider, and instead manage them in my own configuration.
I was not able to find any workaround. I ended up creating an empty tagged SG with a meaningless rule 127.0.0.1/32 along with the cluster - my security team were not happy about that.
@astrived is it possible to re-open this one? I don't think the issue is completely resolved. Specifying the annotation gives you control over the SG the Load Balancer uses, but does not stop the cloud provider modifying the node security groups.
@lstoll yes I am tracking this, will keep it open.
We're facing a similar issue. We don't have a lot of loadbalancer services, but we just have a lot of loadBalancerSourceRanges in our ingress service. We've requested for the limit to be increased to the maximum possible which will help us for now.
For each entry in loadBalancerSourceRanges EKS creates the following inbound rules:
So the standard NLB service with both port 80 & 443, that's 3 inbound rules per CIDR. Then there's the health rules per port and the 2 standard rules for communication with the control plane.
So adding a service that uses an NLB without any loadBalancerSourceRanges already would use 2 (control plane comms) + 2 (health check for each port) + 2 (default 0.0.0.0/0 rules on each port) + 1 (mtu) = 7 inbound rules.
This leaves room for 18 loadBalancerSourceRanges to reach the default limit of 60 on the SecurityGroup.
If the limit is increased to the maximum of 100, it allows for 32 loadBalancerSourceRanges.
We're also thinking of removing port 80 from the mix to shave off some of the inbound rule requirement.
But I wonder if EKS can do this in a more flexible way?
Perhaps create an SG each for mtu and each traffic port ? This still has the potential of breaching the "Security groups for each network interface" limit. Just throwing the idea out there
Hi everyone,
The introduction of Security Groups for Pods, along with the AWS Load Balancer ingress controller v2, may provide a workable solution to the issue of Security Group rule starvation.
When you use the Load Balancer Controller v2 solution instead of the in-tree Load Balancer controller provided by upstream, it is capable of using IP-based targets for your LB Target Groups instead of instances. In addition, the controller will recognize when your Pod is using a dedicated Security Group. If so, it will add the necessary rules only to the Pod's Security Group, and not to the cluster Security Group.
I encourage everyone to check out this solution and let us know whether it addresses their issues.
Most helpful comment
@lstoll yes I am tracking this, will keep it open.