aws-load-balancer-controller 🚀 - Only 5 SGs per ENI allowed

I wasn't aware of this issue... we create a managed securityGroup that will be attached to worker ENIs for every ingress by default.
For now,

you can increase the limit of 5 securityGroup per ENI limits by TT to AWS support: https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups, which give you maximum of 15(16-1) ingresses.
Or you can managed the securityGroup yourself, by creating an securityGroup sg-ForLB, and annotate every ingress with alb.ingress.kubernetes.io/security-groups: sg-ForLB, then add an rule to allow ingress traffic from sg-ForLB on your securityGroup for worker node.

@bigkraig I think ideally we can use an shared instance securityGroups for all ALB, and add/delete rules inside that instance securityGroup 😄 I'll make an PR for it

M00nF1sh on 19 Oct 2018

❤2

Thanks. That's perfect. You also need < (300 / $sg_eni_limit) rules in ALL of your security groups, which was meant, for me, I couldn't only increase to 8.

dokipen on 20 Oct 2018

Currently, I am facing the same issue with the same image quay.io/coreos/alb-ingress-controller:1.0-beta.7

vallurupallikhetan on 22 Oct 2018

Also having this issue. In order to increase SGs per interface you need to decrease rules per SG to maintain the limit of 250 rules per interface (Default Hard Limit = Per region 5 (security group) * 50 (Rule) = 250) therefore we weren't able to increase the number of SGs

rmn36 on 24 Oct 2018

Unfortunately SG limitations and dealing with the repercussions of that are a fact of life in AWS. We had the same problems at Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases. We can try to find ways of making the controller more adaptable but in the end there isn't anything that can truly solve the problem.

bigkraig on 13 Nov 2018

I ran into this issue when target-type: instance and found the best course of action was to self-manage the security group using whatever IaC tool (CloudFormation/Terraform) was used to provision the VPC and EKS cluster. I have submitted a PR to update the documentation: #734

bincyber on 16 Nov 2018

👍1

@M00nF1sh Anything we can do to help speed up the creation of a PR? :D

Multiply on 30 Nov 2018

@Multiply umm..@bigkraig didn't like the idea of share instance sg(less safe than individual sg)....maybe this can be improved by a feature flag(though it's more complicated than simply change it)?

M00nF1sh on 3 Dec 2018

I guess we need to balance the creation of rules between SGs depending on AWS-limits, as these might differ between AWS accounts, but that also proves a bit difficult, if your cluster spans multiple accounts.

There doesn't seem to be a simple fix, but having an option to allow merging certain SGs would be nice.

Multiply on 3 Dec 2018

It would be nice if we could merge the sgs on the instance side since it will always have the same ruleset anyway, allowing all traffic from the ALB to the worker nodes. That way, we'd only have 1 sg per worker's ENI and won't come near the limit. Anyone already on a PR to implement something similar?

encron on 26 Feb 2019

👍3

When this issue was originally filed, the SG per ENI limit times the security rule per group limit had to multiply to less than 300. The multiplicative limit is now _1,000_ which leaves a bit more headroom for raising the SG per ENI limit.

bendrucker on 4 Mar 2019

🎉1

Still the limit is only 16

To increase or decrease this limit, contact AWS Support. The maximum is 16.

If we can dynamically create new node SG or merge rules into a single SG, then this will mitigate the problems a lot.

Seems that @M00nF1sh already has something that's working (thank you:)), any idea when you would release the code?

yifan-gu on 3 Apr 2019

👍2

fwiw, I made something that's working in my branch, let me know if you want to discuss this @M00nF1sh

yifan-gu on 5 Apr 2019

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
(This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

M00nF1sh on 6 Apr 2019

🚀2 👍1

Hit the same restriction from AWS today as well, and I think we will asking AWS to increase the limit while we think of any workaround using CF script.

Really interested to see goes into stable. So we can have a cleaner implementation.
I will agree with @M00nF1sh as its a more simple approach. How can I test your implementation in the ingress-group?

monkeymon on 17 Apr 2019

@M00nF1sh would it be possible to get an estimate of when the fix for this would be released?

Spareo on 27 Apr 2019

❤1 👍1

Also hitting this issue. After a little confusion I finally checked the logs and found:

```
"Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached

montanaflynn on 6 May 2019

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
(This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

@M00nF1sh Sounds like an easier approach, although the limit on the rules in a SG is not too high as well (60 by default, up to 100 [1]), maybe that's enough for most of the use cases.
[1] https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

yifan-gu on 10 May 2019

@M00nF1sh is there anything we can do to help you get the fix PR that you have developed merged in?

diclophis on 19 Jun 2019

It appears that no movement is being made on this issue. I'd like to comment that this is also affecting our organization. Perhaps we need to be more vocal about how we're being affected by this limitation? Is there something blocking progress?

irlevesque on 20 Jun 2019

@irlevesque @diclophis
Hi, the reason i'm hesitate on merge it is we'll support separate securityGroup per pod in the near feature. Needs to discuss with the team what the best approach we should take for securityGroup handling.

For instance mode, since any pod can be addressed by node(kube-proxy), there is no need to create an node securityGroup at all. we can just use existing node securityGroup.
For ip mode, we might have to create a securityGroup and attach it pods that needs it(there might be enterprise customers that have concerns about securityGroup attached to all nodes/pods). There are to approach to handle ip mode:

change it to be same as instance mode, don't create a separate securityGroup at all. And once we actually supports securityGroup per pod, change it back to current mode(create node securityGroup and attach to ENI as need)
1. maintain it's current mode(create node securityGroup and attach to ENI as need), and handle #824 correctly by checking the targetDeregisteration status in targetGroup, and asynchronously update the securityGroup attachment. ( I don't get enough time to work on this yet)

@irlevesque It can be easily mitigated by manually create a securityGroup, and use alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> on ingress. Then add an inbound rule to worker node securityGroup for sgCreatedForALB.

M00nF1sh on 21 Jun 2019

@M00nF1sh creating each security group manually isn't really workable in our case and this issue is a big blocker for using this nice ingress-controller :(

encron on 18 Jul 2019

👍7

This is impacting some projects of mine as well. One issue I see with the alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> option is that setting would be per Kubernetes cluster and we would have to track it on our various applications we are deploying into the cluster. It seems like a more viable option would be to allow alb-ingress-controller users to specify that value at the helm install level for the ingress controller. We would still have to manually make the SG, but since we'd make one per k8s cluster and set it cluster wide at alb-ingress-controller install time we would no longer need to manage it per our deployed application.

anderm3 on 2 Aug 2019

👍2

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

iliazlobin on 28 Aug 2019

👍3

any news on this?

mmack-innio on 7 Nov 2019

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

TheBrigandier on 16 Jan 2020

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

Alexx-G on 10 Feb 2020

March 24-2020.

SecurityGroupsPerInterfaceLimitExceeded was also thrown on my case.

@M00nF1sh
Is there any way the ingress controller can use NODE AFFINITY to prevent PODS from being allocated to nodes with ENI's that have reached the hard limit? If the Cluster Autoscaling is managed properly PODS can get allocated to new Nodes/ENIS thus avoiding the hard limit AT LEAST at the ENI level.
Here are the current limits:

https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Thanks!

ecout on 25 Mar 2020

😄1 👎1

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

https://github.com/Alexx-G
+1. But actually, the Security Groups themselves have a HARD limit of about 40 rules. So with your suggestion we can increase the amount of ingresses in the cluster significantly but not really make it unlimited.

Edit:
The limits for SG numbers and SG rules have been increased across the board but this is still a possibility unless Services stop pointing to Nodes that can't take any more.
https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

ecout on 25 Mar 2020

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

I don't understand what you mean. The logs are the reason I was able to find this issue:

kubebuilder/controller "msg"="Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.\n\tstatus code: 400, request id: ************"  "controller"="alb-ingress-controller" "request"={"Namespace":"******","Name":"*********"}
I

ecout on 25 Mar 2020

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

Agreed,
Grouping ALBs per ingress policies can reduce the number of necessary Security groups per ALB and hence reduce the number of INSTANCE SG that need to be attached. If the Ingress definition is left without an explicit SG every time you create an ingress a new pair of SGs gets created until you reach the hard limit in the ENI side.

Edit, just for curiosity I took a look at cloud provider and found this feature request. Interestingly enough it seems they handle everything on a single SG with no OPTION for multiple SG support.

https://github.com/kubernetes/cloud-provider-aws/issues/81

ecout on 25 Mar 2020

Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases.

@bigkraig How about Node Affinity with Autoscaling? More nodes more ENIs. PODS pertaining to an Ingress that can't get on an ENI can be moved to a new node.

ecout on 25 Mar 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 23 Jun 2020

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 23 Jul 2020

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 22 Aug 2020

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 22 Aug 2020

Aws-load-balancer-controller: Only 5 SGs per ENI allowed

Most helpful comment

All 36 comments

Is that the right logic?

The workaround.

Is that the right logic?

The workaround.

Related issues