Aws-load-balancer-controller: Only 5 SGs per ENI allowed

Created on 18 Oct 2018  路  36Comments  路  Source: kubernetes-sigs/aws-load-balancer-controller

I got this error today: failed association of SecurityGroups due to failed to reconcile managed Instance securityGroup attachment due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.

using quay.io/coreos/alb-ingress-controller:1.0-beta.7

lifecyclrotten

Most helpful comment

@M00nF1sh creating each security group manually isn't really workable in our case and this issue is a big blocker for using this nice ingress-controller :(

All 36 comments

I wasn't aware of this issue... we create a managed securityGroup that will be attached to worker ENIs for every ingress by default.
For now,

  1. you can increase the limit of 5 securityGroup per ENI limits by TT to AWS support: https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups, which give you maximum of 15(16-1) ingresses.
  2. Or you can managed the securityGroup yourself, by creating an securityGroup sg-ForLB, and annotate every ingress with alb.ingress.kubernetes.io/security-groups: sg-ForLB, then add an rule to allow ingress traffic from sg-ForLB on your securityGroup for worker node.

@bigkraig I think ideally we can use an shared instance securityGroups for all ALB, and add/delete rules inside that instance securityGroup 馃槃 I'll make an PR for it

Thanks. That's perfect. You also need < (300 / $sg_eni_limit) rules in ALL of your security groups, which was meant, for me, I couldn't only increase to 8.

Currently, I am facing the same issue with the same image quay.io/coreos/alb-ingress-controller:1.0-beta.7

Also having this issue. In order to increase SGs per interface you need to decrease rules per SG to maintain the limit of 250 rules per interface (Default Hard Limit = Per region 5 (security group) * 50 (Rule) = 250) therefore we weren't able to increase the number of SGs

Unfortunately SG limitations and dealing with the repercussions of that are a fact of life in AWS. We had the same problems at Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases. We can try to find ways of making the controller more adaptable but in the end there isn't anything that can truly solve the problem.

I ran into this issue when target-type: instance and found the best course of action was to self-manage the security group using whatever IaC tool (CloudFormation/Terraform) was used to provision the VPC and EKS cluster. I have submitted a PR to update the documentation: #734

@M00nF1sh Anything we can do to help speed up the creation of a PR? :D

@Multiply umm..@bigkraig didn't like the idea of share instance sg(less safe than individual sg)....maybe this can be improved by a feature flag(though it's more complicated than simply change it)?

I guess we need to balance the creation of rules between SGs depending on AWS-limits, as these might differ between AWS accounts, but that also proves a bit difficult, if your cluster spans multiple accounts.

There doesn't seem to be a simple fix, but having an option to allow merging certain SGs would be nice.

It would be nice if we could merge the sgs on the instance side since it will always have the same ruleset anyway, allowing all traffic from the ALB to the worker nodes. That way, we'd only have 1 sg per worker's ENI and won't come near the limit. Anyone already on a PR to implement something similar?

When this issue was originally filed, the SG per ENI limit times the security rule per group limit had to multiply to less than 300. The multiplicative limit is now _1,000_ which leaves a bit more headroom for raising the SG per ENI limit.

Still the limit is only 16

To increase or decrease this limit, contact AWS Support. The maximum is 16.

If we can dynamically create new node SG or merge rules into a single SG, then this will mitigate the problems a lot.

Seems that @M00nF1sh already has something that's working (thank you:)), any idea when you would release the code?

fwiw, I made something that's working in my branch, let me know if you want to discuss this @M00nF1sh

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

  1. don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
    (This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
    The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

Hit the same restriction from AWS today as well, and I think we will asking AWS to increase the limit while we think of any workaround using CF script.

Really interested to see goes into stable. So we can have a cleaner implementation.
I will agree with @M00nF1sh as its a more simple approach. How can I test your implementation in the ingress-group?

@M00nF1sh would it be possible to get an estimate of when the fix for this would be released?

Also hitting this issue. After a little confusion I finally checked the logs and found:

```
"Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached

@yifan-gu
Hi, thanks for providing an fix for this, really appreciate it.

I'm wondering whether we can fallback to a more simple version as below:

  1. don't create instance security groups at all. Just use the existing securityGroup on worker nodes, and allow ingress traffic from the ALB securityGroup
    (This is not less secure than current model, since ALB should be considered inside trust boundary, and all traffic inside VPC is secure).
    The existing securityGroup on worker node can be resolved using the same logic as kubernetes core. (this is how serviceType=Loadbalancer works too).

Actually i have implemented it in a new branch for ingress group: https://github.com/M00nF1sh/aws-alb-ingress-controller/tree/ingress-group/

@M00nF1sh Sounds like an easier approach, although the limit on the rules in a SG is not too high as well (60 by default, up to 100 [1]), maybe that's enough for most of the use cases.
[1] https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

@M00nF1sh is there anything we can do to help you get the fix PR that you have developed merged in?

It appears that no movement is being made on this issue. I'd like to comment that this is also affecting our organization. Perhaps we need to be more vocal about how we're being affected by this limitation? Is there something blocking progress?

@irlevesque @diclophis
Hi, the reason i'm hesitate on merge it is we'll support separate securityGroup per pod in the near feature. Needs to discuss with the team what the best approach we should take for securityGroup handling.

For instance mode, since any pod can be addressed by node(kube-proxy), there is no need to create an node securityGroup at all. we can just use existing node securityGroup.
For ip mode, we might have to create a securityGroup and attach it pods that needs it(there might be enterprise customers that have concerns about securityGroup attached to all nodes/pods). There are to approach to handle ip mode:

  1. change it to be same as instance mode, don't create a separate securityGroup at all. And once we actually supports securityGroup per pod, change it back to current mode(create node securityGroup and attach to ENI as need)

    1. maintain it's current mode(create node securityGroup and attach to ENI as need), and handle #824 correctly by checking the targetDeregisteration status in targetGroup, and asynchronously update the securityGroup attachment. ( I don't get enough time to work on this yet)

@irlevesque It can be easily mitigated by manually create a securityGroup, and use alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> on ingress. Then add an inbound rule to worker node securityGroup for sgCreatedForALB.

@M00nF1sh creating each security group manually isn't really workable in our case and this issue is a big blocker for using this nice ingress-controller :(

This is impacting some projects of mine as well. One issue I see with the alb.ingress.kubernetes.io/security-groups: <sgCreatedForALB> option is that setting would be per Kubernetes cluster and we would have to track it on our various applications we are deploying into the cluster. It seems like a more viable option would be to allow alb-ingress-controller users to specify that value at the helm install level for the ingress controller. We would still have to manually make the SG, but since we'd make one per k8s cluster and set it cluster wide at alb-ingress-controller install time we would no longer need to manage it per our deployed application.

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

  1. AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

  1. ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

any news on this?

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

March 24-2020.

SecurityGroupsPerInterfaceLimitExceeded was also thrown on my case.

@M00nF1sh
Is there any way the ingress controller can use NODE AFFINITY to prevent PODS from being allocated to nodes with ENI's that have reached the hard limit? If the Cluster Autoscaling is managed properly PODS can get allocated to new Nodes/ENIS thus avoiding the hard limit AT LEAST at the ENI level.
Here are the current limits:

https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Thanks!

Hi,

@M00nF1sh #1019 fixes this issues, doesn't it?
If alb-ingress-controller modifies worker's security group and reconciles changes, then we shouldn't hit any limitations on Security Groups. And it should work even with Managed Node Groups.

https://github.com/Alexx-G
+1. But actually, the Security Groups themselves have a HARD limit of about 40 rules. So with your suggestion we can increase the amount of ingresses in the cluster significantly but not really make it unlimited.

Edit:
The limits for SG numbers and SG rules have been increased across the board but this is still a possibility unless Services stop pointing to Nodes that can't take any more.
https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Hello,

While the alb.ingress.kubernetes.io/security-groups annotation works, is there any possibility of specifying a default for this and allowing users to override it with this annotation? At my company this is causing a lot of support requests as end users are not aware we are running up against SG limits, and alb-ingress doesn't put a note in the ingress object's event log as to that being the case.

Thanks!

I don't understand what you mean. The logs are the reason I was able to find this issue:

kubebuilder/controller "msg"="Reconciler error" "error"="failed to reconcile securityGroup associations due to SecurityGroupsPerInterfaceLimitExceeded: The maximum number of security groups per interface has been reached.\n\tstatus code: 400, request id: ************"  "controller"="alb-ingress-controller" "request"={"Namespace":"******","Name":"*********"}
I

The annotation alb.ingress.kubernetes.io/security-groups let us specify the security group we've created in advanced that will be attached to the application load balancer created by alb controller.

The behavior is pretty similar to what Kubernetes AWS cloud provider suggests (Nginx ingress controller uses that library) with the similar annotation service.beta.kubernetes.io/aws-load-balancer-security-groups when you annotate Service resource. There's no mention of this annotation in the official documentation but you can still explore the source code.

The difference is how they manage security groups for an EC2 instance.

Generally, speaking, the traffic needs to be delivered to an instance. It goes the following way:
[LB SG] LB -> [EC2 INSTANCE SG] EC2 INSTANCE -> KUBE-PROXY -> ...
Just specifying LB security groups is not enough (if you, of course, don't want to have ec2 instance security groups open to the world). So different controller implementations try to solve the problem uniquely.

  1. AWS cloud provider

Once the load balancer is spun up and the security groups (specified or created) attached to it the controller searches for the security groups attached the network interfaces (ENI) amongst all of the instances in the cluster. Once it found the group(s), it adds the rule so that there's a permission on all network interfaces of all the instance in the cluster allowing traffic from the load balancer to it.

  1. ALB ingress controller

This one tries to be more accurate in specifying rules and only in the case when the annotation service.beta.kubernetes.io/aws-load-balancer-security-groups is not specified. When so, the controller creates the group for an instance with the predefined rule that has a reference to the load balancer SG. Then it finds the certain ENI (in a case there are several per on an instance) amongst all of the instances and tries to attach the SG to it. And we pretty fast reach the limit here.

Is that the right logic?

I'm wondering why alb doesn't have the same logic as AWS cloud provider? It seems to be pretty neat and right. The fact that alb ingress controller in an IP target type mode (alb.ingress.kubernetes.io/target-type annotation) solves the problem of an extra hop by not forwarding the traffic through a proxy. In this case, it goes directly to the recipient (Pod (allocated IP for Pod)) and why then we need to make things more complicated due to security concerns? We still can solve the security issues leveraging the native Kubernetes way by using Network Policies.

The workaround.

As been said, the workaround for this is to manage security groups externally. In our organization, we used Terraform for spinning up EKS cluster. So the groups for load balancers are going to be managed from there.

I had some doubts whether Terraform would conflict with changes made by AWS cloud provider (that adds rules into existing SGs attached to an instance), but as turned out it doesn't. When you apply a plan, it just adds additional rules to its state and doesn't manage those if there's no rule (aws_security_group_rule) matching the name.

Agreed,
Grouping ALBs per ingress policies can reduce the number of necessary Security groups per ALB and hence reduce the number of INSTANCE SG that need to be attached. If the Ingress definition is left without an explicit SG every time you create an ingress a new pair of SGs gets created until you reach the hard limit in the ENI side.

Edit, just for curiosity I took a look at cloud provider and found this feature request. Interestingly enough it seems they handle everything on a single SG with no OPTION for multiple SG support.

https://github.com/kubernetes/cloud-provider-aws/issues/81

Ticketmaster and were able to improve the situation by reusing a lot of crafted groups for specific use cases.

@bigkraig How about Node Affinity with Autoscaling? More nodes more ENIs. PODS pertaining to an Ingress that can't get on an ENI can be moved to a new node.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

joseppla picture joseppla  路  5Comments

NickEAVE picture NickEAVE  路  3Comments

brylex418 picture brylex418  路  4Comments

hieu29791 picture hieu29791  路  4Comments

gigi-at-zymergen picture gigi-at-zymergen  路  5Comments