Hi folks,
I have a two node EKS cluster setup where I have one single instance of my application running on each node. Everything works great, my ingress DNS can access the pod on each EC2 instance. The target group has each instance showing available and healthy. The problem is when I enabled stickiness. I do receive the AWSALB cookie however I'm still jumping between each EC2 instance on subsequent requests. We added a blurb to our theme jsps to show the pod name being accessed. Stickiness is applied at the node instance so I don't understand how I can be randomly bouncing between two EC2s listed in the target group?
My ALB values in our ingress:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-group-attributes: stickness.enabled=true,stinkyness.ib_cookie.duration_seconds=1200
alb.ingress.kubernetes.io/target-type: instance
I was using the ALB image v1 and I updated to v1.1.2 and the behavior is still the same. My service port for my application is done via NodePort as per the 2048 example. Again, everything works like a champ but enabling stickiness, while providing the cookie, seemingly does nothing to keep the client going to the same worker node.
I hope this may be useful for others, it was resolved by using this in the ALB:
alb.ingress.kubernetes.io/target-type: ip
Now I'm a little confused why instance wouldn't work as expected so I'll leave this open for now and see we get a comment on that. Cheers!
Hi,
The behavior you observed is due to the nodePort on each workerNode will/might proxy traffic into every pods in your cluster(same ec2 or different ec2).
Thanks M00nF1sh however I'm still scratching my head with how is that ok when you have stickiness enabled and target set to instance. Shouldn't that send the client to that instance (EC2 worker node) each time if the AWSALB cookie is valid? IP is sending to Pod, which is very cool, that works great and it a valid way to use it but instance should stick to instance. Or at a minimum if target is set to instance then stickyness should not be allowed to be configured in the ingress since it basically does nothing in EKS. Beyond that it provided a cookie that has no use at that point either.
@georgefridrich Take a look here and work your way up. I agree with @M00nF1sh , in instance mode the Kubernetes Service is published in NodePort mode so then the ALB's target group can talk to the service.
Since the discerning happens at the Service level there is no mechanism I know of so far that would allow a service to choose the same Pod for actual stickiness to happen. Remember, the routing is happening with Containers, not nodes.
https://v1-11.docs.kubernetes.io/docs/concepts/services-networking/service/#the-gory-details-of-virtual-ips
@M00nF1sh The follow up question is, how does stickiness work with AWS ELBs? because those use NodePorts as well.
Thanks M00nF1sh however I'm still scratching my head with how is that ok when you have stickiness enabled and target set to instance. Shouldn't that send the client to that instance (EC2 worker node) each time if the AWSALB cookie is valid? IP is sending to Pod, which is very cool, that works great and it a valid way to use it but instance should stick to instance. Or at a minimum if target is set to instance then stickyness should not be allowed to be configured in the ingress since it basically does nothing in EKS. Beyond that it provided a cookie that has no use at that point either.
Check this link out, don't know enough about the ALB ingress controller at this point but a Natting issue in NodePort mode is a plausible cause. The solution offered below then is to use, as described in the kubernetes docs, the HA proxy that handles IP persistence across hops:
https://nishadikirielle.blogspot.com/2016/03/load-balancing-kubernetes-services-and.html
For the record in the Kubernetes world they use the term Session Affinity, not stickiness. That's how I was able to find it.
You could try setting the service's externalTrafficPolicy to Local. This would force requests that reach a node to only get routed to pods on that node. There are side effects regarding instance health in the load balancer and distribution of traffic between pods though.
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.10/#servicespec-v1-core
https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@wdalmut: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
According to the documentation here, it should be:
alb.ingress.kubernetes.io/target-group-attributes: stickiness.enabled=true,stickiness.lb_cookie.duration_seconds=1200
Instead of:
alb.ingress.kubernetes.io/target-group-attributes: stickness.enabled=true,stinkyness.ib_cookie.duration_seconds=1200
I hope this may be useful for others, it was resolved by using this in the ALB:
alb.ingress.kubernetes.io/target-type: ip
Now I'm a little confused why instance wouldn't work as expected so I'll leave this open for now and see we get a comment on that. Cheers!
Hi @georgefridrich , have you get a solution to support instance type? thanks,
Most helpful comment
I hope this may be useful for others, it was resolved by using this in the ALB:
alb.ingress.kubernetes.io/target-type: ip
Now I'm a little confused why instance wouldn't work as expected so I'll leave this open for now and see we get a comment on that. Cheers!