Containers-roadmap: [EKS] [request]: Drain nodes gracefully during autoscaling terminations

Created on 7 Aug 2019 · 7Comments · Source: aws/containers-roadmap

Tell us about your request
What I would like to see happen, is when an instance is terminated during an autoscaling event, the aws-node daemon that is running should be able to detect the event, and gracefully drain the node, without having to use something like a lambda as described here
https://github.com/aws-samples/amazon-k8s-node-drainer

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

I'd like to run EKS without external dependencies, especially when those dependencies failing could cause a cluster to go down.

Are you currently working around this issue?
Planning to implement the solution as described in https://github.com/aws-samples/amazon-k8s-node-drainer

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

EKS Proposed

Source

aaronmell

👍17

Most helpful comment

We use a custom ami based on the ami AWS provides. We do this because we need to install some additional monitoring tools for compliance reasons. This makes managed node group a non-starter, because you can only use the official AMI. I think if Managed node groups would allow the use of a custom ami over a standard ami, it would be exactly what we are looking for, and we would probably migrate off our existing solution.

aaronmell on 11 Dec 2019

👍9

All 7 comments

As a workaround you can have a look at:

Disclaimer: I'm the author of the second one.

pawelprazak on 30 Oct 2019

the aws-node daemon that is running should be able to detect the event

aws-node is the CNI. It wouldn't be used for node draining. But luckily AWS already have a solution that might support ASG terminations soon: https://github.com/aws/aws-node-termination-handler/issues/14

max-rocket-internet on 14 Nov 2019

@max-rocket-internet @aaronmell
EKS Managed node group performs a drain when AutoScalingGroup terminates an instance on scale-in or rebalancing. Does that satisfy the requirements? Please re-open if it doesn't.

rtripat on 10 Dec 2019

aaronmell on 11 Dec 2019

👍9

Agreed with @aaronmell - the PR was around being able to do this before managed worker nodes were introduced. I'm also not intending to move to managed nodes, so would like this feature to be added still.

grrywlsn on 11 Dec 2019

I also don't think this issue should be closed as using managed nodes isn't an option for many people due to no custom AMI support or userdata.
I did come across this way of handling node draining on termination from Zalando's kubernetes-on-aws: https://github.com/zalando-incubator/kubernetes-on-aws/blob/449f8f3bf5c60e0d319be538460ff91266337abc/cluster/userdata-worker.yaml#L92-L120 It requires a kubeconfig on the worker nodes though to communicate with the API server to tell it to drain.