Containers-roadmap: [EKS] [request]: Drain nodes gracefully during autoscaling terminations

Created on 7 Aug 2019  路  7Comments  路  Source: aws/containers-roadmap

Tell us about your request
What I would like to see happen, is when an instance is terminated during an autoscaling event, the aws-node daemon that is running should be able to detect the event, and gracefully drain the node, without having to use something like a lambda as described here
https://github.com/aws-samples/amazon-k8s-node-drainer

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

I'd like to run EKS without external dependencies, especially when those dependencies failing could cause a cluster to go down.

Are you currently working around this issue?
Planning to implement the solution as described in https://github.com/aws-samples/amazon-k8s-node-drainer

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

EKS Proposed

Most helpful comment

We use a custom ami based on the ami AWS provides. We do this because we need to install some additional monitoring tools for compliance reasons. This makes managed node group a non-starter, because you can only use the official AMI. I think if Managed node groups would allow the use of a custom ami over a standard ami, it would be exactly what we are looking for, and we would probably migrate off our existing solution.

All 7 comments

As a workaround you can have a look at:

Disclaimer: I'm the author of the second one.

the aws-node daemon that is running should be able to detect the event

aws-node is the CNI. It wouldn't be used for node draining. But luckily AWS already have a solution that might support ASG terminations soon: https://github.com/aws/aws-node-termination-handler/issues/14

@max-rocket-internet @aaronmell
EKS Managed node group performs a drain when AutoScalingGroup terminates an instance on scale-in or rebalancing. Does that satisfy the requirements? Please re-open if it doesn't.

We use a custom ami based on the ami AWS provides. We do this because we need to install some additional monitoring tools for compliance reasons. This makes managed node group a non-starter, because you can only use the official AMI. I think if Managed node groups would allow the use of a custom ami over a standard ami, it would be exactly what we are looking for, and we would probably migrate off our existing solution.

Agreed with @aaronmell - the PR was around being able to do this before managed worker nodes were introduced. I'm also not intending to move to managed nodes, so would like this feature to be added still.

I also don't think this issue should be closed as using managed nodes isn't an option for many people due to no custom AMI support or userdata.
I did come across this way of handling node draining on termination from Zalando's kubernetes-on-aws: https://github.com/zalando-incubator/kubernetes-on-aws/blob/449f8f3bf5c60e0d319be538460ff91266337abc/cluster/userdata-worker.yaml#L92-L120 It requires a kubeconfig on the worker nodes though to communicate with the API server to tell it to drain.

@MarcusNoble I created a separate issue to track this feature request for self managed nodes #783

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jeremietharaud picture jeremietharaud  路  3Comments

aliabas7 picture aliabas7  路  3Comments

tabern picture tabern  路  3Comments

ORESoftware picture ORESoftware  路  3Comments

sarath9985 picture sarath9985  路  3Comments