Eksctl: 'delete nodegroup' doesnt drain nodes

Created on 4 Nov 2019  Â·  1Comment  Â·  Source: weaveworks/eksctl

What happened?
doing 'delete nodegroup' immediately killed all the nodes before draining them and moving pods to new nodegroup

What you expected to happen?
Nodegroup should have been drained first, ie, existing pods moved to new nodegroup, before deleting nodes.
The logs show the node is cordoned, but in reality the nodepool was instantly deleted and there was massive downtime on cluster while pods were slowly rescheduled onto new nodegroup. no 'draining' of nodes was done.

This is the behavior described in official docs

Also EKS upgrade docs also have this approach.

How to reproduce it?
Follow the upgrade steps for EKS cluster using eksctl.
create a new nodegroup, then delete the previous nodegroup using command:

eksctl delete nodegroup --cluster mycluster --name ng-1

Anything else we need to know?
Windows 10

Versions

$ eksctl version 0.8
$ kubectl version  1.14

Logs

eksctl delete nodegroup --cluster mycluster --name ng-1
[ℹ]  eksctl version 0.8.0
[ℹ]  using region us-east-1
[ℹ]  combined include rules: ng-1
[ℹ]  1 nodegroup (ng-1) was included (based on the include/exclude rules)
[ℹ]  will delete 1 nodegroups from auth ConfigMap in cluster "mycluster"
[ℹ]  removing identity "arn:aws:iam::xxx:role/eksctl-mycluster-nodegroup-ng-1-NodeInstanceRole-T528C66G9SYK" from auth ConfigMap (username = "system:node:{{EC2PrivateDNSName}}", groups = ["system:bootstrappers" "system:nodes"])
[ℹ]  will drain 1 nodegroups in cluster "mycluster"
[ℹ]  cordon node "ip-192-168-18-105.ec2.internal"
[ℹ]  cordon node "ip-192-168-62-100.ec2.internal"
[ℹ]  cordon node "ip-192-168-67-209.ec2.internal"
[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-ksrv9, default/dsinfra-agent-q4f9k, default/logdna-agent-m2xtw, kube-system/aws-node-jkgd2, kube-system/kube-proxy-9htpd
[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-8vzkz, default/dsinfra-agent-p2lf9, default/logdna-agent-nb5wv, kube-system/aws-node-g55k4, kube-system/kube-proxy-gc6dj
[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-x9d99, default/dsinfra-agent-7qnsr, default/logdna-agent-bzndz, kube-system/aws-node-b5427, kube-system/kube-proxy-8rplq
[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-ksrv9, default/dsinfra-agent-q4f9k, default/logdna-agent-m2xtw, kube-system/aws-node-jkgd2, kube-system/[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-8vzkz, default/dsinfra-agent-p2lf9, default/logdna-agent-nb5wv, kube-system/aws-node-g55k4, kube-system/kube-proxy-gc6dj
[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-ksrv9, default/dsinfra-agent-q4f9k, default/logdna-agent-m2xtw, kube-system/aws-node-jkgd2, kube-system/[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-8vzkz, default/dsinfra-agent-p2lf9, default/logdna-agent-nb5wv, kube-system/aws-node-g55k4, kube-system/kube-proxy-gc6dj
[!]  ignoring DaemonSet-managed Pods: default/datadog-agent-x9d99, default/dsinfra-agent-7qnsr, default/logdna-agent-bzndz, kube-system/aws-node-b5427, kube-system/kube-proxy-8rplq
[✔]  drained nodes: [ip-192-168-18-105.ec2.internal ip-192-168-62-100.ec2.internal ip-192-168-67-209.ec2.internal]
[ℹ]  will delete 1 nodegroups from cluster "mycluster"
[ℹ]  1 task: { delete nodegroup "ng-1" [async] }
[✔]  deleted 1 nodegroups from cluster "mycluster"
kinbug

Most helpful comment

seems like the pods were deleted instantly cause no pod disruption budget was set

>All comments

seems like the pods were deleted instantly cause no pod disruption budget was set

Was this page helpful?
0 / 5 - 0 ratings