Eksctl: Nodegroup Deletion blocked

Created on 2 Apr 2019 · 15Comments · Source: weaveworks/eksctl

So when trying to delete the nodegroup it includes DS pods in it's "eviction" and therefore refuses to delete the NG. Need a way for these types of pods to be ignored and continue forward. If a pod was run without a DS and hasn't moved would be only reason to avoid the group deletion.

2019-04-01T18:54:44-04:00 [!]  removing nodegroup from auth ConfigMap: nodegroup instance role ARN is not set
2019-04-01T18:54:44-04:00 [▶]  no need to cordon node "ip-10-100-10-100.ec2.internal"
2019-04-01T18:54:44-04:00 [▶]  no need to cordon node "ip-10-100-10-101.ec2.internal"
2019-04-01T18:54:44-04:00 [▶]  no need to cordon node "ip-10-100-10-102.ec2.internal"
2019-04-01T18:54:44-04:00 [▶]  already drained: []
2019-04-01T18:54:44-04:00 [▶]  will drain: [ip-10-100-10-100.ec2.internal ip-10-100-10-101.ec2.internal ip-10-100-10-102.ec2.internal]
2019-04-01T18:54:45-04:00 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-gzx2c, kube-system/kube-proxy-x2msf
2019-04-01T18:54:45-04:00 [▶]  0 pods to be evicted from ip-10-100-100-100.ec2.internal
2019-04-01T18:54:45-04:00 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-nlfzh, kube-system/kube-proxy-nw9hd
2019-04-01T18:54:45-04:00 [▶]  0 pods to be evicted from ip-10-100-10-101.ec2.internal
2019-04-01T18:54:45-04:00 [!]  ignoring DaemonSet-managed Pods: kube-system/aws-node-6v2x2, kube-system/kube-proxy-w2s5z
2019-04-01T18:54:45-04:00 [✖]  Cannot evict pod as it would violate the pod's disruption budget.

aredeletions arenodegroup kinbug

Source

cdenneen

Most helpful comment

It was my fault - one replica of pod and PDB with only one replica allowed.
Thanks!

kovalyukm on 20 Mar 2020

👍2

All 15 comments

Certainly Cannot evict pod as it would violate the pod's disruption budget shouldn't really occur here.

As I mentioned on Slack, you want to use --drain=false to get around this for now.

errordeveloper on 3 Apr 2019

@errordeveloper Any way around this, I would prefer some kind of drain for all resources, not daemonsets.

dennisotugo on 18 May 2019

Sorry, should have said earlier: you should try to use 'kubectl drain -l
alpha.eksctl.io/nodegroup-name=', followed by 'eksctl delete ng
--drain=false'. Let us know how you get on!

On Sat, 18 May 2019, 4:13 pm Dennis W. Otugo, notifications@github.com
wrote:

@errordeveloper https://github.com/errordeveloper Any way around this,
I would prefer some kind of drain for all resources, not daemonsets.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/weaveworks/eksctl/issues/693?email_source=notifications&email_token=AAB5MS2FRKZ672EKWWGZFPLPWAMJHA5CNFSM4HC7BJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVWQNCY#issuecomment-493684363,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB5MS74T5RP4MSASDDM3QLPWAMJHANCNFSM4HC7BJSQ
.

errordeveloper on 18 May 2019

👍1

Is the equivalent of kubectl drain --ignore-daemonsets=true what is needed?

whereisaaron on 18 May 2019

That is what we already do, I am not sure why is it not working. If OP can
reproduce repeatedly, I would happily look into the details.

On Sat, 18 May 2019, 7:31 pm Aaron Roydhouse, notifications@github.com
wrote:

Is the equivalent of kubectl drain --ignore-daemonsets=true what is
needed?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/weaveworks/eksctl/issues/693?email_source=notifications&email_token=AAB5MSYTCLK3VUEYZZT6S4TPWBDQPA5CNFSM4HC7BJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVWTVJQ#issuecomment-493697702,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB5MS24DFYINUJASCOAH4LPWBDQPANCNFSM4HC7BJSQ
.

errordeveloper on 18 May 2019

So probably ‘Cannot evict pod as it would violate the pod's disruption budget.’ is referring to some non-DS Pod on @cdenneen’s cluster which had a strict PDB set that effectively prevents eviction?

whereisaaron on 20 May 2019

I am getting following error if I try to drain the nodes -

node/ip-192-168-19-2.ap-southeast-2.compute.internal cordoned
node/ip-192-168-39-234.ap-southeast-2.compute.internal cordoned
node/ip-192-168-54-95.ap-southeast-2.compute.internal cordoned
error: unable to drain node "ip-192-168-19-2.ap-southeast-2.compute.internal", aborting command...

There are pending nodes to be drained:
 ip-192-168-19-2.ap-southeast-2.compute.internal
 ip-192-168-39-234.ap-southeast-2.compute.internal
 ip-192-168-54-95.ap-southeast-2.compute.internal
cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): default/newrelic-infra-99prl, kube-system/aws-node-tsqht, kube-system/filebeat-b4fmx, kube-system/kube-proxy-zb687
cannot delete Pods with local storage (use --delete-local-data to override): kube-system/kubernetes-dashboard-5dd89b9875-blctn, ratecity/swift-client-58766f4ffb-bzcj2, ratecity/swift-server-67c97f965b-n5lgp

I am referring to the approach in this comment - https://github.com/weaveworks/eksctl/issues/693#issuecomment-493687068

dnmahendra on 5 Jul 2019

I was having this too, but in my case it was because one of my app was set to Replicas:1, and the PDB was set to MAX UNAVAILABLE: 1

With replicas: 2, the budget is respected, and the draining work.

So a problem of app misconfiguration in my case.

laghoule on 6 Aug 2019

👍1

There is a bug, I can reproduce it, and I'm working on a fix.

laghoule on 30 Aug 2019

Should be fixed in eksctl 0.7.0

laghoule on 9 Oct 2019

Does not help:

[ℹ] eksctl version 0.7.0
...
[ℹ] will drain 1 nodegroups in cluster "dev"
[ℹ] cordon node "ip-100-64-184-47.eu-west-1.compute.internal"
[ℹ] cordon node "ip-100-64-213-197.eu-west-1.compute.internal"
[ℹ] cordon node "ip-100-64-80-222.eu-west-1.compute.internal"
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-vjvcw, kube-system/calico-node-z8hwv, kube-system/kube-proxy-h9qlk
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-184-47.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-chhtj, kube-system/calico-node-jkz9h, kube-system/kube-proxy-52tpl
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-213-197.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-z6r49, kube-system/calico-node-5ht5r, kube-system/kube-proxy-gstkf
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-80-222.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-vjvcw, kube-system/calico-node-z8hwv, kube-system/kube-proxy-h9qlk
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-184-47.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-chhtj, kube-system/calico-node-jkz9h, kube-system/kube-proxy-52tpl
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-213-197.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-z6r49, kube-system/calico-node-5ht5r, kube-system/kube-proxy-gstkf
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-80-222.eu-west-1.compute.internal – will retry after delay of 5s
...

kovalyukm on 23 Oct 2019

@kovalyukm v0.7.0 provide a retry mechanism. For the PDB to be respected, you must set appropriate PDB configuration.

laghoule on 23 Oct 2019

👍1

It's installation via eksctl.
eksctl version 0.11.0
the same issue when try to delete old nodegroup:(

kovalyukm on 5 Dec 2019

👍1

Hi @kovalyukm I believe this was already fixed, can you confirm that you are still experiencing this issue with newer versions? Can you provide some logs?