So when trying to delete the nodegroup it includes DS pods in it's "eviction" and therefore refuses to delete the NG. Need a way for these types of pods to be ignored and continue forward. If a pod was run without a DS and hasn't moved would be only reason to avoid the group deletion.
2019-04-01T18:54:44-04:00 [!] removing nodegroup from auth ConfigMap: nodegroup instance role ARN is not set
2019-04-01T18:54:44-04:00 [â–¶] no need to cordon node "ip-10-100-10-100.ec2.internal"
2019-04-01T18:54:44-04:00 [â–¶] no need to cordon node "ip-10-100-10-101.ec2.internal"
2019-04-01T18:54:44-04:00 [â–¶] no need to cordon node "ip-10-100-10-102.ec2.internal"
2019-04-01T18:54:44-04:00 [â–¶] already drained: []
2019-04-01T18:54:44-04:00 [â–¶] will drain: [ip-10-100-10-100.ec2.internal ip-10-100-10-101.ec2.internal ip-10-100-10-102.ec2.internal]
2019-04-01T18:54:45-04:00 [!] ignoring DaemonSet-managed Pods: kube-system/aws-node-gzx2c, kube-system/kube-proxy-x2msf
2019-04-01T18:54:45-04:00 [â–¶] 0 pods to be evicted from ip-10-100-100-100.ec2.internal
2019-04-01T18:54:45-04:00 [!] ignoring DaemonSet-managed Pods: kube-system/aws-node-nlfzh, kube-system/kube-proxy-nw9hd
2019-04-01T18:54:45-04:00 [â–¶] 0 pods to be evicted from ip-10-100-10-101.ec2.internal
2019-04-01T18:54:45-04:00 [!] ignoring DaemonSet-managed Pods: kube-system/aws-node-6v2x2, kube-system/kube-proxy-w2s5z
2019-04-01T18:54:45-04:00 [✖] Cannot evict pod as it would violate the pod's disruption budget.
Certainly Cannot evict pod as it would violate the pod's disruption budget shouldn't really occur here.
As I mentioned on Slack, you want to use --drain=false to get around this for now.
@errordeveloper Any way around this, I would prefer some kind of drain for all resources, not daemonsets.
Sorry, should have said earlier: you should try to use 'kubectl drain -l
alpha.eksctl.io/nodegroup-name=
--drain=false'. Let us know how you get on!
On Sat, 18 May 2019, 4:13 pm Dennis W. Otugo, notifications@github.com
wrote:
@errordeveloper https://github.com/errordeveloper Any way around this,
I would prefer some kind of drain for all resources, not daemonsets.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/weaveworks/eksctl/issues/693?email_source=notifications&email_token=AAB5MS2FRKZ672EKWWGZFPLPWAMJHA5CNFSM4HC7BJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVWQNCY#issuecomment-493684363,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB5MS74T5RP4MSASDDM3QLPWAMJHANCNFSM4HC7BJSQ
.
Is the equivalent of kubectl drain --ignore-daemonsets=true what is needed?
That is what we already do, I am not sure why is it not working. If OP can
reproduce repeatedly, I would happily look into the details.
On Sat, 18 May 2019, 7:31 pm Aaron Roydhouse, notifications@github.com
wrote:
Is the equivalent of kubectl drain --ignore-daemonsets=true what is
needed?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/weaveworks/eksctl/issues/693?email_source=notifications&email_token=AAB5MSYTCLK3VUEYZZT6S4TPWBDQPA5CNFSM4HC7BJS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVWTVJQ#issuecomment-493697702,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAB5MS24DFYINUJASCOAH4LPWBDQPANCNFSM4HC7BJSQ
.
So probably ‘Cannot evict pod as it would violate the pod's disruption budget.’ is referring to some non-DS Pod on @cdenneen’s cluster which had a strict PDB set that effectively prevents eviction?
I am getting following error if I try to drain the nodes -
node/ip-192-168-19-2.ap-southeast-2.compute.internal cordoned
node/ip-192-168-39-234.ap-southeast-2.compute.internal cordoned
node/ip-192-168-54-95.ap-southeast-2.compute.internal cordoned
error: unable to drain node "ip-192-168-19-2.ap-southeast-2.compute.internal", aborting command...
There are pending nodes to be drained:
ip-192-168-19-2.ap-southeast-2.compute.internal
ip-192-168-39-234.ap-southeast-2.compute.internal
ip-192-168-54-95.ap-southeast-2.compute.internal
cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): default/newrelic-infra-99prl, kube-system/aws-node-tsqht, kube-system/filebeat-b4fmx, kube-system/kube-proxy-zb687
cannot delete Pods with local storage (use --delete-local-data to override): kube-system/kubernetes-dashboard-5dd89b9875-blctn, ratecity/swift-client-58766f4ffb-bzcj2, ratecity/swift-server-67c97f965b-n5lgp
I am referring to the approach in this comment - https://github.com/weaveworks/eksctl/issues/693#issuecomment-493687068
I was having this too, but in my case it was because one of my app was set to Replicas:1, and the PDB was set to MAX UNAVAILABLE: 1
With replicas: 2, the budget is respected, and the draining work.
So a problem of app misconfiguration in my case.
There is a bug, I can reproduce it, and I'm working on a fix.
Should be fixed in eksctl 0.7.0
Does not help:
[ℹ] eksctl version 0.7.0
...
[ℹ] will drain 1 nodegroups in cluster "dev"
[ℹ] cordon node "ip-100-64-184-47.eu-west-1.compute.internal"
[ℹ] cordon node "ip-100-64-213-197.eu-west-1.compute.internal"
[ℹ] cordon node "ip-100-64-80-222.eu-west-1.compute.internal"
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-vjvcw, kube-system/calico-node-z8hwv, kube-system/kube-proxy-h9qlk
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-184-47.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-chhtj, kube-system/calico-node-jkz9h, kube-system/kube-proxy-52tpl
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-213-197.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-z6r49, kube-system/calico-node-5ht5r, kube-system/kube-proxy-gstkf
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-80-222.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-vjvcw, kube-system/calico-node-z8hwv, kube-system/kube-proxy-h9qlk
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-184-47.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-chhtj, kube-system/calico-node-jkz9h, kube-system/kube-proxy-52tpl
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-213-197.eu-west-1.compute.internal – will retry after delay of 5s
[!] ignoring DaemonSet-managed Pods: kube-system/aws-node-z6r49, kube-system/calico-node-5ht5r, kube-system/kube-proxy-gstkf
[!] pod eviction error ("Cannot evict pod as it would violate the pod's disruption budget.") on node ip-100-64-80-222.eu-west-1.compute.internal – will retry after delay of 5s
...
@kovalyukm v0.7.0 provide a retry mechanism. For the PDB to be respected, you must set appropriate PDB configuration.
It's installation via eksctl.
eksctl version 0.11.0
the same issue when try to delete old nodegroup:(
Hi @kovalyukm I believe this was already fixed, can you confirm that you are still experiencing this issue with newer versions? Can you provide some logs?
It was my fault - one replica of pod and PDB with only one replica allowed.
Thanks!
Most helpful comment
It was my fault - one replica of pod and PDB with only one replica allowed.
Thanks!