BUG REPORT
kubeadm version (use kubeadm version): v1.11.2
Environment:
kubectl version): v1.11.2uname -a): Linux ip-172-31-35-161.eu-west-1.compute.internal 4.14.59-coreos-r2 #1 SMP Sat Aug 4 02:49:25 UTC 2018 x86_64 Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz GenuineIntel GNU/LinuxHi, I'm trying to upgrade my control plane with custom flags, just to test. (I'm working on an ansible solution to implement Kubeadm HA
To upgrade I'm using kubeadm upgrade diff --config kubeadm-config.yaml and then upgrade with kubeadm upgrade apply --config kubeadm-config.yaml
I'have only added a flag to kube-api-server for now, I can see it in the diff but kubeadm still tries to restart controller and scheduler and also show me some diff between manifest (basicly moving volumeMount up and down the file comparing to the file generated by kubeadm init) so I tries to restart the 3 component, it seems to work but It get stuck when trying to restart the scheduler, except the scheduler is running and it is waiting to acquire lease.
I think this issue might be related to https://github.com/kubernetes/kubernetes/issues/65071 and it might be because the hash is not changing because it is the same kubernetes version and no changes has been made to the pod.
Is this the proper way to modify cluster config on an already bootstrapped cluster ?
I expect the control plane components to restart with the right flags/config added to the kubeadm config file
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]
When changing the version, for example I tried downgrading to 1.11.1 and the upgrade to 1.11.2, the upgrade is completed successfully.
Just adding the link without typo again, so that github can link the issues: https://github.com/kubernetes/kubernetes/issues/65071
Same issue here with single master solution. Current kubernetes version: 1.11.2 . We are just trying to add flags and the kubeadm upgrade process keep being stucked on scheduler in the same way.
Exact same issue. 1.11.2 single master. Trying to upgrade apply --config flags.yaml and getting stuck at api and then scheduler restarting.
Still an issue on 1.12.0. Docs mention that kubeadm upgrade apply is supposed to be idempotent.
/assign @timothysc
/cc @rdodev
@xiangpengzhao Thanks for the review. done.
Hey @ArchiFleKs (and others in this thread) I want to replicate this as close as possible. Would you folks mind sharing the config and the flags you attempted to change?
@rdodev I'm hitting this bug when I do the following:
/etc/kubeadm.yamlkubeletConfiguration option in /etc/kubeadm.yamlkubernetesVersionkubeadm upgrade apply --config /etc/kubeadm.yaml/etc/kubernetes/manifests/kubeadm-config and kubelet-config-1.11[upgrade/staticpods] Waiting for the kubelet to restart the component stepI think, even a simpler scenario should work (as kubeadm is idempotent):
kubeadm config view > /etc/kubeadm.yamlkubeadm upgrade apply --config /etc/kubeadm.yamlSo it should succeed even without any changes in configuration at all.
But it ends with the error:
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]
@ttarczynski great. Thanks for that. I think we found a different regression because of this :)
@ttarczynski can you try if this PR solves the issue for you?
@bart0sh yes I can try it, but don't know how to get the binary with this patch applied.
Is it something I can get easily done if I've never built kubernetes/kubeadm from source?
@bart0sh I've just managed to build the binary from PR #69886 and test it.
I've followed these steps:
v1.12.1:# kubectl version --short
Client Version: v1.12.1
Server Version: v1.12.1
# rpm -qa | egrep '^kube'
kubectl-1.12.1-2.x86_64
kubelet-1.12.1-2.x86_64
kubeadm-1.12.1-2.x86_64
/tmp/kubeadm.PR69886kubeadm config view seems to be broken in v1.12.1 as mentioned in #1174)/tmp/kubeadm.PR69886 config view > /etc/kubeadm.yaml
v1.12.1 to make sure it still ends with an error:[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
# kubeadm upgrade apply --config /etc/kubeadm.yaml
...
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
Static pod: kube-scheduler-ksb-m1.grey hash: 2117f54c43e401f807b7c9744c2a63be
...
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]
# /tmp/kubeadm.PR69886 upgrade apply --config /etc/kubeadm.yaml --force
...
[upgrade/staticpods] current and new manifests of kube-apiserver are equal, skipping upgrade
...
[upgrade/staticpods] current and new manifests of kube-scheduler are equal, skipping upgrade
...
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.12.1". Enjoy!
So it seems to me that PR #69886 does fix this issue.
@ArchiFleKs
I'have only added a flag to kube-api-server for now, I can see it in the diff but kubeadm still tries to restart controller and scheduler and also show me some diff between manifest (basicly moving volumeMount up and down the file comparing to the file generated by kubeadm init)
This should be fixed by this PR
Was this issue ever fixed in v1.11? Or only in v1.12?
@ocofaigh I think the patch (PR #69886) is only available in v1.13 and not backported to older versions.
You can find this info in release notes:
Fixed 'kubeadm upgrade' infinite loop waiting for pod restart (#69886, @bart0sh)
Hmm, is there any way I can work around this for v1.11? My scenario is I have a kubeadm v1.11.6 cluster set up, and I want to enable the PodSecurityPolicy admission plugin. So I should be able to:
kubeadm config view > kubeadm-config.yamlapiServerExtraArgs settings, and enable the PodSecurityPolicy admission plugin. EG:apiServerExtraArgs:
enable-admission-plugins: PodSecurityPolicy
kubeadm upgrade apply --config=kubeadm-config.yamlHowever, I get stuck in a loop outputting:
[upgrade/staticpods] Waiting for the kubelet to restart the component
I guess I could downgrade my cluster version to v1.11.5, and then enable the PodSecurityPolicy admission plugin as part of the upgrade from v1.11.5 -> v1.11.6, but that seems extreme. Anyone got a better idea?
My workaround was the same (downgrade then upgrade).
Can anyone else confirm that this fix wasn't back ported to 1.12?
@brysonshepherd it wasn't.
Most helpful comment
My workaround was the same (downgrade then upgrade).