I checked if there was anything similar by searching etcd or image, but didn't find something like this.
Bug report.
kubeadm version (use kubeadm version): master
Environment:
kubectl version): 1.18 (not that it matters)uname -a): 5.3.18-24.9-default (not that it matters :p )If I have changed my image repository, and I execute an upgrade, and my version of etcd is the same in the new repository, then
the static pod manifest will not be updated according to the configmap. No change will be written to disk.
See also https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/upgrade/staticpods.go#L325 for that logic (permalink: https://github.com/kubernetes/kubernetes/blob/e23d83eead3b5ae57731afb0209f4a2aaa4009dd/cmd/kubeadm/app/phases/upgrade/staticpods.go#L320-L326 ).
For me, the static pod manifest should be rewritten if there was a change.
We should not only compare the version to prevent a downgrade, but also rewrite the manifest if its content is different than the expected content (we could hash the file's content and compare?)
Push an etcd image with the same version in two different repos.
Change the config map to point to the new image repository
kubeadm upgrade apply see the manifest not in sync with configmap.
This kind of problem might not be limited to etcd static pods.
@ereslibre I think you might be interested by this.
/kind bug
Could we SHA1 the contents of the marshaled pod, and compare it to the SHA1 of the pod in disk, and continue if they differ? I can imagine sources of changes that could lead to "phantom" rewrite of the manifest in this case though (e.g. initial-cluster computation).
Hi.
Are you providing an etcd tag or only the repository?
I'd expect If the user has passed a custom tag, 'apply' skips etcd upgrade completely. It should stll rewrite the etcd manifest with your custom repo if you only passed the repo. Else we have a bug.
cc @rosti
I don't remember if we did the last changes in 1.18.
Are you providing an etcd tag or only the repository?
We are providing the kubeadm config with a slightly different imageRepository and the same etcd imageTag in this upgrade apply instance. https://github.com/kubernetes/kubernetes/blob/e23d83eead3b5ae57731afb0209f4a2aaa4009dd/cmd/kubeadm/app/phases/upgrade/staticpods.go#L320-L326 is skipping the etcd upgrade because the image tag of etcd didn't change (despite the imageRepository did in the configuration).
I think the image tag should not be passed, otherwise kubeadm thinks you
want a custom etcd. IIRC the logic is 'tag != ""' to skip upgrade.
Maybe we want to check if they match too.
I think the image tag should not be passed, otherwise kubeadm thinks you
want a custom etcd
Yes, for kubeadm it's all custom images all over the place in our use case, because we override the version and the imageRepository in all components.
This bug was triggered because in this case, we wanted to change the imageRepository of etcd while keeping the same imageTag . We always kubeadm upgrade apply with --config, and is in this configuration where we changed the imageRepository while keeping the same imageTag for etcd.
I don't think we need to checksum these. The problem here is in the following check.
cmpResult, err := desiredEtcdVersion.Compare(currentEtcdVersionStr)
if err != nil {
return true, errors.Wrapf(err, "failed comparing the current etcd version %q to the desired one %q", currentEtcdVersionStr, desiredEtcdVersion)
}
if cmpResult < 1 { // The problem is here, this should be cmpResult < 0
return false, errors.Errorf("the desired etcd version %q is not newer than the currently installed %q. Skipping etcd upgrade", desiredEtcdVersion, currentEtcdVersionStr)
}
The idea behind this check in first place is to deny the ability to accidentally downgrade etcd and corrupt the data store.
The problem here is that this check skips regeneration of the static pod if the version is the same (therefore ignoring changes in other etcd related fields in the config).
so "apply" should regenerate the manifest (which also can restart etcd) even if the old tag == new tag?
(i.e. use < 0)
this guarantees that a new version of kubeadm that uses the old etcd can still apply certain manifest fixes that we introduced.
@neolit123 I think that's reasonable. +1 from me.
PRs welcome btw @ereslibre @evrardjp
Wanted to verify first with you, because there's still room for changes not being applied (e.g. if dataDir, extraArgs, servertCertSANs or peerCertSANs change on the configuration, they won't be persisted to disk unless the version changes) -- this is why I was proposing something that compared sums of the marshaled configurations.
@evrardjp or me will submit a PR with what we discussed though.
cc @rosti
I don't remember if we did the last changes in 1.18.
It's not 1.18 related :)
I don't think we need to checksum these. The problem here is in the following check.
cmpResult, err := desiredEtcdVersion.Compare(currentEtcdVersionStr) if err != nil { return true, errors.Wrapf(err, "failed comparing the current etcd version %q to the desired one %q", currentEtcdVersionStr, desiredEtcdVersion) } if cmpResult < 1 { // The problem is here, this should be cmpResult < 0 return false, errors.Errorf("the desired etcd version %q is not newer than the currently installed %q. Skipping etcd upgrade", desiredEtcdVersion, currentEtcdVersionStr) }The idea behind this check in first place is to deny the ability to accidentally downgrade etcd and corrupt the data store.
The problem here is that this check skips regeneration of the static pod if the version is the same (therefore ignoring changes in other etcd related fields in the config).
Correct, that's exactly what I meant in the bug report. Sorry if that wasn't clear, I tend to be a bit terse.
so "apply" should regenerate the manifest (which also can restart etcd) even if the old tag == new tag?
(i.e. use < 0)
this guarantees that a new version of kubeadm that uses the old etcd can still apply certain manifest fixes that we introduced.
Awesome, that's what I thought, as first patch, to allow the same version by changing those conditions over the place. That's a first part, for me the second part would be to indeed update the files if manifest change, like proposed here above by @ereslibre .
I opened the PR fixing this.
@evrardjp The way kubeadm works is to always regenerate static pods and deployments upon upgrade to carry over the current settings from the config maps. No detection should be required here.
Thanks for the info! Sounds logical.
Most helpful comment
PRs welcome btw @ereslibre @evrardjp