Cluster-api: Mutating a control plane KubeadmConfig triggers indefinite rolling upgrade

Created on 21 Sep 2020 · 17Comments · Source: kubernetes-sigs/cluster-api

What steps did you take and what happened:

Configure a mutating admission webhook that modifies the KubeadmConfig for control plane machines.
Observe that control plane machines will indefinitely try to upgrade.

What did you expect to happen:
Since matchInitOrJoinConfiguration checks against the entire KubeadmConfig Spec, any mutations applied for a KubeadmConfig (such as adding custom files or preKubeadmCommands), will trigger a rolling upgrade of all control plane machines. This behavior does not occur when mutating KubeadmConfig resources created from KubeadmConfigTemplate and likewise I would not expect this behavior when mutating a KubeadmConfig derived from a KubeadmControlPlane. To add another example, Kubernetes would not do a rolling upgrade of a Deployment if the Pods derived from that were mutated in any way (e.g. sidecars for service meshes). A rolling upgrade is only triggered when there is a change in the pod template.

A workaround can be to mutate KubeadmControlPlane in the same way KubeadmConfig is mutated, but if any mutations are dynamic or unique per machine, KubeadmControlPlane will continuously try to perform a rolling upgrade of the control plane machines since it expects KubeadmControlPlane and KubeadmConfig to match exactly.

matchInitOrJoinConfiguration should only match against fields that should trigger a rolling update instead of all of fields in KubeadmConfigSpec or more ideally, a hash of the last defined KubeadmConfig spec should be stored in KubeadmControlPlane to know when a rolling update should occur. This is the same mechanism used to determine whether a rolling update of a Deployment should occur.

Anything else you would like to add:

Environment:

Cluster-api version:
Minikube/KIND version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

kinbug prioritawaiting-more-evidence

Source

andrewsykim

All 17 comments

A workaround can be to mutate KubeadmControlPlane in the same way KubeadmConfig is mutated, but if any mutations are dynamic or unique per machine, KubeadmControlPlane will continuously try to perform a rolling upgrade of the control plane machines since it expects KubeadmControlPlane and KubeadmConfig to match exactly.

If you're using KCP, you should definitely have an admission webhook on KCP, rather than on CABPK resources, you could do so by skipping any modification if the linked Machine is marked as control plane, maybe?

matchInitOrJoinConfiguration should only match against fields that should trigger a rolling update instead of all of fields in KubeadmConfigSpec or more ideally, a hash of the last defined KubeadmConfig spec should be stored in KubeadmControlPlane to know when a rolling update should occur. This is the same mechanism used to determine whether a rolling update of a Deployment should occur.

This was the behavior before, although we moved away from hashing in KCP to allow adoption and do semantic diffs of kubeadm resources.

vincepri on 21 Sep 2020

If you're using KCP, you should definitely have an admission webhook on KCP, rather than on CABPK resources, you could do so by skipping any modification if the linked Machine is marked as control plane, maybe?

This is the "workaround" I mentioned in the issue description above. It works in some cases, but it's not ideal since values you mutate still have to match exactly per machine. If any values are dynamic on a per-machine basis, a full upgrade is still triggered.

This was the behavior before, although we moved away from hashing in KCP to allow adoption and do semantic diffs of kubeadm resources.

Can you expand on this a bit? What's the use-case for diffing between KubeadmControlPlane and KubeadmConfig if machines are considered immutable anyways?

Not sure if this was the intended behavior for KubeadmControlPlane, but regardless I find it counter-intuitive since no other resource (KubeadmConfigTemplate, PodTemplate in Deployment, etc) follows the same behavior / semantic.

andrewsykim on 21 Sep 2020

as a reference, this issue https://github.com/kubernetes-sigs/cluster-api/issues/3170 triggered us moving away from an hash mechanism to the current approach

fabriziopandini on 21 Sep 2020

👍1

We've had various issues using the hash, and we also talked about potentially also moving away from hashing on MachineDeployments.

Hashing also prevented KCP to adopt old control plane Machines, coming from v1alpha2.

vincepri on 21 Sep 2020

@andrewsykim Do you have some examples of what you'd consider something machine-specific or not?

vincepri on 21 Sep 2020

this was one of the many things that made us move away from machine-specific mutation in CAPV

@vincepri - this would be for any machine-specific mutation (e.g. bootstrapping a VIP through iptables on the first machine)

yastij on 21 Sep 2020

as a reference, this issue #3170 triggered us moving away from an hash mechanism to the current approach

Thanks for the context, I definitely see the limitations with hashing, and in fact I think PodTemplates suffer from similar issues but there are known some best practices around it at this point. Always enforcing exact matches of KubeadmControlPlane and KubeadmConfig seems like the other end of the extreme though.

Maybe there's a good middle ground here where we don't use template hashes but we don't check for matches against every field. For example, enforcing equality of KubeadmConfig Files or PreKubeadmCommands seems unexpected but enforcing equality of the Kubernetes version seems fine.

andrewsykim on 21 Sep 2020

For example, enforcing equality of KubeadmConfig Files or PreKubeadmCommands seems unexpected but enforcing equality of the Kubernetes version seems fine.

This seems like an opinionated approach? I'm sure there are some use cases that would require a rollout to be triggered.

I can see both use cases, although I'm not sure if we can/should change the behavior now that v1alpha3 is going in maintenance mode, definitely open to suggestions

vincepri on 21 Sep 2020

I can see both use cases, although I'm not sure if we can/should change the behavior now that v1alpha3 is going in maintenance mode, definitely open to suggestions

Fine with me, just giving my 2 cents that this was pretty unexpected. As @yastij mentioned above, this disallows any mutations that end up being specific to a machine.

andrewsykim on 21 Sep 2020

👍1

@andrewrynhard Do we have any action items here, or should we close this issue for now?

vincepri on 29 Sep 2020

/milestone Next
/priority awaiting-more-evidence

vincepri on 29 Sep 2020

Do we have any action items here, or should we close this issue for now?

Feel free to close this if I'm the only person who is complaining about it :P

To me at least, the semantics around KubeadmControlPlane, KubeadmTemplate and KubeadmConfig is confusing. If I want to mutate the bootstrap data for every machine and add some custom logic in the mutation, I should be able to mutate each machine's KubeadmConfig independent of KubeadmControlPlane and KubeadmTemplate. This is true today for KubeadmTemplate, but not KubeadmControlPlane. I also wouldn't expect mutating a KubeadmConfig to result in the machine being deleted and replaced because the mutated KubeadmConfig was not an EXACT match to KubeadmControlPlane.

andrewsykim on 29 Sep 2020

I think it's worth brainstorming a way for CP machine-specific customisation, wether it's through:

mutation
the same mechanism of MachineDeployments, where you create multiple mds with different params, that join the same cluster
some other way that is TBD

yastij on 7 Oct 2020

CP machine-specific customisation

This seems a fair goal, should we open a more generic issue to design it? It sound like we might want to tackle this from a Machine (template) perspective, rather than having a custom solution for just control planes

vincepri on 7 Oct 2020

Agreed, I'll file one

yastij on 9 Oct 2020

👍1

Thanks @yastij, closing this for now.

/close

vincepri on 9 Oct 2020

@vincepri: Closing this issue.

In response to this:

Thanks @yastij, closing this for now.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.