Cluster-api: KubeadmControlPlane should adopt existing controlplane Machines

Created on 29 Jan 2020 · 21Comments · Source: kubernetes-sigs/cluster-api

User Story

As a user I would like to be able to migrate from a v1alpha2 cluster to a v1alpha3 cluster that utilizes the KubeadmControlPlane object. To facilitate this, the KubeadmControlPlane object should be able to adopt existing v1alpha3 controlplane machines. To simplify implementation for now, this should only occur when:

all Kubernetes version fields involved are equal
Machine's bootstrap ref is to a KubeadmConfig

/kind feature

arecontrol-plane kinfeature lifecyclactive prioritimportant-soon

Source

rudoi

All 21 comments

Also needs to make sure the bootstrap ref is to a KubeadmConfig.

ncdc on 29 Jan 2020

👍1

/cc @chuckha who definitely has ideas on this one

rudoi on 29 Jan 2020

We should be able to automagically fill in the required kubeadm config info (at least the join bits, init/clusterconfig bits may be missing if the cluster was upgraded) using one of the referenced bootstrap configs, and even be able to automagically create the required infrastructure template using one of the referenced infra configs.

That said, anything we are likely to do here could potentially conflict with the validation we have on incoming KubeadmControlPlane resources.

I think the main thing we have to define is what do we expect the migration (from non-KCP to KCP) to look like during the adoption process. I think we have a few different choices here:

User creates a KCP and dependent resources manually that will match the existing control plane machines
- Error prone and not very friendly
- Should work today
- Might be difficult to separate adoption issues from management issues, potentially causing disruption to a running cluster.
- Provides an opt-in experience
User creates an empty KCP that is marked in some way to avoid defaulting/validation errors and triggers an internal process to pre-populate the KCP (or error if unable to)
- Still requires some type of manual effort on the part of the user or higher level tooling
- Should reduce the number of issues related to the creation process
- Requires changes to defaulting/validation to accept empty KCP resources for the purpose of adoption.
- Errors can be returned to the user through Events and the KCP resource, and adoption would be separate from reconciliation, so should isolate adoption issues and avoid adoption issues causing disruption to a running cluster.
- Provides an opt-in experience.
An internal process attempts to identify an adoptable set of Machines and automatically create a pre-populated KCP resource with the proper config for the adopted cluster
- Might require us to provide an opt-out experience for folks that don't want adoption
- Should allow us to avoid changes to defaulting/validation
- Would have to figure out how to surface adoption errors
- Expose through a KCP resource that has failurereason/failuremessage set
- Expose through Events (ephemeral)
- ???
- Would likely require an additional process separate from the KCP controller

detiber on 30 Jan 2020

Does the KCP today rely on some kind of selector that could be exploited for the purpose of option 3? I think it would be a viable option to mark existing controlplane Machines in a way that tells the KCP controller to adopt.

Curious about

Might require us to provide an opt-out experience for folks that don't want adoption

Wouldn't this imply that a user wants a controlplane that is partially managed by a KCP? Is that a supportable pattern?

rudoi on 30 Jan 2020

Wouldn't this imply that a user wants a controlplane that is partially managed by a KCP? Is that a supportable pattern?

I'm thinking more along the lines of opting not to use KubeadmControlPlane initially (at some point, I expect we'll make control plane required, but we specifically stated that it would not be required for v1alpha3), So I would expect this to be something like a command line flag on the controller manager (or whatever implements the automated kcp creation/adoption) that would disable the creation/adoption workflow. If they choose to create a new cluster that uses KCP, it should still work, but any existing clusters would remain as they were.

detiber on 30 Jan 2020

👍1

Does the KCP today rely on some kind of selector that could be exploited for the purpose of option 3? I think it would be a viable option to mark existing controlplane Machines in a way that tells the KCP controller to adopt.

Yes, currently we use the same labels that we previously did manually: https://github.com/kubernetes-sigs/cluster-api/blob/065eb539766dede097e206a7b549b5902d15f14a/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go#L488-L493

Once upgrade handling is in place (prior to v1alpha3 release), this will also include another label that will act as a configuration "hash" as well. This is something that we could add during the adoption process, though.

detiber on 30 Jan 2020

I'm digging option two:

User creates an empty KCP that is marked in some way to avoid defaulting/validation errors and triggers an internal process to pre-populate the KCP (or error if unable to)

To be more specific about some potential gotchas:

cluster could be in a state where there are no controlplane machines with clusterConfig / initConfig in their KubeadmConfigs (this doesn't really matter until these fields become mutable -fingers crossed-)
you should not be able to make a populated KubeadmControlPlane for a cluster that already has controlplane machines?
probably others

rudoi on 30 Jan 2020

you should not be able to make a populated KubeadmControlPlane for a cluster that already has controlplane machines?

Anything we do here could potentially cause issues for pivoting workflows

detiber on 30 Jan 2020

Anything we do here could potentially cause issues for pivoting workflows

Why's that? Naturally, I'm assuming all clusters are pivoted from my perspective. :pizza:

rudoi on 30 Jan 2020

Why's that? Naturally, I'm assuming all clusters are pivoted from my perspective.

Assuming resources are pivoted, the resources are created on the remote cluster without ordering guarantees. If we try to enforce this as part of a defaulting webhook, then the KCP resource creation might fail.

If trying to handle it in the controller, then we'd have to provide asynchronous feedback to the user that the adoption process failed and would need to differentiate the adoption workflow from the reconciliation workflow for populated resources.

We could potentially make the constraint that we do not accept populated KCP resources that have the label that signals the adoption workflow set. That would allow us to validate that KCP resources without it have the fields we expect and would allow us to block populated KCP resources that have it set relatively easily in the admission webhooks.

detiber on 30 Jan 2020

Ah, yep, understood.

We could potentially make the constraint that we do not accept populated KCP resources that have the label that signals the adoption workflow set.

This makes sense to me. I think this label may have to be unset once ownership is granted to protect future "move" workflows?

Or, as part of a move, it may need to be added... 🤔 I need to diagram...

rudoi on 30 Jan 2020

@detiber what are the potential failure modes of purely relying on the label selector and pushing adoption logic to the top of the reconcile loop?

Looks like today we're only looking at ownerRefs:
https://github.com/kubernetes-sigs/cluster-api/blob/7cdea780c77075abec93deb26ab7d2e1f96961ea/controlplane/kubeadm/controllers/kubeadm_control_plane_controller.go#L180-L186

What if, after checking ownerRefs, we just checked for controlplane machines that match the label selector before checking replica count and adopt those that match?

Assorted random thoughts:

KubeadmConfigSpec isn't checked against existing hosts
InfrastructureTemplate could be interesting: What happens when those don't line up? I don't think these are checked against existing machines.
Kubernetes version definitely needs to line up

rudoi on 30 Jan 2020

What if, after checking ownerRefs, we just checked for controlplane machines that match the label selector before checking replica count and adopt those that match?

How do we handle the case that replicas do not match what we expect?
How do we provide safety that the join configuration we are applying for new replicas are similar to that of existing replicas?

I think my biggest concern is that I would expect the adoption process to provide some assurances and halt reconciliation if it hits a scenario it's not expecting. The worst case scenario in my mind is that we attempt adoption/reconciliation against a working control plane causing that control plane to get into a bad state (loss of quorum or deletion) causing cluster downtime.

detiber on 30 Jan 2020

Two potential paths may be:

don't pay attention at all to existing kubeadmconfigs (except for kubernetesVersion?) and allow the KCP to specify a kubeadmConfigSpec that will then be used when new machines are added. this is good because it's simple enough to implement but bad because it doesn't provide any extra guarantees.
deep compare the joinConfig of each kubeadmConfig referenced by an existing controlplane machine and only adopt when they are 1. equal to each other and 2. equal to the KCP's joinConfig? otherwise halt. i don't think this would be very easy to do, but we'd be ensured safety from a few potential failure modes. I also don't entirely think this is necessary, because we don't have this kind of protection today. I can join a bunch of controlplane nodes with totally different configurations for the most part.

as for the replicas, i was thinking the controller would aggregate all possible children (ownerRefs OR labels) and then determine whether a scale-up or scale-down is needed.

rudoi on 30 Jan 2020

/assign
/lifecycle active

Working on some kind of visual/proposal based on the above discussion.

rudoi on 7 Feb 2020

/assign @sethp-nr

rudoi on 13 Feb 2020

We've got a proof of concept working for machine adoption over here: https://github.com/newrelic-forks/cluster-api/commit/793167850e2ed6d5a5c5d82a646b5ab86a5a941e

Currently our thinking is for the controller to compute the hash of a Machine as it's adopted, and then let the rest of the reconciliation happen as appropriate. That has the effect of leaving the control plane in its current state if the KCP would produce the same config, or using the existing update mechanisms to gradually converge the existing control plane to the KCP's spec.

My feeling is that I shouldn't open the PR until after the upgrade work from #2334 is in – all the interesting test cases depend on that work being done first, I think, and there's likely to be a bunch of conflicts to resolve once that settles.

sethp-nr on 19 Feb 2020

We've got a proof of concept working for machine adoption over here: newrelic-forks@7931678

Nice, this is great work!

Currently our thinking is for the controller to compute the hash of a Machine as it's adopted, and then let the rest of the reconciliation happen as appropriate. That has the effect of leaving the control plane in its current state if the KCP would produce the same config, or using the existing update mechanisms to gradually converge the existing control plane to the KCP's spec.

+1, with the upgrade PR this definitely seems like a simple solution. Would require some docs since it could be a somewhat disruptive operation. This did get me thinking, if our KCP defines a version > 1 minor version ahead of what version the machines are using, this could get weird since k8s requires upgrading through each minor version.

chuckha on 20 Feb 2020

We can always add something in the validation webhook to validate the upgrade version limitations.

Will look through in more depth when I can access from a computer and not just my phone

detiber on 20 Feb 2020

Oh, yeah, a validating webhook / adoption-time check to ensure the version changes are reasonable is a super good call. The other thing our steel thread doesn't do yet is make sure that the Machines it's claiming ownership over are actually Kubeadm bootstrapped.

I'll address those two issues once I've got a few test runs under my belt – I'd like to see it work once :)

sethp-nr on 20 Feb 2020

👍1

/milestone v0.3.x