Cluster-api: Consider stopping all deployments (instead of just pausing) before clusterctl move

Created on 16 Sep 2020 · 19Comments · Source: kubernetes-sigs/cluster-api

Clusterctl move today uses the Cluster's paused spec or pause annotation to block reconciliation on objects that are being moved. In past versions of clusterctl, we've actually preferred to stop all deployments in both the source and target management clusters before moving any object, to avoid slow workers or operations to be running while an object is moved and then reconciled somewhere else.

We can interact directly with deployments by first storing the current number of replicas in an annotation or somewhere else, and then scaling all deployments to zero. After moving all objects, only deployments in the target cluster should be scaled up.

/kind feature
/priority important-longterm

areclusterctl kinfeature prioritimportant-longterm

Source

vincepri

Most helpful comment

Hey folks, It seems there might be a lot of confusion here so I'll try to do my best to summarize the proposed solution.

Let's assume the following:

We have two management clusters: A and B.
The intent is to move the management from A to B.
clusterctl init has already been executed on both management clusters, all versions match (This is a pre-requisite for clusterctl move).
User runs clusterctl move to move the management cluster from A to B.

At this point, usually clusterctl moves objects by pausing all Clusters managed on the Management Cluster A, copying them, and then unpausing them when they're created on Management Cluster B.

What we've been wanting to add is:

Before we do any copy, we want to do the equivalent of kubectl scale --replicas=0 on all deployments that run controllers (NOT the webhooks), in both management clusters.
After all copies are performed, and the objects are (force)deleted from Management Cluster A, we need to restore the old replicas number on all deployments that run controllers (you can store the old replicas number into an annotation, and remove it afterwards, you can fallback to a safe default in case the annotation isn't set) first on Management Cluster B (target cluster) and then on Management Cluster A.

Hope this helps!

vincepri on 28 Sep 2020

👍2 ❤1

All 19 comments

Just as FYI, in clusterctl there are also additional checks trying to detect if all the long-running operations are completed (e.g. insfratructureReady = true + others I don't remember on top of my mind)
It should be noted that by scaling down deployments, we are stopping reconcile for all the clusters, not only for the one we are moving, but this should not be a problem for the bootstrap/pivoting use case at least

fabriziopandini on 16 Sep 2020

Yeah I'm fine with stopping for everyone, especially given that we're considering rewriting move to move the entire management cluster, the extra safety is definitely worth it

vincepri on 16 Sep 2020

maybe the check is something like below?
https://github.com/kubernetes-sigs/cluster-api/blob/master/cmd/clusterctl/client/cluster/mover.go#L106

so is proposed way to handle this is to
1) before move, record the current replic of deployment (how about daemonset, replicsets etc?) into somewhere then scale down to 0
2) move the object to target cluster
3) scale up the replica to desired number again when move operation complete

jichenjc on 21 Sep 2020

/milestone v0.4.0

Setting v1alpha4 given that this is a behavioral change

vincepri on 21 Sep 2020

before move, record the current replic of deployment (how about daemonset, replicsets etc?) into somewhere then scale down to 0

As far as I know, we only use deployments for Cluster API controllers, although we should make it a contract going forward, or standardize on a fixed set of supported ways to get a controller up

The rest looks good, if we go down this path, is it still worth to pause/unpause the clusters?

vincepri on 21 Sep 2020

is the pause only for forbidden deployment scale up ? if so we can do that.. if it has additional purpose such as prevent new deployment then certainly we can't do that ?

jichenjc on 21 Sep 2020

we only use deployments for Cluster API controllers, although we should make it a contract going forward

this is already documented in the clusterctl contract https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html#controllers--watching-namespace

fabriziopandini on 21 Sep 2020

is the pause only for forbidden deployment scale up ? if so we can do that.. if it has additional purpose such as prevent new deployment then certainly we can't do that ?

Not really, it's just to give us more control over the flow, although not strictly necessary if we spin down/up the controllers.

this is already documented in the clusterctl contract https://cluster-api.sigs.k8s.io/clusterctl/provider-contract.html#controllers--watching-namespace

Some other example to keep in mind for conformance, I don't think there is anything in clusterctl init that stops an infrastructure provider from using a StatefulSet for example instead of Deployment?

vincepri on 21 Sep 2020

I don't think there is anything in clusterctl init that stops an infrastructure provider from using a StatefulSet for example instead of Deployment?

In clusterctl we are assuming there are Deployment in order to do mutations e.g. to watching namespaces. I have to check if also the image override feature relies on the same assumption

fabriziopandini on 21 Sep 2020

👍1

if also the image override feature relies on the same assumption

@fabriziopandini any update on this ? I want to implement this so I want to avoid
wrong direction ... thanks

jichenjc on 28 Sep 2020

and we need think about how to revert if something wrong ,for example,
what about some error happened during the move action, how to scale them back etc..

jichenjc on 28 Sep 2020

from following (a move dry run action),seems the only object to be moved is MachineDeployment ?

Creating target namespaces, if missing
Creating objects in the target cluster
Creating Cluster="wff-test" Namespace="default"
Creating AWSCluster="wff-test" Namespace="default"
Creating KubeadmConfigTemplate="wff-test-md-0" Namespace="default"
Creating KubeadmControlPlane="wff-test-control-plane" Namespace="default"
Creating MachineDeployment="wff-test-md-0" Namespace="default"
Creating AWSMachineTemplate="wff-test-control-plane" Namespace="default"
Creating AWSMachineTemplate="wff-test-md-0" Namespace="default"
Creating Machine="wff-test-control-plane-s9hsw" Namespace="default"
Creating MachineSet="wff-test-md-0-5955cfb58d" Namespace="default"
Creating Secret="wff-test-ca" Namespace="default"
Creating Secret="wff-test-etcd" Namespace="default"
Creating Secret="wff-test-kubeconfig" Namespace="default"
Creating Secret="wff-test-proxy" Namespace="default"
Creating Secret="wff-test-sa" Namespace="default"
Creating AWSMachine="wff-test-control-plane-qv95s" Namespace="default"
Creating Machine="wff-test-md-0-5955cfb58d-hbk5z" Namespace="default"
Creating KubeadmConfig="wff-test-control-plane-96gj6" Namespace="default"
Creating KubeadmConfig="wff-test-md-0-ggklr" Namespace="default"
Creating Secret="wff-test-control-plane-96gj6" Namespace="default"
Creating AWSMachine="wff-test-md-0-qrc94" Namespace="default"
Creating Secret="wff-test-md-0-ggklr" Namespace="default"

jichenjc on 28 Sep 2020

@jichenjc yes, also images overrides assumes providers are using a deployment

https://github.com/kubernetes-sigs/cluster-api/blob/f07371de19756a95a4a1f09c236ace004261be22/cmd/clusterctl/internal/util/objs.go#L108-L116

from following (a move dry run action),seems the only object to be moved is MachineDeployment ?

Might be there is a little bit of confusion here.
The issues is about scaling controller Deployments to 0 before move and back to X after, not about moving them.
MachineDeployment instead should be moved without any change to the number of replicas (as it is now)

fabriziopandini on 28 Sep 2020

👍1

Hey folks, It seems there might be a lot of confusion here so I'll try to do my best to summarize the proposed solution.

Let's assume the following:

We have two management clusters: A and B.
The intent is to move the management from A to B.
clusterctl init has already been executed on both management clusters, all versions match (This is a pre-requisite for clusterctl move).
User runs clusterctl move to move the management cluster from A to B.

At this point, usually clusterctl moves objects by pausing all Clusters managed on the Management Cluster A, copying them, and then unpausing them when they're created on Management Cluster B.

What we've been wanting to add is:

Before we do any copy, we want to do the equivalent of kubectl scale --replicas=0 on all deployments that run controllers (NOT the webhooks), in both management clusters.
After all copies are performed, and the objects are (force)deleted from Management Cluster A, we need to restore the old replicas number on all deployments that run controllers (you can store the old replicas number into an annotation, and remove it afterwards, you can fallback to a safe default in case the annotation isn't set) first on Management Cluster B (target cluster) and then on Management Cluster A.

Hope this helps!

vincepri on 28 Sep 2020

👍2 ❤1

/area cluserctl

fabriziopandini on 12 Nov 2020

@fabriziopandini: The label(s) area/cluserctl cannot be applied, because the repository doesn't have them

In response to this:

/area cluserctl

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 12 Nov 2020

/area clusterctl

fabriziopandini on 12 Nov 2020

/assign

let me try this ..

jichenjc on 19 Nov 2020

following is only for my development reference, just record here
1) use kubectl scale --replicas=0 deployment capi-controller-manager -n capi-system to scale down controller
2) Cluster.Spec.Paused will be set and it will be used mainly to control new changes (need further check here)

jichenjc on 23 Nov 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings