Cluster-api: Mark KCP machines for deletion

Created on 11 Nov 2020  路  14Comments  路  Source: kubernetes-sigs/cluster-api

User Story
As a user I would like to mark a specific controlplane Machine to be deleted by a KCP after I scale it down.

Detailed Description
As an example, I have three controlplane Machines (a. b, c). Now I want to scale down KCP replicas to 1 but I want specifically b and c Machines to be deleted after I run kubectl scale kcp example --replicas=1. As we discussed with @vincepri over the slack, we can mark&delete(by scaling down) Machines by adding cluster.x-k8s.io/delete-machine= annotation on them but only IF it is a worker Machine (controlled by MachineDeployment) but not controlplane Machine (controlled by KCP).

It would be nice to have similar way of marking specific controlplane Machines that we want to delete, as currently it can be done for MachineDeployment Machines via cluster.x-k8s.io/delete-machine= annotation.

/kind feature

arecontrol-plane kinfeature lifecyclactive prioritimportant-longterm

Most helpful comment

@fabriziopandini thanks for the suggestion/pointers on the draft version of the implementation approach. Would you mind telling me a bit more, which KCP proposal are you referring to? Are you referring to this one? (KCP scale out during the upgrade)

That is scaleIn specific update for the KCP proposal. You should create another PR to update the KCP proposal for this feature.

All 14 comments

/cc @vincepri
/cc @furkatgofurov7

/milestone v0.4.0
/priority important-longterm

One example use case is coming from Metal3, where I have a,b,c controlplane Machines and for some reason the underlying baremetal servers for Machine b and c are broken. Now, we want to get rid of those two b & c Machines (or baremetal servers in our case) for a while from the cluster. To do that we need a way to inform the KCP that it should delete Machine b and c during scaling down.

Is this being currently worked on, I would like to pick this one up.
cc @vincepri

/area control-plane

If no one is working on this, I am interested to pick this up.

@ncdc @fabriziopandini

@furkatgofurov7 sure thing!
You can assign yourself and add lifecycle active to signal other contributors when you start actively working on this. see https://github.com/kubernetes-sigs/cluster-api/blob/master/CONTRIBUTING.md#contributing-a-patch

/assign
/lifecycle active

@fabriziopandini what do you think of this approach? Would that direction be the best approach to fix this issue? Any high-level pointers&suggestions regarding how this needs to be properly tackled are very appreciated. Thanks!

I have seen a couple of PRs'/issue implementing/discussing why KubeadmControlPlane controller shouldn't rely on annotations to operate and we need to avoid annotations as much as possible in the first place. Would be good to hear your thoughts on this @benmoss @vincepri

@fabriziopandini what do you think of this approach?

IMO this approach is ok; possibly:

  • make sure the oldest machine is deleted when more than one machine is marked for deletion
  • add unit test
  • update the KCP proposal with a note describing this behavior for scale down

controller shouldn't rely on annotations to operate

I fully agree controllers should not use annotation for storing their own internal state.
However, in this case, we are using annotation to let the user express their intent - to tweak some internal controller logic -, so IMO we are fine.

@fabriziopandini thanks for the suggestion/pointers on the draft version of the implementation approach. Would you mind telling me a bit more, which KCP proposal are you referring to? Are you referring to this one? (KCP scale out during the upgrade)

@fabriziopandini thanks for the suggestion/pointers on the draft version of the implementation approach. Would you mind telling me a bit more, which KCP proposal are you referring to? Are you referring to this one? (KCP scale out during the upgrade)

That is scaleIn specific update for the KCP proposal. You should create another PR to update the KCP proposal for this feature.

Was this page helpful?
0 / 5 - 0 ratings