With releasing a new awesome HA feature with kubeadm, there are plenty cases are not covered automaticly. It requires manual preparation. One of the cases:
I have 3 master nodes.
Then I terminated first node(volume terminated also),
AND start provisioning a new master node with same ip address (immutable deployment, for example upgrade kube version).
So I need to clean etcd member before do join, because there are still old member with same name and ip registered and etcd health check would fail.
Currently to solve this problem I need to install etcd client tool and generate certificates (if they are missing). Example https://github.com/jetthoughts/infrastructure/blob/cde632f0c12f1ec16816dcffd50740c2679e715a/modules/k8s_master/data/master_init.tpl.sh#L35
There is exisitng merged feature, but it happen for kubeadm reset only https://github.com/kubernetes/kubernetes/pull/74112. I would like to reuse it.
There are mutliple solutions:
Some thoughts and comunication about topic in thread to same pr: https://github.com/kubernetes/kubernetes/pull/74112#issuecomment-467616897
/assign @yagonobre
I'm out of office due a conference but I'll start to work on it tomorrow.
I created this doc to discuss about the phases split. https://docs.google.com/document/d/1r13T8X7IE-FJOnjjDP5YP6Ekbj3Q2Es3Md_gEbwyy88/edit
/lifecycle active
There are a few other problems to reset too:
We might consider fixing everything along with phases in some of the next cycles.
@yagonobre @rosti
I think that this issue has a knob that should be considered before proceeding.
Kubeadm command (and related phases) are designed for having impact on the node where the command is executed, while the request from @miry (if I got this right) is to run kubeadm on machine A for cleaning up an etcd member created on machine B (with machine B actually lost).
What are the impact on UX for this new behaviour (e.g. additional flags/different command)?
Is it possible to achieve those cleanups without access to the machine file system?
the reset phases should continue to operate on the same node.
Most helpful comment
I'm out of office due a conference but I'll start to work on it tomorrow.
I created this doc to discuss about the phases split. https://docs.google.com/document/d/1r13T8X7IE-FJOnjjDP5YP6Ekbj3Q2Es3Md_gEbwyy88/edit
/lifecycle active