Kubeadm: Kubeadm reset phases propose

Created on 26 Feb 2019  路  5Comments  路  Source: kubernetes/kubeadm

With releasing a new awesome HA feature with kubeadm, there are plenty cases are not covered automaticly. It requires manual preparation. One of the cases:

I have 3 master nodes.
Then I terminated first node(volume terminated also),
AND start provisioning a new master node with same ip address (immutable deployment, for example upgrade kube version).

So I need to clean etcd member before do join, because there are still old member with same name and ip registered and etcd health check would fail.

Currently to solve this problem I need to install etcd client tool and generate certificates (if they are missing). Example https://github.com/jetthoughts/infrastructure/blob/cde632f0c12f1ec16816dcffd50740c2679e715a/modules/k8s_master/data/master_init.tpl.sh#L35

There is exisitng merged feature, but it happen for kubeadm reset only https://github.com/kubernetes/kubernetes/pull/74112. I would like to reuse it.

There are mutliple solutions:

  • Build kubeadm reset phases workflow, similar to init phases and join phases (just to help users to skip installation and usage of etcdctl).
  • Correct join master to HA cluster with old ip and fresh instance.

Some thoughts and comunication about topic in thread to same pr: https://github.com/kubernetes/kubernetes/pull/74112#issuecomment-467616897

areUX kindesign lifecyclactive prioritimportant-longterm

Most helpful comment

I'm out of office due a conference but I'll start to work on it tomorrow.
I created this doc to discuss about the phases split. https://docs.google.com/document/d/1r13T8X7IE-FJOnjjDP5YP6Ekbj3Q2Es3Md_gEbwyy88/edit

/lifecycle active

All 5 comments

/assign @yagonobre

I'm out of office due a conference but I'll start to work on it tomorrow.
I created this doc to discuss about the phases split. https://docs.google.com/document/d/1r13T8X7IE-FJOnjjDP5YP6Ekbj3Q2Es3Md_gEbwyy88/edit

/lifecycle active

There are a few other problems to reset too:

  • Needs to be made Windows friendly (extract non-Windows code from the main logic).
  • Test coverage is insufficient.
  • Code needs some structure to it. Currently we have a ~80 LOC function that does most of the stuff in it.

We might consider fixing everything along with phases in some of the next cycles.

@yagonobre @rosti
I think that this issue has a knob that should be considered before proceeding.
Kubeadm command (and related phases) are designed for having impact on the node where the command is executed, while the request from @miry (if I got this right) is to run kubeadm on machine A for cleaning up an etcd member created on machine B (with machine B actually lost).

What are the impact on UX for this new behaviour (e.g. additional flags/different command)?
Is it possible to achieve those cleanups without access to the machine file system?

the reset phases should continue to operate on the same node.

Was this page helpful?
0 / 5 - 0 ratings