Kubespray: Unable to add new master/etcd node to cluster

Created on 8 Oct 2018 · 22Comments · Source: kubernetes-sigs/kubespray

Current:

master1
master2
master3

[etcd]
master1
master2
master3

[kube-node]
node1
node2
node3

Added node3 to group etcd:

master1
master2
master3
master4
master5

[etcd]
master1
master2
master3
master4
master5

[kube-node]
node1
node2
node3

Now i have 3 master/etcd and 45 nodes, i've already refrenced #1122 but couldn't fix it.I extended etcd success but master failed.It shows "kubectl" error:

Unable to connect to the server: x509: certificate is valid for "new master ip"

And my extend command is：

ansible-playbook -i inventory/mycluster/host.ini cluster.yml -l master1,master2,master3,master4,master5

My kubernetes cluster version is 1.9.3,how to fix it?

lifecyclrotten

Source

adoo123

👍4

Most helpful comment

You should be able to. In the past, we managed to replace all nodes in the cluster: master, etcd and workers. But.... there are some misteps you need to be carefull along the way. After a lot of experiments and retries in our lab environment, we came up with a few guidelines.

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

For some reason, Kubespray will not update the apiserver certificate.

Edit /etc/kubernetes/kubeadm-config.yaml, include new host in certSANs list.

Use kubeadm to recreate the certs.

cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml

Check the certificate, new host needs to be there.

openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt

2) Run `cluster.yml`

Add the new host to the inventory and run cluster.yml.

3) Restart kube-system/nginx-proxy

In all hosts, restart nginx-proxy pod. This pod is a local proxy for the apiserver. Kubespray will update its static config, but it needs to be restarted in order to reload.

# run in every host
docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restart

4) Remove old master nodes

If you are replacing a node, remove the old one from the inventory, and remove from the cluster runtime.

kubectl drain --force --ignore-daemonsets --grace-period 300 --timeout 360s --delete-local-data NODE_NAME

kubectl delete node NODE_NAME

After that, the old node can be safely shutdown. Also, make sure to restart nginx-proxy in all remaining nodes (step 3)

From any active master that remains in the cluster, re-upload kubeadm-config.yaml

kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml

Adding/replacing a worker node

This should be the easiest.

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

You can use --limit=node1 to limit Kubespray to avoid disturbing other nodes in the cluster.

3) Drain the node that will be removed

kubectl drain --force=true --grace-period=10 --ignore-daemonsets=true --timeout=0s --delete-local-data NODE_NAME

4) Run the `remove-node.yml` playbook

With the old node still in the inventory, run remove-node.yml. You need to pass -e node=NODE_NAME to the playbook to limit the execution to the node being removed.

5) Remove the node from the inventory

That's it.

Adding/Replacing an etcd node

You need to make sure there are always an odd number of etcd nodes in the cluster. In such a way, this is always a replace or scale up operation. Either add two new nodes or remove an old one.

1) Add the new node running `cluster.yml`.

Update the inventory and run cluster.yml passing --limit=etcd,kube-master -e ignore_assert_errors=yes.

Run upgrade-cluster.yml also passing --limit=etcd,kube-master -e ignore_assert_errors=yes. This is necessary to update all etcd configuration in the cluster.

At this point, you will have an even number of nodes. Everything should still be working, and you should only have problems if the cluster decides to elect a new etcd leader before you remove a node. Even so, running applications should continue to be available.

2) Remove an old etcd node

With the node still in the inventory, run remove-node.yml passing -e node=NODE_NAME as the name of the node that should be removed.

3) Make sure the remaining etcd members have their config updated

In each etcd host that remains in the cluster:

cat /etc/etcd.env | grep ETCD_INITIAL_CLUSTER

Only active etcd members should be in that list.

4) Remove old etcd members from the cluster runtime

Acquire a shell prompt into one of the etcd containers and use etcdctl to remove the old member.

# list all members
etcdctl member list 

# remove old member
etcdctl member remove MEMBER_ID

# careful!!! if you remove a wrong member you will be in trouble

# note: these command lines are actually much bigger, since you need to pass all certificates to etcdctl.

5) Make sure the apiserver config is correctly updated.

In every master node, edit /etc/kubernetes/manifests/kube-apiserver.yaml. Make sure only active etcd nodes are still present in the apiserver command line parameter --etcd-servers=....

6) Shutdown the old instance

juliohm1978 on 10 Sep 2019

👍20 🎉2

All 22 comments

The feature of scaling master nodes seems imperfect, but it's possible to scale etcd cluster seperatly, to do so, just add etcd nodes under [etcd] and rerun cluster.yml.

ykfq on 1 Apr 2019

I'm facing the same issue with adding new masters. I'm using Kubespray v2.10.x and the reason it fails is that Kubespray does not update the apiserver certificates to add the new master to the SAN list.

You can check your certificate with

openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt

... and the new master IP and hostname should be listed in the Subject Alternative Name section.

X509v3 Subject Alternative Name: 
                DNS:infra00-lab, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, DNS:infra00-lab, DNS:lb-apiserver.kubernetes.local, IP Address:10.233.0.1, IP Address:172.31.134.110, IP Address:172.31.134.110, IP Address:10.233.0.1, IP Address:127.0.0.1, IP Address:172.31.134.110

The execution of cluster.yml adds the new master IP and hostname to /etc/kubernetes/kubeadm-config.yaml as expected. It seems, however, that Kubespray is not calling kubeadm to replace the certificate before trying to join the new master node. We fixed this by using kubeadm manually to recreate the certificate.

NOTE: The works for v2.10.x. I never tested this in older versions of Kubespray.

In your first master, recreate the apiserver certificate.

cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml

If you are doing this after you ended up with a broken master, be sure to run reset.yml using the parameter --limit=<broken_master_hostname> before continuing. If you take the precaution of recreating the certificate before adding the new master node, you won't need this.

Run cluster.yml to include the new master node. You should end up with a working cluster.

juliohm1978 on 12 Jun 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 10 Sep 2019

is it possible to add master? or replace failed master for new one?

ppcololo on 10 Sep 2019

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

For some reason, Kubespray will not update the apiserver certificate.

Edit /etc/kubernetes/kubeadm-config.yaml, include new host in certSANs list.

Use kubeadm to recreate the certs.

cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml

Check the certificate, new host needs to be there.

openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt

2) Run `cluster.yml`

Add the new host to the inventory and run cluster.yml.

3) Restart kube-system/nginx-proxy

In all hosts, restart nginx-proxy pod. This pod is a local proxy for the apiserver. Kubespray will update its static config, but it needs to be restarted in order to reload.

# run in every host
docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restart

4) Remove old master nodes

If you are replacing a node, remove the old one from the inventory, and remove from the cluster runtime.

kubectl drain --force --ignore-daemonsets --grace-period 300 --timeout 360s --delete-local-data NODE_NAME

kubectl delete node NODE_NAME

After that, the old node can be safely shutdown. Also, make sure to restart nginx-proxy in all remaining nodes (step 3)

From any active master that remains in the cluster, re-upload kubeadm-config.yaml

kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml

Adding/replacing a worker node

This should be the easiest.

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

You can use --limit=node1 to limit Kubespray to avoid disturbing other nodes in the cluster.

3) Drain the node that will be removed

kubectl drain --force=true --grace-period=10 --ignore-daemonsets=true --timeout=0s --delete-local-data NODE_NAME

4) Run the `remove-node.yml` playbook

With the old node still in the inventory, run remove-node.yml. You need to pass -e node=NODE_NAME to the playbook to limit the execution to the node being removed.

5) Remove the node from the inventory

That's it.

Adding/Replacing an etcd node

You need to make sure there are always an odd number of etcd nodes in the cluster. In such a way, this is always a replace or scale up operation. Either add two new nodes or remove an old one.

1) Add the new node running `cluster.yml`.

Update the inventory and run cluster.yml passing --limit=etcd,kube-master -e ignore_assert_errors=yes.

Run upgrade-cluster.yml also passing --limit=etcd,kube-master -e ignore_assert_errors=yes. This is necessary to update all etcd configuration in the cluster.

2) Remove an old etcd node

With the node still in the inventory, run remove-node.yml passing -e node=NODE_NAME as the name of the node that should be removed.

3) Make sure the remaining etcd members have their config updated

In each etcd host that remains in the cluster:

cat /etc/etcd.env | grep ETCD_INITIAL_CLUSTER

Only active etcd members should be in that list.

4) Remove old etcd members from the cluster runtime

Acquire a shell prompt into one of the etcd containers and use etcdctl to remove the old member.

# list all members
etcdctl member list 

# remove old member
etcdctl member remove MEMBER_ID

# careful!!! if you remove a wrong member you will be in trouble

# note: these command lines are actually much bigger, since you need to pass all certificates to etcdctl.

5) Make sure the apiserver config is correctly updated.

In every master node, edit /etc/kubernetes/manifests/kube-apiserver.yaml. Make sure only active etcd nodes are still present in the apiserver command line parameter --etcd-servers=....

6) Shutdown the old instance

juliohm1978 on 10 Sep 2019

👍20 🎉2

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 10 Oct 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 9 Nov 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 9 Nov 2019

Can https://github.com/kubernetes-sigs/kubespray/blob/48a182844c9c3438e36c78cbc4518c962e0a9ab2/docs/recover-control-plane.md be applied for adding new master/etcd nodes? @qvicksilver

yujunz on 26 Nov 2019

@yujunz Not sure, haven't really tried that use case. Also I'm a bit unsure of the state of that playbook. Haven't had time to add it to CI. But please do try.

qvicksilver on 26 Nov 2019

The procedure to add\remove masters belongs in the readme, not hidden away in a comment in this issue.

holmesb on 21 Jan 2020

👍2

The procedure to add\remove masters belongs in the readme, not hidden away in a comment in this issue.

To be sure everybody see this, this was PR in #5570 and you can now find it here https://kubespray.io/#/docs/nodes

floryut on 10 Apr 2020

👍2

docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restar

I think this line doesn't work anymore, there is no k8s_nginx-proxy_nginx-proxy pod.

maxisam on 3 Jul 2020

You should be able to. In the past, we managed to replace all nodes in the cluster: master, etcd and workers. But.... there are some misteps you need to be carefull along the way. After a lot of experiments and retries in our lab environment, we came up with a few guidelines.

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

For some reason, Kubespray will not update the apiserver certificate.

Edit /etc/kubernetes/kubeadm-config.yaml, include new host in certSANs list.

Use kubeadm to recreate the certs.
cd /etc/kubernetes/ssl
mv apiserver.crt apiserver.crt.old
mv apiserver.key apiserver.key.old

cd /etc/kubernetes
kubeadm init phase certs apiserver --config kubeadm-config.yaml
Check the certificate, new host _needs_ to be there.
openssl x509 -text -noout -in /etc/kubernetes/ssl/apiserver.crt
2) Run cluster.yml

Add the new host to the inventory and run cluster.yml.

3) Restart kube-system/nginx-proxy

In _all hosts_, restart nginx-proxy pod. This pod is a local proxy for the apiserver. Kubespray will update its static config, but it needs to be restarted in order to reload.
# run in every host
docker ps | grep k8s_nginx-proxy_nginx-proxy | awk '{print $1}' | xargs docker restart
4) Remove old master nodes

If you are replacing a node, remove the old one from the inventory, and remove from the cluster runtime.
kubectl drain --force --ignore-daemonsets --grace-period 300 --timeout 360s --delete-local-data NODE_NAME

kubectl delete node NODE_NAME
After that, the old node can be safely shutdown. Also, make sure to restart nginx-proxy in all remaining nodes (step 3)

From any active master that remains in the cluster, re-upload kubeadm-config.yaml
kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml
Adding/replacing a worker node

This should be the easiest.

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

You can use --limit=node1 to limit Kubespray to avoid disturbing other nodes in the cluster.

3) Drain the node that will be removed
kubectl drain --force=true --grace-period=10 --ignore-daemonsets=true --timeout=0s --delete-local-data NODE_NAME
4) Run the remove-node.yml playbook

With the old node still in the inventory, run remove-node.yml. You need to pass -e node=NODE_NAME to the playbook to limit the execution to the node being removed.

5) Remove the node from the inventory

That's it.

Adding/Replacing an etcd node

You need to make sure there are always an odd number of etcd nodes in the cluster. In such a way, this is always a replace or scale up operation. Either add two new nodes or remove an old one.

1) Add the new node running cluster.yml.

Update the inventory and run cluster.yml passing --limit=etcd,kube-master -e ignore_assert_errors=yes.

Run upgrade-cluster.yml also passing --limit=etcd,kube-master -e ignore_assert_errors=yes. This is necessary to update all etcd configuration in the cluster.

At this point, you will have an even number of nodes. Everything should still be working, and you should only have problems if the cluster decides to elect a new etcd leader before you remove a node. Even so, running applications should continue to be available.

2) Remove an old etcd node

With the node still in the inventory, run remove-node.yml passing -e node=NODE_NAME as the name of the node that should be removed.

3) Make sure the remaining etcd members have their config updated

In each etcd host that remains in the cluster:
cat /etc/etcd.env | grep ETCD_INITIAL_CLUSTER
Only active etcd members should be in that list.

4) Remove old etcd members from the cluster runtime

Acquire a shell prompt into one of the etcd containers and use etcdctl to remove the old member.
# list all members
etcdctl member list 

# remove old member
etcdctl member remove MEMBER_ID

# careful!!! if you remove a wrong member you will be in trouble

# note: these command lines are actually much bigger, since you need to pass all certificates to etcdctl.
5) Make sure the apiserver config is correctly updated.

In every master node, edit /etc/kubernetes/manifests/kube-apiserver.yaml. Make sure only active etcd nodes are still present in the apiserver command line parameter --etcd-servers=....

6) Shutdown the old instance

Hello!
I have some issue with this commands

quersys@node1:/etc/kubernetes$ _sudo kubeadm init phase certs apiserver --config kubeadm-config.yaml_
W0810 11:08:48.479307 31818 utils.go:26] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0810 11:08:48.479525 31818 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[certs] Using existing apiserver certificate and key on disk

olegsidokhmetov on 10 Aug 2020

What version of K8s are you using? It's been almost a year since I posted. Did something change in kubeadm since then?

I would start by searching for official instructios on how to renew and recreate certs.

https://kubernetes.io/docs/tasks/administer-cluster/

juliohm1978 on 10 Aug 2020

What version of K8s are you using? It's been almost a year since I posted. Did something change in kubeadm since then?

I would start by searching for official instructios on how to renew and recreate certs.

https://kubernetes.io/docs/tasks/administer-cluster/

quersys@node1:/etc/kubernetes$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:51:04Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

I tried to find information about 6 hours :(

olegsidokhmetov on 10 Aug 2020

W0810 11:08:48.479307 31818 utils.go:26] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0810 11:08:48.479525 31818 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]

Those look like warnings. Most people seem to ignore them. Are you sure no error messages appear as well? Does it hang and never return? If that's the case, I'd wait for a timeout to hopefully get some actual error messages.

juliohm1978 on 10 Aug 2020

Yeh, also I have timeout error with my node, when I use cluster.yml and tried to add this node to master
10 авг. 2020 г. 17:10 пользователь Julio H Morimoto написал:

W0810 11:08:48.479307 31818 utils.go:26] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0810 11:08:48.479525 31818 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]

--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/kubernetes-sigs/kubespray/issues/3471#issuecomment-67141349

olegsidokhmetov on 10 Aug 2020

Sounds like a conectivity problem or something that leads to it. If you can provide any further logs and relevant messages, it would be helpful.

juliohm1978 on 10 Aug 2020

Thanks!!!

I have my new node ip in apiserver.crt

DNS:node1, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, DNS:node1, DNS:node3, DNS:lb-apiserver.kubernetes.local, DNS:node1.cluster.local, DNS:node3.cluster.local, IP Address:10.233.0.1, IP Address:172.26.1.225, IP Address:172.26.1.225, IP Address:10.233.0.1, IP Address:127.0.0.1, IP Address:172.26.1.225, IP Address:172.26.1.130

but when I use command ansible-playbook -i inventory/quersyscluster/hosts.yml cluster.yml I have problem connection "timeout"

olegsidokhmetov on 11 Aug 2020

Please post relevant log messages for more context. At this level, "connection timeout" is a broad error message.

juliohm1978 on 11 Aug 2020

Hi,
I am interested in to replace the first master (and the others) in kubernetes cluster using kubespray scripts. Is it possible?

Story:
I have build k8s cluster using kubespray scripts on Openstack with an old centos7 image. Next I want to upgrade OS, eg. from 7.7 to 7.8. I have newer OS image on Openstack prepared. I am able to deploy new masters and new workers with newer OS image. But there is a problem with first master. I need to delete whole vm and bring the new one with new OS. Did you have similar problem?

I tried to force master2 to be the first one, but when I do a join task on new master (eg. master4) it looks like kubeadm still want to connect to master1 (6.0.1.57):

kubeadm join --config kubeadm-controlplane.yaml --ignore-preflight-errors=all
W1028 12:37:17.050916    1666 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
        [WARNING FileExisting-ebtables]: ebtables not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get config map: Get https://6.0.1.57:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s: dial tcp 6.0.1.57:6443: connect: no route to host
To see the stack trace of this error execute with --v=5 or higher

Present, eg.:
master1, centos7.7
master2, centos7.7
master3, centos7.7

worker1, centos7.7
worker2, centos7.7
worker3, centos7.7

Expected:
master2, centos7.8 - master2 becomes the first one
master3, centos7.8
master4, centos7.8

worker1, centos7.8
worker2, centos7.8
worker3, centos7.8

How did you manage with recreate first master?

@juliohm1978 , maybe can you help?

Thanks!

dagorka on 29 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

blocking: Get current version of calico cluster version

nghiepvo · 3Comments

Vagrant - Fails to finish kubernetes deployment

butuzov · 4Comments

etcd remove-node failing and ignoring the errors

lacebal · 3Comments

Remove node (remove-node.yml) not working in kubespray 2.11

mjlshen · 3Comments

Pod tiller can not run

IvanBiv · 3Comments

Kubespray: Unable to add new master/etcd node to cluster

Most helpful comment

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

2) Run cluster.yml

3) Restart kube-system/nginx-proxy

4) Remove old master nodes

Adding/replacing a worker node

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

3) Drain the node that will be removed

4) Run the remove-node.yml playbook

5) Remove the node from the inventory

Adding/Replacing an etcd node

1) Add the new node running cluster.yml.

2) Remove an old etcd node

3) Make sure the remaining etcd members have their config updated

4) Remove old etcd members from the cluster runtime

5) Make sure the apiserver config is correctly updated.

6) Shutdown the old instance

All 22 comments

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

2) Run cluster.yml

3) Restart kube-system/nginx-proxy

4) Remove old master nodes

Adding/replacing a worker node

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

3) Drain the node that will be removed

4) Run the remove-node.yml playbook

5) Remove the node from the inventory

Adding/Replacing an etcd node

1) Add the new node running cluster.yml.

2) Remove an old etcd node

3) Make sure the remaining etcd members have their config updated

4) Remove old etcd members from the cluster runtime

5) Make sure the apiserver config is correctly updated.

6) Shutdown the old instance

Adding/replacing a master node

1) Recreate apiserver certs manually to include the new master node in the cert SAN field.

2) Run cluster.yml

3) Restart kube-system/nginx-proxy

4) Remove old master nodes

Adding/replacing a worker node

1) Add new node to the inventory.

2) Run upgrade-cluster.yml

3) Drain the node that will be removed

4) Run the remove-node.yml playbook

5) Remove the node from the inventory

Adding/Replacing an etcd node

1) Add the new node running cluster.yml.

2) Remove an old etcd node

3) Make sure the remaining etcd members have their config updated

4) Remove old etcd members from the cluster runtime

5) Make sure the apiserver config is correctly updated.

6) Shutdown the old instance

Related issues

2) Run `cluster.yml`

4) Run the `remove-node.yml` playbook

1) Add the new node running `cluster.yml`.

2) Run `cluster.yml`

4) Run the `remove-node.yml` playbook

1) Add the new node running `cluster.yml`.

2) Run `cluster.yml`

4) Run the `remove-node.yml` playbook

1) Add the new node running `cluster.yml`.