There should be an option for rke up to update kubelet iteratively, one node at a time and only when all nodes are Ready.
RKE version: 0.1.15
*Docker version: (docker version,docker info preferred) 17.03.2-ce *
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) bare-metal
cluster.yml file:
Steps to Reproduce:
Results:
Would be great to drain a node before upgrading it, and support PodDisruptionBudget to avoid disruption !
This will ease the way to drain a node :
https://github.com/kubernetes/kubernetes/pull/72827
Meanwhile, do you think that manually draining nodes and doing a rke up on a partial cluster.yml is a way to achieve a no downtime update ?
One of the reasons to do that is due to https://github.com/kubernetes/kubernetes/issues/74669, kubelet might fail to start. The current worker plane upgrade can cause multiple node failures.
Another option is like @remche suggested, to have additional CLI option to limit workplane upgrades to particular nodes. And people can choose to drain before upgrade if desired.
I can confirm that we've encountered #kubernetes/kubernetes#74669 when trying to upgrade the k8s version with rke 0.1.17. One of the nodes failed to start the kubelet while workloads were still running (but simultaneously they were being recreated on surviving nodes). Ultimately, this led to loss of data on rook-ceph block volumes that weren't being unmounted on the failed node. I'd very much appreciate, if someone can suggest workarounds to perform a rolling upgrade while this enhancement is still not on the horizon.
@clkao @remche and anyone else looking for a solution:
If one specifies the ssh keys for access to nodes on a per node basis in the cluster.yml
nodes:
- node1
...
# ssh_key_path: something
...
- nodeN
...
ssh_key_path: something
Then run rke up, uncomment one more ssh_key_path, run rke up again, etc, until all nodes are updated.
In this case the k8s components are not restarted. Specifically, the node, that is being subjected to rke up for the first time restarts all components. The nodes that have the ssh key commented out are untouched. A node that is subjected to rke up for a second or further time restarts etcd, etcd-rolling-snapshots and kube-apiserver. This seems to be a workable, though hacky way to achieve a rolling upgrade.
We should add an upgrade strategy to all relevant k8s system components. Kubelet and kube-proxy are the most critical ones.
https://github.com/rancher/rke/pull/1800
This PR adds the required changes.
It ugprades the controlplane components one at a time and worker components such as kubelet/kube-proxy in user-configurable batches.
Users can optionally drain nodes before upgrade
Tested with RKE version v1.1.0-rc6
Upgrade strategy = rollingUpdate
maxUnavailable=1
Upgrade strategy = rollingUpdate
maxUnavailable=1
Upgrade strategy = rollingUpdate
maxUnavailable=1 and maxSurge=25%
Upgrade strategy = rollingUpdate
maxUnavailable=25% and maxSurge=25%
Most helpful comment
https://github.com/rancher/rke/pull/1800
This PR adds the required changes.
It ugprades the controlplane components one at a time and worker components such as kubelet/kube-proxy in user-configurable batches.
Users can optionally drain nodes before upgrade