Rke: Rolling upgrade for k8s system components

Created on 1 Feb 2019  路  8Comments  路  Source: rancher/rke

There should be an option for rke up to update kubelet iteratively, one node at a time and only when all nodes are Ready.

RKE version: 0.1.15

*Docker version: (docker version,docker info preferred) 17.03.2-ce *

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) bare-metal

cluster.yml file:

Steps to Reproduce:

Results:

internal kinenhancement

Most helpful comment

https://github.com/rancher/rke/pull/1800
This PR adds the required changes.
It ugprades the controlplane components one at a time and worker components such as kubelet/kube-proxy in user-configurable batches.
Users can optionally drain nodes before upgrade

All 8 comments

Would be great to drain a node before upgrading it, and support PodDisruptionBudget to avoid disruption !

This will ease the way to drain a node :

https://github.com/kubernetes/kubernetes/pull/72827

Meanwhile, do you think that manually draining nodes and doing a rke up on a partial cluster.yml is a way to achieve a no downtime update ?

One of the reasons to do that is due to https://github.com/kubernetes/kubernetes/issues/74669, kubelet might fail to start. The current worker plane upgrade can cause multiple node failures.

Another option is like @remche suggested, to have additional CLI option to limit workplane upgrades to particular nodes. And people can choose to drain before upgrade if desired.

I can confirm that we've encountered #kubernetes/kubernetes#74669 when trying to upgrade the k8s version with rke 0.1.17. One of the nodes failed to start the kubelet while workloads were still running (but simultaneously they were being recreated on surviving nodes). Ultimately, this led to loss of data on rook-ceph block volumes that weren't being unmounted on the failed node. I'd very much appreciate, if someone can suggest workarounds to perform a rolling upgrade while this enhancement is still not on the horizon.

@clkao @remche and anyone else looking for a solution:
If one specifies the ssh keys for access to nodes on a per node basis in the cluster.yml

nodes:
- node1
  ...
# ssh_key_path: something
...
- nodeN
  ...
  ssh_key_path: something

Then run rke up, uncomment one more ssh_key_path, run rke up again, etc, until all nodes are updated.
In this case the k8s components are not restarted. Specifically, the node, that is being subjected to rke up for the first time restarts all components. The nodes that have the ssh key commented out are untouched. A node that is subjected to rke up for a second or further time restarts etcd, etcd-rolling-snapshots and kube-apiserver. This seems to be a workable, though hacky way to achieve a rolling upgrade.

We should add an upgrade strategy to all relevant k8s system components. Kubelet and kube-proxy are the most critical ones.

https://github.com/rancher/rke/pull/1800
This PR adds the required changes.
It ugprades the controlplane components one at a time and worker components such as kubelet/kube-proxy in user-configurable batches.
Users can optionally drain nodes before upgrade

Tested with RKE version v1.1.0-rc6

  1. Verified the default upgrade strategy values for ingress addon :
Upgrade strategy =  rollingUpdate 
maxUnavailable=1
  1. Verified the default upgrade strategy values for Networking addon :
Upgrade strategy =  rollingUpdate 
maxUnavailable=1
  1. Verified the default upgrade strategy values for DNS addon :
Upgrade strategy =  rollingUpdate
maxUnavailable=1 and maxSurge=25%
  1. Verified the default upgrade strategy values for metrics server addon :
Upgrade strategy =  rollingUpdate 
maxUnavailable=25% and maxSurge=25%
  1. Verified that changing maxUnavailable , maxSurge fields in the upgrade strategy for the addons are updated correctly. Verified using kubectl commands
Was this page helpful?
0 / 5 - 0 ratings