Kops: Use systemd as the cgroup driver for kubelet and CRI

Created on 5 Dec 2020 · 12Comments · Source: kubernetes/kops

Currently in kops, kubelet and the container runtimes use cgroupfs driver as the cgroup-manager. Kubernetes recommends using systemd as the default cgroup-manager for kubelet and the container runtimes. Some projects like kubeadm and minikube have moved to set systemd as the default cgroup-driver.

https://kubernetes.io/docs/setup/production-environment/container-runtimes/

https://github.com/kubernetes/minikube/pull/6651
https://github.com/freenas/freenas/pull/5263
https://github.com/kubernetes/kubernetes/pull/73837

Source

bharath-123

👍3

Most helpful comment

Had a more detailed discussion with @olemarkus and agreed on to do here:

doc explaining how to set the cgroup driver and that needs to be done in both k8s and cri
doc note for contained configOverride that using it has to be done with care, as the default config may change between releases and can lead to incompatibilities
print a warning when configOverride is used that the cluster is using non default containerd config
set default to systemd only for k8s >= 1.20 for kubelet, docker and containerd

hakman on 7 Dec 2020

👍2

All 12 comments

/assign @bharath-123

bharath-123 on 6 Dec 2020

Was having a discussion with @bmelbourne on this referring to PR 9879. It seems like there was a consensus that moving to systemd cgroup driver would be a big breaking change for users.

I do agree that this can be a breaking change. Defaulting to systemd for kubelet and container runtimes is pretty straightforward, we only have to set a couple of default options nothing too much.

@hakman @olemarkus @bmelbourne would love to know your thoughts on this before putting any effort on this.

bharath-123 on 6 Dec 2020

One can already set systemd as driver for kubelet, right? So what is missing is setting the appropriate driver for containerd.
For me that sounds straightforward enough.

I am not sure what the breaking change would be. From what I can tell, there are mostly benefits from this change.

Compared to the change done for docker, I would really like to see some validation logic ensuring that the containerd and kubelet config match.

olemarkus on 7 Dec 2020

I would like to not have so much logic for a setting that most people won't notice or change. Defaulting to this for 1.20+ should be ok. Validating the kubelet and container runtime match seems overkill to me. Documentation should be good enough.
Don't remember the discussion about the breaking changes, but may be related to old Debian image.

hakman on 7 Dec 2020

Had a more detailed discussion with @olemarkus and agreed on to do here:

doc explaining how to set the cgroup driver and that needs to be done in both k8s and cri
doc note for contained configOverride that using it has to be done with care, as the default config may change between releases and can lead to incompatibilities
print a warning when configOverride is used that the cluster is using non default containerd config
set default to systemd only for k8s >= 1.20 for kubelet, docker and containerd

hakman on 7 Dec 2020

👍2

alright got these points. Will raise a PR with code keeping all of these in mind. Thanks!

bharath-123 on 7 Dec 2020

Thanks also for doing this.

hakman on 7 Dec 2020

@bharath-123
The reason for considering this to be a breaking change for users with existing clusters was the following _caution note_ documented here...
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers

Changing the cgroup driver of a Node that has joined a cluster is strongly not recommended.
If the kubelet has created Pods using the semantics of one cgroup driver, changing the container runtime to another cgroup driver can cause errors when trying to re-create the Pod sandbox for such existing Pods. Restarting the kubelet may not solve such errors.

If you have automation that makes it feasible, replace the node with another using the updated configuration, or reinstall it using automation.

From a Kops perspective, this was deemed to be too complex for the _rolling update_ feature to deal with, as updates only work on existing nodes in a k8s cluster, hence why systemd wasn't set previously as the default cgroup driver many months ago.

bmelbourne on 11 Dec 2020

@bmelbourne from what I know, kOps just replaces each node, so should not matter much. There won's be any mismatch between kubelet and runtime on any one the nodes, even if some will use cgroupfs or and others systemd.

hakman on 11 Dec 2020

@bmelbourne from what I know, kOps just replaces each node, so should not matter much. There won's be any mismatch between kubelet and runtime on any one the nodes, even if some will use cgroupfs or and others systemd.

That's good to know. I'll be happy to test whatever the final solution might be once the PR has been raised.

bmelbourne on 11 Dec 2020

And I will be happy to review it 😁

hakman on 11 Dec 2020

👍1

Working on it. Ran into a bunch of issues with nodeup and containerd configs. mostly resolved now :)

bharath-123 on 11 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings