Kubeadm: Kubelet 'failed to get cgroup stats for "/system.slice/kubelet.service"' error messages

Created on 26 Mar 2020  路  20Comments  路  Source: kubernetes/kubeadm

BUG REPORT

Versions

kubeadm version : 1.17.3

Environment:

  • Kubernetes version : 1.17.3
  • Cloud provider or hardware configuration: on prem Dell R740XD
  • OS (e.g. from /etc/os-release): RHEL 7.7
  • Kernel (e.g. uname -a): 3.10.0-1062.12.1.el7.x86_64
  • Others: docker-ce 19.03.7-3.el7.x86_64

What happened?

Kubelet is printing regularly to the logs:
summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"

This looks like the issue described here: https://github.com/kubernetes/kops/issues/4049

This was fixed by restarting the kubelet. Rebooting the machine sees the problem persist, so I think it's related to the systemd start order.

What you expected to happen?

No error message logging, and correct reporting of container stats

How to reproduce it (as minimally and precisely as possible)?

kubeadm cluster on RHEL 7.7

Anything else we need to know?

kinbug prioritawaiting-more-evidence sinode

Most helpful comment

@cjreyn Did you solve the issue ? I have it on CentOS 7

Yes, I put the following in the file /usr/lib/systemd/system/kubelet.service.d/11-kubeadm.conf

[Service]
After=docker.service
ExecStartPre=/bin/sleep 10

In theory, adding just the "After=docker.service" should be enough, but in my testing it also needed the sleep.

All 20 comments

the RPM specs are here:
https://github.com/kubernetes/release/tree/master/cmd/kubepkg/templates/latest/rpm

is this solution working for you:
https://github.com/kubernetes/kops/issues/4049#issuecomment-354539302
?

/priority backlog
/kind bug

is this solution working for you:

or does any of the proposed solutions in https://github.com/kubernetes/kops/issues/4049 work for you?

No, this workaround is not recommended. See here for why: https://github.com/kontena/pharos-cluster/issues/440#issuecomment-399014418 Their proposed modification is to add the following to the systemd service file:

[Service]
CPUAccounting=true
MemoryAccounting=true

When I do this, and reboot, the problem persists.

Maybe the rpm spec should have something that makes kubelet start after docker? Simply restarting kubelet once the machine has booted does fix the issue. This points to an order issue at boot time.

For reference, kubelet cgroup driver is the same as Docker:

[root@cs05r-sc-cloud-10 ~]# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.1"
[root@cs05r-sc-cloud-10 ~]# docker info | grep -i 'cgroup driver'
 Cgroup Driver: cgroupfs

does it happen with the systemd cgroup driver?
(throwing random ideas out there)
to change it you must edit the docker daemon configuration and pass --cgroup-driver=systemd to the kubelet.

This does happen from time to time. It's different from the other cgroup driver systemd related error.

It's really hard to reproduce but rarely happens when you start docker and kubelet at the 'same' time. (e.g. Node reboot)

kubelet is trying to get stats from docker.service and can't get it,
but /system.slice/docker.service can be found through commands like systemd-cgls

You can resolve this simply by restarting kubelet.
As mentioned above, I think it has to do with order of the boot of kubelet and docker.

Wondering if anyone else had the same issue

i think the kubelet should just retry for the container runtime to be ready, if it's not doing that let's file a kubernetes/kubernetes bug and tag it with /sig node.

one possibility is that we can do:
https://stackoverflow.com/questions/43001223/how-to-ensure-that-there-is-a-delay-before-a-service-is-started-in-systemd

but i think the kubelet should be adaptable instead..

/sig node

@cjreyn @onesolpark

please provide ideas what action item can be taken here:

  • fix something in the kubelet?
  • patch our DEB/RPM specs?

this doesn't seem like a kubeadm specific ticket, per se.

I agree that its not a kubeadm specific ticket per se
I don鈥檛 think putting something in systemd(dbm/rpm) spec is a good answer either.

Putting sleep might have some side effects e.g. slow node ready time and cant put start after docker because users can use other CRIs

I will gather some logs and a way to reproduce this and put in an issue to kubernetes repo and tag sig-node to have a better look.

@cjreyn @neolit123 does that sound okay?

seems good, please ping me on that ticket. thanks.
/close

@neolit123: Closing this issue.

In response to this:

seems good, please ping me on that ticket. thanks.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

I guess if the CRI is known to kubeadm at the time init is called, one could add the systemd directive to start kubelet after it?

After all, init is used to craft kubelet args which are pretty static.

Btw this happens to every node in our cluster.

ideally we would like to move away from writing the dynamic kubeadm-env file on init/join.
currently it holds flags for dockershim (which are only avail. as flags) and the rest should move to the KubeleConfiguration which is written to /var/lib/kubelet/config.yaml.

dockershim itself is splitting from the kubelet and the plan there is not very clear, but probably it will run as a separate optional service.

Is it safe/advisable to patch the systemd service file for my CRI to control start order? Does it get altered during kubeadm upgrade?

kubeadm does not modify the kubelet.service. currently it only populates some flags in /var/lib/kubelet/kubeadm-flags.env

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/#the-kubelet-drop-in-file-for-systemd

From the systemd docs it looks like I can drop in a file named "11-kubeadm.conf" to append an "After=docker" to the service spec. This should be straight forward. Thanks for all your help!

we can add a note about this in: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/
but i'd want to get feedback from sig-node before that - re: the kubernetes/kubernetes ticket.

Makes sense. Thanks

@cjreyn Did you solve the issue ? I have it on CentOS 7

@cjreyn Did you solve the issue ? I have it on CentOS 7

Yes, I put the following in the file /usr/lib/systemd/system/kubelet.service.d/11-kubeadm.conf

[Service]
After=docker.service
ExecStartPre=/bin/sleep 10

In theory, adding just the "After=docker.service" should be enough, but in my testing it also needed the sleep.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

helphi picture helphi  路  3Comments

mlevesquedion picture mlevesquedion  路  3Comments

kvaps picture kvaps  路  3Comments

jessfraz picture jessfraz  路  3Comments

ep4eg picture ep4eg  路  3Comments