Origin: Openshift Origin server 3.9 Failed to get system container stats for "/user.slice/user-0.slice/session-4.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-4.scope": failed to get container info for "/user.slice/user-0.slice/session-4.scope": unknown container "/user.slice/user-0.slice/session-4.scope"

Created on 27 Apr 2018 · 22Comments · Source: openshift/origin

I'm having issues when starting openshift server. When I start it by
./openshift start (as instructed in the origin docs)
it produces this error all the time. (see ##### Current Result)

Version

openshift v3.9.0+191fece
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.16

Steps To Reproduce

./openshift start

Current Result

I'm getting two errors.
[One]
E0427 12:19:34.473195 11668 summary.go:92] Failed to get system container stats for "/user.slice/user-0.slice/session-4.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-4.scope": failed to get container info for "/user.slice/user-0.slice/session-4.scope": unknown container "/user.slice/user-0.slice/session-4.scope"

[Two]
E0427 12:19:06.220993 11668 dnsmasq.go:105] unable to periodically refresh dnsmasq status: The name uk.org.thekelleys.dnsmasq was not provided by any .service files

Expected Result

Openshift all in one cluster up and running and can be accessible by web console

Additional Information

I'm a newbie to openshift/ docker and kube. So I can;t figure out how this error generating. Please help.

componencontainers lifecyclrotten prioritP2 sicontainers

Source

sadahamrc

Most helpful comment

I have the same issue:

Environment info

openshift-origin-server-v3.9.0-191fece-linux-64bit
# cat /etc/redhat-release 
CentOS Linux release 7.0.1406 (Core)
# uname -a
Linux i-lwfsocpz 4.16.6-1.el7.elrepo.x86_64 #1 SMP Sun Apr 29 16:50:56 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

Got results

E0503 11:51:02.187057   13590 summary.go:92] Failed to get system container stats for "/user.slice/user-0.slice/session-189.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-189.scope": failed to get container info for "/user.slice/user-0.slice/session-189.scope": unknown container "/user.slice/user-0.slice/session-189.scope"
E0503 11:51:02.388686   13590 watcher.go:208] watch chan error: etcdserver: mvcc: required revision has been compacted
W0503 11:51:02.389072   13590 reflector.go:341] github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:86: watch of *v1beta1.PodSecurityPolicy ended with: The resourceVersion for the provided watch is too old.

wenchma on 3 May 2018

👍2

All 22 comments

I have the same issue:

Environment info

openshift-origin-server-v3.9.0-191fece-linux-64bit
# cat /etc/redhat-release 
CentOS Linux release 7.0.1406 (Core)
# uname -a
Linux i-lwfsocpz 4.16.6-1.el7.elrepo.x86_64 #1 SMP Sun Apr 29 16:50:56 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

Got results

E0503 11:51:02.187057   13590 summary.go:92] Failed to get system container stats for "/user.slice/user-0.slice/session-189.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-189.scope": failed to get container info for "/user.slice/user-0.slice/session-189.scope": unknown container "/user.slice/user-0.slice/session-189.scope"
E0503 11:51:02.388686   13590 watcher.go:208] watch chan error: etcdserver: mvcc: required revision has been compacted
W0503 11:51:02.389072   13590 reflector.go:341] github.com/openshift/origin/vendor/k8s.io/client-go/informers/factory.go:86: watch of *v1beta1.PodSecurityPolicy ended with: The resourceVersion for the provided watch is too old.

wenchma on 3 May 2018

👍2

I have the same issue as above. Exactly same openshift version. OS version is RHEL 7.4 64bit

[root@OpenShift]# uname -a
Linux xxx-network-12 3.10.0-693.21.1.el7.x86_64 #1 SMP Fri Feb 23 18:54:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[[root@OpenShift]# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.4 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.4"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.4 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.4:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.4
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.4"
[root@OpenShift]# docker version
Client:
 Version:      17.06.2-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   cec0b72
 Built:        Tue Sep  5 19:59:06 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.06.2-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   cec0b72
 Built:        Tue Sep  5 20:00:25 2017
 OS/Arch:      linux/amd64
 Experimental: false
[root@OpenShift]# docker info |grep Cgroup
Cgroup Driver: systemd
WARNING: overlay: the backing xfs filesystem is formatted without d_type support, which leads to incorrect behavior.
         Reformat the filesystem with ftype=1 to enable d_type support.
         Running without d_type support will not be supported in future releases.

Error message goes

E0530 01:23:17.720369    9856 summary.go:92] Failed to get system container stats for "/user.slice/user-0.slice/session-4.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-4.scope": failed to get container info for "/user.slice/user-0.slice/session-4.scope": unknown container "/user.slice/user-0.slice/session-4.scope"
E0530 01:23:27.800522    9856 summary.go:92] Failed to get system container stats for "/user.slice/user-0.slice/session-4.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-4.scope": failed to get container info for "/user.slice/user-0.slice/session-4.scope": unknown container "/user.slice/user-0.slice/session-4.scope"
E0530 01:23:37.862243    9856 summary.go:92] Failed to get system container stats for "/user.slice/user-0.slice/session-4.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-4.scope": failed to get container info for "/user.slice/user-0.slice/session-4.scope": unknown container "/user.slice/user-0.slice/session-4.scope"
E0530 01:23:45.658357    9856 dnsmasq.go:105] unable to periodically refresh dnsmasq status: The name uk.org.thekelleys.dnsmasq was not provided by any .service files

dove-young on 30 May 2018

I have the same issue as above.

Failed to get system container stats for "/user.slice/user-0.slice/session-882.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-882.scope": failed to get container info for "/user.slice/user-0.slice/session-882.scope": unknown container "/user.slice/user-0.slice/session-882.scope"
E0603 17:36:08.269943 2033 dnsmasq.go:105] unable to periodically refresh dnsmasq status: The name uk.org.thekelleys.dnsmasq was not provided by any .service files

leoterry-ulrica on 3 Jun 2018

i have the same issue

slawekgh on 12 Jun 2018

I facing the same issue, trying to install openshift will try dig around and come up with a solution

jamesdube on 21 Jun 2018

Did anybody manage to get this resolved? I am facing the same issue.

rishikapoor028 on 4 Jul 2018

I have the same issue, can anyone help?

tonylook on 13 Jul 2018

same issue here, also so disappointed

Mikedu1988 on 25 Jul 2018

I had the same issue, then used 'sudo oc cluster up' instead, as recommended here
That showed an error about insecure regisry, so I had to update /etc/docker/daemon.json as described here

vladf3000 on 3 Aug 2018

I have the same issue i.e with the /user.slice/user-0.slice/session-1.scope and dnsmasq error messages when running the binary openshift start...:(
@openshift team can you guys help us out?What above guys suggested to put in /etc/docker/daemon.json did not do any help for me :(

hsafe on 13 Oct 2018

Btw it is a Centos 7.5 minimal install with openshift 3.10.0 and a hint I could say that the service can be successfully started with oc cluster up...but it is not desired production status; and can't help complaining that there is no single source of A-Z installation and set-up of openshift 3.10 with binary out there ...either they are too old-not applicable to the new modern approach-or otherwise it is the kubeshift /oc cluster up which honestly does not get you far...

hsafe on 13 Oct 2018

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot on 11 Jan 2019

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot on 11 Feb 2019

Same issue on 3.11 with Centos 7.5 every 10 seconds

origin-node: E0301 09:48:46.768464 6434 summary.go:102] Failed to get system container stats for "/system.slice/origin-node.service": failed to get cgroup stats for "/system.slice/origin-node.service": failed to get container info for "/system.slice/origin-node.service": unknown container "/system.slice/origin-node.service"

lbrigman124 on 1 Mar 2019

version:
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://example.com:8443
openshift v3.11.0+d0c29df-98
kubernetes v1.11.0+d4cacc0
uname -a
Linux example.com 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Steps reproduce:
Single node standup with openshift-ansible playbooks deploy_cluster.yml

lbrigman124 on 1 Mar 2019

Is everyone getting this error when using xfs with d_type = 0 ?

I noticed @dove-young's error. I'm going to try to redeploy with d_type = 1 per docker (https://docs.docker.com/v17.09/engine/userguide/storagedriver/overlayfs-driver/#prerequisites) and see if it gets rid of the stats error.

[root@OpenShift]# docker info
...
WARNING: overlay: the backing xfs filesystem is formatted without d_type support, which leads to incorrect behavior.
         Reformat the filesystem with ftype=1 to enable d_type support.
         Running without d_type support will not be supported in future releases.

icheko on 5 Mar 2019

I'm getting the same error message as listed above but I don't have the above warning message from docker info.
In my case, a reboot of the node cleared the error message.

lbrigman124 on 6 Mar 2019

❤1

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-bot on 5 Apr 2019

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot on 5 Apr 2019

I have the same issue on 3.11. you should add some config below kubeletArguments in /etc/origin/node/node-config.yaml.My fix is as following:

# vim /etc/origin/node/node-config.yaml
  runtime-cgroups:
  - /systemd/system.slice
  kubelet-cgroups:
  - /systemd/system.slice

do not forget to restart node service.

vanloswang on 20 Jun 2019

This issue should be fix in openshift-ansible. refer to https://github.com/vanloswang/openshift-ansible/commit/cabe815bc1733ba62088fc1c8c04ac0ca8bd0cd7. I CANNOT make a pull request now because github return Ooops with code 500 to me.

vanloswang on 20 Jun 2019

I see the same/similar when I try to deploy a cluster on Centos 7.x with:

openshift-ansible-playbooks-3.11.37-1.git.0.3b8b341.el7.noarch
openshift-ansible-roles-3.11.37-1.git.0.3b8b341.el7.noarch
openshift-ansible-docs-3.11.37-1.git.0.3b8b341.el7.noarch
openshift-ansible-3.11.37-1.git.0.3b8b341.el7.noarch

a cluster off KVM guests, 3 nodes of which 1 is a master, and on all three nodes show:
...
0408 18:32:03.587431 30412 summary.go:102] Failed to get system container stats for "/system.slice/origin-node.service": failed to get cgroup stats for "/system.slice/origin-node.service": failed to get container info for "/system.slice/origin-node.service": unknown container "/system.slice/origin-node.service"
E0408 18:32:13.606530 30412 summary.go:102] Failed to get system container stats for "/system.slice/origin-node.service": failed to get cgroup stats for "/system.slice/origin-node.service": failed to get container info for "/system.slice/origin-node.service": unknown container "/system.slice/origin-node.service"
E0408 18:32:23.612086 30412 summary.go:102] Failed to get system container stats for "/system.slice/origin-node.service": failed to get cgroup stats for "/system.slice/origin-node.service": failed to get container info for "/system.slice/origin-node.service": unknown container "/system.slice/origin-node.service"

Above is logged while deploy_cluster process is working and spits out:
...
FAILED - RETRYING: Verify that the catalog api server is running (57 retries left).
FAILED - RETRYING: Verify that the catalog api server is running (56 retries left).
FAILED - RETRYING: Verify that the catalog api server is running (55 retries left).
FAILED - RETRYING: Verify that the catalog api server is running (54 retries left).
...