RKE version: 1.0.4
* Kubernetes version:* 1.16.6
Docker version: (docker version,docker info preferred)
Server Version: 19.03.8
Storage Driver: devicemapper
Security Options:
seccomp
Profile: default
selinux
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
RHEL 7.7 - 3.10.0-1062.12.1.el7.x86_64
(but upgraded from RHEL 7.2 gradually to 7.7)
Results:
on host with SElinux enabled in docker daemon.json, there is error:
level=debug msg="Found selinux in DockerInfo.SecurityOptions on host [worker2]"
level=debug msg="Found [io.rancher.rke.container.name=service-sidekick] in Labels on host [worker2], applying MCSLabel [label=level:s0:c1000,c1001]"
level=fatal msg="[workerPlane] Failed to bring up Worker Plane: [Failed to start [kubelet] container on host [worker2]: Error response from daemon: error setting label on mount source '/var/lib/docker': failed to set file label on /var/lib/docker/containers/239041eeca9c7f19d722373fd3f80d75700e2a68f10c4ec60fd5a155f715e281/mounts/shm: operation not supported]"
If SElinux is disabled for docker, rke run fine.
More info:
sudo ls -lsahZ /var/lib/docker/containers/239041eeca9c7f19d722373fd3f80d75700e2a68f10c4ec60fd5a155f715e281/mounts/shm
total 8.0K
drwx------. root root system_u:object_r:container_file_t:s0 .
drwx------. root root system_u:object_r:container_file_t:s0 ..
sudo docker inspect kubelet | grep MountLabel
"MountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c457,c994"
md5-16a60c9f4d1861da74639be80d4ffbe8
sudo docker inspect kubelet | grep SecurityOpt -A5
"SecurityOpt": [
"label=disable"
],
It seems that RKE had disabled selinux for kubelet.
I believe rancher/rancher#23662 is the same situation? At least I'm waiting for the release that supports both 1.16.8 and 1.17.4 in order to upgrade.
I think it may be different.
In my situation, kubelet was not able to start, since there is a MountLabel but there is no ProcessLabel that was set for kubelet when rke deploys the kubelet as a Docker Container, i.e. --security-opt="label=disable". This makes me think that this is the problem of rke deployment code.
I have manually edited the configv2.json for kubelet and here are the results:
just tried with RKE 1.1.0-rc16 that supports 1.16.8 and still the same problem.
Downgrading docker-ce to 19.03.6 seems to help
@The-Loeki : 19.03.6 does not work on my side
@The-Loeki : 19.03.6 does not work on my side
we're also still running on 2.3.2 because of another selinux issue?
Is rke still broken when using selinux? I'm trying to do a fedora-coreos cluster with rke, but I'm getting this during rke up
WARN[0180] Can't start Docker container [kube-proxy] on host [d1.lan]: Error response from daemon: error setting label on mount source '/usr/lib/modules': relabeling content in /usr is not allowed
FATA[0180] [workerPlane] Failed to bring up Worker Plane: [Failed to start [kube-proxy] container on host [d1.lan]: Error response from daemon: error setting label on mount source '/usr/lib/modules': relabeling content in /usr is not allowed]
Is rke still broken when using selinux? I'm trying to do a fedora-coreos cluster with rke, but I'm getting this during
rke upWARN[0180] Can't start Docker container [kube-proxy] on host [d1.lan]: Error response from daemon: error setting label on mount source '/usr/lib/modules': relabeling content in /usr is not allowed FATA[0180] [workerPlane] Failed to bring up Worker Plane: [Failed to start [kube-proxy] container on host [d1.lan]: Error response from daemon: error setting label on mount source '/usr/lib/modules': relabeling content in /usr is not allowed]
Can confirm, got the same issue on latest FCOS stable.
I have replicated the above errors mentioned by @tailtwo and @xeor on FCOS. Perhaps it is due to the fact that /usr is mounted read-only.
EDIT: it seems that running sudo setenforce 0 on the hosts does not fix the issue, so it has to do with /usr being ro.
EDIT 2: I have create an issue to address this bug here: #2194
This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
Most helpful comment
I have replicated the above errors mentioned by @tailtwo and @xeor on FCOS. Perhaps it is due to the fact that
/usris mounted read-only.EDIT: it seems that running
sudo setenforce 0on the hosts does not fix the issue, so it has to do with/usrbeing ro.EDIT 2: I have create an issue to address this bug here: #2194