RKE version:
rke version v0.0.8-dev
Docker version: (docker version,docker info preferred)
Docker version 1.12.6, build 85d7426/1.12.6
Operating system and kernel: (cat /etc/os-release, uname -r preferred)
CentOS Linux 7 (Core)
uname -r: 3.10.0-693.11.1.el7.x86_64
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
kvm guest on my RHELv7 laptop.
KVM VM network set to Virtual network 'default': NAT
cluster.yml file:
# If you intened to deploy Kubernetes in an air-gapped envrionment,
# please consult the documentation on how to configure custom RKE images.
nodes:
- address: 192.168.122.148
role:
- controlplane
- worker
- etcd
user: bgehman
docker_socket: /var/run/docker.sock
services:
etcd:
image: quay.io/coreos/etcd:latest
extra_args: {}
kube-api:
image: rancher/k8s:v1.8.3-rancher2
extra_args: {}
service_cluster_ip_range: 10.233.0.0/18
kube-controller:
image: rancher/k8s:v1.8.3-rancher2
extra_args: {}
cluster_cidr: 10.233.64.0/18
service_cluster_ip_range: 10.233.0.0/18
scheduler:
image: rancher/k8s:v1.8.3-rancher2
extra_args: {}
kubelet:
image: rancher/k8s:v1.8.3-rancher2
extra_args: {}
cluster_domain: cluster.local
infra_container_image: gcr.io/google_containers/pause-amd64:3.0
cluster_dns_server: 10.233.0.3
kubeproxy:
image: rancher/k8s:v1.8.3-rancher2
extra_args: {}
network:
plugin: flannel
options: {}
auth:
strategy: x509
options: {}
system_images: {}
ssh_key_path: ~/.ssh/rke_rsa
Steps to Reproduce:
Using a minimal CentOSv7 install on a KVM VM on my laptop, installed docker from CentOS repos. Hit the problem mentioned in https://github.com/rancher/rke/issues/93 -- which I worked around by creating a non-root user account in the VM, and setting that account up so it is in the docker group. Verified that the non-root account works with docker. Attempting to bring up the "everything in one VM" environment with rke fails with:
...
...
INFO[0194] [worker] Successfully pulled [kubelet] image on host [192.168.122.148]
FATA[0197] [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [192.168.122.148]: Error response from daemon: linux mounts: Path /var/lib/kubelet is mounted on / but it is not a shared mount.
Results:
Full run output:
$ ./rke_linux-amd64 up
INFO[0000] Building Kubernetes cluster
INFO[0000] [ssh] Setup tunnel for host [192.168.122.148]
INFO[0000] [ssh] Setup tunnel for host [192.168.122.148]
INFO[0000] [ssh] Setup tunnel for host [192.168.122.148]
INFO[0000] [certificates] Generating kubernetes certificates
INFO[0000] [certificates] Generating CA kubernetes certificates
INFO[0001] [certificates] Generating Kubernetes API server certificates
INFO[0001] [certificates] Generating Kube Controller certificates
INFO[0001] [certificates] Generating Kube Scheduler certificates
INFO[0002] [certificates] Generating Kube Proxy certificates
INFO[0002] [certificates] Generating Node certificate
INFO[0002] [certificates] Generating admin certificates and kubeconfig
INFO[0002] [reconcile] Reconciling cluster state
INFO[0002] [reconcile] This is newly generated cluster
INFO[0002] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0020] Successfully Deployed local admin kubeconfig at [./.kube_config_cluster.yml]
INFO[0020] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0020] [etcd] Building up Etcd Plane..
INFO[0020] [etcd] Pulling Image on host [192.168.122.148]
INFO[0028] [etcd] Successfully pulled [etcd] image on host [192.168.122.148]
INFO[0031] [etcd] Successfully started [etcd] container on host [192.168.122.148]
INFO[0031] [etcd] Successfully started Etcd Plane..
INFO[0031] [controlplane] Building up Controller Plane..
INFO[0031] [remove/nginx-proxy] Checking if container is running on host [192.168.122.148]
INFO[0031] [remove/nginx-proxy] Container doesn't exist on host [192.168.122.148]
INFO[0031] [controlplane] Pulling Image on host [192.168.122.148]
INFO[0185] [controlplane] Successfully pulled [kube-api] image on host [192.168.122.148]
INFO[0188] [controlplane] Successfully started [kube-api] container on host [192.168.122.148]
INFO[0188] [controlplane] Pulling Image on host [192.168.122.148]
INFO[0188] [controlplane] Successfully pulled [kube-controller] image on host [192.168.122.148]
INFO[0191] [controlplane] Successfully started [kube-controller] container on host [192.168.122.148]
INFO[0191] [controlplane] Pulling Image on host [192.168.122.148]
INFO[0192] [controlplane] Successfully pulled [scheduler] image on host [192.168.122.148]
INFO[0194] [controlplane] Successfully started [scheduler] container on host [192.168.122.148]
INFO[0194] [controlplane] Successfully started Controller Plane..
INFO[0194] [worker] Building up Worker Plane..
INFO[0194] [worker] Pulling Image on host [192.168.122.148]
INFO[0194] [worker] Successfully pulled [kubelet] image on host [192.168.122.148]
FATA[0197] [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [192.168.122.148]: Error response from daemon: linux mounts: Path /var/lib/kubelet is mounted on / but it is not a shared mount.
I'm currently searching for a work-around. BTW, rke seems quite nice, I'm a fan already. :+1:
@galal-hussein hit this yesterday and is looking into it, thanks for reporting
@bgehman @superseb yes i think it may be related to the Mountflags in docker service, i believe it should be Mountflags=0 in the service configuration, we still looking into it, thanks for reporting the issue
@bgehman Since we are using containerized kubelet i think you need to reset Mountflags to empty value instead of slave, see https://github.com/kubernetes/kubernetes/issues/4869#issuecomment-195696990
I just tested with upstream docker (which has MountFlags disabled) and worked.
@galal-hussein Thanks, was able to find/edit /etc/systemd/system/multi-user.target.wants/docker.service and set MountFlags= (no value). That worked around the problem. However, immediately ran into other issues with this CentOS minimal install: swap is enabled by default, which k8 doesn't support. Disabled that and now running into kubelet crashloop:
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
Will try with docker from docker.io, and not the version packaged with RHEL/CentOS -- maybe will have better luck. Thanks!
@bgehman we actually resolved this problem in https://github.com/rancher/rke/pull/125, and will be available in the next release, you will be able to work with both systemd and cgroupfs drivers
@galal-hussein With CentOSv7 minimal + docker v1.12.6 from upstream docker (installed with https://raw.githubusercontent.com/rancher/install-docker/master/1.12.6.sh) -- seeing a different kubelet failure now.
rke v0.0.8 up output:
...
INFO[0037] [worker] Container [kubelet] is already running on host [192.168.122.3]
INFO[0037] [worker] Container [kube-proxy] is already running on host [192.168.122.3]
INFO[0037] [worker] Successfully started Worker Plane..
INFO[0037] [certificates] Save kubernetes certificates as secrets
FATA[0067] [certificates] Failed to Save Kubernetes certificates: Failed to save certificate [kube-scheduler] to kubernetes: [certificates] Timeout waiting for kubernetes to be ready
kubelet container logs (looping with):
...
W1211 22:19:05.591966 2791 cni.go:196] Unable to update cni config: No networks found in /etc/cni/net.d
E1211 22:19:05.592270 2791 kubelet.go:2095] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
...
Perhaps I should be using Ubuntu until rke+CentOS is a little more baked? Thanks!
This issue still apply to my environment using rancher server alpha:21
I am also getting this with:
2018/03/30 17:32:06 [INFO] cluster [cluster-hqm7l] provisioning: [authz] system:node ClusterRoleBinding created successfully
2018/03/30 17:32:06 [INFO] cluster [cluster-hqm7l] provisioning: [certificates] Save kubernetes certificates as secrets
2018/03/30 17:32:07 [INFO] cluster [cluster-hqm7l] provisioning: [certificates] Successfully saved certificates as kubernetes secret [k8s-certs]
2018/03/30 17:32:07 [INFO] cluster [cluster-hqm7l] provisioning: [state] Saving cluster state to Kubernetes
2018/03/30 17:32:07 [INFO] cluster [cluster-hqm7l] provisioning: [state] Successfully Saved cluster state to Kubernetes ConfigMap: cluster-state
2018/03/30 17:32:07 [INFO] cluster [cluster-hqm7l] provisioning: [worker] Building up Worker Plane..
2018/03/30 17:32:07 [INFO] cluster [cluster-hqm7l] provisioning: [sidekick] Sidekick container already created on host [147.75.76.197]
E0330 17:32:07.809962 1 generic_controller.go:204] ClusterController cluster-hqm7l [cluster-provisioner-controller] failed with : [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [147.75.76.197]: Error response from daemon: linux mounts: Path /var/lib/kubelet is mounted on / but it is not a shared mount.
I'm getting this in Rancher 2.0 Beta 1, Docker 17.03.2-ce
I'm getting this in Rancher 2.0.0-beta3, Docker 17.03.2-ce
+1 on Rancher 2.0.0-beta3, Docker 1.13.1
+1 Rancher 2.0.0-beta3, Docker 17.03.2
+1 Rancher 2.0 Docker version 17.09.1-ce, build 19e2cf
+1 Rancher 2.0 Docker 1.13.1 build 092cba3
With self signed certificates
This fixed it for me.
Create the file /etc/systemd/system/docker.service.d/mount_flags.conf and insert as shown and restart the docker service.
[ubuntu@k8s01][~]$ cat /etc/systemd/system/docker.service.d/mount_flags.conf
[Service]
MountFlags=shared
+1 Rancher 2.2 Docker 18.06.0-ce 14.04.1-Ubuntu
And tried to set Docker MountFlags-shared and restart docker service. The same problem.
@cw1427 This ticket is old (and closed). Best for you to open a new ticket with all the details.