Rke: /opt/rke/var/lib/kubelet mounted thousands of times

Created on 23 May 2018  ·  29Comments  ·  Source: rancher/rke

This line appears in a df -h command over a thousand times:

/dev/sda9        494G  2.1G  472G   1% /opt/rke/var/lib/kubelet

This is massively slowing down each node and we're encountering problems with the kubelet not passing healthcheck.

RKE version v0.1.7

internal kinbug

Most helpful comment

Docker doesn't handle nested mount volumes very well, which is the root cause in the case. You can see this using this simple example:

root@ip-172-31-1-170:/home/ubuntu# mount | grep /tmp| wc -l
root@ip-172-31-1-170:/home/ubuntu# for i in {1..10}; do docker  run -d -v /tmp:/tmp:shared -v /tmp/test:/tmp/test alpine ; sleep 1 ; done  > /dev/null
root@ip-172-31-1-170:/home/ubuntu# mount | grep /tmp| wc -l
1023

Kuberetes containers deployed by rke use a volume from rancher/rke-tools image for entry scripts and other scripts/tools. This volume is mounted under /opt/rke/:

            {
                "Name": "dd6677cfc583a64171246b04f33cb97bd9c94bc640087d10575daffaaf264038",
                "Source": "/var/lib/docker/volumes/dd6677cfc583a64171246b04f33cb97bd9c94bc640087d10575daffaaf264038/_data",
                "Destination": "/opt/rke",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },

In some OSs, including CoreOS and RancherOS, we persist data in prefix_path:, which is also /opt/rke. This creates a nested mounts situation, where mounts are doubled with each kubelet restart!

I pushed a fix for this. Please feel free to test it once it's merged in master. We would love to get your feedback.

All 29 comments

Update: Once I upgraded from kubernetes_version v1.10.1-rancher2-1 to v1.10.1-rancher2-1, rke-tools stopped remounting the /opt/rke/var/lib/kubelet This might have also been causing #648 once there was so many mounts, the host became largely unresponsive.

@HighwayofLife I am unable to reproduce this with the latest RC ? Did you ever hit it again after the upgrade ?

Yes. I hit it again multiple times and again today. This only happens to
existing clusters, and I think the reason might be if something causes the
node to drop off line ( kubelet no longer reporting). I haven't figured out
whether the solution is to Simply restart the node, or to reprovision it
entirely.

>

@HighwayofLife can you please share kubelet logs ? and debug logs from rke runs ? Which RC did you use ?

I am able to reproduce this with some regularity now, so working on gathering some useful info.

I created / provisioned a series of nodes (6) in Azure, CoreOS Virtual Machines. Rake rke, which runs successfully, then when I update some components, maybe updating the kubelet, the kubelet itself is restarting multiple times, and there are over a thousand mounts to /var/lib/kubelet on at least one of the worker nodes.

/dev/sda9        246G  3.3G  233G   2% /opt/rke/var/lib/kubelet

$ df -h | wc -l
1038

Kubelet logs --tail 500:
https://gist.github.com/HighwayofLife/93e81a4eb659e35891931a061c7ca74e

RKE error:

FATA[0239] [workerPlane] Failed to bring up Worker Plane: Failed to verify healthcheck: Failed to check https://localhost:10250/healthz for service [kubelet] on host [10.18.160.19]: Get https://localhost:10250/healthz: Failed to dial to localhost:10250: ssh: rejected: connect failed (Connection refused), log: + umount /var/lib/docker/volumes/f12546359200ccbc35d2039ec464d6e57135b7ee535af8291295108554e9c5ac/_data/var/lib/kubelet/pods/cb6d67f4-7bef-11e8-a210-000d3a37ea40/volumes/kubernetes.io~secret/canal-token-hnr7l 

Other pertinent info. This problem has appeared on Docker 1.12 as well as the latest stable version of Docker installed on these machines: (CoreOS stable channel)

Server:
 Engine:
  Version:      18.03.1-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.4
  Git commit:   9ee9f40
  Built:        Thu Apr 26 04:27:49 2018
  OS/Arch:      linux/amd64
  Experimental: false

Docker info:

$ docker info
Containers: 17
 Running: 14
 Paused: 0
 Stopped: 3
Images: 13
Server Version: 18.03.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: v0.13.2 (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 seccomp
  Profile: default
 selinux
Kernel Version: 4.14.48-coreos-r2
Operating System: Container Linux by CoreOS 1745.7.0 (Rhyolite)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.771GiB
Name: devcluster-worker0
ID: 7FHT:QWET:FCHQ:5AD5:KYSE:HYGB:SXY6:TUH6:I2T3:77MQ:5FB2:A5Z6
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

docker ps:

$ docker ps
CONTAINER ID        IMAGE                                           COMMAND                  CREATED             STATUS                          PORTS               NAMES
681213508cdf        rancher/hyperkube:v1.11.0-rancher1              "/opt/rke/entrypoint…"   37 hours ago        Restarting (255) 37 hours ago                       kubelet
7c075287c126        rancher/rke-tools:v0.1.10                       "nginx-proxy CP_HOST…"   37 hours ago        Up 37 hours                                         nginx-proxy
4b107686e288        rancher/k8s-dns-sidecar-amd64                   "/sidecar --v=2 --lo…"   2 days ago          Up 2 days                                           k8s_sidecar_kube-dns-56d486c5c-dcg7x_kube-system_192663b8-7c07-11e8-a210-000d3a37ea40_0
c6a051b9871b        rancher/k8s-dns-dnsmasq-nanny-amd64             "/dnsmasq-nanny -v=2…"   2 days ago          Up 2 days                                           k8s_dnsmasq_kube-dns-56d486c5c-dcg7x_kube-system_192663b8-7c07-11e8-a210-000d3a37ea40_0
c0465964a6d6        rancher/k8s-dns-kube-dns-amd64                  "/kube-dns --domain=…"   2 days ago          Up 2 days                                           k8s_kubedns_kube-dns-56d486c5c-dcg7x_kube-system_192663b8-7c07-11e8-a210-000d3a37ea40_0
3ee1ac79ae30        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 days ago          Up 2 days                                           k8s_POD_kube-dns-56d486c5c-dcg7x_kube-system_192663b8-7c07-11e8-a210-000d3a37ea40_0
ec987ebd100a        rancher/cluster-proportional-autoscaler-amd64   "/cluster-proportion…"   2 days ago          Up 2 days                                           k8s_autoscaler_kube-dns-autoscaler-6c4b786f5-kh5qq_kube-system_cec6df30-7bef-11e8-a210-000d3a37ea40_0
575e2eaf9434        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 days ago          Up 2 days                                           k8s_POD_kube-dns-autoscaler-6c4b786f5-kh5qq_kube-system_cec6df30-7bef-11e8-a210-000d3a37ea40_0
59215c87a089        rancher/coreos-flannel                          "/opt/bin/flanneld -…"   2 days ago          Up 2 days                                           k8s_kube-flannel_canal-l8mcr_kube-system_cb6d67f4-7bef-11e8-a210-000d3a37ea40_0
7dad6ed7cd23        rancher/nginx-ingress-controller                "/usr/bin/dumb-init …"   2 days ago          Up 2 days                                           k8s_nginx-ingress-controller_nginx-ingress-controller-m6pd9_ingress-nginx_d160a293-7bef-11e8-a210-000d3a37ea40_0
5bc27cc9c0a6        rancher/calico-cni                              "/install-cni.sh"        2 days ago          Up 2 days                                           k8s_install-cni_canal-l8mcr_kube-system_cb6d67f4-7bef-11e8-a210-000d3a37ea40_0
9f04782a2573        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 days ago          Up 2 days                                           k8s_POD_nginx-ingress-controller-m6pd9_ingress-nginx_d160a293-7bef-11e8-a210-000d3a37ea40_0
36a62230d371        rancher/calico-node                             "start_runit"            2 days ago          Up 2 days                                           k8s_calico-node_canal-l8mcr_kube-system_cb6d67f4-7bef-11e8-a210-000d3a37ea40_0
a1ec239ea3af        gcr.io/google_containers/pause-amd64:3.0        "/pause"                 2 days ago          Up 2 days                                           k8s_POD_canal-l8mcr_kube-system_cb6d67f4-7bef-11e8-a210-000d3a37ea40_0
6d96cfadd698        rancher/hyperkube:v1.10.3-rancher2              "/opt/rke/entrypoint…"   2 days ago          Up 2 days                                           kube-proxy

I've also noticed this on multiple versions of hyperkube 1.10, and now also 1.11. I feel it has something to do with rke-tools, but I'm having difficulty nailing it down.

cluster.yml file:

nodes:
- address: 10.<snip>
  hostname_override: devcluster-master0
  internal_address: 10.<snip>
  role:
  - controlplane
  - etcd
  user: devadmin
- address: 10.<snip>
  hostname_override: devcluster-master1
  internal_address: 10.<snip>
  role:
  - controlplane
  - etcd
  user: devadmin
- address: 10.<snip>
  hostname_override: devcluster-master2
  internal_address: 10.<snip>
  role:
  - controlplane
  - etcd
  user: devadmin
- address: 10.<snip>
  hostname_override: devcluster-worker0
  internal_address: 10.<snip>
  role:
  - worker
  user: devadmin
  labels:
    app: ingress
- address: 10.<snip>
  hostname_override: devcluster-worker1
  internal_address: 10.<snip>
  role:
  - worker
  user: devadmin
  labels:
    app: ingress
- address: 10.<snip>
  hostname_override: devcluster-worker2
  internal_address: 10.<snip>
  role:
  - worker
  user: devadmin
  labels:
    app: ingress
private_registries:
- url: <snip>
  user: <snip>
  password: <snip>
authentication:
  strategy: x509
  sans:
  - <snip>
ssh_key_path: <snip>
cloud_provider:
  name: azure
  azureCloudProvider:
    aadClientId: <snip>
    aadClientSecret: <snip>
    aadTenantId: <snip>
    cloud: AzurePublicCloud
    location: <snip>
    primaryAvailabilitySetName: <snip>
    resourceGroup: <snip>
    securityGroupName: <snip>
    subnetName: <snip>
    subscriptionId: <snip>
    tenantId: <snip>
    vnetName: <snip>
    vnetResourceGroup: <snip>
network:
  plugin: canal
authorization:
  mode: rbac
services:
  etcd:
  kube-api:
    service_cluster_ip_range: 10.43.0.0/16
    extra_args:
      oidc-client-id: "spn:<snip>"
      oidc-issuer-url: "https://sts.windows.net/<snip>/"
      oidc-username-claim: "upn"
      oidc-groups-claim: "groups"
      v: 2
  kube-controller:
    cluster_cidr: 10.42.0.0/16
    service_cluster_ip_range: 10.43.0.0/16
    extra_args:
      v: 2
  scheduler:
  kubelet:
    cluster_domain: dev.<snip>.net
    cluster_dns_server: 10.43.0.10
    infra_container_image: gcr.io/google_containers/pause-amd64:3.0
    extra_args:
      v: 2
      read-only-port: 10256
      authentication-token-webhook: true
  kubeproxy:
ignore_docker_version: true
cluster_name: "devcluster"
ingress:
  provider: nginx
  node_selector:
    app: ingress

system_images:
  kubernetes: rancher/hyperkube:v1.11.0-rancher1

Docker inspect kubelet: https://gist.github.com/HighwayofLife/9a30dffa9759a233d3e3f3abf8725bd7

Snippet from state:

"Error": "OCI runtime create failed: container_linux.go:348: starting container process caused \"process_linux.go:402: container init caused \\\"rootfs_linux.go:58: mounting \\\\\\\"/opt/rke/var/lib/kubelet\\\\\\\" to rootfs \\\\\\\"/var/lib/docker/overlay2/670c83d6be395da0a505fdf7faa61567802cdd6257c93fbfd5fc31180f06b48d/merged\\\\\\\" at \\\\\\\"/var/lib/docker/overlay2/670c83d6be395da0a505fdf7faa61567802cdd6257c93fbfd5fc31180f06b48d/merged/opt/rke/var/lib/kubelet\\\\\\\" caused \\\\\\\"no space left on device\\\\\\\"\\\"\": unknown",

rke version last run:

$ ../bin/rke --version
rke version v0.1.8-rc11

On the node, df -h:

$ df -h
Filesystem       Size  Used Avail Use% Mounted on
devtmpfs         3.9G     0  3.9G   0% /dev
tmpfs            3.9G     0  3.9G   0% /dev/shm
tmpfs            3.9G  888K  3.9G   1% /run
tmpfs            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/sda9        246G  3.3G  233G   2% /
/dev/mapper/usr  985M  713M  221M  77% /usr
none             3.9G  184M  3.8G   5% /run/torcx/unpack
tmpfs            3.9G     0  3.9G   0% /media
/dev/sda1        128M   49M   80M  38% /boot
tmpfs            3.9G  4.0K  3.9G   1% /tmp
/dev/sda6        108M   45M   55M  45% /usr/share/oem
/dev/sdb1         16G   45M   15G   1% /mnt/resource
/dev/sda9        246G  3.3G  233G   2% /opt/rke/var/lib/kubelet
... repeated 1000+ times ... 
tmpfs            796M     0  796M   0% /run/user/1000

kube-proxy and kubelet are fighting over port 10256 which you configured as read-only port for the kubelet but is the kube-proxy healthz port. If kubelet starts first, all is good, if it dies/restarts and kube-proxy takes it, kubelet can't start. I have to check why kube-proxy doesn't exit when it can't bind and kubelet does, but this looks like a configuration error.

Any update on this?

Let me know if this needs more investigation.

Yes, I'm still getting this issue. Had a chance to look into it a bit today.

RKE run hit the following error trying to bring up the kubelet.

FATA[0178] [workerPlane] Failed to bring up Worker Plane: Failed to verify healthcheck: Failed to check https://localhost:10250/healthz for service [kubelet] on host [10.18.160.17]: Get https://localhost:10250/healthz: Unable to access the Docker socket (localhost:10250). Please check if the configured user can execute `docker ps` on the node, and if the SSH server version is at least version 6.7 or higher. If you are using RedHat/CentOS, you can't use the user `root`. Please refer to the documentation for more instructions. Error: ssh: rejected: connect failed (Connection refused), log: + umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/aece7468-8c64-11e8-97a7-000d3a377774/volumes/rook.io~rook/jenkins-jobs 

rke version b71fd3b (v0.1.9-rc2 + the 4 commits on master)

docker ps works on the node.

127 mounts to var/lib/kubelet...

...
/dev/sda9        246G   25G  211G  11% /opt/rke/var/lib/kubelet
...
$ df -h | grep "/var/lib/kubelet" | wc -l
127

kubelet config:

        "Args": [
            "kubelet",
            "--fail-swap-on=false",
            "--root-dir=/opt/rke/var/lib/kubelet",
            "--tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
            "--cgroups-per-qos=True",
            "--allow-privileged=true",
            "--authentication-token-webhook=true",
            "--cluster-dns=10.43.0.10",
            "--client-ca-file=/etc/kubernetes/ssl/kube-ca.pem",
            "--volume-plugin-dir=/var/lib/kubelet/volumeplugins",
            "--cluster-domain=sandbox.pipeline.vpc.starbucks.net",
            "--pod-infra-container-image=gcr.io/google_containers/pause-amd64:3.0",
            "--enforce-node-allocatable=",
            "--network-plugin=cni",
            "--cni-conf-dir=/etc/cni/net.d",
            "--cni-bin-dir=/opt/cni/bin",
            "--anonymous-auth=false",
            "--v=2",
            "--cadvisor-port=0",
            "--read-only-port=0",
            "--cloud-config=/etc/kubernetes/cloud-config",
            "--cloud-provider=azure",
            "--kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-node.yaml",
            "--address=0.0.0.0",
            "--resolv-conf=/etc/resolv.conf",
            "--hostname-override=nonprod-worker3"
        ],

...

            "Entrypoint": [
                "/opt/rke/entrypoint.sh",
                "kubelet",
                "--fail-swap-on=false",
                "--root-dir=/opt/rke/var/lib/kubelet",
                "--tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
                "--cgroups-per-qos=True",
                "--allow-privileged=true",
                "--authentication-token-webhook=true",
                "--cluster-dns=10.43.0.10",
                "--client-ca-file=/etc/kubernetes/ssl/kube-ca.pem",
                "--volume-plugin-dir=/var/lib/kubelet/volumeplugins",
                "--cluster-domain=sandbox.pipeline.vpc.starbucks.net",
                "--pod-infra-container-image=gcr.io/google_containers/pause-amd64:3.0",
                "--enforce-node-allocatable=",
                "--network-plugin=cni",
                "--cni-conf-dir=/etc/cni/net.d",
                "--cni-bin-dir=/opt/cni/bin",
                "--anonymous-auth=false",
                "--v=2",
                "--cadvisor-port=0",
                "--read-only-port=0",
                "--cloud-config=/etc/kubernetes/cloud-config",
                "--cloud-provider=azure",
                "--kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-node.yaml",
                "--address=0.0.0.0",
                "--resolv-conf=/etc/resolv.conf",
                "--hostname-override=nonprod-worker3"
            ],

Relevant mount:

            {
                "Source": "/opt/rke/var/lib/kubelet",
                "Destination": "/opt/rke/var/lib/kubelet",
                "Mode": "shared,z",
                "RW": true,
                "Propagation": "shared"
            }
        "HostConfig": {
            "Binds": [
                "/opt/rke/etc/kubernetes:/etc/kubernetes:z",
                "/etc/cni:/etc/cni:rw,z",
                "/opt/cni:/opt/cni:rw,z",
                "/opt/rke/var/lib/cni:/var/lib/cni:z",
                "/var/lib/calico:/var/lib/calico:z",
                "/etc/resolv.conf:/etc/resolv.conf",
                "/sys:/sys:rprivate",
                "/var/lib/docker:/var/lib/docker:rw,rslave,z",
                "/opt/rke/var/lib/kubelet:/opt/rke/var/lib/kubelet:shared,z",
                "/var/lib/kubelet/volumeplugins:/var/lib/kubelet/volumeplugins:shared,z",
                "/var/lib/rancher:/var/lib/rancher:shared,z",
                "/var/run:/var/run:rw,rprivate",
                "/run:/run:rprivate",
                "/opt/rke/etc/ceph:/etc/ceph",
                "/dev:/host/dev:rprivate",
                "/var/log/containers:/var/log/containers:z",
                "/var/log/pods:/var/log/pods:z",
                "/usr:/host/usr:ro",
                "/etc:/host/etc:ro"
            ],

The kubelet is running, and is 2 hours running.

      "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 123535,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-07-23T18:38:37.2207815Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },

Yet RKE still fails on the healthcheck.

The entire kubelet log file.

$ docker logs kubelet
+ '[' kubelet = kubelet ']'
++ DOCKER_API_VERSION=1.24
++ /opt/rke/bin/docker info
++ grep -i 'docker root dir'
++ cut -f2 -d:
+ for i in $(DOCKER_API_VERSION=1.24 /opt/rke/bin/docker info 2>&1  | grep -i 'docker root dir' | cut -f2 -d:) /var/lib/docker /run /var/run
++ tac /proc/mounts
++ grep '^/var/lib/docker/'
++ awk '{print $2}'
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a5a7a47d-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-db99j '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a5a7a47d-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-db99j '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a5a7a47d-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-db99j
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a55fbdea-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/tiller-token-shvn6 '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a55fbdea-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/tiller-token-shvn6 '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a55fbdea-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/tiller-token-shvn6
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4fcdfed-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/kube-dns-autoscaler-token-zgcr2 '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4fcdfed-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/kube-dns-autoscaler-token-zgcr2 '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4fcdfed-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/kube-dns-autoscaler-token-zgcr2
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4e7c5b6-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/kube-dns-token-ngvbq '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4e7c5b6-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/kube-dns-token-ngvbq '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4e7c5b6-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/kube-dns-token-ngvbq
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4adb843-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-5qzfg '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4adb843-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-5qzfg '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/a4adb843-8dcf-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-5qzfg
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/9eedb712-8d3c-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-db99j '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/9eedb712-8d3c-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-db99j '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/9eedb712-8d3c-11e8-8f76-000d3a377662/volumes/kubernetes.io~secret/default-token-db99j
+ for m in $(tac /proc/mounts | awk '{print $2}' | grep ^${i}/)
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/aece7468-8c64-11e8-97a7-000d3a377774/volumes/rook.io~rook/jenkins-jobs '!=' /var/run/nscd ']'
+ '[' /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/aece7468-8c64-11e8-97a7-000d3a377774/volumes/rook.io~rook/jenkins-jobs '!=' /run/nscd ']'
+ umount /var/lib/docker/volumes/1fbd8a343365cd6be00ec705ac31864d221cca831ae19654e5fb2b156a20a475/_data/var/lib/kubelet/pods/aece7468-8c64-11e8-97a7-000d3a377774/volumes/rook.io~rook/jenkins-jobs

It looks like the kubelet hangs at that point (above). I had to use docker stop, docker rm on the kubelet image, and the kubelet-old image, then re-run RKE, and it seems to have worked this time... although the problem isn't solved because this keeps popping up.

I also manually ran: sudo umount -l /opt/rke/var/lib/kubelet which seems to have worked. The -f flag still runs into the device is busy error.

Just ran into this. Trying to understand what is making this happen as well. Running on CoreOS / Azure with Docker 1.12. Almost same specs as @HighwayofLife .

Noticed if I roll back to 0.1.7 this stops. So possibly a regression or something that was added recently that is causing this?

Docker doesn't handle nested mount volumes very well, which is the root cause in the case. You can see this using this simple example:

root@ip-172-31-1-170:/home/ubuntu# mount | grep /tmp| wc -l
root@ip-172-31-1-170:/home/ubuntu# for i in {1..10}; do docker  run -d -v /tmp:/tmp:shared -v /tmp/test:/tmp/test alpine ; sleep 1 ; done  > /dev/null
root@ip-172-31-1-170:/home/ubuntu# mount | grep /tmp| wc -l
1023

Kuberetes containers deployed by rke use a volume from rancher/rke-tools image for entry scripts and other scripts/tools. This volume is mounted under /opt/rke/:

            {
                "Name": "dd6677cfc583a64171246b04f33cb97bd9c94bc640087d10575daffaaf264038",
                "Source": "/var/lib/docker/volumes/dd6677cfc583a64171246b04f33cb97bd9c94bc640087d10575daffaaf264038/_data",
                "Destination": "/opt/rke",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },

In some OSs, including CoreOS and RancherOS, we persist data in prefix_path:, which is also /opt/rke. This creates a nested mounts situation, where mounts are doubled with each kubelet restart!

I pushed a fix for this. Please feel free to test it once it's merged in master. We would love to get your feedback.

@moelsayed Added feedback to #836 . Thanks for pushing the fix, currently still not working for me but gave me some insights into where to look.

@rossedman Thank you for your feedback! #836 is not complete yet. The fix requires a new rke-tools image rancher/rke-tools:v0.1.13 which hasn't been added to types yet.

If you still would like to try, add this to your cluster.yml:

system_images:
  alpine: rancher/rke-tools:v0.1.13
  nginx_proxy: rancher/rke-tools:v0.1.13
  cert_downloader: rancher/rke-tools:v0.1.13
  kubernetes_services_sidecar: rancher/rke-tools:v0.1.13

This should use the correct image without needing the types update.

@moelsayed 👏 👏 👏 it totally worked! 🐱

I've adjusted the system images to take v0.1.13, but am getting the following error now on a new cluster:

FATA[0151] [controlPlane] Failed to bring up Control Plane: Failed to start [kube-apiserver] container on host [10.18.160.20]: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"/opt/rke-tools/entrypoint.sh\": stat /opt/rke-tools/entrypoint.sh: no such file or directory": unknown 

@HighwayofLife you also need to apply #836.

I had, but it wasn't working, turns out that I needed to take Kubernetes 1.11, it wasn't working with v1.10.5-rancher1. I was able to get my new cluster online. RKE doesn't seem to be able to fix a halfway configured cluster some of the time.

Oh, sorry about that. That will be handled in the types update. It should be fully functional in the next rc.

@HighwayofLife I just updated #836 with types from my own branch. It should work with our supported versions.

@HighwayofLife we've released v0.1.9-rc7 that should contain the entire fix. let me know if you still see issues.

@moelsayed upgrading rke without upgrading kubernetes version and rke-tools will cause the cluster to break on any rke up run, because k8s components will use /opt/rke-tools and rke-tools image still using /opt/rke

rke version: v0.1.9-rc9

I was able to confirm that the problem doesn't happen with ROS anymore, also i was able to test the upgrade of kubernetes clusters created by rke 0.1.8 to the latest rc.

Was this page helpful?
0 / 5 - 0 ratings