K3s: ISCSI mount fails, but can be fixed manually

Created on 1 Mar 2019 · 19Comments · Source: k3s-io/k3s

Awesome project, though I am having some trouble with iscsi:

Describe the bug
When a pod is created with a iscsi volume, the pod will fail to mount the block. Running the command from the event works and allows pod to succeed.

To Reproduce

run iscsiadm -m discover -t sendtargets -p $IP on kube worker node.
-- assume pre-formatted block device to ext4
create with a iscsi volume, as per the example here. Obviously replace connection info with correct settings.
-- I am using a synology iscsi target
error shows:

  Warning  FailedMount  2m5s   kubelet, lab02.libretinker.com  MountVolume.SetUp failed for volume "iscsipd-rw" : mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/rancher/k3s/agent/kubelet/pods/42351a5d-3c02-11e9-87ea-b827eb93df71/volumes/kubernetes.io~iscsi/iscsipd-rw --scope -- mount -o bind /var/lib/rancher/k3s/agent/kubelet/plugins/kubernetes.io/iscsi/iface-default/192.168.1.2:3260-iqn.2000-01.com.synology:hds.Target-21.336328a62a-lun-1 /var/lib/rancher/k3s/agent/kubelet/pods/42351a5d-3c02-11e9-87ea-b827eb93df71/volumes/kubernetes.io~iscsi/iscsipd-rw
Output: Running scope as unit: run-r430a1e406f8f488da063a8c2ee197b9a.scope
mount: bad address '/var/lib/rancher/k3s/agent/kubelet/plugins/kubernetes.io/iscsi/iface-default/192.168.1.2'

running the following on the assigned worker node will allow the pod to successfully start after it tries again: (note that is is mkdir followed by the failed command in the error)

$ mkdir /var/lib/rancher/k3s/agent/kubelet/pods/42351a5d-3c02-11e9-87ea-b827eb93df71/volumes/kubernetes.io~iscsi/iscsipd-rw
$ systemd-run --description="Kubernetes transient mount for /var/lib/rancher/k3s/agent/kubelet/pods/42351a5d-3c02-11e9-87ea-b827eb93df71/volumes/kubernetes.io~iscsi/iscsipd-rw" --scope -- mount -o bind /var/lib/rancher/k3s/agent/kubelet/plugins/kubernetes.io/iscsi/iface-default/192.168.1.2:3260-iqn.2000-01.com.synology:hds.Target-21.336328a62a-lun-1 /var/lib/rancher/k3s/agent/kubelet/pods/42351a5d-3c02-11e9-87ea-b827eb93df71/volumes/kubernetes.io~iscsi/iscsipd-rw

Expected behavior
mount should bind to pod directory and the pod should start on its own

Screenshots
n/a

Additional context

I assume the same issue would exist with PVs but I did not test that - the kubernetes example just uses a pod with a volume.
worker nodes are fedora, master node is raspbian.

Thanks!

kinbug

Source

iluminae

👍1

Most helpful comment

Sorry @lentzi90 the latest RC doesn't have the fix, but will be available in the next RC!

deniseschannon on 3 May 2019

🚀1 👍1

All 19 comments

after looking at the structure of your repo as it pertains to the full kubernetes repo - I am guessing this is actually a bug with kubernetes proper, if you would rather close this in that case let me know.

iluminae on 1 Mar 2019

@iluminae this very well could be an issue with k3s. k3s includes a full userspace stashed away in /var/lib/rancher/k3s/data/${HASH} which is based on busybox. It could be that the mount command in there is conflicting.

Do you know this is an upstream k8s issue or do you just suspect? I'm curious if systemd-run is just executing mount. The busybox mount is probably not capable of doing the mount. So this does sound like a legit issue we should fix.

ibuildthecloud on 1 Mar 2019

@erikwilson We should move the mount binary to mount.real and then make mount be a shell script the looks for a host version of mount if it exists exec that and not the the busybox mount otherwise fallback to mount.real. I suspect we will continue to get issue in this area. (ceph, nfs, etc.)

ibuildthecloud on 1 Mar 2019

@ibuildthecloud it was only a guess that the k8s code was the issue after glancing at the code that executes systemd-run. I didn't realize mount was provided by BusyBox here. I should also test this on a non-systemd host to see if it reproduces with a more straightforward error there.

iluminae on 2 Mar 2019

I tested on a alpine linux worker host and got a different failure entirely. Possibly unrelated to the above failure all together, may need a separate issue? Let me know if so.

E0302 03:48:48.273013    3236 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/iscsi/192.168.1.2:3260:iqn.2000-01.com.synology:hds.Target-21.336328a62a:1\"" failed. No retries permitted until 2019-03-02 03:49:04.272952562 -0500 EST m=+1533.849915120 (durationBeforeRetry 16s). Error: "MountVolume.WaitForAttach failed for volume \"iscsipd-rw\" (UniqueName: \"kubernetes.io/iscsi/192.168.1.2:3260:iqn.2000-01.com.synology:hds.Target-21.336328a62a:1\") pod \"iscsipd\" (UID: \"8c3b10b4-3cc7-11e9-8ce6-b827eb93df71\") : failed to get any path for iscsi disk, last err seen:\nCould not attach disk: Timeout after 10s"
I0302 03:49:04.314366    3236 operation_generator.go:501] MountVolume.WaitForAttach entering for volume "iscsipd-rw" (UniqueName: "kubernetes.io/iscsi/192.168.1.2:3260:iqn.2000-01.com.synology:KDThds.Target-21.336328a62a:1") pod "iscsipd" (UID: "8c3b10b4-3cc7-11e9-8ce6-b827eb93df71") DevicePath ""
E0302 03:49:13.319595    3236 iscsi_util.go:417] Could not attach disk: Timeout after 10s
E0302 03:49:13.324302    3236 iscsi_util.go:430] iscsi: failed to get any path for iscsi disk, last err seen:
Could not attach disk: Timeout after 10s

This time, the iscsi session is created, and the block device shows up in /dev, but the initial mount is not made. On my systemd system, the initial mount of that block device is made, but the bind mount into the container fails. In Alpine, it does not make it that far.

To prepare the node in alpine I ran:

echo "cgroup /sys/fs/cgroup cgroup defaults 0 0" >> /etc/fstab
mount -a
apk add open-iscsi
iscsiadm -m discovery -t sendtargets -p $IP
./k3s agent -s https://$MASTER:6443 -t $TOKEN &

iluminae on 2 Mar 2019

After further testing, this seems to effect NFS mounts as well (all remote mounts?).

  Warning  FailedMount       6s (x6 over 22s)   kubelet, lab01     MountVolume.SetUp failed for volume "pvc-2e15f26e-3d80-11e9-8ce6-b827eb93df71" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs 192.168.1.2:/volume1/AppStoreNG/default-slzc-pvc-2e15f26e-3d80-11e9-8ce6-b827eb93df71 /var/lib/rancher/k3s/agent/kubelet/pods/2e18a4f9-3d80-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-2e15f26e-3d80-11e9-8ce6-b827eb93df71
Output: mount: mounting 192.168.1.2:/volume1/AppStoreNG/default-slzc-pvc-2e15f26e-3d80-11e9-8ce6-b827eb93df71 on /var/lib/rancher/k3s/agent/kubelet/pods/2e18a4f9-3d80-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-2e15f26e-3d80-11e9-8ce6-b827eb93df71 failed: Connection refused

If I run the mount command from the error event, it fails - but if I run mkdir first, it will succeed - and the pod will start correctly. This is exactly the behavior of the iscsi issue on a systemd host.

Could the issue be as simple as the mounting code needing to run os.Mkdir() before it mounts?

edit: also reproduced on master (currently f90cbed4081e7e1e6972861c196543b2d253bfcc)

iluminae on 3 Mar 2019

👍1

@ibuildthecloud you may be right about the mount binary somehow. On my NFS system test:

lab01:~# /var/lib/rancher/k3s/data/4df430e1473d0225734948e562863c82f20d658ed9c420c77e168aec42eccdb5/bin/mount -t nfs 192.168.1.2:/volume1/AppStoreNG/default-slzc-pvc-abad135e-3d84-11e9-8ce6-b827eb93df71
/var/lib/rancher/k3s/agent/kubelet/pods/abafcf51-3d84-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-abad135e-3d84-11e9-8ce6-b827eb93df71
mount: mounting 192.168.1.2:/volume1/AppStoreNG/default-slzc-pvc-abad135e-3d84-11e9-8ce6-b827eb93df71 on /var/lib/rancher/k3s/agent/kubelet/pods/abafcf51-3d84-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-abad135e-3d84-11e9-8ce6-b827eb93df71 failed: No such file or directory

#create missing directory
lab01:~# mkdir $_

#use packaged mount - fails
lab01:~# /var/lib/rancher/k3s/data/4df430e1473d0225734948e562863c82f20d658ed9c420c77e168aec42eccdb5/bin/mount -t nfs 192.168.12:/volume1/AppStoreNG/default-slzc-pvc-abad135e-3d84-11e9-8ce6-b827eb93df71 /var/lib/rancher/k3s/agent/kubelet/pods/abafcf51-3d84-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-abad135e-3d84-11e9-8ce6-b827eb93df71
mount: mounting 192.168.1.2:/volume1/AppStoreNG/default-slzc-pvc-abad135e-3d84-11e9-8ce6-b827eb93df71 on /var/lib/rancher/k3s/agent/kubelet/pods/abafcf51-3d84-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-abad135e-3d84-11e9-8ce6-b827eb93df71 failed: Connection refused

#use alpine built in mount - succeeds
lab01:~# mount -t nfs 192.168.1.2:/volume1/AppStoreNG/default-slzc-pvc-abad135e-3d84-11e9-8ce6-b827eb93df71 /var/lib/rancher/k3s/agent/kubelet/pods/abafcf51-3d84-11e9-8ce6-b827eb93df71/volumes/kubernetes.io~nfs/pvc-abad135e-3d84-11e9-8ce6-b827eb93df71

your packaged busybox vs the alpine busybox:

lab01:~# /var/lib/rancher/k3s/data/4df430e1473d0225734948e562863c82f20d658ed9c420c77e168aec42eccdb5/bin/busybox | head -n1
BusyBox v1.29.3 (2019-01-09 16:41:51 UTC) multi-call binary.
lab01:~# busybox | head -n1
BusyBox v1.29.3 (2019-01-24 07:45:07 UTC) multi-call binary.

lab01:~# ldd /var/lib/rancher/k3s/data/4df430e1473d0225734948e562863c82f20d658ed9c420c77e168aec42eccdb5/bin/busybox
ldd: /var/lib/rancher/k3s/data/4df430e1473d0225734948e562863c82f20d658ed9c420c77e168aec42eccdb5/bin/mount: Not a valid dynamic program
lab01:~# ldd `which busybox`
    /lib/ld-musl-x86_64.so.1 (0x7ff8f872a000)
    libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7ff8f872a000)

im guessing the musl version is different enough to do the trick?

iluminae on 3 Mar 2019

I can confirm that mounting NFS volumes does work if you symlink the mount within the fake root to the system's mount.
eg: on alpine ln -sfn /bin/busybox /var/lib/rancher/k3s/data/{hash}/mount

I have not tried it with iscsi yet, but it seems to completely work around the issue afaik. Probably has something to do with access to the installed mount.nfs helper binary, that is installed into /sbin/.

iluminae on 16 Mar 2019

Just to confirm I also ran into this mounting an NFS volume with a PV. Running mkdir as @iluminae says makes the pod run.

tfpereira on 26 Mar 2019

I can also confirm this problem with NFS volumes. The weird thing is that I have one node that doesn't have this problem or at least hasn't run into it yet. Let me know if there are any interesting logs/system information that could help.

lentzi90 on 31 Mar 2019

I also ran into this issue on Arch with k3s + openebs, symlinking the k3s mount binary to the system one did the trick and now my pvc is mounted as expected.
ln -sf /usr/sbin/mount /var/lib/rancher/k3s/data/{hash}/bin/mount

pysen on 5 Apr 2019

👍1

Hi, thanks for the fix/workaround, it seems to work with OpenEBS and volumes mount successfully. Since I use Ansible to deploy the K3S cluster, how can I know the value of the hash so that I can automate the symlink at installation time? Thanks!

vitobotta on 24 Apr 2019

@iluminae @tfpereira @lentzi90 @pysen @vitobotta

Can you guys test with v0.5.0-rc3 to see if it's working for you?

deniseschannon on 3 May 2019

v0.5.0-rc3 doesn't seem to work for me :(
Actually, when I try to mount an nfs volume I first see the same problem as before:

Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/rancher/k3s/agent/kubelet/pods/a6af917e-6dc7-11e9-aac5-b827eb6bcfdd/volumes/kubernetes.io~nfs/test --scope -- mount -t nfs nasse:/Public/data /var/lib/rancher/k3s/agent/kubelet/pods/a6af917e-6dc7-11e9-aac5-b827eb6bcfdd/volumes/kubernetes.io~nfs/test
Output: Running scope as unit: run-r2f362e62291a4a32ad95e0734940ad0a.scope
mount: mounting nasse:/Public/data on /var/lib/rancher/k3s/agent/kubelet/pods/a6af917e-6dc7-11e9-aac5-b827eb6bcfdd/volumes/kubernetes.io~nfs/test failed: Connection refused                                                                                                                                         
  Warning  FailedMount  8m39s  kubelet, pippin  MountVolume.SetUp failed for volume "test" : mount failed: exit status 255

When I create the missing folder it gets deleted after a while. If I create it and manually mount (using the command from the logs) it works. Maybe it was this way before and I just happened to create the folder at the right time, I don't know.

Is anything special required for this? I simply downloaded the v0.5.0-rc3 binary on one node, removed it from the cluster and run ./k3s server from this binary. Is that alright? Should I reboot, clean something up?

lentzi90 on 3 May 2019

@lentzi90 can you check to see if the packaged mount binary still in /var/lib/rancher/k3s/data/**/bin/mount, if its there then it might be a problem with the upgrade.