As long as there are still healthy replicas available, we can attach the volume.
It should cover with node reboot #375 , docker reboot #762 , Kubernetes upgrade #703 , and recovery from volume remounted as read-only #381
https://github.com/longhorn/longhorn-manager/pull/453 should help.
Yes !!!
Thats what I have been waiting for .
Validation: Failed
Longhorn version: 0.7.0-rc1
Steps to test:
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn-manager/master/examples/storageclass.yamlkubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn-manager/master/examples/pvc.yaml/data directorysystemctl restart dockerExpected result: pod should be restarted , and Longhorn volume should be detached and reattached again. (PASSED)
- Check the content of the file
FAILED: listing the content of/datashows no files
I had to manually delete and re-create the pods to be able to access the created file.
@shuo-wu can you check? Seems remount doesn't work.
Need to restart the pod container to bring the volume back.
We need document for this.
In short, the pod should be configured with a liveness probe to allow it to restart if it cannot access the volume.
Validation: FAILED
Case: Kubernetes upgrade
Steps to reproduce:
v2.3.2v1.14.8v0.7.0-rc1https://github.com/shuo-wu/longhorn/blob/14b524130eccb0c32eb3d1fecfaed51d7612b1d0/docs/restore-volume.mdv1.14.8 to v1.15.5, wait for upgrade to complete.Error: pod get stuck in CrashLoopBackOff loop
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 35m (x2 over 35m) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
Normal Scheduled 35m default-scheduler Successfully assigned default/volume-test to lh-worker2
Normal SuccessfulAttachVolume 35m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-3fa008c1-0586-11ea-8955-f23c92f1b630"
Normal Pulling 34m kubelet, lh-worker2 Pulling image "nginx:stable-alpine"
Normal Pulled 34m kubelet, lh-worker2 Successfully pulled image "nginx:stable-alpine"
Normal Created 34m kubelet, lh-worker2 Created container volume-test
Normal Started 34m kubelet, lh-worker2 Started container volume-test
Warning FailedMount 25m (x5 over 25m) kubelet, lh-worker2 MountVolume.MountDevice failed for volume "pvc-3fa008c1-0586-11ea-8955-f23c92f1b630" : driver name driver.longhorn.io not found in the list of registered CSI drivers
Normal SandboxChanged 25m kubelet, lh-worker2 Pod sandbox changed, it will be killed and re-created.
Normal Pulled 25m kubelet, lh-worker2 Container image "nginx:stable-alpine" already present on machine
Normal Created 25m kubelet, lh-worker2 Created container volume-test
Normal Started 25m kubelet, lh-worker2 Started container volume-test
Warning Unhealthy 20m kubelet, lh-worker2 Liveness probe errored: rpc error: code = Unknown desc = cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: no such file or directory
Warning FailedMount 16m kubelet, lh-worker2 MountVolume.MountDevice failed for volume "pvc-3fa008c1-0586-11ea-8955-f23c92f1b630" : driver name driver.longhorn.io not found in the list of registered CSI drivers
Warning FailedMount 16m (x6 over 16m) kubelet, lh-worker2 MountVolume.SetUp failed for volume "pvc-3fa008c1-0586-11ea-8955-f23c92f1b630" : rpc error: code = InvalidArgument desc = There is no block device frontend for volume pvc-3fa008c1-0586-11ea-8955-f23c92f1b630
Normal SandboxChanged 15m (x2 over 15m) kubelet, lh-worker2 Pod sandbox changed, it will be killed and re-created.
Normal Pulled 15m kubelet, lh-worker2 Container image "nginx:stable-alpine" already present on machine
Normal Created 15m kubelet, lh-worker2 Created container volume-test
Normal Started 15m kubelet, lh-worker2 Started container volume-test
Normal Pulled 8m57s (x3 over 9m58s) kubelet, lh-worker2 Container image "nginx:stable-alpine" already present on machine
Normal Created 8m57s (x3 over 9m58s) kubelet, lh-worker2 Created container volume-test
Normal Started 8m57s (x3 over 9m58s) kubelet, lh-worker2 Started container volume-test
Normal Killing 8m38s (x4 over 9m58s) kubelet, lh-worker2 Container volume-test failed liveness probe, will be restarted
Warning Unhealthy 8m38s (x9 over 9m53s) kubelet, lh-worker2 Liveness probe failed: ls: /data/lost+found: I/O error
Warning BackOff 4m54s (x21 over 9m23s) kubelet, lh-worker2 Back-off restarting failed container
Validation: PARTIAL PASSED
Steps to reproduce:
v2.3.2v1.14.8v0.7.0-rc1https://github.com/shuo-wu/longhorn/blob/14b524130eccb0c32eb3d1fecfaed51d7612b1d0/docs/restore-volume.mdSteps to reproduce:
v2.3.2v1.14.8v0.7.0-rc1https://github.com/shuo-wu/longhorn/blob/14b524130eccb0c32eb3d1fecfaed51d7612b1d0/docs/restore-volume.mdsystemct restart docker.serviceValidation: PASSED
Case: Kubernetes upgrade
Most helpful comment
Yes !!!
Thats what I have been waiting for .