Rook: pod/csi-rbdplugin - CrashLoopBackOff

Created on 8 Feb 2020  路  4Comments  路  Source: rook/rook

I'm trying to get rook-ceph running on a DigitalOcean K8S cluster.

Pods

k -n rook-ceph get pods                        
NAME                                            READY   STATUS             RESTARTS   AGE
csi-cephfsplugin-provisioner-565ffd64f5-slqlb   4/4     Running            0          146m
csi-cephfsplugin-provisioner-565ffd64f5-t49z5   4/4     Running            0          146m
csi-cephfsplugin-qrvr9                          3/3     Running            0          146m
csi-cephfsplugin-r4jzc                          3/3     Running            0          146m
csi-rbdplugin-4hk6k                             2/3     CrashLoopBackOff   33         146m
csi-rbdplugin-provisioner-7bb78d6c66-8kl4z      5/5     Running            0          146m
csi-rbdplugin-provisioner-7bb78d6c66-ssxql      5/5     Running            0          146m
csi-rbdplugin-vklmw                             2/3     CrashLoopBackOff   33         146m
rook-ceph-mon-a-canary-55b6c9df66-xp2s2         1/1     Running            0          5m46s
rook-ceph-mon-b-canary-5598b47dfc-dpqgw         1/1     Running            0          5m45s
rook-ceph-mon-c-canary-54fc8b799b-mz5cl         0/1     Pending            0          5m44s
rook-ceph-operator-6d74795f75-wl9l9             1/1     Running            0          146m
rook-discover-52d2c                             1/1     Running            0          146m
rook-discover-smpzk                             1/1     Running            0          146m

The logs from the failed pods shows

k -n rook-ceph logs csi-rbdplugin-vklmw -c csi-rbdplugin
I0208 16:16:10.977115   11801 cephcsi.go:104] Driver version: v1.2.2 and Git version: f8c854dc7d6ffff02cb2eed6002534dc0473f111
I0208 16:16:10.977545   11801 cephcsi.go:131] Initial PID limit is set to -1
I0208 16:16:10.977724   11801 cephcsi.go:140] Reconfigured PID limit to -1 (max)
I0208 16:16:10.977803   11801 cephcsi.go:159] Starting driver type: rbd with name: rook-ceph.rbd.csi.ceph.com
I0208 16:16:10.980418   11801 mount_linux.go:170] Cannot run systemd-run, assuming non-systemd OS
I0208 16:16:10.980605   11801 mount_linux.go:171] systemd-run failed with: exit status 1
I0208 16:16:10.980692   11801 mount_linux.go:172] systemd-run output: Failed to create bus connection: No such file or directory
F0208 16:16:10.981047   11801 httpserver.go:25] listen tcp 10.131.11.201:9090: bind: address already in use

Versions

k version 
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-21T22:17:28Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.6", GitCommit:"72c30166b2105cd7d3350f2c28a219e6abcd79eb", GitTreeState:"clean", BuildDate:"2020-01-18T23:23:21Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
bug

Most helpful comment

Encountered similar issue while installing in both IKS and Minikube:

$ kgp -n rook-ceph
NAME                                                 READY   STATUS             RESTARTS   AGE
csi-cephfsplugin-7gn75                               3/3     Running            0          14m
csi-cephfsplugin-provisioner-fd87698db-5krpb         5/5     Running            0          14m
csi-cephfsplugin-provisioner-fd87698db-lvgk8         5/5     Running            0          14m
csi-rbdplugin-b88qj                                  2/3     CrashLoopBackOff   7          14m
csi-rbdplugin-provisioner-7c9c6578b-6psnp            6/6     Running            0          14m
csi-rbdplugin-provisioner-7c9c6578b-w4cl4            6/6     Running            0          14m
rook-ceph-crashcollector-minikube-5bd86fc9c7-swhm5   1/1     Running            0          14m
rook-ceph-mgr-a-657b685994-scrsk                     1/1     Running            0          14m
rook-ceph-mon-a-75db5784b8-68z8r                     1/1     Running            0          14m
rook-ceph-operator-9bd79cdcf-mtrzn                   1/1     Running            0          15m
rook-ceph-osd-prepare-minikube-wblkh                 0/1     Completed          0          14m
rook-discover-9dsg6                                  1/1     Running            0          15m

$ klogs csi-rbdplugin-b88qj --all-containers -n rook-ceph
...
F0418 14:10:04.435189   12096 httpserver.go:25] listen tcp 192.168.99.106:9090: bind: address already in use
I0418 13:54:07.799329    3005 cephcsi.go:113] Driver version: v2.0.1 and Git version: be6318716e08f6307206508d41a63eaa3781d408
I0418 13:54:07.800228    3005 cephcsi.go:168] Starting driver type: liveness with name: liveness.csi.ceph.com
I0418 13:54:07.800233    3005 liveness.go:84] Liveness Running
I0418 13:54:07.800356    3005 connection.go:151] Connecting to unix:///csi/csi.sock
W0418 13:54:17.800716    3005 connection.go:170] Still connecting to unix:///csi/csi.sock
W0418 13:54:27.803248    3005 connection.go:170] Still connecting to unix:///csi/csi.sock
...

It turned out that it was also caused by port conflict and changed the port in operator.yaml helped:

# CSI_RBD_GRPC_METRICS_PORT: "9090"
CSI_RBD_GRPC_METRICS_PORT: "9092"

All 4 comments

Fixed this with help from Rook Slack channel. I was mislead by systemd-run output: Failed to create bus connection: No such file or directory this issue was port 9090 was already in use on the node.

@lilHermit How were you able to fix this? I'm also having this issue on DigitalOcean.

The above is a port issues. You can configure it in operator.yaml file

Encountered similar issue while installing in both IKS and Minikube:

$ kgp -n rook-ceph
NAME                                                 READY   STATUS             RESTARTS   AGE
csi-cephfsplugin-7gn75                               3/3     Running            0          14m
csi-cephfsplugin-provisioner-fd87698db-5krpb         5/5     Running            0          14m
csi-cephfsplugin-provisioner-fd87698db-lvgk8         5/5     Running            0          14m
csi-rbdplugin-b88qj                                  2/3     CrashLoopBackOff   7          14m
csi-rbdplugin-provisioner-7c9c6578b-6psnp            6/6     Running            0          14m
csi-rbdplugin-provisioner-7c9c6578b-w4cl4            6/6     Running            0          14m
rook-ceph-crashcollector-minikube-5bd86fc9c7-swhm5   1/1     Running            0          14m
rook-ceph-mgr-a-657b685994-scrsk                     1/1     Running            0          14m
rook-ceph-mon-a-75db5784b8-68z8r                     1/1     Running            0          14m
rook-ceph-operator-9bd79cdcf-mtrzn                   1/1     Running            0          15m
rook-ceph-osd-prepare-minikube-wblkh                 0/1     Completed          0          14m
rook-discover-9dsg6                                  1/1     Running            0          15m

$ klogs csi-rbdplugin-b88qj --all-containers -n rook-ceph
...
F0418 14:10:04.435189   12096 httpserver.go:25] listen tcp 192.168.99.106:9090: bind: address already in use
I0418 13:54:07.799329    3005 cephcsi.go:113] Driver version: v2.0.1 and Git version: be6318716e08f6307206508d41a63eaa3781d408
I0418 13:54:07.800228    3005 cephcsi.go:168] Starting driver type: liveness with name: liveness.csi.ceph.com
I0418 13:54:07.800233    3005 liveness.go:84] Liveness Running
I0418 13:54:07.800356    3005 connection.go:151] Connecting to unix:///csi/csi.sock
W0418 13:54:17.800716    3005 connection.go:170] Still connecting to unix:///csi/csi.sock
W0418 13:54:27.803248    3005 connection.go:170] Still connecting to unix:///csi/csi.sock
...

It turned out that it was also caused by port conflict and changed the port in operator.yaml helped:

# CSI_RBD_GRPC_METRICS_PORT: "9090"
CSI_RBD_GRPC_METRICS_PORT: "9092"
Was this page helpful?
0 / 5 - 0 ratings

Related issues

stephan2012 picture stephan2012  路  3Comments

itmuckel picture itmuckel  路  4Comments

kokhang picture kokhang  路  3Comments

funkypenguin picture funkypenguin  路  4Comments

zerkms picture zerkms  路  4Comments