Longhorn: [BUG] ARM64 - Error Attaching Volume to Pods

Created on 19 Aug 2020  路  3Comments  路  Source: longhorn/longhorn

Describe the bug
After looking at the arm64 support PR, I deployed Longhorn on my 8 node Raspberry Pi 4 K8S cluster. I was able to find all the needed docker images on DockerHub that were built by ivang. I used the official Longhorn helm chart and modified the images and their respective tags in the values.yaml file before deploying. I also made sure to run the environment check script, install open-iscsi on all nodes and have the iscsid daemon running.

From the Longhorn UI, I can successfully create volumes and attach them to any worker nodes. I can also create PVC's using the provided storage class successfully.

I keep running into this issue where the created volume cannot bind to any pod. For example, I ran the sample nginx deployment provided in the documentation.

running "VolumeBinding" filter plugin for pod "volume-test": pod has unbound immediate PersistentVolumeClaims
AttachVolume.Attach failed for volume "pvc-7cc9759f-474a-4fe8-8eb2-ee9ad5757d6c" : attachdetachment timeout for volume pvc-7cc9759f-474a-4fe8-8eb2-ee9ad5757d6c
Unable to attach or mount volumes: unmounted volumes=[volv], unattached volumes=[default-token-64qvs volv]: timed out waiting for the condition

For reference, this is the created PVC:

Name:          longhorn-volv-pvc
Namespace:     test
StorageClass:  usb
Status:        Bound
Volume:        pvc-7cc9759f-474a-4fe8-8eb2-ee9ad5757d6c
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: driver.longhorn.io
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      2Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Mounted By:    volume-test
Events:
  Type    Reason                 Age                From                                                                                     Message
  ----    ------                 ----               ----                                                                                     -------
  Normal  Provisioning           19m                driver.longhorn.io_csi-provisioner-67f6b9d8f-gnwht_06721115-849b-44d4-8272-c8e22ffbb189  External provisioner is provisioning volume for claim "test/longhorn-volv-pvc"
  Normal  ExternalProvisioning   19m (x3 over 19m)  persistentvolume-controller                                                              waiting for a volume to be created, either by external provisioner "driver.longhorn.io" or manually created by system administrator
  Normal  ProvisioningSucceeded  19m                driver.longhorn.io_csi-provisioner-67f6b9d8f-gnwht_06721115-849b-44d4-8272-c8e22ffbb189  Successfully provisioned volume pvc-7cc9759f-474a-4fe8-8eb2-ee9ad5757d6c

This is the PV:

Name:            pvc-7cc9759f-474a-4fe8-8eb2-ee9ad5757d6c
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: driver.longhorn.io
Finalizers:      [kubernetes.io/pv-protection external-attacher/driver-longhorn-io]
StorageClass:    usb
Status:          Bound
Claim:           test/longhorn-volv-pvc
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        2Gi
Node Affinity:   <none>
Message:
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            driver.longhorn.io
    FSType:            ext4
    VolumeHandle:      pvc-7cc9759f-474a-4fe8-8eb2-ee9ad5757d6c
    ReadOnly:          false
    VolumeAttributes:      diskSelector=usb
                           fromBackup=
                           numberOfReplicas=2
                           staleReplicaTimeout=30
                           storage.kubernetes.io/csiProvisionerIdentity=1597799569063-8081-driver.longhorn.io
Events:                <none>

From the Longhorn UI, I can see that the PV is healthy and is attached to the correct worker node.

I also tried scaling down the longhorn-driver-deployer to 0, delete deployments then scale it back up. This resulted in the pod 'has unbound immediate PersistentVolumeClaims' error to go away on the first PVC claim, but it returned after I tested creating another PVC.

Expected behavior
The volume should attach to the pod.

Environment:

  • Kubernetes version: v1.18.8
  • Node OS type and version: Ubuntu Server 20.04.01 (64-bit)

Most helpful comment

Update:

I was able to get it to work! After looking at @ivang comment, I found csi docker images that did the trick! I was able to deploy the sample nginx deployment and I just finished deploying cluster monitoring, which uses persistence for Grafana and Prometheus. Works great so far!

For anyone that comes across this, here is what I did to get Longhorn up and running:

- git clone https://github.com/longhorn/longhorn.git
- cd longhorn/chart/
- nano values.yaml

Replace the yaml file content with the following (or manually change the docker images and their tags):

# Default values for longhorn.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
image:
  longhorn:
    engine: ivanangelov/longhorn-engine
    engineTag: b0a22eb
    manager: ivanangelov/longhorn-manager
    managerTag: 92a5f6a5_arm64
    ui: ivanangelov/longhorn-ui
    uiTag: 1607c9e
    instanceManager: ivanangelov/longhorn-instance-manager
    instanceManagerTag: v1_20200711
  pullPolicy: IfNotPresent

service:
  ui:
    type: ClusterIP
    nodePort: null
  manager:
    type: ClusterIP
    nodePort: ""

persistence:
  defaultClass: true
  # The default replica count is 3
  defaultClassReplicaCount: 2

csi:
  attacherImage: raspbernetes/csi-external-attacher
  attacherImageTag: latest
  provisionerImage: raspbernetes/csi-external-provisioner
  provisionerImageTag: latest
  nodeDriverRegistrarImage: raspbernetes/csi-node-driver-registrar
  nodeDriverRegistrarImageTag: latest
  resizerImage: raspbernetes/csi-external-resizer
  resizerImageTag: latest
  kubeletRootDir: ~
  attacherReplicaCount: ~
  provisionerReplicaCount: ~
  resizerReplicaCount: ~

defaultSettings:
  backupTarget: ~
  backupTargetCredentialSecret: ~
  createDefaultDiskLabeledNodes: ~
  defaultDataPath: ~
  replicaSoftAntiAffinity: ~
  storageOverProvisioningPercentage: ~
  storageMinimalAvailablePercentage: ~
  upgradeChecker: ~
  defaultReplicaCount: ~
  guaranteedEngineCPU: ~
  defaultLonghornStaticStorageClass: ~
  backupstorePollInterval: ~
  taintToleration: ~
  priorityClass: ~
  registrySecret: ~
  autoSalvage: ~
  disableSchedulingOnCordonedNode: ~
  replicaZoneSoftAntiAffinity: ~
  volumeAttachmentRecoveryPolicy: ~
  mkfsExt4Parameters: ~

privateRegistry:
  registryUrl: ~
  registryUser: ~
  registryPasswd: ~

resources: {}
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #  cpu: 100m
  #  memory: 128Mi
  # requests:
  #  cpu: 100m
  #  memory: 128Mi
  #

ingress:
  ## Set to true to enable ingress record generation
  enabled: false


  host: xip.io

  ## Set this to true in order to enable TLS on the ingress record
  ## A side effect of this will be that the backend service will be connected at port 443
  tls: false

  ## If TLS is set to true, you must declare what secret will store the key/certificate for TLS
  tlsSecret: longhorn.local-tls

  ## Ingress annotations done as key:value pairs
  ## If you're using kube-lego, you will want to add:
  ## kubernetes.io/tls-acme: true
  ##
  ## For a full list of possible ingress annotations, please see
  ## ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md
  ##
  ## If tls is set to true, annotation ingress.kubernetes.io/secure-backends: "true" will automatically be set
  annotations:
  #  kubernetes.io/ingress.class: nginx
  #  kubernetes.io/tls-acme: true

  secrets:
  ## If you're providing your own certificates, please use this to add the certificates as secrets
  ## key and certificate should start with -----BEGIN CERTIFICATE----- or
  ## -----BEGIN RSA PRIVATE KEY-----
  ##
  ## name should line up with a tlsSecret set further up
  ## If you're using kube-lego, this is unneeded, as it will create the secret for you if it is not set
  ##
  ## It is also possible to create and manage the certificates outside of this helm chart
  ## Please see README.md for more information
  # - name: longhorn.local-tls
  #   key:
  #   certificate:

# Configure a pod security policy in the Longhorn namespace to allow privileged pods
enablePSP: true

Then install the chart:

kubectl create namespace longhorn-system
helm install longhorn . --namespace longhorn-system

To test if it works, create a test-longhorn.yaml file with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-volv-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: volume-test
    image: nginx:stable-alpine
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: longhorn-volv-pvc

If everything goes according to plan, this should spin up an nginx pod with a volume created and provisioned by Longhorn!

Thank you @ivang and @yasker for your help!

All 3 comments

@ivang Is there anything specific we need to do with the CSI driver for ARM64?

Yes, the original CSI drivers are not yet available for ARM64. Thus, I had to build them myself (see this comment), push them to my docker hub, and modify these lines of the deployment scripts to make use of them. I didn't want to clutter the PR with my local changes to the deployment scripts but I can include them if necessary.

PS. It seems that support for ARM64 in CSI is on its way.

Update:

I was able to get it to work! After looking at @ivang comment, I found csi docker images that did the trick! I was able to deploy the sample nginx deployment and I just finished deploying cluster monitoring, which uses persistence for Grafana and Prometheus. Works great so far!

For anyone that comes across this, here is what I did to get Longhorn up and running:

- git clone https://github.com/longhorn/longhorn.git
- cd longhorn/chart/
- nano values.yaml

Replace the yaml file content with the following (or manually change the docker images and their tags):

# Default values for longhorn.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
image:
  longhorn:
    engine: ivanangelov/longhorn-engine
    engineTag: b0a22eb
    manager: ivanangelov/longhorn-manager
    managerTag: 92a5f6a5_arm64
    ui: ivanangelov/longhorn-ui
    uiTag: 1607c9e
    instanceManager: ivanangelov/longhorn-instance-manager
    instanceManagerTag: v1_20200711
  pullPolicy: IfNotPresent

service:
  ui:
    type: ClusterIP
    nodePort: null
  manager:
    type: ClusterIP
    nodePort: ""

persistence:
  defaultClass: true
  # The default replica count is 3
  defaultClassReplicaCount: 2

csi:
  attacherImage: raspbernetes/csi-external-attacher
  attacherImageTag: latest
  provisionerImage: raspbernetes/csi-external-provisioner
  provisionerImageTag: latest
  nodeDriverRegistrarImage: raspbernetes/csi-node-driver-registrar
  nodeDriverRegistrarImageTag: latest
  resizerImage: raspbernetes/csi-external-resizer
  resizerImageTag: latest
  kubeletRootDir: ~
  attacherReplicaCount: ~
  provisionerReplicaCount: ~
  resizerReplicaCount: ~

defaultSettings:
  backupTarget: ~
  backupTargetCredentialSecret: ~
  createDefaultDiskLabeledNodes: ~
  defaultDataPath: ~
  replicaSoftAntiAffinity: ~
  storageOverProvisioningPercentage: ~
  storageMinimalAvailablePercentage: ~
  upgradeChecker: ~
  defaultReplicaCount: ~
  guaranteedEngineCPU: ~
  defaultLonghornStaticStorageClass: ~
  backupstorePollInterval: ~
  taintToleration: ~
  priorityClass: ~
  registrySecret: ~
  autoSalvage: ~
  disableSchedulingOnCordonedNode: ~
  replicaZoneSoftAntiAffinity: ~
  volumeAttachmentRecoveryPolicy: ~
  mkfsExt4Parameters: ~

privateRegistry:
  registryUrl: ~
  registryUser: ~
  registryPasswd: ~

resources: {}
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #  cpu: 100m
  #  memory: 128Mi
  # requests:
  #  cpu: 100m
  #  memory: 128Mi
  #

ingress:
  ## Set to true to enable ingress record generation
  enabled: false


  host: xip.io

  ## Set this to true in order to enable TLS on the ingress record
  ## A side effect of this will be that the backend service will be connected at port 443
  tls: false

  ## If TLS is set to true, you must declare what secret will store the key/certificate for TLS
  tlsSecret: longhorn.local-tls

  ## Ingress annotations done as key:value pairs
  ## If you're using kube-lego, you will want to add:
  ## kubernetes.io/tls-acme: true
  ##
  ## For a full list of possible ingress annotations, please see
  ## ref: https://github.com/kubernetes/ingress-nginx/blob/master/docs/annotations.md
  ##
  ## If tls is set to true, annotation ingress.kubernetes.io/secure-backends: "true" will automatically be set
  annotations:
  #  kubernetes.io/ingress.class: nginx
  #  kubernetes.io/tls-acme: true

  secrets:
  ## If you're providing your own certificates, please use this to add the certificates as secrets
  ## key and certificate should start with -----BEGIN CERTIFICATE----- or
  ## -----BEGIN RSA PRIVATE KEY-----
  ##
  ## name should line up with a tlsSecret set further up
  ## If you're using kube-lego, this is unneeded, as it will create the secret for you if it is not set
  ##
  ## It is also possible to create and manage the certificates outside of this helm chart
  ## Please see README.md for more information
  # - name: longhorn.local-tls
  #   key:
  #   certificate:

# Configure a pod security policy in the Longhorn namespace to allow privileged pods
enablePSP: true

Then install the chart:

kubectl create namespace longhorn-system
helm install longhorn . --namespace longhorn-system

To test if it works, create a test-longhorn.yaml file with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-volv-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: volume-test
    image: nginx:stable-alpine
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: longhorn-volv-pvc

If everything goes according to plan, this should spin up an nginx pod with a volume created and provisioned by Longhorn!

Thank you @ivang and @yasker for your help!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Angelinsky7 picture Angelinsky7  路  8Comments

yasker picture yasker  路  7Comments

KuroObi picture KuroObi  路  8Comments

lucevers picture lucevers  路  4Comments

yasker picture yasker  路  8Comments