K3s: Dynamic EBS storage "no volume plugin found"

Created on 8 Nov 2019 · 6Comments · Source: k3s-io/k3s

Version:

k3s version v0.10.2 (8833bfd9)

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC=\"server --bind-address 0.0.0.0\" sh -
Describe the bug

I've created the ebs storageclass with the manifest:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  creationTimestamp: "2019-11-08T19:45:47Z"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
  name: gp2
  resourceVersion: "338717"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/gp2
  uid: 4af986cd-acdd-4bdc-bcef-fdb73f23daa4
parameters:
  type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: Immediate

When I create/describe a pvc, I'm given:

Name:          test
Namespace:     default
StorageClass:  gp2
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Mounted By:    <none>
Events:
  Type     Reason              Age                  From                         Message
  ----     ------              ----                 ----                         -------
  Warning  ProvisioningFailed  93s (x381 over 96m)  persistentvolume-controller  no volume plugin matched

journalctl:

Nov 08 19:08:16 ip-10-0-1-216 k3s[2474]: I1108 19:08:16.185328    2474 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-vpc", UID:"749940b8-1bcc-4385-b3f6-66ad02d2e991", APIVersion:"v1", ResourceVersion:"333562", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' no volume plugin matched
Nov 08 19:08:31 ip-10-0-1-216 k3s[2474]: E1108 19:08:31.184450    2474 pv_controller.go:1331] error finding provisioning plugin for claim default/test-vpc: no volume plugin matched

To Reproduce

Follow recommended EBS steps.

Expected behavior
PVC is created

Actual behavior
Shown in logs

Additional context

So, I can't find any documented successes of getting this working in k3s. Per the cloud-provider-aws repo the only requirement is to run the kubernetes components with --cloud-provider=external

I see some components are deployed with it, but when I was greeted with the error

time="2019-11-08T19:56:23Z" level=fatal msg="flag provided but not defined: -cloud-provider"

When I deploy with curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --bind-address 0.0.0.0 --kube-apiserver-arg cloud-provider=external --kube-controller-arg cloud-provider=external --kubelet-arg cloud-provider=external" sh -

Source

David-Igou

👍1

Most helpful comment

Hi Everyone,

I managed to resolve the issue by the help of installing the CSI driver of AWS EBS and EFS to my k3s cluster.

References

AWS EBS CSI driver installation guide
AWS EFS CSI driver installation guide

Important notes

EBS is good for its IOPS
EFS is good for multipleReadWrite capability
Since all the guides mentioned are purposed for EKS cluster - all images of the corresponding Deployments and DaemonSets are referencing to the internal AWS Docker Image Registries. To resolve the issue just execute the following (example for DaemonSets, approximately the same works for fixing the Deployments):

kubectl set image daemonset/efs-csi-node efs-plugin=amazon/aws-efs-csi-driver:v0.2.0 -n kube-system
kubectl set image daemonset/efs-csi-node csi-driver-registrar=quay.io/k8scsi/csi-node-driver-registrar:v1.1.0 -n kube-system
kubectl set image daemonset/efs-csi-node liveness-probe=quay.io/k8scsi/livenessprobe:v1.1.0 -n kube-system
kubectl set image daemonset/ebs-csi-node liveness-probe=quay.io/k8scsi/livenessprobe:v1.1.0 -n kube-system
kubectl set image daemonset/ebs-csi-node node-driver-registrar=quay.io/k8scsi/csi-node-driver-registrar:v1.1.0 -n kube-system
kubectl set image daemonset/ebs-csi-node ebs-plugin=amazon/aws-ebs-csi-driver:v0.4.0 -n kube-system

vkim-rogers on 22 Dec 2019

👍4

All 6 comments

I somehow missed the cloudprovider plugins are removed in k3s. Looking for other solutions

David-Igou on 9 Nov 2019

You should be able to set those flags in v0.10.2, I am not able to reproduce the error level=fatal msg="flag provided but not defined: -cloud-provider", can you give more context? We are now running our own cloud controller manager that you can disable with --disable-cloud-controller.

erikwilson on 9 Nov 2019

@erikwilson I'm attempting to reproduce.

What would --disable-cloud-controller do to help me here ? Is there a way to do dynamic storage w/ EBS like regular k8s, or do I need to look into a solution like openebs?

David-Igou on 9 Nov 2019

Hi Everyone,

I managed to resolve the issue by the help of installing the CSI driver of AWS EBS and EFS to my k3s cluster.

References

AWS EBS CSI driver installation guide
AWS EFS CSI driver installation guide

Important notes

EBS is good for its IOPS
EFS is good for multipleReadWrite capability
Since all the guides mentioned are purposed for EKS cluster - all images of the corresponding Deployments and DaemonSets are referencing to the internal AWS Docker Image Registries. To resolve the issue just execute the following (example for DaemonSets, approximately the same works for fixing the Deployments):

kubectl set image daemonset/efs-csi-node efs-plugin=amazon/aws-efs-csi-driver:v0.2.0 -n kube-system
kubectl set image daemonset/efs-csi-node csi-driver-registrar=quay.io/k8scsi/csi-node-driver-registrar:v1.1.0 -n kube-system
kubectl set image daemonset/efs-csi-node liveness-probe=quay.io/k8scsi/livenessprobe:v1.1.0 -n kube-system
kubectl set image daemonset/ebs-csi-node liveness-probe=quay.io/k8scsi/livenessprobe:v1.1.0 -n kube-system
kubectl set image daemonset/ebs-csi-node node-driver-registrar=quay.io/k8scsi/csi-node-driver-registrar:v1.1.0 -n kube-system
kubectl set image daemonset/ebs-csi-node ebs-plugin=amazon/aws-ebs-csi-driver:v0.4.0 -n kube-system

vkim-rogers on 22 Dec 2019

👍4

@vkim-rogers, Thanks for your help. I'm attempting to follow the AWS EBS CSI driver instructions you mentioned. But I don't have a aws-auth configmap as mentioned in the instructions. Hence running

kubectl -n kube-system describe configmap aws-auth

Just gives:

Error from server (NotFound): configmaps "aws-auth" not found

How did you get the rolearn without that command? Thanks again.

ndjhartman on 28 Feb 2020

Thanks @vkim-rogers, that got me going as well!

@ndjhartman:

You can skip that part if you assign the policies to the ec2 instances some other way. You just need to be able to do run commands like aws ec2 describe-volumes from your ec2 host without explicitly logging in.

For example I have my EC2 InstanceProfiles set up like this in CloudFormation now:

  #
  # IAM configuration to support Session Manager
  #

  NodeProfile:
    Type: AWS::IAM::InstanceProfile
    Properties:
      Roles:
      - Ref: NodeRole
      InstanceProfileName: NodeProfile

  NodeRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
        - Effect: Allow
          Principal:
            Service: ec2.amazonaws.com
          Action: sts:AssumeRole
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM
      - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly

  #
  # POlicy for EBS access
  #
  EBSPolicy:
    Type: AWS::IAM::Policy
    Properties:
      PolicyName: !Sub ${AWS::StackName}-ebs-policy
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
        - Effect: Allow
          Action:
            - ec2:AttachVolume
            - ec2:CreateSnapshot
            - ec2:CreateTags
            - ec2:CreateVolume
            - ec2:DeleteSnapshot
            - ec2:DeleteTags
            - ec2:DeleteVolume
            - ec2:DescribeAvailabilityZones
            - ec2:DescribeInstances
            - ec2:DescribeSnapshots
            - ec2:DescribeTags
            - ec2:DescribeVolumes
            - ec2:DescribeVolumesModifications
            - ec2:DetachVolume
            - ec2:ModifyVolume
          Resource: "*"
      Roles:
      - !Ref NodeRole

I ran the following commands in this order:

kubectl-tools apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"
kubectl-tools set image daemonset/ebs-csi-node liveness-probe=quay.io/k8scsi/livenessprobe:v2.1.0 -n kube-system
kubectl-tools set image daemonset/ebs-csi-node ebs-plugin=amazon/aws-ebs-csi-driver:v0.7.1 -n kube-system
kubectl-tools set image daemonset/ebs-csi-node node-driver-registrar=quay.io/k8scsi/csi-node-driver-registrar:v2.0.1 -n kube-system

Using the old versions of these above resulted in the ebs-csi-node pods failing their healthchecks. This seemed to be because of this issue: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/494 when running ebs-plugin=amazon/aws-ebs-csi-driver:v0.4.0

I did edit the DaemonSet with the following based on some things I found in the aws-ebs-csi-driver repo:

kubectl-tools get daemonset -n kube-system -o yaml

Old:

        tolerations:
        - operator: Exists

New:

        tolerations:
        - key: CriticalAddonsOnly
          operator: Exists

My StorageClass definition looks like this:
Note that allowedTopologies would result in the pods never getting scheduled to a node. Not sure what that's about yet.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
parameters:
  type: gp2
reclaimPolicy: Retain
mountOptions:
  - debug
volumeBindingMode: WaitForFirstConsumer
#allowedTopologies:
#- matchLabelExpressions:
#  - key: topology.kubernetes.io/zone
#    values:
#    - us-west-2a
#    - us-west-2b
#    - us-west-2c

PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: redacted-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: 'standard'
  resources:
    requests:
      storage: 1Gi