Bug
When installing a CSI Driver, PVCs and PVs are created but will not attach to a pod with this error:
E0727 14:03:55.220699 2205 csi_attacher.go:93] kubernetes.io/csi: attacher.Attach failed: volumeattachments.storage.k8s.io is forbidden: User "system:node:master1" cannot create resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: can only get individual resources of this type
To Reproduce
Expected behavior
PVs are able to attach to a pod without issue
Additional context
The issue has been reported for multiple CSI drivers and all things point to k3s. I can personally confirm that this works normally when used with a kubespray cluster.
hetznercloud/csi-driver: https://github.com/hetznercloud/csi-driver/issues/46
longhorn CSI driver: https://forums.rancher.com/t/longhorn-on-k3s-pv-attach-error/14920
I tried with azures hostpath from https://github.com/Azure/kubernetes-volume-drivers/tree/master/csi/hostpath and retrieved the same message on k3s:
AttachVolume.Attach failed for volume "kubernetes-dynamic-pv-42256afbbc7411e9" : volumeattachments.storage.k8s.io is forbidden: User "system:node:k3s-test" cannot create resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: can only get individual resources of this type
Unable to mount volumes for pod "my-csi-app_default(3ac675b9-bc74-11e9-a41e-96000029eacb)": timeout expired waiting for volumes to attach or mount for pod "default"/"my-csi-app". list of unmounted volumes=[my-csi-volume]. list of unattached volumes=[my-csi-volume default-token-8d8kh]
0.6.1: works
0.7.x: broken
0.8.x: broken
Same problem here 馃様
As a temporary workaround, are there any changes I can make to permissions or something to make it work? Thanks
I don't know much about authentication and authotirzation yet... but I was playing a little and got a volume attached and mounted by doing this:
Because the system:node ClusterRole had only the verb 'get' for 'volumeattachments', I added 'create', 'delete', 'patch', 'update', 'list' and 'watch' after seeing what's in the other sections of the role...
I edited the ClusterRoleBinding for system:node and since there were no subjects I tried adding these:
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
I have no idea of what I'm doing but with this change it works LOL :D the volume is mounted and working correctly. Is this something that should be done in K3s?
@vitobotta I see two possible issues with your workaround (which isn't to say you shouldn't use it, just FYI):
If those are acceptable trade-offs, then I think you're good to go (no guarantees though: I may be missing some other reason)
Hi @costela , like I said I had no idea of what I was doing :D Anyway it's just me using my small clusters, so I would be worried about the first point only. I'm going to try and restart k3s and see if I lose the changes I made.
Just tried restarting K3s on both master and workers, and the changes are still there and volumes still attach fine.
@vitobotta cool, so auto-reconciliation isn't an issue.
As for your other question:
Is this something that should be done in K3s?
Probably not. The system:nodes Group should not be used after k8s > 1.8.
Still a valid workaround, though.
I checked on another cluster deployed with Rancher (not K3s) and I see the same thing about that group. So if this isn't something that should be fixed in k3s... where should it?
@vitobotta The system:nodes Group is deprecated in favor of Node authorization and NodeRestriction admission plugin. Both on per default in k3s (and k8s, for that matter).
So this specific solution should not be done by k3s. Which isn't to say the final fix for this issue won't turn out to be in k3s. It just probably won't involve the system:nodes Group.
My current bet is on the service-account token auth. I suspect the attachdetach-controller isn't getting the credentials it should. Maybe related to the "embedded" way k3s runs it. Didn't get a chance to dig deep enough to confirm that yet, though :disappointed:
@vitobotta the link above also says
To opt out of this reconciliation, set the rbac.authorization.kubernetes.io/autoupdate annotation on a default cluster role or rolebinding to false. Be aware that missing default permissions and subjects can result in non-functional clusters.
so the auto-reconciliation issue would be easily avoidable if ever present. But it's a hack ;)
@costela
My current bet is on the service-account token auth. I suspect the attachdetach-controller isn't getting the credentials it should. Maybe related to the "embedded" way k3s runs it. Didn't get a chance to dig deep enough to confirm that yet, though
I agree, I think we can pin the issue on the fact that in k3s system:node:<node_name> credentials are being used (authorized via NodeAuthorizer or via RBACAuthorizer in @vitobotta's hack) instead of the attachdetach-controller serviceaccount. I checked and in vanilla Kubernetes this service account has all the needed permissions on volumeattachments.
I also dug a bit through the Kubernetes codebase and the process goes like this:
attachdetach-controller finds and initializes the csiPlugin via the VolumePluginMgr, passing its credentials to the plugin. The csiAttacher of the csiPlugin would then attempt to create a volumeattachment in this line:
https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_attacher.go#L105
and fail in k3s.
The thing is that I need to create a couple of clusters for development asap and I would like to use K3s because it's cheaper to run. I have spent quite a bit of time so far learning Kubernetes (I'll still a noob sadly) so I would like to go back to coding...
Is it safe to use the hack I posted earlier or do I risk for example that future versions of K3s might screw it up?
@ilyasotkov from what you say it seems that the problem is with K3s, if vanilla Kubernetes has all the correct permissions. Right?
@vitobotta It's without doubt a k3s issue because I tried it with a vanilla Kubernetes cluster (installed via kubespray / kubeadm on identical Hetzner Cloud infrastructure) and didn't face any similar issue.
Yeah I tried it with Kubernetes deployed with Rancher and didn't have any problems.
Is there a solution, which can be incorporated into k3s to resolve:
it may pose a security issue: every node can now "steal" volumes, which might be a problem depending on what you want to use your cluster for.
in the suggested workaround? I couldn鈥檛 find the source, where k3s does magic stuff to initialize the attacher with the wrong service account. Any hint where I can find this?
@DracoBlue if we knew exactly where the error was, it would probably be fixed already :wink:
k3s starts the controller-manager (which includes the attachdetach-controller) here. The controller-manager starts loading the service account here, if I'm not mistaken. Somewhere down this path lies our issue (or at least that's my current unverified theory).
As for the workaround problem: no, I don't think there's a solution for it, other than fixing the attachdetach-controller authentication (or whatever causes it as a side-effect).
I'd gladly be proven wrong, though!
I'm also seeing this issue with my the Digital Ocean CSI driver. I actually had it all working then destroyed my instance and started over again for automation and now it doesn't work. Based on the releases page I believe I was using 0.7.0. So it must of broken in the 0.8.0 release.
This appears to be broken in v0.7.0. That version introduced a certs refactor which should utilize node authorization (https://kubernetes.io/docs/reference/access-authn-authz/node/). I am not sure if kubespray uses node authorization, but from a quick look it appears not.
The NodeRestriction docs at https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction say:
kubelets must use credentials in the system:nodes group, with a username in the form
system:node:<nodeName>
If I understand your responses correct, the right way would be to ensure there is an extra service account for theattachdetach-controller instead of using the system:node:<nodeName> user. It gets initialized in k8s by https://github.com/kubernetes/kubernetes/blob/103e926604de6f79161b78af3e792d0ed282bc06/cmd/kube-controller-manager/app/controllermanager.go#L404 with a call to https://github.com/kubernetes/kubernetes/blob/9bae1bc56804db4905abebcd408e0f02e199ab93/cmd/kube-controller-manager/app/core.go#L250.
Looks like authorization ist created here https://github.com/kubernetes/kubernetes/blob/9c973c6d2c33e88521ebcebec2fdd9cbddccd857/pkg/controller/client_builder.go#L111
In k8s the Account is created here https://github.com/kubernetes/kubernetes/blob/22ff2673249d39f4559aa16f985c15cb4b66488c/plugin/pkg/auth/authorizer/rbac/bootstrappolicy/testdata/controller-role-bindings.yaml#L17
# kubectl describe clusterrolebindings/system:controller:attachdetach-controller
Name: system:controller:attachdetach-controller
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
Role:
Kind: ClusterRole
Name: system:controller:attachdetach-controller
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount attachdetach-controller kube-system
# kubectl describe clusterroles/system:controller:attachdetach-controller
Name: system:controller:attachdetach-controller
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
volumeattachments.storage.k8s.io [] [] [create delete get list watch]
events [] [] [create patch update]
nodes [] [] [get list watch]
csidrivers.storage.k8s.io [] [] [get list watch]
persistentvolumeclaims [] [] [list watch]
persistentvolumes [] [] [list watch]
pods [] [] [list watch]
nodes/status [] [] [patch update]
I will try to prepare a PR with such changes!
PS: I installed 0.6.1 and retried it. Worked out of the box.
Thanks for looking into this @DracoBlue! Were you able to make any progress?
It does seem related to the attach detach controller logic somehow, found this issue https://github.com/LINBIT/linstor-csi/issues/4 which has a similar error when that controller is disabled. The controller does run and a attachdetach-controller service account exists on the system tho.
Turning on more debugging with -v 9 shows that the node authorizer is attempted but falls back to RBAC:
I0825 21:22:34.910001 24619 node_authorizer.go:161] NODE DENY: k3s-1 &authorizer.AttributesRecord{User:(*user.DefaultInfo)(0xc004d41180), Verb:"create", Namespace:"", APIGroup:"storage.k8s.io", APIVersion:"v1", Resource:"volumeattachments", Subresource:"", Name:"", ResourceRequest:true, Path:"/apis/storage.k8s.io/v1/volumeattachments"}
I0825 21:22:34.910100 24619 rbac.go:118] RBAC DENY: user "system:node:k3s-1" groups ["system:nodes" "system:authenticated"] cannot "create" resource "volumeattachments.storage.k8s.io" cluster-wide
I0825 21:22:34.910113 24619 authorization.go:73] Forbidden: "/apis/storage.k8s.io/v1/volumeattachments", Reason: "can only get individual resources of this type"
It looks like the node authorizer is hard coded to reject the request: https://github.com/kubernetes/kubernetes/blob/v1.14.6/plugin/pkg/auth/authorizer/node/node_authorizer.go#L161
Similarly we are running with --use-service-account-credentials=true for controller-manager and appear to be using the attach detach controller service account.
I think like @vitobotta found an RBAC rule is going to be the easiest fix at the moment. Instead of modifying the system:node role for create volume attachment and binding system:nodes to it I just created a new role and binding for system:nodes with minimal permissions:
kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:nodes:volumeattachments
rules:
- apiGroups:
- storage.k8s.io
resources:
- volumeattachments
verbs:
- create
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:nodes:volumeattachments
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:nodes:volumeattachments
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
EOF
@erikwilson yup! So I don鈥榯 try to fix this at the other end. The ClusterRolebinding will be shipped with k8s then and CSI should be usable with >0.6.1 again ;)
Similarly we are running with
--use-service-account-credentials=truefor controller-manager and appear to be using the attach detach controller service account.
The service account is being created but not used, which is the core of the problem.
The request (create volumeattachment) in k3s is coming from from system:node:<nodeName> but should be coming from attachdetach-controller service account.
I agree that it is weird that system:node:<nodeName> auth is being used, looking through the kubelet code it is deferring to attachdetach-controller for the attach so not sure how those credentials end up being used. I think that service account is being used for other stuff because if you remove the role binding there will be other issues. I think the RBAC change is just a temporary stop-gap, I would like to get to the bottom of what is going on here.
This has something to do with us running a combined binary, it looks like some CSI operations are being picked up by the kubelet instead of the attachdetach controller. If you run the server with --disable-agent and run a separate agent the CSI attach will work correctly. Stack trace of attacher code:
github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/csi.(*csiAttacher).Attach(0xc006719e30, 0xc00669cc20, 0xc00453019b, 0x5, 0xc008cfeb30, 0x10, 0x10, 0x2f99aa0)
/Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/csi/csi_attacher.go:90 +0x2d6
github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/operationexecutor.(*operationGenerator).GenerateAttachVolumeFunc.func2(0x8, 0x3557a98, 0xc004458d00, 0xc002a07720)
/Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/operationexecutor/operation_generator.go:346 +0x9b
github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations.(*nestedPendingOperations).Run.func1(0xc006f76200, 0xc002a07720, 0x4e, 0x0, 0x0, 0xc0087e18c0, 0xc00659de00, 0xc0087e1940, 0x38, 0x40, ...)
/Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations/nestedpendingoperations.go:143 +0x146
created by github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations.(*nestedPendingOperations).Run
/Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations/nestedpendingoperations.go:130 +0x2ce
The issue appears to be that basically csi_plugin.go is maintaining some global state that makes it difficult for us to run kubelet and the attachdetach controller in the same process.
We tried to work around it with https://github.com/rancher/k3s/commit/4ce15bb26bd0e1db7e6b3095878d180413de6d70, which was okay with the previous node permissions but with the node authorizer is now having issues. Basically there are three global variables in the upstream csi_plugin.go code: csiDrivers, nim, and PluginHandler, which need to be made into instance variables for the kubelet and attachdetach routines (see note here).
Still trying to figure out the best way to approach the problem, unfortunately getting rid of those global variables appears not to be easy, suggestions welcome! :)
This will be fixed with the next release and using k8s v1.15. The nim package variable is the main culprit here since it contains host information which is different for the attachdetach controller and kubelet. In general we can say that package level state like this is a bad idea, unfortunately there is no good way for us to audit the code for variables like this. We could try forking to help isolate processes but that will likely come with an additional memory cost.
Hi, is there a planned date for the next release? Thanks!
Yes! I just tried 0.9.0 RC2 and I didn't need the hack this time. Thanks! :)
Closing as this has been fixed in the v0.9.0 release.
Most helpful comment
I don't know much about authentication and authotirzation yet... but I was playing a little and got a volume attached and mounted by doing this:
Because the system:node ClusterRole had only the verb 'get' for 'volumeattachments', I added 'create', 'delete', 'patch', 'update', 'list' and 'watch' after seeing what's in the other sections of the role...
I edited the ClusterRoleBinding for system:node and since there were no subjects I tried adding these:
I have no idea of what I'm doing but with this change it works LOL :D the volume is mounted and working correctly. Is this something that should be done in K3s?