1. What kops version are you running? The command kops version, will display
this information.
Version 1.10.0
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.6", GitCommit:"a21fdbd78dde8f5447f5f6c331f7eb6f80bd684e", GitTreeState:"clean", BuildDate:"2018-07-26T10:17:47Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.7", GitCommit:"0c38c362511b20a098d7cd855f1314dad92c2780", GitTreeState:"clean", BuildDate:"2018-08-20T09:56:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
3. What cloud provider are you using?
aws
4. What commands did you run? What is the simplest way to reproduce this issue?
kops update cluster
5. What happened after the commands executed?
After enable https://github.com/kubernetes/kops/blob/master/docs/cluster_spec.md#bootstrap-tokens and https://github.com/kubernetes/kops/blob/master/docs/node_authorization.md
we got
Error from server (Forbidden): Forbidden (user=kubelet-api, verb=get, resource=nodes, subresource=proxy) ( pods/log etcd-server-events-xxxxxx)
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
name: test
spec:
api:
loadBalancer:
type: Internal
channel: stable
authorization:
rbac: {}
cloudLabels:
env: infra
cloudProvider: aws
configBase: s3://bucket/test
dnsZone: dreamteam.internal
docker:
storage: overlay2
storageOpts:
- overlay2.override_kernel_check=true
version: 17.03.2
liveRestore: true
etcdClusters:
- etcdMembers:
- instanceGroup: master-us-east-1a
name: a
encryptedVolume: true
- instanceGroup: master-us-east-1b
name: b
encryptedVolume: true
- instanceGroup: master-us-east-1c
name: c
encryptedVolume: true
name: main
version: 3.2.18
enableEtcdTLS: true
- etcdMembers:
- instanceGroup: master-us-east-1a
name: a
encryptedVolume: true
- instanceGroup: master-us-east-1b
name: b
encryptedVolume: true
- instanceGroup: master-us-east-1c
name: c
encryptedVolume: true
name: events
version: 3.2.18
enableEtcdTLS: true
iam:
allowContainerRegistry: true
legacy: false
kubeAPIServer:
authorizationMode: Node,RBAC
authorizationRbacSuperUser: admin
enableBootstrapTokenAuth: true
runtimeConfig:
rbac.authorization.k8s.io/v1: "true"
authentication.k8s.io/v1beta1: "true"
kubelet:
anonymousAuth: false
authorizationMode: Webhook
authenticationTokenWebhook: true
kubernetesApiAccess:
- 0.0.0.0/0
kubernetesVersion: 1.10.7
masterInternalName: api.internal.test
masterPublicName: api.test
networkCIDR: xxxxxxxxxxxxxx
networkID: xxxxxxxxxxxxxx
networking:
cilium: {}
kubeDNS:
provider: CoreDNS
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 0.0.0.0/0
subnets:
- cidr: xxxxxxxxxxxxxx
id: xxxxxxxxxxxxxx
name: us-east-1a
type: Private
zone: us-east-1a
- cidr: xxxxxxxxxxxxxx
id: xxxxxxxxxxxxxx
name: us-east-1b
type: Private
zone: us-east-1b
- cidr: xxxxxxxxxxxxxx
id: xxxxxxxxxxxxxx
name: us-east-1c
type: Private
zone: us-east-1c
- cidr: xxxxxxxxxxxxxx
id: xxxxxxxxxxxxxx
name: utility-us-east-1a
type: Utility
zone: us-east-1a
- cidr: xxxxxxxxxxxxxx
id: xxxxxxxxxxxxxx
name: utility-us-east-1b
type: Utility
zone: us-east-1b
- cidr: xxxxxxxxxxxxxx
id: xxxxxxxxxxxxxx
name: utility-us-east-1c
type: Utility
zone: us-east-1c
topology:
dns:
type: Private
masters: private
nodes: private
nodeAuthorization:
nodeAuthorizer: {}
I'm not sure this is the right solution, but I faced the same problem after upgraded a cluster from 1.9.9 to 1.10.5 while adding the following to my cluster spec to support a newer version of kube-prometheus:
kubelet:
anonymousAuth: false
authorizationMode: Webhook
authenticationTokenWebhook: true
After the upgrade, I got same errors when attempting to fetch logs with kubectl logs or via the Kubernetes Dashboard. I had a similar error when trying to exec into a pod via the dashboard.
i noticed the cluster role system:kubelet-api-admin has been added during the upgrade, so I created a cluster role binding for user kubelet-api using this role:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubelet-api-admin
subjects:
- kind: User
name: kubelet-api
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:kubelet-api-admin
apiGroup: rbac.authorization.k8s.io
This fixed the log/exec errors for me, but I'd appreciate any advice on whether this is a wise solution.
I did the same trick, but I guess it's shouldn't be default behavior?
i ran into same issue too and creating clusterrolebinding fixed my issue too
@justinsb can u pls suggest whats the right thing to do here
Same issue here. At the moment the way to go for me is to add the clusterrolebinding as an addon so to be automatically and seamlessly created during the cluster creation
Same problem here, Kops 1.10 K8S 1.10.12
Same here. Thank you for providing a solution @or1can
I noticed that clusterrolebinding created by kops for the system always contains the annotation rbac.authorization.kubernetes.io/autoupdate: "true" and label kubernetes.io/bootstrapping: rbac-defaults. Should we add those ?
Below is the modified yaml provided by @or1can glorious solution :1st_place_medal: (Note that I changed the name from kubelet-api-admin to system:kubelet-api-admin)
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: system:kubelet-api-admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
subjects:
- kind: User
name: kubelet-api
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:kubelet-api-admin
apiGroup: rbac.authorization.k8s.io
Note that the named changed from kubelet-api-admin to system:kubelet-api-admin.
@nvanheuverzwijn , maybe kops uses those labels to manage the clusterrolebinding in the future but likely the clusterrolebinding won't change since the bootstrapping of the cluster.
It's good to have the labels anyways for and if this role changes in the future.
Thanks @or1can and @nvanheuverzwijn for the workaround! I was wondering if there's a permanent (more automated) fix for this.
Running into the same issue when creating a brand new cluster using kops 1.11.1 with the following config for kubelet:
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
Without the fix provided by @or1can and @nvanheuverzwijn (馃) , Helm doesn't work either due to this error message.
Forbidden (user=kubelet-api, verb=get, resource=nodes, subresource=proxy)
The documentation for this role seems to be sparse for such an important clusterrolebinding. There is a related documentation request https://github.com/kubernetes/website/issues/7388, but this only relates to https://github.com/kubernetes/website/pull/8363.
Thanks @or1can, you're the best, I got exactly the same problem on my cluster today, even after cleaning and reinstalling Tiller, it wouldn't work. But after I apply your fix, it works like a charm. Thanks mate 馃殌
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
c'mon... no attention has been payed from official maintainers
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
I'm not sure this is the right solution, but I faced the same problem after upgraded a cluster from 1.9.9 to 1.10.5 while adding the following to my cluster spec to support a newer version of
kube-prometheus:After the upgrade, I got same errors when attempting to fetch logs with
kubectl logsor via the Kubernetes Dashboard. I had a similar error when trying to exec into a pod via the dashboard.i noticed the cluster role
system:kubelet-api-adminhas been added during the upgrade, so I created a cluster role binding for userkubelet-apiusing this role:This fixed the log/exec errors for me, but I'd appreciate any advice on whether this is a wise solution.