kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.16.3/release.yaml) controller and webhook pod fails kubectl get pods -n tekton-pipelines -w
NAME READY STATUS RESTARTS AGE
tekton-pipelines-controller-767f44b5f5-92q4n 0/1 CrashLoopBackOff 9 23m
tekton-pipelines-webhook-7f9888f9b-gl457 0/1 CrashLoopBackOff 9 23m
with below error
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned tekton-pipelines/tekton-pipelines-webhook-7f9888f9b-gl457 to minikube
Normal Pulled 10m (x5 over 11m) kubelet, minikube Container image "gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/webhook:v0.16.3@sha256:5087d4022a4688990cf04eee003d76fc736a939b011a62a160c89ae5bd6b7c20" already present on machine
Normal Created 10m (x5 over 11m) kubelet, minikube Created container webhook
Warning Failed 10m (x5 over 11m) kubelet, minikube Error: failed to start container "webhook": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied": unknown
Warning BackOff 105s (x53 over 11m) kubelet, minikube Back-off restarting failed container
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.16.3/release.yamlkubectl get pods -n tekton-pipelinesminikube version
minikube version: v1.5.2
kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
v0.16.3
This is caused by the security settings in the Pod.
Temporary solution:
Annotate the definition of "securityContext".
Warning:
Any security definition needs to be tested to go online.
I am experiencing the same problem on kubernetes v1.18.3 with tekton pipelines v0.16.3.
The version v0.14.2 worked fine.
@zops can you elaborate on what you mean? Thanks
I had this exact issue and had to change the Deployment for both the webhook and the controller to change runAsUser from 1001 -> 65532
kustomization.yaml to patch it
apiVersion: apps/v1
kind: Deployment
metadata:
name: tekton-pipelines-controller
namespace: tekton-pipelines
spec:
template:
spec:
containers:
- name: tekton-pipelines-controller
securityContext:
runAsUser: 65532
securityContext:
runAsUser: 65532
cc @mattmoor @ImJasonH
I see that the images do have "User": "65532",, which makes me think we are actually doing something wrong in the controller.yaml. When did this change to user 65532 ?
Might be worth a bugfix release (0.17.1, and maybe even 0.16.4 :sweat:)
@vdemeester I saw that too... the image is running as 65532 but all manifests point to 1001
Edit: I would also point out that this happens for the webhook Deployment as well
Let's fix this by using 65532 in the deployment (webhook and controller).
/assign
@bobcatfish @sbwsg @afrittoli @pritidesai @ImJasonH @dibyom is it worth doing a 0.16.4 ? (my initial though is yes :upside_down_face:)
It will be in 0.17.1 for sure :wink:
I think we'd need this for Triggers too: https://github.com/tektoncd/triggers/issues/781
I think we'd need this for Triggers too: tektoncd/triggers#781
Yep, indeed :sweat_drops:
@vdemeester This was probably my bad when I switched things over to the :nonroot base images.
:nonroot has used 65532 since it's inception, but Tekton only moved (relatively) recently (tho still probably a few months back?).
Having just fixed something similar for our rebuild of Contour, I'd love to know if/how we can make this fail on GKE to avoid future issues like this 馃槵
cc @mikedanese @cjcullen
cc @bentheelder too since I'd like it to fail on KinD too 馃槄
@mattmoor this is a difference between containerd and docker rather than minikube and kind, and has to do with default permissions.
we've already attempted to resolve this upstream. https://github.com/kubernetes-sigs/kind/issues/1331
someone needs to land https://github.com/containerd/cri/pull/1397
Once https://github.com/containerd/containerd/pull/4669 (I'm carrying forward https://github.com/containerd/cri/pull/1397 after the repo merge and Lantao moving on) lands this will fail on future kind clusters.
EDIT: We ship containerd fixes quickly. But it has to land first... We avoid forks.
Most helpful comment
Let's fix this by using
65532in the deployment (webhook and controller)./assign
@bobcatfish @sbwsg @afrittoli @pritidesai @ImJasonH @dibyom is it worth doing a 0.16.4 ? (my initial though is yes :upside_down_face:)
It will be in 0.17.1 for sure :wink: