Pipeline: Failed to install v0.16.3 on minikube

Created on 22 Sep 2020  路  13Comments  路  Source: tektoncd/pipeline

Expected Behavior

  • Pipeline installation should be success

Actual Behavior

  • After installing pipeline v0.16.3 (kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.16.3/release.yaml) controller and webhook pod fails
kubectl get pods -n tekton-pipelines  -w
NAME                                           READY   STATUS             RESTARTS   AGE
tekton-pipelines-controller-767f44b5f5-92q4n   0/1     CrashLoopBackOff   9          23m
tekton-pipelines-webhook-7f9888f9b-gl457       0/1     CrashLoopBackOff   9          23m

with below error

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  <unknown>            default-scheduler  Successfully assigned tekton-pipelines/tekton-pipelines-webhook-7f9888f9b-gl457 to minikube
  Normal   Pulled     10m (x5 over 11m)    kubelet, minikube  Container image "gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/webhook:v0.16.3@sha256:5087d4022a4688990cf04eee003d76fc736a939b011a62a160c89ae5bd6b7c20" already present on machine
  Normal   Created    10m (x5 over 11m)    kubelet, minikube  Created container webhook
  Warning  Failed     10m (x5 over 11m)    kubelet, minikube  Error: failed to start container "webhook": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "chdir to cwd (\"/home/nonroot\") set in config.json failed: permission denied": unknown
  Warning  BackOff    105s (x53 over 11m)  kubelet, minikube  Back-off restarting failed container

Steps to Reproduce the Problem

  1. Install released pipeline kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.16.3/release.yaml
  2. kubectl get pods -n tekton-pipelines

Additional Info

  • Minikube version:
minikube version
minikube version: v1.5.2

  • Kubernetes version:
  kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

  • Tekton Pipeline version:
kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
v0.16.3
kinbug

Most helpful comment

Let's fix this by using 65532 in the deployment (webhook and controller).
/assign

@bobcatfish @sbwsg @afrittoli @pritidesai @ImJasonH @dibyom is it worth doing a 0.16.4 ? (my initial though is yes :upside_down_face:)
It will be in 0.17.1 for sure :wink:

All 13 comments

This is caused by the security settings in the Pod.

Temporary solution:
Annotate the definition of "securityContext".

Warning:
Any security definition needs to be tested to go online.

I am experiencing the same problem on kubernetes v1.18.3 with tekton pipelines v0.16.3.
The version v0.14.2 worked fine.

@zops can you elaborate on what you mean? Thanks

I had this exact issue and had to change the Deployment for both the webhook and the controller to change runAsUser from 1001 -> 65532

kustomization.yaml to patch it

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tekton-pipelines-controller
  namespace: tekton-pipelines
spec:
  template:
    spec:
      containers:
      - name: tekton-pipelines-controller
        securityContext:
          runAsUser: 65532
      securityContext:
        runAsUser: 65532

cc @mattmoor @ImJasonH
I see that the images do have "User": "65532",, which makes me think we are actually doing something wrong in the controller.yaml. When did this change to user 65532 ?

Might be worth a bugfix release (0.17.1, and maybe even 0.16.4 :sweat:)

@vdemeester I saw that too... the image is running as 65532 but all manifests point to 1001

Edit: I would also point out that this happens for the webhook Deployment as well

Let's fix this by using 65532 in the deployment (webhook and controller).
/assign

@bobcatfish @sbwsg @afrittoli @pritidesai @ImJasonH @dibyom is it worth doing a 0.16.4 ? (my initial though is yes :upside_down_face:)
It will be in 0.17.1 for sure :wink:

I think we'd need this for Triggers too: https://github.com/tektoncd/triggers/issues/781

I think we'd need this for Triggers too: tektoncd/triggers#781

Yep, indeed :sweat_drops:

@vdemeester This was probably my bad when I switched things over to the :nonroot base images.

:nonroot has used 65532 since it's inception, but Tekton only moved (relatively) recently (tho still probably a few months back?).

Having just fixed something similar for our rebuild of Contour, I'd love to know if/how we can make this fail on GKE to avoid future issues like this 馃槵

cc @mikedanese @cjcullen

cc @bentheelder too since I'd like it to fail on KinD too 馃槄

@mattmoor this is a difference between containerd and docker rather than minikube and kind, and has to do with default permissions.

we've already attempted to resolve this upstream. https://github.com/kubernetes-sigs/kind/issues/1331

someone needs to land https://github.com/containerd/cri/pull/1397

Once https://github.com/containerd/containerd/pull/4669 (I'm carrying forward https://github.com/containerd/cri/pull/1397 after the repo merge and Lantao moving on) lands this will fail on future kind clusters.
EDIT: We ship containerd fixes quickly. But it has to land first... We avoid forks.

Was this page helpful?
0 / 5 - 0 ratings