The piplinerun should check if given pvc is existing or not and pipelinerun should fail if the pvc is not present
The pipelinerun will be in Running state and taskruns will be in Running(Pending) state
tkn commands to run the pipelineKubernetes version:
Output of kubectl version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2020-01-29T21:26:39Z", GoVersion:"go1.14beta1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.3+8b0a82f", GitCommit:"8b0a82f", GitTreeState:"clean", BuildDate:"2020-07-10T05:34:00Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
$ tkn version
Client version: dev
Pipeline version: v0.14.2
Triggers version: unknown
Command logs
$ tkn t ls
NAME DESCRIPTION AGE
workspace-test-task 29 minutes ago
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn p ls
NAME AGE LAST RUN STARTED DURATION STATUS
workspace-test-pipeline 28 minutes ago --- --- --- ---
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn tr ls
No TaskRuns found
[varadhya@localhost workspace-testing]$ tkn p start workspace-test-pipeline
Please give specifications for the workspace: shared-workspace
? Name for the workspace : shared-workspace
? Value of the Sub Path :
? Type of the Workspace : pvc
? Value of Claim Name : simple-poc-non-existent
PipelineRun started: workspace-test-pipeline-run-xs5sw
In order to track the PipelineRun progress run:
tkn pipelinerun logs workspace-test-pipeline-run-xs5sw -f -n veeresh-testing
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn pr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-xs5sw 2 seconds ago --- Running
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn tr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-xs5sw-list-repo-fzkzv 6 seconds ago --- Running(Pending)
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
Running using kubectl
[varadhya@localhost workspace-testing]$ kubectl apply -f test-run.yaml
pipelinerun.tekton.dev/workspace-test-pipeline-run-test-invlaid created
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn pr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-test-invlaid 3 seconds ago --- Running
workspace-test-pipeline-run-xs5sw 1 minute ago --- Running
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn tr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-test-invlaid-list-repo-48n4h 4 seconds ago --- Running(Pending)
workspace-test-pipeline-run-xs5sw-list-repo-fzkzv 1 minute ago --- Running(Pending)
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn tr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-test-invlaid-list-repo-48n4h 54 seconds ago --- Running(Pending)
workspace-test-pipeline-run-xs5sw-list-repo-fzkzv 2 minutes ago --- Running(Pending)
[varadhya@localhost workspace-testing]$ tkn pr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-test-invlaid 58 seconds ago --- Running
workspace-test-pipeline-run-xs5sw 2 minutes ago --- Running
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn tr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-test-invlaid-list-repo-48n4h 1 minute ago --- Running(Pending)
workspace-test-pipeline-run-xs5sw-list-repo-fzkzv 2 minutes ago --- Running(Pending)
[varadhya@localhost workspace-testing]$
[varadhya@localhost workspace-testing]$ tkn tr ls
NAME STARTED DURATION STATUS
workspace-test-pipeline-run-test-invlaid-list-repo-48n4h 5 minutes ago --- Running(Pending)
workspace-test-pipeline-run-xs5sw-list-repo-fzkzv 6 minutes ago --- Running(Pending)
task:
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: workspace-test-task
spec:
steps:
- name: write-message
image: ubuntu
script: |
#!/usr/bin/env bash
set -xe
echo hello! > $(workspaces.messages.path)/message
cat $(workspaces.messages.path)/message
workspaces:
- name: messages
description: The folder where we write the message to
pipeline:
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: workspace-test-pipeline
spec:
workspaces:
- name: shared-workspace
tasks:
- name: list-repo
taskRef:
name: workspace-test-task
kind: Task
workspaces:
- name: messages
workspace: shared-workspace
pipelinerun:
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: workspace-test-pipeline-run-test-invlaid
namespace: veeresh-testing
spec:
pipelineRef:
name: workspace-test-pipeline
timeout: 1h0m0s
workspaces:
- name: shared-workspace
persistentVolumeClaim:
claimName: simple-pvc-non-existent
/cc @jlpettersson @sbwsg
I'm not sure that I agree that a PipelineRun should fail if a PVC doesn't exist the moment a Pod runs. Isn't it possible that the volume is being provisioned by some other service and will appear some time after the Pod launches? Or will a PVC _always_ exist prior to a Pod launching?
Yes, this is the way Kubernetes works. It is hard to say if we should do it differently.
Kubernetes is an eventual consistency system. E.g. you may create a PVC and a Pod/TaskRun in the same "kubectl create"-command. But they are created by different controllers that solve different responsibilities - independent of eachother.
If the taskrun-controller should be responsible for this, it need to lookup PVCs and there might be a race from the same "kubectl create"-command, so a missing PVC might be created just a few milliseconds after the taskrun-controller does the PVC-lookup.
I agree that UX might be better to show this earlier. But that is contrary to the loosely-coupled Kubernetes architecture with eventual consistency that make it very scalable and with clear bounded responsibilities for controllers.
@jlpettersson @sbwsg maybe we should at least update the Reason message to "say" that we are waiting for a PVC to be provisionned.
I agree that this is Kubernetes behavior and I'd be cautious about changing it. This is just one manifestation of a non-existent resource. You can get the same behavior for example when referencing a configmap that doesn't yet exist.
I did look into improving the display status for issue #2268 but didn't get a lot of feedback. Part of the problem is that if there are multiple pipeline branches active there can be multiple things going "wrong". It's hard to concisely summarize what's going on.
@jlpettersson @sbwsg maybe we should at least update the Reason message to "say" that we are waiting for a PVC to be provisionned.
This is an interesting idea. I wonder if we can detect from Pod status / events whether it is waiting on a PVC.
You can get the same behavior for example when referencing a configmap that doesn't yet exist.
Ah, good point, I hadn't considered the non-PVC scenarios too.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Send feedback to tektoncd/plumbing.
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
Send feedback to tektoncd/plumbing.
/remove-lifecycle rotten
These seems similar to https://github.com/tektoncd/pipeline/issues/3563 in that it's a question of whether or not we should add PVC specific behavior that differs from the k8s paradigm.
Since this issue went rotten and the discussion was pointing out that this is the standard k8s behavior (and the same question applies to other types like configmaps as @GregDritschler pointed out), I'm inclined to close this issue.
These seems similar to #3563 in that it's a question of whether or not we should add PVC specific behavior that differs from the k8s paradigm.
I am not sure I fully agree on this. This issue is about what should we do in case of PVC not being ready at the time we try to create TaskRun/PipelineRun. Right now, we say we are pending because it might be available at some point in the future. But we never reconcile again (aka if it is available), so we always end-up timing out (after 1h or more).
From a user perspective, I would either assume we are waiting because it will be reconciled at some point or it should fail directly. I think we need to decide on this.
But we never reconcile again (aka if it is available), so we always end-up timing out (after 1h or more).
OH. That sounds like a bug I 100% agree. I thought we've only been discussing the messages when a PVC isn't available; i definitely did not understand that this was also an issue describing a bug where the reconcile doesn't occur again!
From a user perspective, I would either assume we are waiting because it will be reconciled at some point or it should fail directly. I think we need to decide on this.
I don't think we should fail directly, as this is asynchronous (e.g. how Kubernetes works). This problem is very similar to https://github.com/tektoncd/pipeline/issues/3378
Ah kk thanks for explaining that extra detail @vdemeester I agree this is a bug.
I don't think we should fail directly, as this is asynchronous (e.g. how Kubernetes works).
I agree, I think we might want to subscribe to updates on the PVC if we can - does anyone know which part of the tekton code requires the PVC to exist?
But we never reconcile again (aka if it is available), so we always end-up timing out (after 1h or more).
OH. That sounds like a bug I 100% agree. I thought we've only been discussing the messages when a PVC isn't available; i definitely did not understand that this was also an issue describing a bug where the reconcile doesn't occur again!
This is news to me as well. I just tried it again and when the missing PVC is created and bound to a PV, the pod runs and the
TaskRun completes exactly as one would expect.
It is true that if the PVC is never created that the pipelinerun will timeout. Again that is what one would expect.
What is the scenario where the pipelinerun does not continue when the PVC is created and bound? I have not seen that reported before.
does anyone know which part of the tekton code requires the PVC to exist?
a TaskRun could hold-on to create a Pod until the PVC exists, perhaps? When creating a Pod from a TaskRun we know if a PVC is passed or not. But isn't this preventing concurrent work? E.g. a node could pull the image concurrently as it waits for a PVC to be created?
We could perhaps have a shorter timeout if the Pod never start to run? 1h is a bit long for that case?
What is the scenario where the pipelinerun does not continue when the PVC is created and bound? I have not seen that reported before.
Yes, there are many scenarios here, if we should handle them all. If we watch PVCs or so, Tekton can do more.... e.g. if the PV is located in a different AZ the Pod cannot be scheduled either - and there are more reasons to not be able to schedule a Pod.
I'm a bit confused. I feel like there are two different conversations happening at once here:
There is apparently a bug in Tekton where a reconcile does not happen even when a PVC is created and bound successfully after the TaskRun is created. I cannot reproduce this using a simple TaskRun + pvc combo in Minikube. Definitely need more input here on specifically how to repro @vdemeester .
There is a broader question about how a TaskRun should report about themselves waiting on PVCs.
I am _much_ more immediately interested in fixing the possible bug described in (1) (from comment) than the messaging part of this. That sounds like a serious bug that needs to be fixed.