tkn pr cancel should be able to cancel a Tekton Pipeline that has two or more Tasks and at least one Condition on a Task at any given time before the Pipeline completes.
tkn pr cancel fails to cancel a Pipeline when its .status.taskRuns includes an entry for the Condition but the Condition has not been evaluated yet.
Further, tkn pr list shows the status of the pipeline to be Running(PipelineRunCouldntCancel) and the pipeline stays stuck / will not time out.
The pipeline runs two Hello World tasks in a sequence. The first Hello World task additionally sleeps for 5 minutes. The second Hello World task has a Condition on it and the Condition simply exits 0.
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: hello-world
spec:
steps:
- image: ubuntu
name: echo-hello-world
script: echo "Hello, world!"
---
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: hello-world-and-sleep
spec:
steps:
- image: ubuntu
name: echo-hello-world
script: echo "Hello, world!" && sleep 300
---
apiVersion: tekton.dev/v1alpha1
kind: Condition
metadata:
name: foo-condition
spec:
check:
image: ubuntu
script: |
#!/bin/bash
exit 0
---
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: cancel-bug-reproduction
spec:
tasks:
- name: hello1
taskRef:
name: hello-world-and-sleep
- name: hello2
taskRef:
name: hello-world
conditions:
- conditionRef: foo-condition
runAfter:
- hello1
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: cancel-bug-reproduction-pr
spec:
pipelineRef:
name: cancel-bug-reproduction
While the first Hello World task sleeps for 300 seconds, run tkn pr cancel cancel-bug-reproduction-pr
tkn pr list shows something like below
NAME STARTED DURATION STATUS
cancel-bug-reproduction-pr 15 minutes ago --- Running(PipelineRunCouldntCancel)
kubectl get pr cancel-bug-reproduction-pr -o yaml should show .status.taskRuns that includes two taskRun entries, one for the first Task and the other for the Condition to be executed.Kubernetes version:
Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"20c265fef0741dd71a66480e35bd69f18351daea", GitTreeState:"clean", BuildDate:"2019-10-15T19:16:51Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.0", GitCommit:"e19964183377d0ec2052d1f1fa930c4d7575bd50", GitTreeState:"clean", BuildDate:"2020-08-26T14:23:04Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
Tekton Pipeline version:
Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
Client version: 0.13.0
Pipeline version: v0.17.1
I spent some time digging into the controller code and the cancellation failure is caused by https://github.com/tektoncd/pipeline/blob/master/pkg/reconciler/pipelinerun/cancel.go#L60.
Because the code is looping over pr.Status.TaskRuns and for the example PipelineRun, it will have two entries. The entry for the Condition will cause the Patch call to fail because the TaskRun does not exist.
pr.Status.TaskRuns is updated by https://github.com/tektoncd/pipeline/blob/master/pkg/reconciler/pipelinerun/pipelinerun.go#L531.
The code that adds new .status.taskRuns entries will look at the second Hello World Task and see that it has a non nil ResolvedConditionChecks. So that causes the line 80 if -> continue check to be skipped.
Thanks for reporting this @riceluxs1t !
Hopefully we'll be able to get this resolved quickly, in the meantime, I want to draw your attention to the fact that we have deprecated the Conditions CRD in favor of when expressions. I'd be interesting in hearing if you are able to use when expressions to achieve the same thing you are using conditions for and if you run into the same problems.
@bobcatfish
Hi Christie. We are currently running Tekton 0.14.2 which doesn't yet support when expressions. We certainly plan to migrate to a version >= 0.16.0 which introduces when expressions.
I will test a similar pipeline that uses a when expression instead and see if cancellation happens as expected :)
A when expression-based same pipeline doesn't seem to have the issue. Does a when expression execute in a TaskRun? If not, I don't think it has the same issue.
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: hello-world
spec:
steps:
- image: ubuntu
name: echo-hello-world
script: echo "Hello, world!"
---
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: hello-world-and-sleep
spec:
steps:
- image: ubuntu
name: echo-hello-world
script: echo "Hello, world!" && sleep 5
---
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: cancel-bug-reproduction
spec:
tasks:
- name: hello1
taskRef:
name: hello-world-and-sleep
- name: hello2
taskRef:
name: hello-world
when:
- input: "1"
operator: in
values: ["1"]
runAfter:
- hello1
---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: cancel-bug-reproduction-pr12
spec:
pipelineRef:
name: cancel-bug-reproduction
serviceAccountName: sdp-builder
Does a when expression execute in a TaskRun?
nope no explicit taskRun, when expressions are evaluated by the controller. The controller evaluates when expressions before scheduling the guarded task. The task (along with the whole branch) is not executed if when expressions evaluates to false.
I think it's safe to conclude then that when expressions do not have this issue. Any near future plan to fix the identified problem?
thanks @riceluxs1t
Any near future plan to fix the identified problem?
Nope, when expressions replaced conditions in 0.16, no more fixes happening in conditions. Would when expressions work for your use case?
@bobcatfish @jerop please correct me if I am wrong.
馃憤 to what @pritidesai said - since we've deprecated Conditions we'd like to avoid making changes to that functionality if possible - but if when expressions don't work for your use cases plz let us know so we can make sure we do!
maybe it would make sense to write up a short migration guide, but basically since Conditions run as tasks underneath, you should be able to get the same functionality you get from Conditions today by re-writing your condition as a Task and using the result in a when expression (with the caveat that instead of having your Condition fail in the "skip" case, you need it to return a result)
OK! Because we are upgrading our internal Tekton version, we will upgrade to a when expression when the upgrade completes. Closing this PR
Excellent!!!