Pipeline: Resource upload containers should be able to run even if steps in the pod have failed

Created on 2 Aug 2019  路  7Comments  路  Source: tektoncd/pipeline

Currently, the step(s) added by GetUploadContainerSpec() won't run if an earlier step fails, since the pod won't run any steps after a failure. This makes things like reporting a build failure via a PullRequestResource or uploading the workspace to a storage resource even if the build fails impossible. This should be configurable, ideally both as something that can be inherent to the resource type (i.e., you're always going to want to run the PullRequestResource upload) and something that can be configured on an individual resource (i.e., you're not _always_ going to want to upload to a storage resource after a failure, but sometimes you might!).

kinfeature prioritimportant-soon triagduplicate

Most helpful comment

My personal feeling is that Notifications (#49) could be a nice answer here. Notifications could use PipelineResources and their job would be to:
1) execute after Tasks have completed
2) have access to metadata about the execution of previous Tasks / execution of a Pipeline

All 7 comments

At first glance, it looks like we could do this by adding a flag to the entrypoint binary to ignore the presence of a [wait file].err file and run the command anyway. Also seems like this kind of logic is how we'd want to do a more general conditional execution of steps within a Task? (cc @dibyom)

This seems like a finally type clause for a Task. How do we add this to the API? Should something like a continueOnFailure be part of the Task? (Or for something like a pull request resource, does it make sense to add it as part of the resource spec or maybe even the resource interface?)

I think for these particular purposes it'd be step-specific, so it would probably make sense to add it to Step in some form. Then you could specify it in either a resource spec or the resource type, or just specify it on an arbitrary Step as well.

Currently, the step(s) added by GetUploadContainerSpec() won't run if an earlier step fails, since the pod won't run any steps after a failure. This makes things like reporting a build failure via a PullRequestResource or uploading the workspace to a storage resource even if the build fails impossible.

Right now we have an implicit assumption that a failed step will stop the rest of the steps in a Task, and that a failed Task will stop the execution of the pipeline (except for tasks that are already running). I think the assumption is correct for sequential items - like steps - but we should have some finally type of clause, like proposed in the design for conditionals.
For Tasks it's a bit different, the part of the DAG that depends on the failed Task should not be executed - but parts that are independent might still be executed - so this could be a configuration option in the Pipeline.

This should be configurable, ideally both as something that can be inherent to the resource type (i.e., you're always going to want to run the PullRequestResource upload) and something that can be configured on an individual resource (i.e., you're not _always_ going to want to upload to a storage resource after a failure, but sometimes you might!).

The way I solved this in the past was by using async Tasks - i.e. use a cloud event (which is sent regardless of the exit condition of the Task) to trigger an external TaskRun that is responsible to update GitHub. While this increases complexity, it allows decoupling the notification behaviour as well as the status from the initial pipeline. For instance I can add extra postprocessing (like PR update) without changing the original pipeline and also the status of the original pipeline won't be affected by the outcome of the postprocessing itself.

My personal feeling is that Notifications (#49) could be a nice answer here. Notifications could use PipelineResources and their job would be to:
1) execute after Tasks have completed
2) have access to metadata about the execution of previous Tasks / execution of a Pipeline

Closing this in favor of https://github.com/tektoncd/pipeline/issues/2448
/triage duplicate
/close

@dibyom: Closing this issue.

In response to this:

Closing this in favor of https://github.com/tektoncd/pipeline/issues/2448
/triage duplicate
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

objectiveous picture objectiveous  路  3Comments

castlemilk picture castlemilk  路  4Comments

ImJasonH picture ImJasonH  路  4Comments

chmouel picture chmouel  路  3Comments

sujithjoseph picture sujithjoseph  路  3Comments