Pipeline: Idea: Pipeline Mutexes

Created on 18 Jun 2020  路  13Comments  路  Source: tektoncd/pipeline

This was an idea that @k floated to me awhile back, but I finally got around to making an issue to discuss. What I'm curious about:

  • Is this a use case we want to focus on?
  • Is this worth make this a built-in feature? (as opposed to a Catalog feature)
  • Any other features / alternatives to consider

Details intentionally vague - this is a "should we do this?" issue, not a "how we'll do this" issue.

Idea

I may want to control how Pipelines run in relation to others and ensure only 1 pipeline for a given selector can run at a time (hence a "mutex").
I may want to reject new Pipelines if one if a similar one is already running, or queue it up and just make sure it does not run in parallel. This might be because:

  • I have a presubmit pipeline that I want to only run one instance at a time per pull request to reduce costs (e.g. in case someone pushes multiple commits. I only need to run the most recent and can cancel the currently running pipelines).
  • My pipeline mutates some external state, and I want to make sure only one thing operates on it at a time.

Possible solution

Have a mechanism to select conditions to allow Pipeline execution, as well as a strategy for what to do in response.

Examples

If a new Pipeline is created that was labelled as a pull request, cancel existing runs.

selector: repo=foo, type=pullrequest
strategy: cancel

Only run 1 pipeline at a time that was labelled as being started by a push to master. (does not guarantee ordering)

selector: repo=foo, type=push, ref=master
strategy: queue

Deny new pipeline create requests if they match a pipeline currently running.

selector: repo=foo, type=push, ref=master
strategy: deny

Alternatives

Implement as a task

  • Cancellation could be handled by having the first step of every pipeline could include something along the lines of

    kubectl delete pipelinerun -l foo=bar
    

    This would clobber over any other Pipelines with a particular label.

  • Queueing could be handled by having a Condition that runs kubectl get for running pods, and only proceed if a condition is true. This is difficult since you'd have to get creative in inspecting runtime information of other runs (e.g. are they also in a wait state, or are they running). This also creates container waste since the pipelines would all be running.

  • Deny could not be implemented this way.

kinfeature lifecyclrotten

Most helpful comment

We've built a queueing system to manage our way around this problem, so a +1 for it being a useful thing to tackle. I don't know whether it should be a core primitive or in the catalog, but in our use case is was necessary at a very early stage, and for deployments specifically it seems to me that having one deployment per app per environment at a time, and ideally in a sensible order, is going to be a very common requirement. So I'm leaning towards 'core'.

All 13 comments

/kind feature

We've built a queueing system to manage our way around this problem, so a +1 for it being a useful thing to tackle. I don't know whether it should be a core primitive or in the catalog, but in our use case is was necessary at a very early stage, and for deployments specifically it seems to me that having one deployment per app per environment at a time, and ideally in a sensible order, is going to be a very common requirement. So I'm leaning towards 'core'.

I'd really appreciate this as well, perhaps with Task granularity rather than pipeline. My use case for this, which is to do with cross-talk between concurrent runs of integration tests/db management. In an integration test scenario, for example, the tasks depend on an external resource. If that resource is stateful (like a database), some tasks are rebuilding the database while others might be executing tests which use the database. I'd love to be able to single-thread pipeline runs through the integration test phase.

I also think this is a very common use case for a CI/CD Pipeline.
Our scenario is that we have one test cluster for all the created PRs. For instance when 2 developers opens 2 PRs, the pipeline should test one PR against the test cluster first and set the status for corresponding PR. Meanwhile all other pipelineRuns for other PRs should be queued until the cluster is free for the next run.

I got inspired by @tragiclifestories 's suggestion of a queueing system as a workaround, so I made one too. I documented the steps - hopefully it's useful to someone else while this is pending: https://medium.com/@holly.k.cummins/using-lease-resources-to-manage-concurrency-in-tekton-builds-344ba84df297

Interesting!

We took a different approach by storing the queue data in configmaps and defining all the queue operations as scripts that run in task steps. So no explicit modelling through CRDs but it works well enough for our use case.

Hopefully we'll get around to the blog-post stage of the project soon.

I got inspired by @tragiclifestories 's suggestion of a queueing system as a workaround, so I made one too. I documented the steps - hopefully it's useful to someone else while this is pending: https://medium.com/@holly.k.cummins/using-lease-resources-to-manage-concurrency-in-tekton-builds-344ba84df297

Nice :)
@pritidesai finally in action

Here is the use case we currently have. Imagine this simplified CD pipeline:
--> DeployToDev --> TestOnDev --> DeployToPreProd --> TestOnPreProd --> DeployToProd --> SmoketestOnProd
There are something like three sections Dev, PreProd and Prod.
The pipeline is started for a commit in a Git repository that holds the configuration of an application (GitOps). Now here are some requirements:

  • if a PipelineRun is in TestOnPreProd we don't want another PipelineRun to start DeployToPreProd because the test should be the result of the configuration that the first PipelineRun was started with.
  • but it is ok to have another PipelineRun starting to DeployToDev.
  • a PipelineRun must not take over another PipelineRun because the CD pipeline should deliver the stuff in the order as the configurations were committed to the Git repository.
  • to make things worse - we use the same pipeline to deploy different applications. So the exclusivity should be per section and application.

Currently we use a task in front of the section that polls a REST service to "ask to enter the section". The implementation of the REST service is specific to our pipeline and uses the Tekton API to analyse the state of all the PipelineRuns. It's ugly :blush: but it works so far.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

I'd prefer both task and pipeline granularity mutex.

Hi @pritidesai , is there any update about this issue? Thanks :)

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

Was this page helpful?
0 / 5 - 0 ratings