Argo: Workflow semaphores

Created on 31 Mar 2020  路  19Comments  路  Source: argoproj/argo

Summary

It should be possible to prevent two workflows from running at the same time based on a key (semaphore name).

Motivation

TODO

Proposal

TODO



Message from the maintainers:

If you wish to see this enhancement implemented please add a 馃憤 reaction to this issue! We often sort issues this way to know what to prioritize.

enhancement

Most helpful comment

Motivation

A use case which has come up multiple times is where there are many workflows submitted at once, but the number of parallel executions of the workflow or even an individual step in the workflow needs to be limited/mutually exclusive. This issue is to introduce some type of Mutex or Semaphore functionality in workflows to limit the total number of concurrently running workflows from executing the same workflow or step.

Note that currently we already have a parallelism configuration in the controller. However, this setting applies to all workflows in the system, and is not granular to a class of workflows, or step. There is also a parallelism setting at a workflow and template level, but this only restricts total concurrent executions of steps from within the same workflow.

Workflows should support a separate concept of a semaphore, which can be referenced inside a workflow, which is cross cutting across workflows.

The use case that semaphores allows to be solved are:

  1. it allows the restriction of a certain class of workflows from parallel execution, but not restrict others
  2. it would allows the limitation of a specific step, to be restricted (e.g. a deploy step of a CI/CD pipeline)

Proposal

A proposed syntax is to referencing a separately defined "semaphore" setting, which is defined as a integer value in a configmap:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-semaphore-
spec:
  semaphoreRef:
    configMap: my-semaphore
    key: concurrency
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-semaphore
data:
  concurrency: "3"

In the above example, only three workflows which all reference the my-semaphore configmap would be allowed to execute at the same time. Other workflows, which do not reference the semaphore, would be allowed to run.

A second example is to limit the concurrency at a step level from being executed across workflows:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: template-semaphore-
spec:
  entrypoint: sleep
  templates:
  - name: sleep
    semaphoreRef:
      configMap: my-semaphore
      key: concurrency
    container:
      image: alpine:latest
      command: [sh, -c, sleep 10]
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-semaphore
data:
  concurrency: "3"

The above example would not restrict any workflows from being executed, but when concurrent workflows attempted to run the sleep template, only three would allowed to be run.

All 19 comments

In terms of motivation: how about exclusive resources (e.g. a GPU) ? Is the idea behind the feature that argo would queue and serialize the workflows ?

That is one use case - more suggestions welcome!

Motivation

A use case which has come up multiple times is where there are many workflows submitted at once, but the number of parallel executions of the workflow or even an individual step in the workflow needs to be limited/mutually exclusive. This issue is to introduce some type of Mutex or Semaphore functionality in workflows to limit the total number of concurrently running workflows from executing the same workflow or step.

Note that currently we already have a parallelism configuration in the controller. However, this setting applies to all workflows in the system, and is not granular to a class of workflows, or step. There is also a parallelism setting at a workflow and template level, but this only restricts total concurrent executions of steps from within the same workflow.

Workflows should support a separate concept of a semaphore, which can be referenced inside a workflow, which is cross cutting across workflows.

The use case that semaphores allows to be solved are:

  1. it allows the restriction of a certain class of workflows from parallel execution, but not restrict others
  2. it would allows the limitation of a specific step, to be restricted (e.g. a deploy step of a CI/CD pipeline)

Proposal

A proposed syntax is to referencing a separately defined "semaphore" setting, which is defined as a integer value in a configmap:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-semaphore-
spec:
  semaphoreRef:
    configMap: my-semaphore
    key: concurrency
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-semaphore
data:
  concurrency: "3"

In the above example, only three workflows which all reference the my-semaphore configmap would be allowed to execute at the same time. Other workflows, which do not reference the semaphore, would be allowed to run.

A second example is to limit the concurrency at a step level from being executed across workflows:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: template-semaphore-
spec:
  entrypoint: sleep
  templates:
  - name: sleep
    semaphoreRef:
      configMap: my-semaphore
      key: concurrency
    container:
      image: alpine:latest
      command: [sh, -c, sleep 10]
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-semaphore
data:
  concurrency: "3"

The above example would not restrict any workflows from being executed, but when concurrent workflows attempted to run the sleep template, only three would allowed to be run.

A complimentary concept could also be mutexes. Although mutexes are simply a binary semaphore, we would introduce it as a convenience so that deploying separate configmap object is not necessary for defining the value of the semaphore:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-mutex-
spec:
  mutex: foo
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: template-mutex-
spec:
  entrypoint: sleep
  templates:
  - name: sleep
    mutex: foo
    container:
      image: alpine:latest
      command: [sh, -c, sleep 10]

@jessesuen given the relative simplicity of mutexs, could we implement them instead of semaphores? It'll be quicker and less buggy.

Mutex is simple and easy. But It won't give the flexibility of configured no of parallelism to the group of workflows. Semaphore will solve more use-cases (including mutex use- case). I think Semaphore will be the right choice.

I get that, I want to be confident that the additional complexity is valuable as I don't see how you could implement semaphores with the same quality as mutexs.

So my question is - what are the additional uses case we cannot support with mutxs?

what are the additional uses case we cannot support with mutexs?

A mutex alone, will not be able to support the workqueue pattern when there are more than one workers.

I agree with your point about it will be more complex to implement semaphores, but I do think it will be worthwhile. I also believe the implementation for semaphores vs. mutexes will be entirely different, so we could consider breaking this feature into two pieces.

Design requirements:

  1. mutexes and semaphores are namespaced. Workflows in separate namespaces will lock a mutex or semaphore of the same name independently of other namespaces.
  2. When notifying blocked workflows, we should probably consider the option (or even the default behavior) workflow priority when deciding who to notify about availability of the mutex/semaphore. Doing so will enable a prioritized workqueue pattern.

@jessesuen can we break this into two issue please?

@alexec @jessesuen we would really appreciate this feature.

Here is our use-case:

We use workflows to make automated updates to GitHub repos in our CI/CD pipeline. The events that trigger these workflows are pushes to our Docker registry.

There are scenarios where multiple commits need to be made to the same repository at the same time. Each workflow effectively handles a single commit, and so if 2 workflows try to commit to the same git repo at once, one of the workflows will fail.

We want to effectively serialize workflows via a mutex keyed on the github repo they鈥檇 be committing to.

It should be possible to set the semaphore based on variables.

@jessesuen can we break this into two issue please?

I opened https://github.com/argoproj/argo/issues/2677 separately for workflow mutexes.

Use case:
We use templates to trigger builds based upon a repo and committer basis. We would like to limit the number of concurrent builds for a given user on a given repo, based upon that template. How would that work here?

For context, we include the Git committer as a workflow-level variable, along with the Git repo.

Multiple semaphores per workflow?

Would it be possible to specify multiple semaphores per workflow? We would like to limit 1) the number of concurrent workflows from a given workflow template and 2) the number of concurrent workflows from a given group of users or individual user.

Example:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: workflow-semaphore-
spec:
  semaphoreRefs:
    - configMap: wftmpl-limit
      key: concurrency
    - configMap: user-group-limit
      key: concurrency

Can we know that a semaphore is blocking execution?

Will we be able to tell, using the Argo API, that a semaphore is blocking a workflow's execution? And, if multiple semaphores are allowed, will we be able to tell which one(s) are blocking execution?

One other suggestion, given that there is potentially a queue of pending workflows, is that you may want to provide a TTL for queuing a new workflow if it's blocked by an existing workflow, that is separate from the workflow execution TTL.

A use case that is vital to us:
Being able to label a semaphore at runtime. For example, we want to raise an employee's salary as part of a workflow or step. In the meantime, other business logic (i.e. other workflows or steps) should not be allowed to read or write the value. However, other employees would be allowed to read their own salary. In other words: we would want to be able to "instantiate" the salary-semaphore with parameters.

apiVersion: v1
kind: ConfigMap
metadata:
  name: salary-semaphore
data:
  concurrency: 1
  inputs:
    parameters:
    - name: user-id

Then a step / workflow for raising a salary could look like this:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: raise-salary-workflow-
spec:
  entrypoint: raise-salary

  templates:
  - name: raise-salary

    inputs:
      parameters:
      - name: user-id
      - name: new-salary

    # per-step semaphore
    semaphoreRef:
      configMap: salary-semaphore
      key: concurrency
      # parameterised semaphore, so it requires inputs
      inputs:
        parameters:
        - name: user-id
        # pipe the input that was passed to the step
        - value: {{inputs.parameters.user-id}}

    container:
      # ...

If the workflow is executed three times with the parameter arguments:

  • E1:

    • user-id: 1

    • new-salary: 100

  • E2:

    • user-id: 1

    • new-salary: 110

  • E3:

    • user-id: 2

    • new-salary: 90

Then only E1 and E2 cannot be run in parallel, whereas E3 can run in parallel to either one of the other executions.

I would love this addition! Right now we are running into issues where we hit resource limits within our cluster, specifically on volume claims. Although not required for our specific use case, a way to increment the semaphore by numbers other than 1 would be great. If my workflow uses 2 PVCs I'd like to be able to indicate that so that we can easily limit resource consumption of Argo as a whole.

@YourPsychiatrist i have opened a very closely related enhancement issue for mutexes that would also satisfy your use case: #3955 . just linking it here for reference and good measure

Was this page helpful?
0 / 5 - 0 ratings