It should be possible to prevent two workflows from running at the same time based on a key (semaphore name).
TODO
TODO
Message from the maintainers:
If you wish to see this enhancement implemented please add a 馃憤 reaction to this issue! We often sort issues this way to know what to prioritize.
In terms of motivation: how about exclusive resources (e.g. a GPU) ? Is the idea behind the feature that argo would queue and serialize the workflows ?
That is one use case - more suggestions welcome!
A use case which has come up multiple times is where there are many workflows submitted at once, but the number of parallel executions of the workflow or even an individual step in the workflow needs to be limited/mutually exclusive. This issue is to introduce some type of Mutex or Semaphore functionality in workflows to limit the total number of concurrently running workflows from executing the same workflow or step.
Note that currently we already have a parallelism configuration in the controller. However, this setting applies to all workflows in the system, and is not granular to a class of workflows, or step. There is also a parallelism setting at a workflow and template level, but this only restricts total concurrent executions of steps from within the same workflow.
Workflows should support a separate concept of a semaphore, which can be referenced inside a workflow, which is cross cutting across workflows.
The use case that semaphores allows to be solved are:
A proposed syntax is to referencing a separately defined "semaphore" setting, which is defined as a integer value in a configmap:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: workflow-semaphore-
spec:
semaphoreRef:
configMap: my-semaphore
key: concurrency
apiVersion: v1
kind: ConfigMap
metadata:
name: my-semaphore
data:
concurrency: "3"
In the above example, only three workflows which all reference the my-semaphore configmap would be allowed to execute at the same time. Other workflows, which do not reference the semaphore, would be allowed to run.
A second example is to limit the concurrency at a step level from being executed across workflows:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: template-semaphore-
spec:
entrypoint: sleep
templates:
- name: sleep
semaphoreRef:
configMap: my-semaphore
key: concurrency
container:
image: alpine:latest
command: [sh, -c, sleep 10]
apiVersion: v1
kind: ConfigMap
metadata:
name: my-semaphore
data:
concurrency: "3"
The above example would not restrict any workflows from being executed, but when concurrent workflows attempted to run the sleep template, only three would allowed to be run.
A complimentary concept could also be mutexes. Although mutexes are simply a binary semaphore, we would introduce it as a convenience so that deploying separate configmap object is not necessary for defining the value of the semaphore:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: workflow-mutex-
spec:
mutex: foo
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: template-mutex-
spec:
entrypoint: sleep
templates:
- name: sleep
mutex: foo
container:
image: alpine:latest
command: [sh, -c, sleep 10]
@jessesuen given the relative simplicity of mutexs, could we implement them instead of semaphores? It'll be quicker and less buggy.
Mutex is simple and easy. But It won't give the flexibility of configured no of parallelism to the group of workflows. Semaphore will solve more use-cases (including mutex use- case). I think Semaphore will be the right choice.
I get that, I want to be confident that the additional complexity is valuable as I don't see how you could implement semaphores with the same quality as mutexs.
So my question is - what are the additional uses case we cannot support with mutxs?
what are the additional uses case we cannot support with mutexs?
A mutex alone, will not be able to support the workqueue pattern when there are more than one workers.
I agree with your point about it will be more complex to implement semaphores, but I do think it will be worthwhile. I also believe the implementation for semaphores vs. mutexes will be entirely different, so we could consider breaking this feature into two pieces.
Design requirements:
@jessesuen can we break this into two issue please?
@alexec @jessesuen we would really appreciate this feature.
Here is our use-case:
We use workflows to make automated updates to GitHub repos in our CI/CD pipeline. The events that trigger these workflows are pushes to our Docker registry.
There are scenarios where multiple commits need to be made to the same repository at the same time. Each workflow effectively handles a single commit, and so if 2 workflows try to commit to the same git repo at once, one of the workflows will fail.
We want to effectively serialize workflows via a mutex keyed on the github repo they鈥檇 be committing to.
It should be possible to set the semaphore based on variables.
@jessesuen can we break this into two issue please?
I opened https://github.com/argoproj/argo/issues/2677 separately for workflow mutexes.
Use case:
We use templates to trigger builds based upon a repo and committer basis. We would like to limit the number of concurrent builds for a given user on a given repo, based upon that template. How would that work here?
For context, we include the Git committer as a workflow-level variable, along with the Git repo.
Would it be possible to specify multiple semaphores per workflow? We would like to limit 1) the number of concurrent workflows from a given workflow template and 2) the number of concurrent workflows from a given group of users or individual user.
Example:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: workflow-semaphore-
spec:
semaphoreRefs:
- configMap: wftmpl-limit
key: concurrency
- configMap: user-group-limit
key: concurrency
Will we be able to tell, using the Argo API, that a semaphore is blocking a workflow's execution? And, if multiple semaphores are allowed, will we be able to tell which one(s) are blocking execution?
One other suggestion, given that there is potentially a queue of pending workflows, is that you may want to provide a TTL for queuing a new workflow if it's blocked by an existing workflow, that is separate from the workflow execution TTL.
A use case that is vital to us:
Being able to label a semaphore at runtime. For example, we want to raise an employee's salary as part of a workflow or step. In the meantime, other business logic (i.e. other workflows or steps) should not be allowed to read or write the value. However, other employees would be allowed to read their own salary. In other words: we would want to be able to "instantiate" the salary-semaphore with parameters.
apiVersion: v1
kind: ConfigMap
metadata:
name: salary-semaphore
data:
concurrency: 1
inputs:
parameters:
- name: user-id
Then a step / workflow for raising a salary could look like this:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: raise-salary-workflow-
spec:
entrypoint: raise-salary
templates:
- name: raise-salary
inputs:
parameters:
- name: user-id
- name: new-salary
# per-step semaphore
semaphoreRef:
configMap: salary-semaphore
key: concurrency
# parameterised semaphore, so it requires inputs
inputs:
parameters:
- name: user-id
# pipe the input that was passed to the step
- value: {{inputs.parameters.user-id}}
container:
# ...
If the workflow is executed three times with the parameter arguments:
Then only E1 and E2 cannot be run in parallel, whereas E3 can run in parallel to either one of the other executions.
I would love this addition! Right now we are running into issues where we hit resource limits within our cluster, specifically on volume claims. Although not required for our specific use case, a way to increment the semaphore by numbers other than 1 would be great. If my workflow uses 2 PVCs I'd like to be able to indicate that so that we can easily limit resource consumption of Argo as a whole.
@YourPsychiatrist i have opened a very closely related enhancement issue for mutexes that would also satisfy your use case: #3955 . just linking it here for reference and good measure
Most helpful comment
Motivation
A use case which has come up multiple times is where there are many workflows submitted at once, but the number of parallel executions of the workflow or even an individual step in the workflow needs to be limited/mutually exclusive. This issue is to introduce some type of Mutex or Semaphore functionality in workflows to limit the total number of concurrently running workflows from executing the same workflow or step.
Note that currently we already have a
parallelismconfiguration in the controller. However, this setting applies to all workflows in the system, and is not granular to a class of workflows, or step. There is also aparallelismsetting at a workflow and template level, but this only restricts total concurrent executions of steps from within the same workflow.Workflows should support a separate concept of a semaphore, which can be referenced inside a workflow, which is cross cutting across workflows.
The use case that semaphores allows to be solved are:
Proposal
A proposed syntax is to referencing a separately defined "semaphore" setting, which is defined as a integer value in a configmap:
In the above example, only three workflows which all reference the
my-semaphoreconfigmap would be allowed to execute at the same time. Other workflows, which do not reference the semaphore, would be allowed to run.A second example is to limit the concurrency at a step level from being executed across workflows:
The above example would not restrict any workflows from being executed, but when concurrent workflows attempted to run the
sleeptemplate, only three would allowed to be run.