What happened/what you expected to happen?
Running synchronization-tmpl-level.yam the locks are acquired, but are not released once the step is finished. The workflow keeps running and the other steps are waiting with Message Waiting for default/configmap/workflow-synchronization/template lock. Lock status: 0/2 (same behavior with DAG). Synchronization works on workflow level.
What version of Argo Workflows are you running?
v2.10.2
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
annotations:
argo: workflows
creationTimestamp: "2020-09-16T06:59:40Z"
generateName: synchronization-tmpl-level-
generation: 8
labels:
workflows.argoproj.io/phase: Running
managedFields:
- apiVersion: argoproj.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:generateName: {}
f:spec:
.: {}
f:arguments: {}
f:entrypoint: {}
f:templates: {}
f:status:
.: {}
f:finishedAt: {}
manager: argo
operation: Update
time: "2020-09-16T06:59:40Z"
- apiVersion: argoproj.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:argo: {}
f:labels:
.: {}
f:workflows.argoproj.io/phase: {}
f:spec:
f:parallelism: {}
f:serviceAccountName: {}
f:ttlStrategy:
.: {}
f:secondsAfterCompletion: {}
f:secondsAfterFailure: {}
f:secondsAfterSuccess: {}
f:status:
f:nodes:
.: {}
f:synchronization-tmpl-level-xxgc2:
.: {}
f:children: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-327139691:
.: {}
f:boundaryID: {}
f:children: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-633772542:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:message: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-1878609776:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:message: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-2314512256:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:message: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-2913002658:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:hostNodeName: {}
f:id: {}
f:name: {}
f:outputs:
.: {}
f:artifacts: {}
f:exitCode: {}
f:phase: {}
f:resourcesDuration:
.: {}
f:cpu: {}
f:memory: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-3085788296:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:hostNodeName: {}
f:id: {}
f:name: {}
f:outputs:
.: {}
f:artifacts: {}
f:exitCode: {}
f:phase: {}
f:resourcesDuration:
.: {}
f:cpu: {}
f:memory: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:phase: {}
f:startedAt: {}
f:synchronization:
.: {}
f:semaphore:
.: {}
f:holding: {}
f:waiting: {}
manager: workflow-controller
operation: Update
time: "2020-09-16T07:00:01Z"
name: synchronization-tmpl-level-xxgc2
namespace: default
resourceVersion: "39383194"
selfLink: /apis/argoproj.io/v1alpha1/namespaces/default/workflows/synchronization-tmpl-level-xxgc2
uid: 1630345b-c478-4013-b4e7-1435c5ba901c
spec:
arguments: {}
entrypoint: synchronization-tmpl-level-example
parallelism: 3
serviceAccountName: argo
templates:
- arguments: {}
inputs: {}
metadata: {}
name: synchronization-tmpl-level-example
outputs: {}
steps:
- - arguments:
parameters:
- name: seconds
value: '{{item}}'
name: synchronization-acquire-lock
template: acquire-lock
withParam: '["1","2","3","4","5"]'
- arguments: {}
container:
args:
- sleep 10; echo acquired lock
command:
- sh
- -c
image: alpine:latest
name: ""
resources: {}
inputs: {}
metadata: {}
name: acquire-lock
outputs: {}
synchronization:
semaphore:
configMapKeyRef:
key: template
name: workflow-synchronization
ttlStrategy:
secondsAfterCompletion: 600
secondsAfterFailure: 43200
secondsAfterSuccess: 600
status:
finishedAt: null
nodes:
synchronization-tmpl-level-xxgc2:
children:
- synchronization-tmpl-level-xxgc2-327139691
displayName: synchronization-tmpl-level-xxgc2
finishedAt: null
id: synchronization-tmpl-level-xxgc2
name: synchronization-tmpl-level-xxgc2
phase: Running
startedAt: "2020-09-16T06:59:40Z"
templateName: synchronization-tmpl-level-example
templateScope: local/synchronization-tmpl-level-xxgc2
type: Steps
synchronization-tmpl-level-xxgc2-327139691:
boundaryID: synchronization-tmpl-level-xxgc2
children:
- synchronization-tmpl-level-xxgc2-3085788296
- synchronization-tmpl-level-xxgc2-2913002658
- synchronization-tmpl-level-xxgc2-1878609776
- synchronization-tmpl-level-xxgc2-633772542
- synchronization-tmpl-level-xxgc2-2314512256
displayName: '[0]'
finishedAt: null
id: synchronization-tmpl-level-xxgc2-327139691
name: synchronization-tmpl-level-xxgc2[0]
phase: Running
startedAt: "2020-09-16T06:59:40Z"
templateName: synchronization-tmpl-level-example
templateScope: local/synchronization-tmpl-level-xxgc2
type: StepGroup
synchronization-tmpl-level-xxgc2-633772542:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(3:4)
finishedAt: null
id: synchronization-tmpl-level-xxgc2-633772542
message: 'Waiting for default/configmap/workflow-synchronization/template
lock. Lock status: 0/2 '
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(3:4)
phase: Pending
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-1878609776:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(2:3)
finishedAt: null
id: synchronization-tmpl-level-xxgc2-1878609776
message: 'Waiting for default/configmap/workflow-synchronization/template
lock. Lock status: 0/2 '
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(2:3)
phase: Pending
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-2314512256:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(4:5)
finishedAt: null
id: synchronization-tmpl-level-xxgc2-2314512256
message: 'Waiting for default/configmap/workflow-synchronization/template
lock. Lock status: 0/2 '
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(4:5)
phase: Pending
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-2913002658:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(1:2)
finishedAt: "2020-09-16T06:59:56Z"
hostNodeName: eoc-gzs-pn02-vm
id: synchronization-tmpl-level-xxgc2-2913002658
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2)
outputs:
artifacts:
- archiveLogs: true
name: main-logs
s3:
accessKeySecret:
key: accesskey
name: artifact-s3-secret
bucket: gzs-workflow-artifacts
endpoint: artifact-minio-service:9000
insecure: true
key: default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658/main.log
secretKeySecret:
key: secretkey
name: artifact-s3-secret
exitCode: "0"
phase: Succeeded
resourcesDuration:
cpu: 23
memory: 23
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-3085788296:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(0:1)
finishedAt: "2020-09-16T06:59:59Z"
hostNodeName: eoc-gzs-pn02-vm
id: synchronization-tmpl-level-xxgc2-3085788296
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1)
outputs:
artifacts:
- archiveLogs: true
name: main-logs
s3:
accessKeySecret:
key: accesskey
name: artifact-s3-secret
bucket: gzs-workflow-artifacts
endpoint: artifact-minio-service:9000
insecure: true
key: default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296/main.log
secretKeySecret:
key: secretkey
name: artifact-s3-secret
exitCode: "0"
phase: Succeeded
resourcesDuration:
cpu: 26
memory: 26
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
phase: Running
startedAt: "2020-09-16T06:59:40Z"
synchronization:
semaphore:
holding:
- holders:
- synchronization-tmpl-level-xxgc2-3085788296
- synchronization-tmpl-level-xxgc2-2913002658
semaphore: default/configmap/workflow-synchronization/template
waiting:
- holders:
- default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296
- default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658
semaphore: default/configmap/workflow-synchronization/template
Paste the logs from the workflow controller:
time="2020-09-16T06:59:40Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Updated phase -> Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Steps node synchronization-tmpl-level-xxgc2 initialized Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="StepGroup node synchronization-tmpl-level-xxgc2-327139691 initialized Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="default/configmap/workflow-synchronization/template acquired by default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296 " semaphore=default/configmap/workflow-synchronization/template
time="2020-09-16T06:59:40Z" level=info msg="Node synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1) acquired synchronization lock" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-3085788296 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Created pod: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1) (synchronization-tmpl-level-xxgc2-3085788296)" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="default/configmap/workflow-synchronization/template acquired by default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658 " semaphore=default/configmap/workflow-synchronization/template
time="2020-09-16T06:59:40Z" level=info msg="Node synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2) acquired synchronization lock" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-2913002658 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Created pod: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2) (synchronization-tmpl-level-xxgc2-2913002658)" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-1878609776 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-633772542 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-2314512256 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Workflow step group node synchronization-tmpl-level-xxgc2-327139691 not yet completed" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383030 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 message: ContainerCreating"
time="2020-09-16T06:59:41Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 message: ContainerCreating"
time="2020-09-16T06:59:41Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383043 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-2913002658
time="2020-09-16T06:59:42Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-3085788296
time="2020-09-16T06:59:47Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 status Pending -> Running"
time="2020-09-16T06:59:47Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383086 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 status Pending -> Running"
time="2020-09-16T06:59:51Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383104 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:56Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-2913002658
time="2020-09-16T06:59:58Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=info msg="Setting node synchronization-tmpl-level-xxgc2-2913002658 outputs"
time="2020-09-16T06:59:58Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 status Running -> Succeeded"
time="2020-09-16T06:59:58Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383134 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-3085788296
time="2020-09-16T06:59:59Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="Setting node synchronization-tmpl-level-xxgc2-3085788296 outputs"
time="2020-09-16T06:59:59Z" level=info msg="Labeled pod default/synchronization-tmpl-level-xxgc2-2913002658 completed"
time="2020-09-16T06:59:59Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383142 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 status Running -> Succeeded"
time="2020-09-16T07:00:01Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383194 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=info msg="Labeled pod default/synchronization-tmpl-level-xxgc2-3085788296 completed"
time="2020-09-16T07:00:02Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
I believe the parallelism: 3 is conflicting with the semaphore code for some reason
Found the bug, fixing
We've also seen this issue when a pod that has acquired the lock and is running, but the workflow gets deleted during this phase. The lock is never released.
Probably some sort of cleanup is also required during workflow deletion
We've also seen this issue when a pod that has acquired the lock and is running, but the workflow gets deleted during this phase. The lock is never released.
Yup, this will be included as part of this bug fix
@simster7 @sarabala1979 this looks like an issue that makes semaphores unusable - how can we quickly get this fixed, back-ported and released?
@simster7 @sarabala1979 this looks like an issue that makes semaphores unusable - how can we quickly get this fixed, back-ported and released?
I am currently working on a fix and refactor of the code, as I've found multiple issues with it.
To be clear, this does not make semaphores unusable – it only makes semaphores unusable _while using parallelism_ at the same time.
Dropping to P3 as work-around would be to either not use parallelism or not use semaphores.
@simster7 Any workaround for the lock acquired during workflow issue. Is there a way to manually reset the lock?
@simster7 Any workaround for the lock acquired during workflow issue. Is there a way to manually reset the lock?
Restarting the controller seems like the only way, unfortunately. Will try to get a fix out soon.
We use argo 2.7.1 and we also noticed this problem on our cluster, but we already removed parallelism in our workflow... A restart of the workflow-controller like said above does fix our issue.
We think that a deletion of a running workflow does not free the lock but we are not 100% sure of that... We are also using some workflowgc to delete a completed workflow 5 mins after completion so maybe It can cause some issues ?
@simster7 We are still facing issues of the lock not getting released in a running workflow. Argo version is 2.11.2
All the *.publish templates use the same semaphore

The value of the semaphore is 2. So the first 2 publish succeed but the 3 one is stuck waiting for the lock to be released. Let me know if you need me to create a sample example if it is easier to debug
@sarabala1979 can you investigate, please?
I will take a look
@firecast can you provide the workflow controller logs and reproducible workflow?
I tried multiple examples. it works for me. The last step may not be updated the message and status but it may be started already.
apiVersion: v1
kind: ConfigMap
metadata:
name: my-config
data:
template: "3"
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: synchronization-tmpl-level-
spec:
entrypoint: synchronization-tmpl-level-example
templates:
- name: synchronization-tmpl-level-example
steps:
- - name: synchronization-acquire-lock
template: acquire-lock
arguments:
parameters:
- name: seconds
value: "{{item}}"
withParam: '["1","2","3","4","5"]'
- name: acquire-lock
synchronization:
semaphore:
configMapKeyRef:
name: my-config
key: template
container:
image: alpine:latest
command: [sh, -c]
args: ["sleep 10; echo acquired lock"]
Sure. Will try to create a reproducible one from my end and share within a day @sarabala1979
apiVersion: v1
kind: ConfigMap
metadata:
name: semaphore
data:
template: "1"
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: workflow-template-whalesay-template
spec:
entrypoint: whalesay-template
templates:
- name: whalesay-template
synchronization:
semaphore:
configMapKeyRef:
name: semaphore
key: template
inputs:
parameters:
- name: message
container:
image: docker/whalesay
command: [cowsay]
args: ["{{inputs.parameters.message}}"]
---
apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: workflow-template-hello-world
spec:
entrypoint: whalesay
templates:
- name: whalesay
dag:
tasks:
- name: call-whalesay-template-1
templateRef:
name: workflow-template-whalesay-template
template: whalesay-template
arguments:
parameters:
- name: message
value: "hello world"
- name: call-whalesay-template-2
dependencies:
- call-whalesay-template-1
templateRef:
name: workflow-template-whalesay-template
template: whalesay-template
arguments:
parameters:
- name: message
value: "hello world 2"
- name: call-whalesay-template-3
dependencies:
- call-whalesay-template-2
templateRef:
name: workflow-template-whalesay-template
template: whalesay-template
arguments:
parameters:
- name: message
value: "hello world 3"
---
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: hello-world
spec:
workflowTemplateRef:
name: workflow-template-hello-world
@sarabala1979 This example replicates the issue.

Most helpful comment
I am currently working on a fix and refactor of the code, as I've found multiple issues with it.
To be clear, this does not make semaphores unusable – it only makes semaphores unusable _while using parallelism_ at the same time.