Argo-cd: Controller stuck, probably dead lock

Created on 19 May 2020  路  5Comments  路  Source: argoproj/argo-cd

The controller stuck after updating Argo CD config map:

Grouting dump: dump.txt

Version : v1.5.4

bug high criticial core

Most helpful comment

I don't know if its the same issue, but I have observed similar behaviour. I have a simple Argo CD application consisting of only secrets and config maps. Occasionally (once in a few runs) the application is never getting to Synced/Healthy state. What's more, is that application-controller is not working anymore, it's neither syncing anything nor self healing. Restarting of application-controller pod fixes stuff immediately.

I have created a script to reproduce this issue pretty consistently (although it takes some time, because usually a few iterations are needed). However, I think it occurs more frequently the more secrets/config maps there is.

The code and results are available here:
https://github.com/code4free/argocd-stuck

If you feel that it might be separate issue I will create a separate ticket.

All 5 comments

I don't know if its the same issue, but I have observed similar behaviour. I have a simple Argo CD application consisting of only secrets and config maps. Occasionally (once in a few runs) the application is never getting to Synced/Healthy state. What's more, is that application-controller is not working anymore, it's neither syncing anything nor self healing. Restarting of application-controller pod fixes stuff immediately.

I have created a script to reproduce this issue pretty consistently (although it takes some time, because usually a few iterations are needed). However, I think it occurs more frequently the more secrets/config maps there is.

The code and results are available here:
https://github.com/code4free/argocd-stuck

If you feel that it might be separate issue I will create a separate ticket.

I am also seeing this issue on v1.5.4 with non HA configuration and small scale (~70 applications). I've been resorting to restarting the application-controller whenever it gets wedged.

Thank you @code4free ! I owe you a beer 馃嵑

I consistently can reproduce deadlock using the attached application. The bug is in SettingsManager.notifySubscribers method:

https://github.com/argoproj/argo-cd/blob/0fdef4861e12026e133224f7c9413072340e2983/util/settings/settings.go#L1029-L1039

Apparently I already fixed it in master:

https://github.com/argoproj/argo-cd/blob/fe8d47e0eae9fe5bd9dbcf52ca898fe4b4ba9ce1/util/settings/settings.go#L1052-L1066

I think fix should be cherry-picked into 1.5. Working on it

cherry-picked into 1.5

@alexmt cool, happy to help. I tried to reduce our application to the minimal example that still produces the same result and that's how i ended up with an application with only secrets and configmaps. I have also noticed, that the more secrets/configmaps, the more often this error occurs.

Also I didn't observe the same behaviour (or I was just lucky), when the argocd itself was in different namespace, then the application "destination" namespace.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ksaito1125 picture ksaito1125  路  3Comments

KarstenSiemer picture KarstenSiemer  路  3Comments

clintberry picture clintberry  路  3Comments

hulu1522 picture hulu1522  路  3Comments

travis-sobeck picture travis-sobeck  路  3Comments