Argo-cd: sync fails when deleting namespace but no retry

Created on 6 Sep 2019  路  13Comments  路  Source: argoproj/argo-cd

Describe the bug

I have a bootstrap-application where the namespaces get created, and a sub-application where the other objects in this namespace get created.
applications have autosync enabled and self-heal enabled. I delete namespace where sub-application gets deployed with "kubectl delete namespace ". the following message occurs in argocd webconsole:

kubectl failed exit status 1: Error from server (Forbidden): deployments.apps "t-maeve-api-maeve" is forbidden: unable to create new content in namespace t-maeve-api because it is being terminated

To Reproduce

Steps to reproduce the behavior:

  1. create a bootstrap-application where you create a namespace and define another sub-application where the destination is this specific namespace. enable auto-sync and self-heal in both applications.
  2. let everything create by argocd
  3. delete the namespace with "kubectl delete namespace "
  4. bootstrap-application is out-of-sync because namespace is not there anymore and sub-application is out-of-sync because all other objects aren't there. sub-applications wants to create the objects but fails with error message described above
  5. bootstrap-application syncs and creates the namespae
  6. sub-application doesn't sync anymore. It gets stuck in this out-of-sync state.

Expected behavior

sub-application should retry to sync

Version

v1.2.0

bug workaround

Most helpful comment

Also having an Issue with this, due to some tls-certificate synchronisation I have one repo with only secrets and another with the rest of the application, and from time to time the secrets repo gets checked first by argocd, which will then fail because of the missing namespace and being stuck until manually synct.

I am not sure if this is a completely uncommon use-case.

All 13 comments

I did a second try and this time also the bootstrap-application stucks in out-of-sync because I also create a network-policy for the deleted namespace in the bootstrap-application. So to summarize it, I just think there should be a retry, no matter if you have the applications-of-applications pattern implemented or not.

maybe this log message of the controller helps:
time="2019-09-06T14:24:28Z" level=warning msg="Skipping auto-sync: failed previous sync attempt to 0580ffb40f36d40b834ab88034b5de17b1942547" application=bootstrap-application

I am sorry I read through https://github.com/argoproj/argo-cd/issues/2156 and now I am curious if I set something wrong ... I will check and come back later

I checked it again. I have selfHeal enabled and version is v1.2.0 . However, it gets stuck in "out-of-sync" as I described. So no obvious configuration error at my side. When I start a manuel sync, everything works again.

logfile.txt

here is the logfile of the application controller.

As you manually deleted the namespace, can you manually resolve this?

yes I can, I just need to sync the bootstrap-application manuallly in the argocd webconsole and then I also need to sync the sub-application manually.

However, I assumed that with selfHealing enabled argocd always tries to get into the desired state automatically. Is this assumption wrong? Are there still situations where I must do a manual sync per design?

Mostly, you have a work-around.

Hi, I got the exact same problem on ArgoCD 1.4.2.

The app try to self heal during the namespace deletion and get stuck in error state even after the namespace re-creation by the parent app.

The only way to remove the error state is to do a manual sync, which is not something expected from a self heal feature ;-)

Also having an Issue with this, due to some tls-certificate synchronisation I have one repo with only secrets and another with the rest of the application, and from time to time the secrets repo gets checked first by argocd, which will then fail because of the missing namespace and being stuck until manually synct.

I am not sure if this is a completely uncommon use-case.

The ability to retry failed sync attempts have been implemented in by https://github.com/argoproj/argo-cd/pull/3997.

Retry is supported in both auto and manual syncs. By default retries are enabled for auto sync.

Was this page helpful?
0 / 5 - 0 ratings