Describe the bug
I have a bootstrap-application where the namespaces get created, and a sub-application where the other objects in this namespace get created.
applications have autosync enabled and self-heal enabled. I delete namespace where sub-application gets deployed with "kubectl delete namespace
kubectl failed exit status 1: Error from server (Forbidden): deployments.apps "t-maeve-api-maeve" is forbidden: unable to create new content in namespace t-maeve-api because it is being terminated
To Reproduce
Steps to reproduce the behavior:
Expected behavior
sub-application should retry to sync
Version
v1.2.0
I did a second try and this time also the bootstrap-application stucks in out-of-sync because I also create a network-policy for the deleted namespace in the bootstrap-application. So to summarize it, I just think there should be a retry, no matter if you have the applications-of-applications pattern implemented or not.
maybe this log message of the controller helps:
time="2019-09-06T14:24:28Z" level=warning msg="Skipping auto-sync: failed previous sync attempt to 0580ffb40f36d40b834ab88034b5de17b1942547" application=bootstrap-application
I am sorry I read through https://github.com/argoproj/argo-cd/issues/2156 and now I am curious if I set something wrong ... I will check and come back later
I checked it again. I have selfHeal enabled and version is v1.2.0 . However, it gets stuck in "out-of-sync" as I described. So no obvious configuration error at my side. When I start a manuel sync, everything works again.
If I understand https://github.com/argoproj/argo-cd/blob/9e486dfad4e57be359a4caef7a2da2629b7e49a8/controller/appcontroller.go#L904 correctly, the condition is True in my case and so it logs https://github.com/argoproj/argo-cd/blob/9e486dfad4e57be359a4caef7a2da2629b7e49a8/controller/appcontroller.go#L906
here is the logfile of the application controller.
As you manually deleted the namespace, can you manually resolve this?
yes I can, I just need to sync the bootstrap-application manuallly in the argocd webconsole and then I also need to sync the sub-application manually.
However, I assumed that with selfHealing enabled argocd always tries to get into the desired state automatically. Is this assumption wrong? Are there still situations where I must do a manual sync per design?
Mostly, you have a work-around.
Hi, I got the exact same problem on ArgoCD 1.4.2.
The app try to self heal during the namespace deletion and get stuck in error state even after the namespace re-creation by the parent app.
The only way to remove the error state is to do a manual sync, which is not something expected from a self heal feature ;-)
Also having an Issue with this, due to some tls-certificate synchronisation I have one repo with only secrets and another with the rest of the application, and from time to time the secrets repo gets checked first by argocd, which will then fail because of the missing namespace and being stuck until manually synct.
I am not sure if this is a completely uncommon use-case.
The ability to retry failed sync attempts have been implemented in by https://github.com/argoproj/argo-cd/pull/3997.
Retry is supported in both auto and manual syncs. By default retries are enabled for auto sync.
Most helpful comment
Also having an Issue with this, due to some tls-certificate synchronisation I have one repo with only secrets and another with the rest of the application, and from time to time the secrets repo gets checked first by argocd, which will then fail because of the missing namespace and being stuck until manually synct.
I am not sure if this is a completely uncommon use-case.