Argo-cd: auto-sync apply only once for one commit

Created on 12 Jun 2019  路  6Comments  路  Source: argoproj/argo-cd

Describe the bug
According to Argocd documentation "Argo CD has the ability to automatically sync an application when it detects differences between the desired manifests in Git, and the live state in the cluster"

However, it does not worked as expected

To Reproduce
Deploy test application and configure auto-synced option as documented in the documentation
~/$ ./argocd app get guestbook
Name: guestbook
[...]
Repo: https://github.com/argoproj/argocd-example-apps.git
Path: guestbook
Sync Policy: Automated (Prune)
Sync Status: Synced to HEAD (0ad95c5)

The status is synced, now lets delete the services via kubectl -nNamespaceUsed delete service/guestbook-ui
Argocd correctly reports the outofSync, at this step it is expected that argocd resync automatically, which is not the case even after 30minutes

~$ ./argocd app get guestbook
Name: guestbook
[...]
Repo: https://github.com/argoproj/argocd-example-apps.git
Path: guestbook
Sync Policy: Automated (Prune)
Sync Status: OutOfSync from HEAD (0ad95c5)

Expected behavior
Argo should automatically synchronise regarding the autosync policy configuration

Version

argocd: v1.0.1+5fe1447.dirty
BuildDate: 2019-05-28T17:26:35Z
GitCommit: 5fe1447b722716649143c63f9fc054886d5b111c
GitTreeState: dirty
GoVersion: go1.11.4
Compiler: gc
Platform: linux/amd64
argocd-server: v1.0.1+5fe1447.dirty
BuildDate: 2019-05-28T17:27:38Z
GitCommit: 5fe1447b722716649143c63f9fc054886d5b111c
GitTreeState: dirty
GoVersion: go1.11.4
Compiler: gc
Platform: linux/amd64
Ksonnet Version: 0.13.1

Have you thought about contributing a fix yourself?
Short term workaround, perform cron argocd app sync AppName

bug

Most helpful comment

Yes when auto-sync was implemented, we err'd on the side of caution and only auto-synced once per commitSHA. We did not want to take the approach of blindly syncing every time we were detected to be OutOfSync because it could cause an infinite sync loop. So if we implement the feature, we definitely need some rate limiting knobs here.

Also, keep in mind that sync hooks get invoked when a auto-sync occurs.

If want to distinguish this from auto-sync, we could term this as self-healing feature. e.g.:

  syncPolicy:
    automated:
      prune: true
      selfHealing: true

Self-healing would react to cluster events (e.g. someone deleted a managed resource that should be running)

All 6 comments

Git is treated as the master for the state of your application. Therefore, auto-sync is triggered as a result of changes to Git, I don't think that they are triggered by changes to your cluster.

@alexmt thoughts?

Auto-sync also supposed to be triggered if anything changes in the cluster. Some resources like mutating webhook might become out of sync right after syncing. In order to avoid infinite syncing loop, auto-sync refuses to sync the same revision twice even if app is in out-of-sync state.

After discussion with @jessesuen we decided to improve that protection. Instead of stop trying to sync we need to introduce auto-sync rate limit ( e.g. add syncMaxRetries or a syncRetryTimeout to syncPolicy https://argoproj.slack.com/archives/CASHNF6MS/p1560300424293100?thread_ts=1560298269.289500&cid=CASHNF6MS)

@alexmt this was a namespace that was deleted, no something I would expect it prod due to the large blast radius and unpredictable impact - could that affect this?

I think in this case service was deleted, is not it? (now lets delete the services via kubectl -nNamespaceUsed delete service/guestbook-ui )/ Even if namespace in the target cluster got deleted then auto-sync should attempt to fix it.

Yes when auto-sync was implemented, we err'd on the side of caution and only auto-synced once per commitSHA. We did not want to take the approach of blindly syncing every time we were detected to be OutOfSync because it could cause an infinite sync loop. So if we implement the feature, we definitely need some rate limiting knobs here.

Also, keep in mind that sync hooks get invoked when a auto-sync occurs.

If want to distinguish this from auto-sync, we could term this as self-healing feature. e.g.:

  syncPolicy:
    automated:
      prune: true
      selfHealing: true

Self-healing would react to cluster events (e.g. someone deleted a managed resource that should be running)

A self-healing feature for out-of-sync resources together with PR #1794 would also fix my issue in #1646 :heart:

Was this page helpful?
0 / 5 - 0 ratings