Argo-cd: helm hooks are deleted right after creation, not after all hooks are created

Created on 18 Nov 2019  路  16Comments  路  Source: argoproj/argo-cd

Checklist:

  • [x] I've searched in the docs and FAQ for my answer: http://bit.ly/argocd-faq.
  • [x] I've included steps to reproduce the bug.
  • [x] I've pasted the output of argocd version.

Describe the bug

I am deploying prometheus-operator as helm dependency. It has helm hook resources including serviceaccount and job which uses this serviceaccount. I see that sa gets created and then deleted before job is created. As a result, job fails and sync fails.

To Reproduce

Try to deploy prometheus-operator as helm dependency with admission webhooks enabled in values:

prometheus-operator:
  prometheusOperator:
    admissionWebhooks:
      enabled: true

It will try to create serviceaccount hook https://github.com/helm/charts/blob/master/stable/prometheus-operator/templates/prometheus-operator/admission-webhooks/job-patch/serviceaccount.yaml then it will delete it, then it will try to create job https://github.com/helm/charts/blob/master/stable/prometheus-operator/templates/prometheus-operator/admission-webhooks/job-patch/job-createSecret.yaml and it will fail because serviceaccount have been deleted to this moment.

Expected behavior

Helm deletes hooks in one go after all hooks are done, this is not documented, but it's clear from code: https://github.com/helm/helm/blob/9b42702a4bced339ff424a78ad68dd6be6e1a80a/pkg/action/hooks.go#L98 - I expect the same behavior from argocd - to delete hook resources only all of them were processed (or after sync wave maybe). It makes sense because it's often needed to have hooks running in parallel or depending on each other.

Version

argocd: v1.3.0-rc1+8a43840
  BuildDate: 2019-10-16T21:44:49Z
  GitCommit: 8a43840f0baef50946d3eadc1d52d6a2abc162d5
  GitTreeState: clean
  GoVersion: go1.12.6
  Compiler: gc
  Platform: linux/amd64
argocd-server: v1.3.0+9f8608c
  BuildDate: 2019-11-13T01:51:00Z
  GitCommit: 9f8608c9fcb2a1d8dcc06eeadd57e5c0334c5800
  GitTreeState: clean
  GoVersion: go1.12.6
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: v0.13.1
  Kustomize Version: Version: {Version:kustomize/v3.2.1 GitCommit:d89b448c745937f0cf1936162f26a5aac688f840 BuildDate:2019-09-27T00:10:52Z GoOs:linux GoArch:amd64}
  Helm Version: v2.15.2
  Kubectl Version: v1.14.0
bug core workaround

Most helpful comment

I've hit the same issue as @wrdls and @estahn when attempting to deploy ingress-nginx with version 2.9.0. In my case, I'm not even using Helm but only kustomize, however due to the official manifest containing helm hook annotations, the sync process of ingress-nginx always failed.

As ingress-nginx itself does not need these helm hooks to work properly (regular deployment through kubectl is supported too), I managed to get it working by removing all helm hook annotations. Unfortunately ArgoCD does not allow disabling the parsing/handling of Helm hooks (even when just using kustomize in the first place) itself, so I had to strip them from the original manifest using a few patchesJson6902 statements within kustomize:


kustomization.yaml

patchesJson6902:
  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: ClusterRole
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: ClusterRoleBinding
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: batch
      version: v1
      kind: Job
      name: ingress-nginx-admission-create
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: batch
      version: v1
      kind: Job
      name: ingress-nginx-admission-patch
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: Role
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: RoleBinding
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      version: v1
      kind: ServiceAccount
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json


no-helm-hooks.json

[
  {
    "op": "remove",
    "path": "/metadata/annotations/helm.sh~1hook"
  },
  {
    "op": "remove",
    "path": "/metadata/annotations/helm.sh~1hook-delete-policy"
  }
]

Maybe this helps someone else as a temporary workaround to sync ingress-nginx when Helm is not being used.

EDIT: Just to clarify, I'm explicitly referring to the automatic mapping of Helm hooks to ArgoCD hooks as described in the manual, which seems to take place even if not using Helm at all.

All 16 comments

Thank you for submitting such a detailed bug report.

A PR for this won't be immediate for this, in the meantime do you have a work-around that we can share?

Alex

do you have a work-around that we can share?

Unfortunately no, I just disabled admissionsWebhooks and tlsProxy in prometheus-operator. One can manually render hooks during installation/upgrade, but this can be tricky.

You can choose one or more defined annotation values: * "hook-succeeded" specifies Helm should delete the hook after the hook is successfully executed. * "hook-failed" specifies Helm should delete the hook if the hook failed during execution. * "before-hook-creation" specifies Helm should delete the previous hook before the new hook is launched.

So we've implemented as per the Helm docs, but not as per their implementation. They don't delete hooks until the end of the hook's phase. So we should mimic that rather than just deleting right at the end.

Is there any progress on at least a temporary patch to match the expected behavior?

I'm seeing the same issue with ingress-nginx:

chart: ingress-nginx
chart version: 2.1.0
chart repo: https://kubernetes.github.io/ingress-nginx
app version: 3.2.0

I've hit the same issue as @wrdls and @estahn when attempting to deploy ingress-nginx with version 2.9.0. In my case, I'm not even using Helm but only kustomize, however due to the official manifest containing helm hook annotations, the sync process of ingress-nginx always failed.

As ingress-nginx itself does not need these helm hooks to work properly (regular deployment through kubectl is supported too), I managed to get it working by removing all helm hook annotations. Unfortunately ArgoCD does not allow disabling the parsing/handling of Helm hooks (even when just using kustomize in the first place) itself, so I had to strip them from the original manifest using a few patchesJson6902 statements within kustomize:


kustomization.yaml

patchesJson6902:
  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: ClusterRole
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: ClusterRoleBinding
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: batch
      version: v1
      kind: Job
      name: ingress-nginx-admission-create
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: batch
      version: v1
      kind: Job
      name: ingress-nginx-admission-patch
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: Role
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      group: rbac.authorization.k8s.io
      version: v1
      kind: RoleBinding
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json

  - target:
      version: v1
      kind: ServiceAccount
      name: ingress-nginx-admission
      namespace: ingress-nginx
    path: no-helm-hooks.json


no-helm-hooks.json

[
  {
    "op": "remove",
    "path": "/metadata/annotations/helm.sh~1hook"
  },
  {
    "op": "remove",
    "path": "/metadata/annotations/helm.sh~1hook-delete-policy"
  }
]

Maybe this helps someone else as a temporary workaround to sync ingress-nginx when Helm is not being used.

EDIT: Just to clarify, I'm explicitly referring to the automatic mapping of Helm hooks to ArgoCD hooks as described in the manual, which seems to take place even if not using Helm at all.

Any idea what release this will be included in?

@jessesuen Is there a timeline for this fix to be released? I am still running into this issue with prometheus-operator.

馃啓

Is there a timeline for this fix to be released? I am still running into this issue with ingress Nginx

It is 1.8 - we are working on testing it now. @mayzhang2000 do you think it is safe to cherry-pick fix into 1.7 as well?

I think it is safe to cherry-pick to 1.7. I will take care of that if you approve.

Thank you @mayzhang2000 ! Please, go ahead.

When I am trying to cherry-pick to 1.7, I realized there are too many merge conflicts. It is quite risky.

Was this page helpful?
0 / 5 - 0 ratings