Argo-cd: Deploying from a helm repo to ArgoCD in a proxy environment results in timeout as Argo adds the stable repo by default

Created on 6 Dec 2019  路  26Comments  路  Source: argoproj/argo-cd

Checklist:

  • [x] I've searched in the docs and FAQ for my answer: http://bit.ly/argocd-faq.
  • [x] I've included steps to reproduce the bug.
  • [x] I've pasted the output of argocd version.

Describe the bug

Currently in a proxy environment, when I try to deploy an app from a private helm repo, ArgoCD times out and fails. Following are my findings as to why it does that:

Argo repo-servers runs a helm init --client-only --skip-refresh. This adds the stable repo with URL: https://kubernetes-charts.storage.googleapis.com along with the private repo I added to argo which is hosted in artifactory. Access to the private repo doesn't need proxy but since Argo repo-server pod doesn't have proxy vars set up, it can't reach the stable repo when it runs a helm repo update and hence times out and throws out an error.

Logs attached below from repo-server pod.

To Reproduce

In a proxy environment, deploy an app from ArgoCD UI or using a declarative approach (application resource) from a private helm repo that is accessible from the cluster without proxy.

Expected behavior

Argo repo-server shouldn't add the stable repo (https://kubernetes-charts.storage.googleapis.com) by default as in a proxy environment it won't have access to it and hence would time-out doing a helm repo update.

Screenshots

If applicable, add screenshots to help explain your problem.

Version

$ argocd version
argocd: v1.3.0+9f8608c
  BuildDate: 2019-11-13T01:49:01Z
  GitCommit: 9f8608c9fcb2a1d8dcc06eeadd57e5c0334c5800
  GitTreeState: clean
  GoVersion: go1.12.6
  Compiler: gc
  Platform: linux/amd64

Logs

From repo-server when I try to create an app from a helm repo

time="2019-12-05T19:59:04Z" level=info msg="manifest cache miss: 0.1.43/&ApplicationSource{RepoURL:https://artifactory.xxxxxxxxxxxxxxx,Path:,TargetRevision:0.1.43,Helm:nil,Kustomize:nil,Ksonnet:nil,Directory:nil,Plugin:nil,Chart:catalog,}"
time="2019-12-05T19:59:12Z" level=error msg="`helm repo update` failed timeout after 1m30s" execID=KOsLx
time="2019-12-05T19:59:12Z" level=error msg="finished unary call with code Unknown" error="`helm repo update` failed timeout after 1m30s" grpc.code=Unknown grpc.method=GetAppDetails grpc.request.deadline="2019-12-05T19:58:42Z" grpc.service=repository.RepoServerService grpc.start_time="2019-12-05T19:57:42Z" grpc.time_ms=90061.29 span.kind=server system=grpc
time="2019-12-05T19:59:12Z" level=info msg="helm init --client-only --skip-refresh" dir="/tmp/https:__artifactory.xxxxxxxxxxx" execID=GQGCe
time="2019-12-05T19:59:12Z" level=info msg="helm repo update" dir="/tmp/https:__artifactory.xxxxxxxxxxx" execID=W41qf
time="2019-12-05T19:59:13Z" level=info msg="manifest cache hit: &ApplicationSource{RepoURL:http://git.xxxxxxxxxxxxxx/catalog,TargetRevision:HEAD,Helm:&ApplicationSourceHelm{ValueFiles:[values.yaml],Parameters:[{apiserver.storage.etcd.persistence.enabled false false} {image quay.io/kubernetes-service-catalog/service-catalog:v0.2.1 false} {apiserver.replicas 1 false}],ReleaseName:service-catalog,Values:,},Kustomize:nil,Ksonnet:nil,Directory:nil,Plugin:nil,Chart:,}/2c9adc2df1596483711655aa1e5186e6b737992d"
time="2019-12-05T19:59:13Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GenerateManifest grpc.request.deadline="2019-12-05T20:00:12Z" grpc.service=repository.RepoServerService grpc.start_time="2019-12-05T19:59:12Z" grpc.time_ms=791.33 span.kind=server system=grpc

Going into the repo server pod and running the same commands confirms this:

$ kubectl exec -it argocd-repo-server-6f4b75f4bb-q2f9x -n argocd  bash
$ helm init --client-only --skip-refresh
Creating /home/argocd/.helm
Creating /home/argocd/.helm/repository
Creating /home/argocd/.helm/repository/cache
Creating /home/argocd/.helm/repository/local
Creating /home/argocd/.helm/plugins
Creating /home/argocd/.helm/starters
Creating /home/argocd/.helm/cache/archive
Creating /home/argocd/.helm/repository/repositories.yaml
Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
Adding local repo with URL: http://127.0.0.1:8879/charts
$HELM_HOME has been configured at /home/argocd/.helm.
Not installing Tiller due to 'client-only' flag having been set
argocd@argocd-repo-server-6f4b75f4bb-q2f9x:~$ helm repo add <redacted>
"xx-helm-dev" has been added to your repositories
argocd@argocd-repo-server-6f4b75f4bb-q2f9x:~$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "xx-helm-dev" chart repository
...Unable to get an update from the "stable" chart repository (https://kubernetes-charts.storage.googleapis.com):
        Get https://kubernetes-charts.storage.googleapis.com/index.yaml: dial tcp 216.58.196.80:443: connect: connection timed out
Update Complete.
bug config-management

All 26 comments

Would you like to submit a PR to fix this?

Hi @alexec. I haven't dived deep in to the source code yet. Will need to take a look at it and open a PR if I find the fix for it.

@alexec @afanrasool
I just run into the same issue. I see the following possible solutions, based on this issue here:
https://github.com/helm/helm/issues/3749

Set --stable-repo-url

Allow defining the --stable-repo-url argument while executing helm init in util/helm/cmd.go#L42 based on a entry in the argocd-cm ConfigMap. The ConfigMap entry could look like this:

data:
  helm.stable.repo.url: https://charts.example.tld # default: https://kubernetes-charts.storage.googleapis.com

If this entry is set, the helm init command would be executed with the argument --stable-repo-url $helm.stable.repo.url

Disabling stable repo

Add the possibility to remove the stable repo if the corresponding config is set in the argocd-cm ConfigMap. Such a ConfigMap entry could look like this:

data:
  helm.stable.repo.enabled: 'false' # default: true

To implement this, an additional function called removeStableRepo is added in util/helm/cmd.go.
The execution of this function would be more or less:

helm repo remove stable

This function would be called from util/helm/client.go#L116 when the config value helm.stable.enabled is false.

If I overlook a possible solution, let me know. It would probably be best if both variants were implemented to provide the necessary flexibility for different circumstances (e.g. required internal mirror of the stable repo, complete exclusion of the stable repo). The config value helm.stable.repo.url would be ignored in the case of helm.stable.repo.enabled: 'false'.

What are your thoughts? If nothing speaks against my suggestions, I would prepare an appropriate PR.

@niiku Thats exactly where my head was at for a potential fix, but haven't had chance to work on the PR. If you can do that, that would be awesome!

I was wondering. The Helm stable repository is going away (see https://github.com/helm/charts#deprecation-timeline) and therefore we should remove support from it. At the same time we should be upgrading to Helm 3 (see #2864). Should this be part of that work?

@alexec I think Argo CD shouldn't drop the Helm stable repository per default until Nov 13, 2020, since it normally doesn't hurt. Especially since the removal would break the default usage of the new feature introduced here https://github.com/argoproj/argo-cd/issues/1145

As for Helm v3, I think there should be a long time where Argo CD works with Helm v2 and v3 in parallel to support backwards compatibility. Helm v3 has no default repository anyway, as there's no helm init - so this issue shouldn't be relevant for Helm v3.

Please excuse me, of course it is up to you how long you want to support Helm v2 and the stable repository. With the circumstances, it makes more sense to implement only the helm repo remove stable function for Helm v2. I will try to suggest a corresponding commit.

@alexec
Unfortunately, I don't know exactly how I can best implement a feature flag for users currently using the helm stable repo without an entry in helm.repositories. I noticed that the reposerver does not access ConfigMaps directly and that this is not foreseen. I see two possibilities how such a flag could be implemented:

Extending ManifestRequest

To transfer the initially proposed ConfigMap entry helm.stable.repo.enabled from the argocd-server to the reposerver, the v1alpha1.ManifestRequest type could be extended by a field HelmOptions analogous to KustomizeOptions, which holds the corresponding configuration value.

Adding a cmd parameter

The cmd argocd-repo-server could be extended by a parameter --enable-helm-stable-repo or similar. If this is set to false, the helm stable repo is removed. The advantage of this solution would be that no api types have to be modified.

Did I miss an easier way? Or would it be acceptable to remove the helm stable repository every time helm init is run? Could you give me some input on how to proceed in fixing this issue? This would help me a lot to solve this issue in an appropriate way.

Hi @alexec, hope you had great holidays and a happy new year. Would it be possible that you could point me in the right direction with my previous questions? That would be really great :-)

I'd suggest that we add no repos by default. This is because the Helm stable repo is no longer recommend way to distribute you app.

This should probably done with the Helm v3 #2383 support.

@afanrasool I trick my setup with setting the unix environment variable that are normally picked up by any library... therefore this is absolutely not a problem for me. Maybe we shall have a documentation's patch. Have time for it? The below goes to the deploy resources of: argocd-repo-server,argocd-application-controller,argocd-server

env:
- name: HTTP_PROXY
  value: http://XXX-dsi-proxy:3128
- name: HTTPS_PROXY
  value: http://XXX-dsi-proxy:3128
- name: NO_PROXY
  value: 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.cluster.local,argocd-repo-server,.test.dsp.XXX.de,.shared.dsp.XXX.de,.dst.kube

@niiku I would simply close this ticket based on my comment above, have u tested it for ur setup?

@zeph Yes, I tested the proxy trick, but as we're doing a "man-in-the-middle" I would need to configure the certificate from our webproxy in the repo server. And there should be no reason for the repo server to have access to the internet (so people can't use other chart repositories than ours).

p.s. this issue applies also if using DEX to integrate Azure's SSO for example

@alexec So v1.5 is going to have some breaking changes I guess? Or are you going to support Helm v2 and v3 at the same time? If that's the case, a corresponding fix for Helm v2 would still make sense.

We are also behind a http proxy and as I understand the discussion there is no workaround to get local helm repos working? there is no workaround to disable the default stable helm repo and also http_proxy env variables aren't working. Am I right? From my perspective I think a feature to disable the default repo with a cmd parameter described above would be perfect, because adding a no_proxy env for all local communication can be quite error-prone.

@jkleinlercher Thanks for the feedback. I鈥榤 going to submit a corresponding PR in the next couple of days.

@niiku hold the fire... I already provided a patch for https://github.com/argoproj/argo-cd/issues/3055

@jkleinlercher ah, u want a solution not based on env variables? uhmm

@zeph PR #3063 is of course very useful if you want to retrieve helm charts from the internet via a forward-proxy. However, imho additionally it is very good to disable the default helm repo if you don't need it. maintaining a non_proxy list is very error-prone from my experience, so I do not want to have it if I don't need it.

@zeph Apologies for the late reply..catching up on this now. I have set up env vars as a workaround but I would agree with @jkleinlercher above ^^.

@afanrasool at this point I fear I don't get the full complexity of your problem, nevermind
What I care about is the PR #3063 because that blocks me currently ...

I found this document https://argoproj.github.io/argo-cd/faq/#argo-cd-cannot-deploy-helm-chart-based-applications-without-internet-access-how-can-i-solve-it
seems like an easy workaround @afanrasool and @niiku

@jkleinlercher - I tried setting that up but it didn't seem to work for me.

I was writing up an example repo for my team on this and tried using the Off-the-shelf example, trying to get it to use our mirror of the wordpress chart.

argocd@argocd-repo-server-557bc7c877-chsrm:/tmp/https:__<git_REPO>/staging/OTS-example$ helm --debug dep up       
Hang tight while we grab the latest from your chart repositories...
...Unable to get an update from the "local" chart repository (http://127.0.0.1:8879/charts):
    Get http://127.0.0.1:8879/charts/index.yaml: dial tcp 127.0.0.1:8879: connect: connection refused
...Successfully got an update from the "stable" chart repository
Update Complete.
Saving 1 charts
Downloading wordpress from repo https://<internal-mirror>/kubernetes-charts-storage-googleapis-com
Save error occurred:  could not download https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: Get https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: dial tcp 172.217.163.144:443: connect: connection timed out
Deleting newly downloaded charts, restoring pre-update state
Error: could not download https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: Get https://kubernetes-charts.storage.googleapis.com/wordpress-1.0.9.tgz: dial tcp 172.217.163.144:443: connect: connection timed out

Deployed latest argo-cd with PR #3063. 馃帀 I was able to add bitnami helm repository - we have MITM proxy - but not able to add kubernetes-charts.storage.googleapis.com. This is where it comes so strange.

I noticed that argo-cd deployed metrics-server somehow. Pod is running but argo-cd shows:

ComparisonError:
helm repo add --ca-file /app/config/tls/kubernetes-charts.storage.googleapis.com stable https://kubernetes-charts.storage.googleapis.com

Furthermore, I checked that I cannot either deploy any application or browse charts due to x509: certificate signed by unknown authority. All the above works with bitnami helm repository. I think argo-cd handles specially the stable repository.

So Offline environment and disconnected cluster ! Right ?
Then,

  • Play with CorDNS configmap and let it resolve kubernetes-charts.storage.googleapis.com to existing ip (.i.e: Node IP).
  • Run App (nginx) with ingress host is kubernetes-charts.storage.googleapis.com. Make sure that, the app can serve /index.yaml

apiVersion: v1
data:
  Corefile: |
    .:53 {
        log
        errors
        health {
          lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
          ttl 30
        }
        file /etc/coredns/helm-repo-stable.db storage.googleapis.com
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
  helm-repo-stable.db: |
    storage.googleapis.com. 5 IN    SOA ns1.dns.storage.googleapis.com. hostmaster.storage.googleapis.com. (
            12345      ; serial
            14400      ; refresh (4 hours)
            3600       ; retry (1 hour)
            604800     ; expire (1 week)
            5          ; minimum (4 hours)
            )
    storage.googleapis.com.     5 IN    NS ns1.dns.storage.googleapis.com.
    ns1.dns.storage.googleapis.com.  5 IN  A 10.96.0.10 ; Refer to IP of kube-dns.kube-system.svc.cluster.local.
    kubernetes-charts.storage.googleapis.com.    5   IN      A       192.168.19.12 ; Refer to one of nodes IP

In this example I resolved kubernetes-charts.storage.googleapis.com. to an existing IP which one of my node IPs ( 192.168.19.12)

Also, I've added this line above file /etc/coredns/helm-repo-stable.db storage.googleapis.com
Notice also 10.96.0.10 which is the Cluster IP of the service 'kube-dns' (kube-dns.kube-system.svc.cluster.local.)

Take care

Was this page helpful?
0 / 5 - 0 ratings