Terraform-provider-helm: helm charts failing deployment via terraform, working when direct vai helm cli

Created on 16 Apr 2020 · 11Comments · Source: hashicorp/terraform-provider-helm

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version and Provider Version

❯ terraform version 
Terraform v0.12.24
+ provider.azurerm v2.5.0
+ provider.null v2.1.2
+ provider.random v2.2.1
+ provider.tls v2.1.1

Provider Version

Affected Resource(s)

helm_release

I am seeing this behavior across a few charts, but its a bit random.

Terraform Configuration Files

# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key.

resource in question;

resource "helm_release" "mongodb-sharded" {
  name      = "mongodb-sharded"
  chart     = "mongodb-sharded"
  repository = "https://charts.bitnami.com/bitnami"
  timeout = 600
}

Debug Output

https://gist.github.com/lukekhamilton/8e52b1e403a89557062796a4d25af24d

Panic Output

Expected Behavior

When I run the terraform apply I expect it to install the helm chart.

Actual Behavior

When I run the terraform apply this helm chart and others arent installing and getting stuck. However when I delete the release and then install manually it works without issue.

❯ helm ls -a   
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
mongodb-sharded default         1               2020-04-16 02:46:09.166067202 +0000 UTC pending-install mongodb-sharded-1.1.2   4.2.5

Steps to Reproduce

terraform apply

Important Factoids

I am running this on a very standard AKS clusters cluster.

References

GH-1234

acknowledged bug

Source

utx0

👍19

Most helpful comment

Experiencing this with the prometheus operator (as does this issue potentially https://github.com/helm/charts/issues/21913).

Same symptoms as above. Everything in kubectl is running, helm chart stuck on 'pending-install'. As soon as terraform times out, helm chart goes to 'failed'. Installing with helm manually works no problem.

arlyon on 19 Apr 2020

👍4

All 11 comments

Further more. A deploying I have running right now is showing me this:

❯ helm ls -a
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
mongodb-sharded default         1               2020-04-16 03:18:01.305438381 +0000 UTC pending-install mongodb-sharded-1.1.2   4.2.5

And also this:

❯ kubectl get pods
NAME                                           READY   STATUS    RESTARTS   AGE
mongodb-sharded-configsvr-0                    1/1     Running   0          3m38s
mongodb-sharded-mongos-6c8fb46c44-nml8b        0/1     Running   1          3m38s
mongodb-sharded-shard0-data-0                  0/1     Running   0          3m38s
mongodb-sharded-shard1-data-0                  0/1     Running   0          3m38s

utx0 on 16 Apr 2020

Experiencing this with the prometheus operator (as does this issue potentially https://github.com/helm/charts/issues/21913).

arlyon on 19 Apr 2020

👍4

One thing that changed the behavior for me was to up the size of the VM for the node pools then it worked without issue. However, for the life of me, I can't find any outputted loges anywhere to help debug what is actually happening...

utx0 on 20 Apr 2020

👍1

I'm trying to reproduce this one, so far this is working for me:

resource "helm_release" "example" {
  name       = "example"
  repository = "https://kubernetes-charts.storage.googleapis.com"
  chart      = "prometheus-operator"
}

Is there more info you can share about your environments @lukekhamilton @arlyon ? A full tf config that reproduces this issue would be super helpful for me. We have test accounts on all the major cloud providers and even a bare metal cluster we can run on.

jrhouston on 20 Apr 2020

@jrhouston Given enough tries, it goes through fine, but it is quite inconsistent. I have in response to this problem split my terraform configs into two separate modules with isolated states (one for monitoring / plumbing and one for 'apps') for now.

You can find an example here: https://github.com/arlyon/infra-code (pre split). Note that this has expired cloudflare keys populated in some of the configs, but I don't think it'll cause problems.

arlyon on 21 Apr 2020

I'm experiencing a similar issue, but with Bitnami's nginx, as follows:

# Using helm provider ~> 1.1.1

data "helm_repository" "bitnami" {
  name = "bitnami"
  url  = "https://charts.bitnami.com/bitnami"
}

resource "helm_release" "nginx" {
  name       = "my-nginx"
  repository = data.helm_repository.bitnami.metadata[0].name
  chart      = "nginx"
  version    = "5.2.3"

  namespace = "default"

  timeout = 50
}

It keeps printing lines such as: helm_release.nginx: Still creating... [50s elapsed] until hitting the timeout.

Once it hits the timeout, is when the deployment entry appears on helm:

$ helm ls --namespace my-namespace
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS  CHART           APP VERSION
my-nginx        default   1               2020-04-22 16:16:25.7100555 +0100 BST   failed  nginx-5.2.3     1.17.10

marked as failed.

However, the pods are up, ready and running, and the services have been created and work.

I'm sharing it here because I believe it may be related (seems to relate with the perception of pod readiness) and it's also a relatively quick scenario to spin up/down as necessary.

Also, a workaround for this use case is to add wait = false to the helm_release.

lmserrano on 22 Apr 2020

👍3

We're going to work on this with low priority as we collect more data since the issue is hard to reproduce on demand.

alexsomesan on 3 Jun 2020

I have a similar problem on GKE

My helm_release has a high timeout of 3000, and sometimes its fails with message:
Kubernetes Cluster Unreachable
BUT most of the times that helm_release would be deployed in the backend.

srinathganesh1 on 5 Jun 2020

👍1

I've had the issue with kubedb, both on AKS as on a bare-metal cluster. They have this as an open issue: https://github.com/kubedb/project/issues/504

carlowouters on 9 Jun 2020

I'm also experiencing this, deploying the Gloo helm chart via terraform. In my case, I have a tf-modules repo that I import.. Gloo deploys just fine, but terraform keeps waiting for it to finish and eventually times out, though the deployment ended minutes ago...

CelsoSantos on 17 Jul 2020

I ran into this issue and I solved this by deleting jobs that can't be "recreated" like Kubernetes Jobs which were created as part of a helm deployment. Then the Terraform Helm module worked as expected for my use case.