Terraform-provider-helm: Can't use helm provider v1 with plan & apply on different machines

Created on 20 Feb 2020  ยท  7Comments  ยท  Source: hashicorp/terraform-provider-helm

Hello

There is an issue with running plan & apply on different build agents in a CI CD pipeline. It is document here

The use of the home key in the provider worked fine in helm provider < 1.0. for example:

provider "helm" {
  debug           = true
  version         = "~> 0.10"
  namespace       = "kube-system"
  service_account = kubernetes_service_account.tiller_sa.metadata.0.name
  home            = "${abspath(path.root)}/.helm"

  kubernetes {
    host                   = azurerm_kubernetes_cluster.aks.kube_config.0.host
    client_certificate     = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.cluster_ca_certificate)
    load_config_file       = false
  }
}

However, when upgrading the provider to version 1.0, the home key is invalid anymore

terraform plan
Acquiring state lock. This may take a few moments...

Error: Unsupported argument

  on init.tf line 40, in provider "helm":
  40:   home            = "${abspath(path.root)}/.helm"

An argument named "home" is not expected here.

This key can be removed and terraform will work fine as long as the plan & apply both done on the same machine (e.g. developer's laptop)
But when running the plan and apply on different machines (for example, during a CI / CD process) this will break since the /.helm folder doesn't exist on the agent, causing the deployment of the helm release to fail. for example:

# plan phase

2020-02-20T11:55:00.2426615Z   # helm_release.phippyandfriends["parrot"] will be created
2020-02-20T11:55:00.2427141Z   + resource "helm_release" "phippyandfriends" {
2020-02-20T11:55:00.2427646Z       + atomic                = false
2020-02-20T11:55:00.2428208Z       + chart                 = "parrot"
2020-02-20T11:55:00.2428911Z       + cleanup_on_fail       = false
2020-02-20T11:55:00.2429613Z       + dependency_update     = false
2020-02-20T11:55:00.2430106Z       + disable_crd_hooks     = false
2020-02-20T11:55:00.2430586Z       + disable_webhooks      = false
2020-02-20T11:55:00.2431122Z       + force_update          = false
2020-02-20T11:55:00.2432099Z       + id                    = (known after apply)
2020-02-20T11:55:00.2432645Z       + max_history           = 0
2020-02-20T11:55:00.2436642Z       + metadata              = (known after apply)
2020-02-20T11:55:00.2437794Z       + name                  = "parrot"
2020-02-20T11:55:00.2440461Z       + namespace             = "phippyandfriends"
2020-02-20T11:55:00.2449192Z       + recreate_pods         = false
2020-02-20T11:55:00.2450215Z       + render_subchart_notes = true
2020-02-20T11:55:00.2450894Z       + replace               = false
2020-02-20T11:55:00.2451419Z       + repository            = "***"
2020-02-20T11:55:00.2451948Z       + reset_values          = false
2020-02-20T11:55:00.2452482Z       + reuse_values          = false
2020-02-20T11:55:00.2453060Z       + skip_crds             = false
2020-02-20T11:55:00.2453776Z       + status                = "deployed"
2020-02-20T11:55:00.2454491Z       + timeout               = 300
2020-02-20T11:55:00.2454986Z       + verify                = false
2020-02-20T11:55:00.2455494Z       + version               = "v0.5.0"
2020-02-20T11:55:00.2456021Z       + wait                  = true
2020-02-20T11:55:00.2456689Z 
2020-02-20T11:55:00.2468851Z       + set {
2020-02-20T11:55:00.2469468Z           + name  = "image.repository"
2020-02-20T11:55:00.2470036Z           + value = "***.azurecr.io/parrot"
2020-02-20T11:55:00.2470298Z         }
2020-02-20T11:55:00.2470677Z       + set {
2020-02-20T11:55:00.2471118Z           + name  = "ingress.alias"
2020-02-20T11:55:00.2472657Z           + value = "phippyandfriends.dvps.***.guru"
2020-02-20T11:55:00.2473543Z         }
2020-02-20T11:55:00.2474072Z       + set {
2020-02-20T11:55:00.2474823Z           + name  = "ingress.basedomain"
2020-02-20T11:55:00.2475892Z           + value = (known after apply)
2020-02-20T11:55:00.2476369Z         }
2020-02-20T11:55:00.2476593Z     }
...
2020-02-20T11:55:00.2559391Z Plan: 13 to add, 0 to change, 0 to destroy.
2020-02-20T11:55:00.2559598Z 
2020-02-20T11:55:00.2559986Z ------------------------------------------------------------------------
2020-02-20T11:55:00.2560232Z 
2020-02-20T11:55:00.2560652Z This plan was saved to: 503-dvps.plan
2020-02-20T11:55:00.2560853Z 
2020-02-20T11:55:00.2561070Z To perform exactly these actions, run the following command to apply:
2020-02-20T11:55:00.2561531Z     terraform apply "503-dvps.plan"
2020-02-20T11:55:00.2561765Z 
2020-02-20T11:55:00.3165350Z ##[section]Finishing: Terraform Dry Run (Plan)

failure in apply phase:

# apply phase

2020-02-20T12:21:11.9461719Z Error: repo *** not found
2020-02-20T12:21:11.9461981Z 
2020-02-20T12:21:11.9462420Z   on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9462846Z  151: resource "helm_release" "phippyandfriends" {
2020-02-20T12:21:11.9463198Z 
2020-02-20T12:21:11.9463508Z 
2020-02-20T12:21:11.9473599Z 
2020-02-20T12:21:11.9474198Z Error: repo *** not found
2020-02-20T12:21:11.9474291Z 
2020-02-20T12:21:11.9476491Z   on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9477223Z  151: resource "helm_release" "phippyandfriends" {
2020-02-20T12:21:11.9477766Z 
2020-02-20T12:21:11.9478336Z 
2020-02-20T12:21:11.9488647Z 
2020-02-20T12:21:11.9489190Z Error: repo *** not found
2020-02-20T12:21:11.9489271Z 
2020-02-20T12:21:11.9489671Z   on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9490010Z  151: resource "helm_release" "phippyandfriends" {
2020-02-20T12:21:11.9490246Z 
2020-02-20T12:21:11.9490588Z 
2020-02-20T12:21:11.9500783Z 
2020-02-20T12:21:11.9505476Z Error: repo *** not found
2020-02-20T12:21:11.9505598Z 
2020-02-20T12:21:11.9505922Z   on main.tf line 151, in resource "helm_release" "phippyandfriends":
2020-02-20T12:21:11.9506204Z  151: resource "helm_release" "phippyandfriends" {
2020-02-20T12:21:11.9506420Z 
2020-02-20T12:21:11.9506611Z 
2020-02-20T12:21:12.0553033Z ##[error]Bash exited with code '1'.
2020-02-20T12:21:12.0564378Z ##[section]Finishing: Deploy (Terraform Apply)

please note that it is not a problem with the configuration / etc:

  • it is working on a dev machine, where both plan and apply running on the same machine
  • it is working with helm provider < 1.0, meaning there is no problem in the actual terraform code
  • reverting back to the old version fix the problem without any other changes

How can I accomplish the same scenario (plan & apply on separate machines) using the new provider version (and finally remove tiller ๐Ÿ˜„ )?

Omer

Most helpful comment

I was able to reproduce this problem locally with the following steps using this example:

provider "helm" {}

data "helm_repository" "stable" {
   name = "stable"
   url = "https://kubernetes-charts.storage.googleapis.com"
}

resource "helm_release" "example" {
   name = "example"
   repository = "stable"
   chart = "postgresql"
}
  1. terraform apply
  2. Make an edit to the release
  3. terraform plan -out tf.plan
  4. helm repo remove stable
  5. terraform apply tf.plan ๐Ÿ’ฃ ๐Ÿ”ฅ

I was able to work around this by removing the helm_repository data source and configuring the release to explicitly use the URL of the repository, and the above steps succeeded:

provider "helm" {}

resource "helm_release" "example" {
   name = "example"
   repository = "https://kubernetes-charts.storage.googleapis.com"
   chart = "postgresql"
}

The problem here seems to be that the helm_repository data source is writing state to the file system here where it creates an entry in helm's repositories.yaml file. When coming to do an install, helm expects an entry for the repository name to be in this file and throws the failed to download error seen above if an entry does not exist for the name.

So when we output a plan and try to run it on a fresh machine, the helm_repository data source never gets refreshed and therefore the repo entry doesn't get created, causing the apply to fail. This somewhat questions the legitimacy of helm_repository as a data source, because it is in fact creating a piece of state that's outside of terraform and depended upon by another resource, not just querying for information.

There's a few paths forward here:

  1. Deprecate the helm_repository resource entirely, and do the repository configuration at the helm_release level. I think the intent behind the helm_repository resource is that you only have to configure the repo and it's auth credentials once and re-use it, so this will create a bunch of repetition.

  2. Make helm_repository a resource. In the case were terraform is being run fresh in CI, this would mean the resource would always be being created which I'm not sure makes a lot of sense. This data source was previously a resource, and I don't have full context for why it was changed. This is also confusing because the resource would not actually manage a _repository_ per se, but a RepoEntry in the repositories.yaml file where terraform is being executed.

  3. Find a way of storing this repository entry inside the terraform state, and feeding it into helm at apply time. I haven't yet looked into how feasible this is. The provider defers the locating of the chart to helm here which then uses the ChartDownloader configured with a path to the repositories.yaml here.

Thoughts on the above would be much appreciated.

All 7 comments

I'm experiencing the same issue with running plan then apply in what's basically separate Docker containers in my CI/CD pipeline

Error: failed to download "traefik/traefik" (hint: runninghelm repo updatemay help)

(Even though it works locally)

I was able to reproduce this problem locally with the following steps using this example:

provider "helm" {}

data "helm_repository" "stable" {
   name = "stable"
   url = "https://kubernetes-charts.storage.googleapis.com"
}

resource "helm_release" "example" {
   name = "example"
   repository = "stable"
   chart = "postgresql"
}
  1. terraform apply
  2. Make an edit to the release
  3. terraform plan -out tf.plan
  4. helm repo remove stable
  5. terraform apply tf.plan ๐Ÿ’ฃ ๐Ÿ”ฅ

I was able to work around this by removing the helm_repository data source and configuring the release to explicitly use the URL of the repository, and the above steps succeeded:

provider "helm" {}

resource "helm_release" "example" {
   name = "example"
   repository = "https://kubernetes-charts.storage.googleapis.com"
   chart = "postgresql"
}

The problem here seems to be that the helm_repository data source is writing state to the file system here where it creates an entry in helm's repositories.yaml file. When coming to do an install, helm expects an entry for the repository name to be in this file and throws the failed to download error seen above if an entry does not exist for the name.

So when we output a plan and try to run it on a fresh machine, the helm_repository data source never gets refreshed and therefore the repo entry doesn't get created, causing the apply to fail. This somewhat questions the legitimacy of helm_repository as a data source, because it is in fact creating a piece of state that's outside of terraform and depended upon by another resource, not just querying for information.

There's a few paths forward here:

  1. Deprecate the helm_repository resource entirely, and do the repository configuration at the helm_release level. I think the intent behind the helm_repository resource is that you only have to configure the repo and it's auth credentials once and re-use it, so this will create a bunch of repetition.

  2. Make helm_repository a resource. In the case were terraform is being run fresh in CI, this would mean the resource would always be being created which I'm not sure makes a lot of sense. This data source was previously a resource, and I don't have full context for why it was changed. This is also confusing because the resource would not actually manage a _repository_ per se, but a RepoEntry in the repositories.yaml file where terraform is being executed.

  3. Find a way of storing this repository entry inside the terraform state, and feeding it into helm at apply time. I haven't yet looked into how feasible this is. The provider defers the locating of the chart to helm here which then uses the ChartDownloader configured with a path to the repositories.yaml here.

Thoughts on the above would be much appreciated.

@jrhouston thank you for your suggestion, this has worked very well for me
since I'm using AKS, my provider config is a bit different, see below if anyone needs it in the future. specifically, the load_config_file = false was a must for me:

provider "helm" {
  debug   = true
  version = "~> 1.0.0"

  kubernetes {
    host                   = azurerm_kubernetes_cluster.aks.kube_config.0.host
    client_certificate     = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.cluster_ca_certificate)
    load_config_file       = false
  }
}

Closing this issue since is making reference to a version based on Helm 2, if this is still valid to the master branch please reopen it. Thanks.

No, this issue still exists, although it can be overcome by not using the helm_repository and instead including the repo directly in the helm_release

Still seeing the issue when specifying the incubator repo url

resource "helm_release" "alb_ingress" { name = "aws-alb-ingress-controller" repository = "https://kubernetes-charts-incubator.storage.googleapis.com" chart = "aws-alb-ingress-controller" namespace = "kube-system" version = "1.0.0" }

Error: failed to download "https://kubernetes-charts-incubator.storage.googleapis.com/aws-alb-ingress-controller-1.0.0.tgz" (hint: runninghelm repo updatemay help)

Provider version 1.0
Terraform v0.12.24

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error ๐Ÿค– ๐Ÿ™‰ , please reach out to my human friends ๐Ÿ‘‰ [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

utx0 picture utx0  ยท  11Comments

adaphi picture adaphi  ยท  11Comments

mstrzele picture mstrzele  ยท  13Comments

pdecat picture pdecat  ยท  14Comments

dangarthwaite picture dangarthwaite  ยท  19Comments