terraform-provider-kubernetes 🚀 - Provider 1.11.0 fails to initialize

The error message I was seeing was

Error: Failed to initialize config: invalid configuration: no configuration has been provided

  on kubernetes.tf line 1, in provider "kubernetes":
   1: provider "kubernetes" {

ankushagarwal on 10 Feb 2020

👍21

Can you please post provider configuration blocks here?
Redacted, of course.

alexsomesan on 10 Feb 2020

👍1

tf aws provider version : 2.48.0
tf version : v0.12.16

provider "kubernetes" {
  version                = "~> 1.10"
  host                   = aws_eks_cluster.mycluster.endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.mycluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.mycluster.token
  load_config_file       = false
}

ankushagarwal on 10 Feb 2020

got the same error like @bowczarek with Terraform v0.11.14

provider "kubernetes" {
    version = "~> 1.10.0"
    host                   = "${azurerm_kubernetes_cluster.k8scluster.kube_config.0.host}"
    client_certificate     = "${base64decode(azurerm_kubernetes_cluster.k8scluster.kube_config.0.client_certificate)}"
    client_key             = "${base64decode(azurerm_kubernetes_cluster.k8scluster.kube_config.0.client_key)}"
    cluster_ca_certificate = "${base64decode(azurerm_kubernetes_cluster.k8scluster.kube_config.0.cluster_ca_certificate)}"
}

immae1 on 10 Feb 2020

Yeah, nothing special:

data "google_client_config" "current" {}

provider "kubernetes" {
  host                   = google_container_cluster.service_cluster.endpoint
  token                  = data.google_client_config.current.access_token
  client_certificate     = base64decode(google_container_cluster.service_cluster.master_auth.0.client_certificate)
  client_key             = base64decode(google_container_cluster.service_cluster.master_auth.0.client_key)
  cluster_ca_certificate = base64decode(google_container_cluster.service_cluster.master_auth.0.cluster_ca_certificate)
}

bowczarek on 10 Feb 2020

@bowczarek Does setting load_config_file = false fix your issue?
Explained here https://www.terraform.io/docs/providers/kubernetes/index.html#statically-defined-credentials

alexsomesan on 10 Feb 2020

Also, what's the reason for using both a token and client_certificate at the same time?

alexsomesan on 10 Feb 2020

I am using load_config_file = false and I am still running into an issue (which I hope has a similar root cause to the issue reported on the top)

ankushagarwal on 10 Feb 2020

@alexsomesan seems like adding load_config_file = false fixed my problem but some guys may still encounter it.

Still need to test applying some fake changes to see if it works there as well.

bowczarek on 10 Feb 2020

👍4

I have the same issue, mine will initialize, but the plan fails with file cannot be found in the config dir. I have many clusters and use kubectx. The error is file not found, just just forced a revert back to 1.9.0 and everything is working normally. All I am doing in the state is at the end spitting out the kube config.

Here is my provider:

provider "kubernetes" {
  version = "<= 1.9.0"
  host                   = module.eks-cluster.aws-eks-cluster-eks-cluster-endpoint
  cluster_ca_certificate = base64decode(module.eks-cluster.aws_eks_cluster-eks-cluster-certificate-authority-data)
  token                  = data.aws_eks_cluster_auth.cluster_auth.token
}

dubb-b on 10 Feb 2020

@dubb-b see my previous remark

alexsomesan on 10 Feb 2020

I'm having the same issue. In fact this broke my live demo 😄
I'm running terraform v0.12.19 kubectl v1.16.2
And this is my terraform hcl snippet

provider "kubernetes" {
  host = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token = data.aws_eks_cluster_auth.cluster.token
  load_config_file = "false"
  version = "~> 1.10"
}

ruizink on 10 Feb 2020

👍3

i tried with load_config_file="false"

but now i get this error:

_module.components.provider.kubernetes: Failed to initialize config: invalid configuration: no configuration has been provided_

Again my provider, i use tf version v0.11.14

provider "kubernetes" {
    load_config_file = false
    host                   = "${azurerm_kubernetes_cluster.k8scluster.kube_config.0.host}"
    client_certificate     = "${base64decode(azurerm_kubernetes_cluster.k8scluster.kube_config.0.client_certificate)}"
    client_key             = "${base64decode(azurerm_kubernetes_cluster.k8scluster.kube_config.0.client_key)}"
    cluster_ca_certificate = "${base64decode(azurerm_kubernetes_cluster.k8scluster.kube_config.0.cluster_ca_certificate)}"

}

immae1 on 10 Feb 2020

👍11

I'm looking into the EKS case.

@ruizink Sorry about your live demo. Pinning provider versions to known good configurations is a best practice we encourage users to adopt. It would help avoid situations like this from happening in the future.

alexsomesan on 10 Feb 2020

👍3

I just confirmed the following configuration to be valid and working, using minikube.

provider "kubernetes" {
  load_config_file = false

  host = "https://192.168.64.35:8443"
  client_certificate = file("/Users/alex/.minikube/client.crt")
  client_key = file("/Users/alex/.minikube/client.key")
  cluster_ca_certificate = file("/Users/alex/.minikube/ca.crt")
}

resource "kubernetes_namespace" "name" {
  metadata {
    name = "test"
  }
}

Please double-check that your interpolation expressions are resolving to valid values.
I'll double-check the EKS case next.

alexsomesan on 10 Feb 2020

👍2

@immae1 I tested above configuration with TF 0.11.14 against minikube.
Works as expected:

~/test-creds » terraform plan                                                                                                                                                                 alex@alex-macbook
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.


------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + kubernetes_namespace.name
      id:                          <computed>
      metadata.#:                  "1"
      metadata.0.generation:       <computed>
      metadata.0.name:             "test"
      metadata.0.resource_version: <computed>
      metadata.0.self_link:        <computed>
      metadata.0.uid:              <computed>


Plan: 1 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

------------------------------------------------------------
~/test-creds » terraform apply -auto-approve                                                                                                                                                  alex@alex-macbook
kubernetes_namespace.name: Creating...
  metadata.#:                  "" => "1"
  metadata.0.generation:       "" => "<computed>"
  metadata.0.name:             "" => "test"
  metadata.0.resource_version: "" => "<computed>"
  metadata.0.self_link:        "" => "<computed>"
  metadata.0.uid:              "" => "<computed>"
kubernetes_namespace.name: Creation complete after 0s (ID: test)

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
------------------------------------------------------------
~/test-creds » terraform destroy -auto-approve                                                                                                                                                alex@alex-macbook
kubernetes_namespace.name: Refreshing state... (ID: test)
kubernetes_namespace.name: Destroying... (ID: test)
kubernetes_namespace.name: Destruction complete after 6s

Destroy complete! Resources: 1 destroyed.
------------------------------------------------------------

alexsomesan on 10 Feb 2020

it seems that this PR #690 broke config init (initializeConfiguration)

Victorion on 10 Feb 2020

Example of broken config: (terraform-aws-eks)
https://github.com/terraform-aws-modules/terraform-aws-eks#usage-example

Victorion on 10 Feb 2020

👍8

logs:

2020-02-10T18:38:24.413Z [DEBUG] plugin.terraform-provider-kubernetes_v1.11.0_x4: 2020/02/10 18:38:24 [DEBUG] Trying to load configuration from file
2020-02-10T18:38:24.413Z [DEBUG] plugin.terraform-provider-kubernetes_v1.11.0_x4: 2020/02/10 18:38:24 [DEBUG] Configuration file is: /Users/username/.kube/config
2020/02/10 18:38:24 [ERROR] <root>: eval: *terraform.EvalConfigProvider, err: Failed to initialize config: invalid configuration: no configuration has been provided
2020/02/10 18:38:24 [ERROR] <root>: eval: *terraform.EvalSequence, err: Failed to initialize config: invalid configuration: no configuration has been provided
2020/02/10 18:38:24 [ERROR] <root>: eval: *terraform.EvalOpFilter, err: Failed to initialize config: invalid configuration: no configuration has been provided
2020/02/10 18:38:24 [ERROR] <root>: eval: *terraform.EvalSequence, err: Failed to initialize config: invalid configuration: no configuration has been provided

Victorion on 10 Feb 2020

Started having this issue today also.

Error: Failed to initialize config: invalid configuration: no configuration has been provided

  on modules/beta-private-cluster/auth.tf line 29, in provider "kubernetes":
  29: provider "kubernetes" {

Configuration looks like this:

provider "kubernetes" {
  load_config_file       = false
  host                   = "https://${local.cluster_endpoint}"
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(local.cluster_ca_certificate)
}

Markieta on 10 Feb 2020

Having the same Issue with GKE:

data "google_client_config" "primary" {}

provider "kubernetes" {
  load_config_file       = false
  host                   = google_container_cluster.primary.endpoint
  token                  = data.google_client_config.primary.access_token
  client_certificate     = base64decode(google_container_cluster.primary.master_auth.0.client_certificate)
  client_key             = base64decode(google_container_cluster.primary.master_auth.0.client_key)
  cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
}

v-morlock on 10 Feb 2020

guys, don't spam) Use version pinning:

version = "1.10.0". # Stable version

provider "kubernetes" {
  load_config_file       = false
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  version                = "1.10.0"

  alias                  = "override"
}

Victorion on 10 Feb 2020

👍12 🎉3

Pinning to v1.10 works for me, but it's not a solution. A minor release should not introduce breaking changes (not to imply it was intended).

adamrbennett on 10 Feb 2020

👍36 👎1

I'm having the same issue, I locked to 1.10.0 and works now. This is my config:
`data "aws_eks_cluster" "cluster" {
name = module.eks-cluster.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
name = module.eks-cluster.cluster_id
}

provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
version = "= 1.10.0"
}

provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
}
version = "~> 1.0.0"
}`

xposix on 11 Feb 2020

👍3 👎1

I just wanted to drop a note that this has broken other documented provider configurations as well. Here's the example provider configuration from an Azure Kubernetes cluster:

https://www.terraform.io/docs/providers/azurerm/r/kubernetes_cluster.html#attributes-reference

Seems like there's an issue initializing if the Kubernetes provider is configured using dynamic values based on output from other terraform resources. In the example @alexsomesan provided, he is loading files that exist at init time from the filesystem, rather than other terraform values.

I'll take a look at that linked PR and see if I can help.

jmcshane on 11 Feb 2020

👍6

Works fine with AKS an the following config:

provider "kubernetes" {
  version          = "~>1.10"
  host             = azurerm_kubernetes_cluster.k8s.kube_config[0].host
  load_config_file = false
  client_certificate = base64decode(
    azurerm_kubernetes_cluster.k8s.kube_config[0].client_certificate,
  )
  client_key = base64decode(
    azurerm_kubernetes_cluster.k8s.kube_config[0].client_key
  )
  cluster_ca_certificate = base64decode(
    azurerm_kubernetes_cluster.k8s.kube_config[0].cluster_ca_certificate,
  )
}

We only had to add "load_config_file = false" the get things working again.

Edit: the version statement "~>1.10" is valid for versions 1.x above or equals 1.10, so it IS working for us with the above config using v1.11 of the provider.

chreichert on 11 Feb 2020

yes, it appears the issue was introduced in the 1.11 version of the provider

jmcshane on 11 Feb 2020

Same issue with v 1.11.0 on EKS with following configuration:

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

gertjangaillet on 11 Feb 2020

We are having same issue with GCP since yesterday.

When we tried to downgrade to 1.10.0 and setting load_config_file= false not worked for us. Here is our updated config which get failed.

provider "kubernetes"{
  host     = ""
  username = ""
  password = ""
  client_certificate     = ""
  client_key             = ""
  cluster_ca_certificate = ""
  version                =  "1.10.0"
  load_config_file       =   false
}

provider "helm"{
   kubernetes {
    host     = ""
    username = ""
    password = ""
    client_certificate     = ""
    client_key             = ""
    cluster_ca_certificate = ""
    load_config_file       =   false
  }
  version = "~> 1.0.0"
}

Please let us know the solution if it worked,
Thanks

msbond on 11 Feb 2020

Same issue for me. Pinning to 1.10.0 is working as a workaround. Using terraform 12.
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
version = "1.10.0"
}

jmorgan415 on 11 Feb 2020

I was able to reproduce on EKS. But only intermittently and only on destroy.
Still, getting closer to the cause.

alexsomesan on 11 Feb 2020

👍3

@alexsomesan I took at look at #690. It looks like the main change in behavior is what gets set on the cfg object. I'm stepping through the logic within a dynamic provider configuration. In the old version:

Line 199: try the config file (nil result, should be set to false anyway when statically defined)
Line 207: check if running in a cluster (again will be nil)
Line 210/211: fallback to standard cfg
Line 221-260: set values from provider config on cfg object

In the new version, there is no way tog get this fallback to the standard cfg object provided by cfg = &restclient.Config{}, so when you skip loading the config file, it runs cc := clientcmd.NewNonInteractiveDeferredLoadingClientConfig(loader, overrides) based on the overrides from the config file and it doesn't load these values correctly based on the dynamic input.

I haven't quite figured out the full scope of what the PR was attempting to address, but you may want to re-work the PR to resolve just #679 before dropping in a full refactor of the provider code. I realized this was already merged, which creates a bit of a challenge, but there may be a simpler way to address that one bug resolution without dropping support defaulting Kube client object and setting values directly on cfg from the provider.

jmcshane on 11 Feb 2020

👍1

Same issue with v 1.11.0 on EKS with following configuration:

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

I only now noticed the above configuration still works for existing environments, and it only during terraform plan when creating a brand new environment from scratch and the EKS cluster does not yet exist (it is initialized by another module). I assume that @jmcshane was on the right track saying

Seems like there's an issue initializing if the Kubernetes provider is configured using dynamic values based on output from other terraform resources. In the example @alexsomesan provided, he is loading files that exist at init time from the filesystem, rather than other terraform values.

gertjangaillet on 11 Feb 2020

👍1

Similar issue, however a different usage pattern though

provider "kubernetes" {
  alias          = "minikube_cluster"
  config_path    = "~/.kube/minikube_config"
  config_context = "default-context"
}

Ends up with the following error, when we try to remove/destroy K8S resources that were previously created using the above provider

Error: Failed to initialize config: stat /root/.kube/config: no such file or directory

      on <empty> line 0:
      (source code not available)

sekarnaren on 12 Feb 2020

@pdecat wondering your thoughts on the override config vs the fallback default restclient that sets values on the returned cfg object from the provider block. Looking through the PR, it seems like we're running into a bit of a mess with the downstream client more than anything else.

jmcshane on 12 Feb 2020

Could someone please confirm by downgrading kubernetes version to 1.10.0 and setting load_config_file=false solved for Google container cluster?

It is failing for apply stage.

It is not working for me.

msbond on 12 Feb 2020

Ran into this as well, not in the least part because I thought I had pinned to 1.10.*... but hadn't. Illustrates why pinning indeed _is_ good practice, as stated above.

Regardless of the actual fix, what could be improved in CI to prevent bugs like this? Given the variety of reports (AWS, GKE, Azure) this is not 'some edge-case'.

TBeijen on 12 Feb 2020

👍2

Could someone please confirm by downgrading kubernetes version to 1.10.0 and setting load_config_file=false solved for Google container cluster?

It is failing for apply stage.

It is not working for me.

Make sure you run terraform init again after pinning the version and validate the provider version with terraform version.

manfredlift on 12 Feb 2020

The client setup introduced in #690 is more strict and does more validation upfront. It is an improvement over the previous setup and it's very likely not the root cause of the problem.

I was able to reproduce this issue by using EKS's aws_eks_cluster datasource to fetch credentials from an already present EKS cluster. Everything works as expected on apply and destroy, but a second destory (with the now empty state) triggers a failure and the error message about failing to configure the provider, as reported above.

This is also likely why previously some users where reporting that the provider tries to connect to localhost. That was the result of the default K8s client configuration being fallen back to, instead of bailing out with an error when a valid configuration was not available.

My primary working theory at the moment is that we have uncovered, or better said, stopped avoiding a bug in Terraform itself.

I'm going down that rabbit hole to confirm this is indeed the case and pinpoint the exact cause.

Please try to validate if a second destroy action does trigger the same behaviour in your cases.

alexsomesan on 12 Feb 2020

The use case I (and I assume more folks using terraform-aws-eks) run into is having cluster-creation being conditional.

So a cluster might or might not be there. Terraform not allowing conditional providers, (afaik, please prove me wrong) it's actually very useful to have the kubernetes provider be a stub, meaning configured with blank strings, as it won't be invoked anyway.

Not having that ability prevents upgrading beyond 1.10. So if insisting on validation, at least make it opt out, something like validate_config which would default to true (1.11 behaviour), but can be set to false (<=1.10 behaviour).

(I think in general, the inability to have conditional providers, combined with enforcing 100% correct config is... tricky.)

TBeijen on 12 Feb 2020

👍2

alexsomesan on 12 Feb 2020

@TBeijen I'm tempted to disagree with you on the provider accepting invalid configurations. We have no way of telling whether it will actually need to do anything later on, and the downside of allowing that is the plethora of errors previously reported as connection attempts to localhost which are even more confusing for users.

alexsomesan on 12 Feb 2020

Hello,
I am having the same issue using provider v1.11.0 and terraform 0.12.9
The Kubernetes cluster is in GCP (GKE) and I am using access token for authentification
Here is my provider configuration:

data "google_client_config" "default" {
}

provider "kubernetes" {
  version = "v1.11.0"

  load_config_file = false

  host                   = var.gke_endpoint
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(var.gke_cluster_ca_certificate)

}

bbenlazreg on 12 Feb 2020

@TBeijen I'm tempted to disagree with you on the provider accepting invalid configurations. We have no way of telling whether it will actually need to do anything later on, and the downside of allowing that is the plethora of errors previously reported as connection attempts to localhost which are even more confusing for users.

@alexsomesan Hence the 'opt out' suggestion.

Since conditional resources are pretty common (not just terraform-aws-eks), the 'stub' provider looks like a pretty common use case. Current situation to me feels like replacing one problem, with a bigger problem: The inability to use the provider on conditional resources.

(The former being more of a nuisance really, I would assume it to fail anyway. And in the end, the root cause iiuc is a misconfiguration by the user).

Edit: Another approach to opt out might be keeping the 1.10 behaviour but improving the error message, indicating something might be off as it tries to connect to localhost. But in the end I think if you don't correctly configure the provider, you shouldn't use it and the other way round.

Edit2: Sample configuration (as displayed here) which illustrates the use case of conditional resources:

# In case of not creating the cluster, this will be an incompletely configured, unused provider, which poses no problem.
provider "kubernetes" {
  host                   = element(concat(data.aws_eks_cluster.cluster[*].endpoint, list("")), 0)
  cluster_ca_certificate = base64decode(element(concat(data.aws_eks_cluster.cluster[*].certificate_authority.0.data, list("")), 0))
  token                  = element(concat(data.aws_eks_cluster_auth.cluster[*].token, list("")), 0)
  load_config_file       = false
  version                = "~> 1.10"
}

Edit3: While I assumed the problem to be related to the 'conditional' part, it's apparently also triggered by resources not yet being ready, as indicated by comment below and this terraform-aws-eks issue.

TBeijen on 12 Feb 2020

👍2

I am experiencing this issue when configuring the kubernetes provider with the output of an azure kubernetes cluster resource. The error message is as follows:

Error: Failed to initialize config: invalid configuration: no configuration has been provided

I have provided a minimal reproduction as a gist.

@alexsomesan You mentioned that you were able to only reproduce on destroy of an existing cluster. It appears that the issue stems from using the credentials of a cluster created in the same terraform configuration or from a data source.

dbalcomb on 12 Feb 2020

@alexsomesan I think there's still a very basic issue moving towards the strict configuration within the provider. Its not failing in the destroy phase, but rather the plan phase, because the cluster has not yet been provisioned:

resource "google_container_cluster" "primary" {
  name               = "container-cluster-tf-test"
  location           = "us-central1-a"
  initial_node_count = 3

  master_auth {
    username = ""
    password = ""

    client_certificate_config {
      issue_client_certificate = false
    }
  }

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    metadata = {
      disable-legacy-endpoints = "true"
    }
  }

  timeouts {
    create = "30m"
    update = "40m"
  }
}

provider "kubernetes" {
  host                   = google_container_cluster.primary.endpoint
  client_certificate     = base64decode(google_container_cluster.primary.master_auth.0.client_certificate)
  client_key             = base64decode(google_container_cluster.primary.master_auth.0.client_key)
  cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
}

resource "kubernetes_service" "example" {
  metadata {
    name = "terraform-example"
  }
  spec {
    selector = {
      app = "my-app"
    }
    port {
      port        = 8080
      target_port = 80
    }

    type = "ClusterIP"
  }
}

This fails in terraform plan because the values are empty when trying to provision the provider.

jmcshane on 12 Feb 2020

👍2

@dbalcomb Thanks for the gist, I'll give that go next.

The use-case you described, where provider configuration is sourced from resources / data-sources created in the same apply run has been known to not be supported by Terraform for a while now. I've been trying for quite a while to raise awareness around it.

The underlying issue in Terraform is very well explained here: https://github.com/hashicorp/terraform/issues/4149 (albeit it requires a bit of knowledge of Terraform internals).

alexsomesan on 12 Feb 2020

❤1

@alexsomesan This above was my exact use case. Thanks for linking this stuff ^^ I had no idea...

dubb-b on 12 Feb 2020

@dbalcomb Thanks for the gist, I'll give that go next.

The use-case you described, where provider configuration is sourced from resources / data-sources created in the same apply run has been known to not be supported by Terraform for a while now. I've been trying for quite a while to raise awareness around it.

The underlying issue in Terraform is very well explained here: hashicorp/terraform#4149 (albeit it requires a bit of knowledge of Terraform internals).

Well part of the problem is that most people aren't going to assume its an issue when the documentation makes no mention of it being an issue:

https://www.terraform.io/docs/providers/aws/d/eks_cluster_auth.html

It's also a bit unreasonable to expect that someone isn't going to want to do this I think. One of the main benefits of Terraform is the ability to provision resources across multiple platforms all in one go. It kind of makes sense that someone is going to want to both provision an EKS cluster and the k8s deployments that go with it all in one go. Otherwise why not just use CloudFormation for AWS resources and Helm/Ansible for the k8s deployments if the workflow has to be split? Daisy chaining providers is one of the main reasons we chose Terraform for so much automation. When you couple AWS, K8S, and Rancher providers together in one project you get a run once, deploy everything project with tons of benefits.

wrsuarez on 12 Feb 2020

👍11

@alexsomesan there's another problem with the initialization of the provider in 0.11.0 that is triggered if the ~/.kube/config does not exist (relevant for automated deployments from within the cluster that use a ServiceAccount)

Error: Failed to initialize config: stat /home/atlantis/.kube/config: no such file or directory

  on main.tf line 1, in provider "kubernetes":
   1: provider "kubernetes" {

A quick look at the provider.go shows the problem, as it simply returns the error if it can't find the file https://github.com/terraform-providers/terraform-provider-kubernetes/blob/3572842358cc92d21074012f7111437d12774df5/kubernetes/provider.go#L236

The workaround is to either create the ~/.kube/config file empty or add load_config_file = false to the provider, but neither is an ideal solution here.

Looking at the AWS Provider for example, we can see that it initializes without any variables just fine for local users, as well as for automated Systems on AWS that make use of IAM permissions.

Ideally this part should be changed, so that the provider will work out of the box with a ServiceAccount, even if the ~/.kube/config does not exist

Bobonium on 13 Feb 2020

Works fine with AKS an the following config:
provider "kubernetes" {
  version          = "~>1.10"
  host             = azurerm_kubernetes_cluster.k8s.kube_config[0].host
  load_config_file = false
  client_certificate = base64decode(
    azurerm_kubernetes_cluster.k8s.kube_config[0].client_certificate,
  )
  client_key = base64decode(
    azurerm_kubernetes_cluster.k8s.kube_config[0].client_key
  )
  cluster_ca_certificate = base64decode(
    azurerm_kubernetes_cluster.k8s.kube_config[0].cluster_ca_certificate,
  )
}
We only had to add "load_config_file = false" the get things working again.

Edit: the version statement "~>1.10" is valid for versions 1.x above or equals 1.10, so it IS working for us with the above config using v1.11 of the provider.

this didn't work for me
pinning the version to 1.10.0 worked

jungopro on 13 Feb 2020

Can confirm, same issue here in EKS. It works fine for Kube clusters that are already created, but does not work for new ones (i.e. when running plan in a blank environment). Code I'm using:

provider "kubernetes" {
  host                   = aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

Pinning provider to version 1.10.0 worked.

traction-corgi on 14 Feb 2020

We are facing the same issue with TF 0.11.x and Kubernetes 1.11.x plugin

rajivreddy on 15 Feb 2020

any plan to fix this issue?

tigerwings on 15 Feb 2020

same issue here, got
Error: Failed to initialize config: invalid configuration: no configuration has been provided
Pin to 1.10.0 works.

elvis-cai on 16 Feb 2020

We get the same issue with TF v0.12.18 and Kubernetes provider v1.11.0. we can't use provider version v1.10.0, because we need the fix for "in-cluster authentication" and want to use the resource kubernetes_priority_class which was released in v1.10.0.

anmoel on 17 Feb 2020

@alexsomesan there's another problem with the initialization of the provider in 0.11.0 that is triggered if the ~/.kube/config does not exist (relevant for automated deployments from within the cluster that use a ServiceAccount)
Error: Failed to initialize config: stat /home/atlantis/.kube/config: no such file or directory

  on main.tf line 1, in provider "kubernetes":
   1: provider "kubernetes" {
A quick look at the provider.go shows the problem, as it simply returns the error if it can't find the file

https://github.com/terraform-providers/terraform-provider-kubernetes/blob/3572842358cc92d21074012f7111437d12774df5/kubernetes/provider.go#L236

The workaround is to either create the ~/.kube/config file empty or add load_config_file = false to the provider, but neither is an ideal solution here.

Looking at the AWS Provider for example, we can see that it initializes without any variables just fine for local users, as well as for automated Systems on AWS that make use of IAM permissions.

Ideally this part should be changed, so that the provider will work out of the box with a ServiceAccount, even if the ~/.kube/config does not exist

@Bobonium Where would the provider pick the details of the service account from when configured with no attributes?

alexsomesan on 17 Feb 2020

@wrsuarez I understand the temptation to pack every operation in one apply. It does bring a lot of convenience.

The AWS provider documentation that you referenced is rightfully instructing users to source credentials form aws_eks_cluster_auth and aws_eks_cluster_auth data sources. It does not imply that the cluster resouce should be created in the same apply run.

alexsomesan on 17 Feb 2020

@tigerwings the fix for this has to come from Terraform itself. Meanwhile, there is a PR (https://github.com/terraform-providers/terraform-provider-kubernetes/pull/767) that tries to alleviate the problem by "optimistic client initialization" ™️ , as was the case with the old provider configuration before 1.11.

The issue in Terraform is tracked here: https://github.com/hashicorp/terraform/issues/4149

alexsomesan on 17 Feb 2020

If anyone who is experiencing this issue has the ability to test with custom provider binaries, please test the changes in https://github.com/terraform-providers/terraform-provider-kubernetes/pull/767 and report if the issue persists. That would greatly help in validating those changes and releasing a new provider version.

To use a custom provider binary, checkout the PR branch and run go build in the root directory of the provider codebase. A binary of the provider will be built in the same directory. Copy that provider binary to $HOME/.terraform.d/plugins and run terraform init again in your configuration directory.

alexsomesan on 17 Feb 2020

Can this issue be focused simply on fixing the regression in 1.11.0?

In 1.10.0, the bug was that the provider would always use in cluster config no matter what config was passed in.

Now in 1.11.0, it never uses a token I pass in through the data aws_eks_cluster_auth.

I’d rather have a fix for the imminent bug which breaks many use cases than solve a rearchitecture discussion.

Edit: My use case (one run):

Create cluster
Null resource checks for EKS endpoint to be online
Then, I use data tag to create a token
I use the token to authenticate and spin up the provider
Finally I configure the cluster auth map to map my IAM roles to cluster roles.

Edit2: looks like @alexsomesan replied at the same time I did. I will try the custom binary.

Clete2 on 17 Feb 2020

First, I appreciate your work to quickly cut a stopgap patch (#767) for this.

The use-case you described, where provider configuration is sourced from resources / data-sources created in the same apply run has been known to not be supported by Terraform for a while now.

(https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759#issuecomment-585236941)

The documentation discussed above--demonstrating how to configure Terraform's kubernetes provider with the aws_eks_cluster_auth data source--also exists in the docs for the google_client_config data source and the azurerm_kubernetes_cluster data source:

https://www.terraform.io/docs/providers/google/d/datasource_client_config.html#example-usage-configure-kubernetes-provider-with-oauth2-access-token
https://www.terraform.io/docs/providers/azurerm/r/kubernetes_cluster.html#password

It's difficult to fathom a usage pattern given the above configurations which would not entail configuring a newly-provisioned K8s cluster in the same Terraform run.

I've been trying for quite a while to raise awareness around it.

(https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759#issuecomment-585236941)

Given three of the major cloud providers are all demonstrating this (apparently flawed) pattern, perhaps a fix narrower than https://github.com/hashicorp/terraform/issues/4149 (which appears to have become a long-lived dumping ground) is necessary.

As a Hashicorp insider, can you point us in the direction of any open issues that more narrowly focus on "chaining" a resource from provider A into provider B's configuration?

StephenWithPH on 17 Feb 2020

👍1

@StephenWithPH For what it's worth, the documentation for the Google and AWS providers demonstrates using a datasource to get the cluster credentials required by the Kubernetes provider configuration. This indeed works fine and does not imply that the cluster is created in the same apply operation. It was likely never considered to be a use-case since the Terraform documentation itself clearly advises against it here: https://www.terraform.io/docs/configuration/providers.html#provider-configuration

I will work with the maintainers of the Azure provider to amend the example for Azure which is indeed the only one that references a cluster resource directly.

As for you second question, it is explained in part in the same Terraform doc page I linked above. What is not mentioned there is that you could still "chain" a resource into a provider's configuration if the resource already exists in Terraform state when you run apply (as in it was created by a previous partial apply - see the -target parameter of terrafrom apply).

alexsomesan on 17 Feb 2020

👀1 👍1

One more thing, @StephenWithPH

Terraform issue https://github.com/hashicorp/terraform/issues/4149 is THE place that tracks the underlying issue at play here. That is where you should be adding feedback in order to help prioritize it. The fact that it is so old speaks to the fact that it's easily avoidable and mostly a nice to have. It thus fell victim to higher priority issues and Terraform team's focus on more broadly-benefiting features. Please go upvote https://github.com/hashicorp/terraform/issues/4149 to help with priotising work on it.

alexsomesan on 17 Feb 2020

👍3

@alexsomesan Thanks for your work on this. I have checked out your work-in-progress branch and it initially appears that I can create an Azure Kubernetes cluster and Kubernetes resources in the same apply as I could before when using version 1.10. I ran apply on both versions and both succeeded whereas 1.11 wouldn't even plan. I would like to double-check with a more advanced configuration though.

I understand the recommendations to not configure the provider from the output of a resource and that the documentation has done no favours in preventing users from doing so. However until there is better support in Terraform for such a scenario I do not see myself doing anything differently even if I have to forego new features and lock the provider to an older version.

The alternative choices appear to be resource targeting, which is not recommended according to the documentation, and multiple configurations which is less than ideal if only for the fact that it introduces extra overhead to manage.

dbalcomb on 18 Feb 2020

👍1

@alexsomesan , thanks for your quick work on this issue.

@tigerwings the fix for this has to come from Terraform itself. Meanwhile, there is a PR (#767) that tries to alleviate the problem by "optimistic client initialization" ™️ , as was the case with the old provider configuration before 1.11.

The issue in Terraform is tracked here: hashicorp/terraform#4149

tigerwings on 18 Feb 2020

@alexsomesan there's another problem with the initialization of the provider in 0.11.0 that is triggered if the ~/.kube/config does not exist (relevant for automated deployments from within the cluster that use a ServiceAccount)
Error: Failed to initialize config: stat /home/atlantis/.kube/config: no such file or directory

  on main.tf line 1, in provider "kubernetes":
   1: provider "kubernetes" {
A quick look at the provider.go shows the problem, as it simply returns the error if it can't find the file
https://github.com/terraform-providers/terraform-provider-kubernetes/blob/3572842358cc92d21074012f7111437d12774df5/kubernetes/provider.go#L236

The workaround is to either create the ~/.kube/config file empty or add load_config_file = false to the provider, but neither is an ideal solution here.
Looking at the AWS Provider for example, we can see that it initializes without any variables just fine for local users, as well as for automated Systems on AWS that make use of IAM permissions.
Ideally this part should be changed, so that the provider will work out of the box with a ServiceAccount, even if the ~/.kube/config does not exist
@Bobonium Where would the provider pick the details of the service account from when configured with no attributes?

I'm not sure you understood me correctly, as this is already correctly done. Let me give you a minimalistic example of the problem at hand

~ # cat main.tf 
provider "kubernetes" {
  version                = "1.11.0"
  host                   = "https://127.0.0.1:6443"
  cluster_ca_certificate = "doesn't matter"
  token                  = "doesn't matter"
}


data "kubernetes_service" "example" {
  metadata {
    name = "kubernetes"
  }
}
~ # terraform apply

Error: Failed to initialize config: stat /root/.kube/config: no such file or directory

  on main.tf line 1, in provider "kubernetes":
   1: provider "kubernetes" {


~ # mkdir ~/.kube
~ # touch ~/.kube/config
~ # terraform apply
data.kubernetes_service.example: Refreshing state...

Error: Get https://127.0.0.1:6443/api/v1/namespaces/default/services/kubernetes: dial tcp 127.0.0.1:6443: connect: connection refused

  on main.tf line 9, in data "kubernetes_service" "example":
   9: data "kubernetes_service" "example" {

This was executed in a fresh hashicorp/terraform:0.12.20 container. Since I did not explicitly specify load_config_file = false it fails the first time I try to apply, because it can't access the ~/.kube/config file on disk. The second time it then fails correctly because of the incorrect configuration that has been supplied.
I'd expect the provider to simply not fail initializing if it tries to load the ~/.kube/config but can't access the file for whatever reason, then it would automatically work for ServiceAccounts as well.

As a side note, since there were now changes to the configuration loading of the provider in 1.10.0 and 1.11.0 (and maybe once more if you consider my input), will you also adjust https://github.com/hashicorp/terraform/pull/19525 to have the same configuration lookup behavior before merging?

Bobonium on 18 Feb 2020

Hi, i confirm Pinning provider to version 1.10.0 worked.

flenoir on 20 Feb 2020

👍4

Regarding:

the documentation for the Google and AWS providers demonstrates using a datasource to get the cluster credentials required by the Kubernetes provider configuration ... This indeed works fine ... the Terraform documentation itself clearly advises against it

In my experience, it works for the most part, however using variable data in provider blocks still causes problems with the terraform import operation for example. See https://github.com/hashicorp/terraform/issues/13018

rehevkor5 on 22 Feb 2020

Where would the provider pick the details of the service account from when configured with no attributes?

Previously, I assume that the provider was lazy-loading the configuration from ~/.kube/config, only loading it when necessary for processing a resource. This allowed us to misbehave by creating the k8s cluster in the same stack.

So the order of resource creation (based on depends_on) in the stack was:

create k8s cluster
create kubeconfig file (null resource)
wait for k8s to become available (null resource)
create all kubernetes resources

If the provider is now eagerly loading the config, that approach no longer works.

rehevkor5 on 22 Feb 2020

👍2

this is my provider block:

provider "kubernetes" {
    host                   = azurerm_kubernetes_cluster.aks.kube_config.0.host
    client_certificate     = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.cluster_ca_certificate)
    alias                  = "aks"
    version                = "1.10.0"
    load_config_file       = false
}

and this didnt work without version pinning.

4c74356b41 on 22 Feb 2020

I too am getting this error. I have a module that will create an EKS cluster. It creates both the control plane and worker resources in one apply run.

My config looks like this

provider "Kubernetes" {
  version                = "~>1.8"
  host                   = "${aws_eks_cluster.this.endpoint}"
  cluster_ca_certificate = "${base64decode(aws_eks_cluster.this.certificate_authority.0.data)}"
  token                  = "${data.aws_eks_cluster_auth.this.token}"
  load_config_file       = false
}

With the first plan, there is no control plane (so aws_eks_cluster does not exist), so my initial prediction was that this would not work, however it did work with 1.8, 1.9 & 1.10.

It fails with

module.example.provider.kubernetes: Failed to initialize config: invalid configuration: no configuration has been provided

I would appreciate clarity on if this is a reasonable approach or not. While super convenient, if this is no longer supported I will need to change my approach.

Version 1.11 does work if the above parameters reference an EKS data source of a currently running EKS cluster.

DefSol on 27 Feb 2020

👍1

I worked around this issue by doing this in the root module in the meantime:
(Not sure why i'm being downvoted?)

provider "kubernetes" {
    # fixed to 1.10.0 because of https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759
    version = "1.10.0"
    # use a real dependency instead of specifying a variable dependency in the submodule
    token = module.scaleway.token
}

hazcod on 27 Feb 2020

👎1

Not sure why i'm being downvoted?

because it doesn't work when the token state is unknown, e.g. on first apply

acuteaura on 28 Feb 2020

I have to admit its gone worse. so it now expects you to have a real config in the ~/.kube/config, I'm still testing, it might not actually need a real config, but its extremely weird at this point

4c74356b41 on 28 Feb 2020

Provider version 1.11.1 was just released, which includes measures to circumvent this error.

Please test it and report if the solution works in your case.

alexsomesan on 28 Feb 2020

🎉4 ❤1

@alexsomesan I tried 1.11.1 and unfortunately I get the following error when using kubernetes_cluster_role_binding.

Error: serializer for text/html; charset=utf-8 doesn't exist

Reverting back to 1.10.0 and the above works.

alecor191 on 29 Feb 2020

@alecor191 Please post the complete kubernetes_cluster_role_binding resource template as well as the provider "kubernetes" block.

alexsomesan on 29 Feb 2020

@alexsomesan sure, here you go:

provider "kubernetes" {
  host                   = module.cluster.k8s_host
  client_certificate     = base64decode(module.cluster.k8s_client_certificate)
  client_key             = base64decode(module.cluster.k8s_client_key)
  cluster_ca_certificate = base64decode(module.cluster.k8s_cluster_ca_certificate)
  version                = "1.11.1"
}

resource "kubernetes_cluster_role_binding" "admins" {
  metadata {
    name = "aks-cluster-admins"
  }

  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "ClusterRole"
    name      = "cluster-admin"
  }

  subject {
    kind      = "Group"
    name      = var.admin_group
    api_group = "rbac.authorization.k8s.io"
  }

  subject {
    kind      = "ServiceAccount"
    name      = "default"
    namespace = "kube-system"
  }

  subject {
    kind      = "ServiceAccount"
    name      = "kubernetes-dashboard"
    namespace = "kube-system"
  }

  depends_on = [
    azurerm_kubernetes_cluster.aks
  ]
}

alecor191 on 29 Feb 2020

Thanks! If you set an output with the value of module.cluster.k8s_host is it a valid URL format? Does it start with https://?

output "api_host" {
  value = module.cluster.k8s_host
}

alexsomesan on 29 Feb 2020

Seems to work for me! https://github.com/ironPeakServices/infrastructure/runs/476517046?check_suite_focus=true

hazcod on 29 Feb 2020

@alexsomesan yes, that's correct: the output of

output "api_host" {
  value = module.cluster.k8s_host
}

is

api_host = https://rg5168-aks1-a0e50f13.hcp.westeurope.azmk8s.io:443

Essentially we first create an Azure AKS cluster using TF and then use the output of it (like the host name) to configure the K8S provider.

alecor191 on 29 Feb 2020

@hazcod Thanks for the confirmation.

@alecor191 I'll be trying to reproduce your case, but I'm baffled by one thing: if you wrap your cluster provisioning resources in a module, why is your depends_on clause referencing a top-level resource?
Also, which version of Terraform are you on?

alexsomesan on 1 Mar 2020

@alexsomesan sorry for the delay and not being more precise. I used TF v0.12.20.

Essentially what we have is setting up AKS cluster + assigning roles in a shared module that is being used by 3 "environments". In short, we have this setup:

/environments/cluster-test, /environments/cluster-qa, /environments/cluster-prod folders for our environments. tf apply is called from these folders. Each contains a main.tf with the following:

provider "azurerm" {
  subscription_id = var.subscription_id
  tenant_id       = var.tenant_id
}

provider "kubernetes" {
  ... like in my previous message
}

// reference to the "shared" module containing the actual resources
module "cluster" {
  source                  = "../../modules/aks"
  ...
}

And then we have a shared /modules/aks folder containing a main.tf with the following resources:

resource "azurerm_kubernetes_cluster" "aks" {
  ... create the AKS cluster first
}

resource "kubernetes_cluster_role_binding" "admins" {
  ... like in my previous message

  depends_on = [
    azurerm_kubernetes_cluster.aks
  ]
}

with outputs.tf as follows (some of them you can see being used in provider "kubernetes" in my previous message):

output "k8s_host" {
  value = azurerm_kubernetes_cluster.aks.kube_admin_config.0.host
}

output "k8s_client_certificate" {
  value     = azurerm_kubernetes_cluster.aks.kube_admin_config.0.client_certificate
  sensitive = true
}

output "k8s_client_key" {
  value     = azurerm_kubernetes_cluster.aks.kube_admin_config.0.client_key
  sensitive = true
}

output "k8s_cluster_ca_certificate" {
  value     = azurerm_kubernetes_cluster.aks.kube_admin_config.0.cluster_ca_certificate
  sensitive = true
}

output "k8s_kube_admin_config_raw" {
  value     = azurerm_kubernetes_cluster.aks.kube_admin_config_raw
  sensitive = true
}

I'm new to TF, so I may miss something obvious; like if it is an issue to reference outputs of the "shared-module main.tf" in our "environment-specific main.tf" provider "kubernetes", when having the actual resource "kubernetes_cluster_role_binding" in the "shared-module main.tf".

alecor191 on 4 Mar 2020

The 1.11.1 release notes specifically mention this issue as fixed, but it's still open?

Do I miss something?

Comradin on 5 Mar 2020

@Comradin you didn’t miss anything. I’m just keeping the issue open for a while to collect confirmations from users who reported here. This issue was only manifesting in peculiar setups which I cannot reproduce all of.

alexsomesan on 5 Mar 2020

I tested this with an existing cluster (and existing Terraform state) as well as on a new cluster (with no Terraform state) and it all appears to be working as it did before. Thanks @alexsomesan!

For reference, this is my provider config:

data "aws_eks_cluster_auth" "main" {
  name = aws_eks_cluster.main.name
}

provider "kubernetes" {
  version                = "~> 1.10"
  host                   = aws_eks_cluster.main.endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.main.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.main.token
  load_config_file       = false
}

Also, I don't think it should matter for this issue, but I also have a local-exec provisioner on my cluster resource to wait for the k8s API:

resource "aws_eks_cluster" "main" {
  name = "my-cluster"

  ...

  # local-exec will execute after a resource is created
  # this provisioner waits until the k8s API is healthy

  # this way, any other resources that depend on this one
  # will not be created until the k8s API is operational

  # curl should wait up to 5 minutes (default) to connect,
  # so the loop/sleep logic will only apply if curl times out
  # this effectively means that once curl is able to connect
  # it will wait up to 60 seconds for a healthy response
  provisioner "local-exec" {
    command = <<EOF
RETRIES=0
until curl -sk --fail ${self.endpoint}/healthz || [ $RETRIES -eq 6 ]; do
  echo "Waiting for EKS..."
  sleep 10
  RETRIES=$(($RETRIES+1))
done
EOF
  }
}

adamrbennett on 5 Mar 2020

Moving to 1.11, fixed the issue.
provider "kubernetes" { version = "~> 1.11" host = eks_endpoint cluster_ca_certificate = eks_certificate_authority token = eks_token.token load_config_file = false }

rajivreddy on 11 Mar 2020

👍1

Looks like 1.11.1 fixes my use case as well. Tried adding a new cluster to a project that already contained a cluster. Both 'from scratch' and existing worked without flaws.

I use terraform-aws-eks, conditionally creating all k8s resources via a variable (the ones managed by aforementioned module + some others). Similar to this example.

So new clusters I roll out in 2 steps (one Terraform project). First pass creates just the cluster without 'touching' anything related to kubernetes provider. Second pass sets up de aws-auth configmaps and other bootstrapping.

TBeijen on 13 Mar 2020

👍1

Sounds like this issue has been fixed. Does anyone have further reports of this still being an issue?

aareet on 8 Apr 2020

@aareet unfortunately I still have a 100% repro. I have an AKS cluster created with TF. I used kubernetes provider 1.10.0 in my TF files.

If I now just change the provider version to 1.11.1 and run terraform plan again, then I get the following:

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

module.appgateway.module.appgateway-frontoffice-api.data.azurerm_key_vault_secret.backend_https_certificate: Refreshing state...
module.appgateway.module.appgateway-backoffice-api.data.azurerm_key_vault_secret.backend_https_certificate: Refreshing state...
module.appgateway.module.appgateway-corporate-api.data.azurerm_key_vault_secret.backend_https_certificate: Refreshing state...
module.appgateway.module.appgateway-customer-api.data.azurerm_key_vault_secret.backend_https_certificate: Refreshing state...
azurerm_resource_group.aks: Refreshing state... 
module.appgateway.module.appgateway-customer-api.azurerm_public_ip.ag_public_ip: Refreshing state... 
module.cluster.azurerm_role_assignment.aks_rg_access: Refreshing state... 
module.appgateway.module.appgateway-frontoffice-api.azurerm_public_ip.ag_public_ip: Refreshing state... 
module.vnet.azurerm_virtual_network.cluster_net: Refreshing state... 
module.appgateway.module.appgateway-backoffice-api.azurerm_public_ip.ag_public_ip: Refreshing state... 
module.appgateway.module.appgateway-corporate-api.azurerm_public_ip.ag_public_ip: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_corporate_api_ag: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_k8s_pods_worker: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_k8s_pods_default: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_backoffice_ag: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_frontoffice_ag: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_customer_api_ag: Refreshing state... 
module.vnet.azurerm_subnet.cluster_net_lb: Refreshing state... 
module.cluster.azurerm_kubernetes_cluster.aks: Refreshing state... 
module.appgateway.module.appgateway-frontoffice-api.azurerm_application_gateway.ag: Refreshing state... 
module.appgateway.module.appgateway-backoffice-api.azurerm_application_gateway.ag: Refreshing state... 
module.appgateway.module.appgateway-customer-api.azurerm_application_gateway.ag: Refreshing state... 
module.cluster.azurerm_kubernetes_cluster_node_pool.aks_worker_node_pool: Refreshing state... 
module.cluster.kubernetes_cluster_role_binding.admins: Refreshing state... 
module.appgateway.module.appgateway-corporate-api.azurerm_application_gateway.ag: Refreshing state... 

Error: serializer for text/html; charset=utf-8 doesn't exist

Is there any way for me to provide some sort of diagnostic logs?

alecor191 on 8 Apr 2020

@alecor191 That's a weird error message. Can you do the same operation with the env var TF_LOG=TRACE set and share the output? It will be a quite hefty log so maybe put it in a gist instead of pasting.

alexsomesan on 8 Apr 2020

@alexsomesan thanks for the hint. I re-ran terraform plan against the existing AKS cluster using that env variable and stored the relevant logs in this gist. Let me know if you need the whole log.

From what I see the issue is that we're getting redirected as the request is unauthorized (TF400813: The user '' is not authorized to access this resource.). However, the result of the redirect is a HTML page that TF can't deal with and thus it fails.

For context: I'm running the TF command as part of a CI pipeline in Azure DevOps Pipelines.

Update: I also ran terraform plan with trace logging enabled with provider version 1.10.0 and the difference is, that with 1.10.0 there is no redirect due to auth. Instead, the request succeeds right away.

Both runs were executed on the same CI pipeline. The only difference between the two runs is the TF Kubernetes provider version.

Right before that K8S API call, I noticed this diff between 1.10.0 and 1.11.1:

1.10.0

[INFO] Unable to load config file as it doesn't exist at "/root/.kube/config"
[DEBUG] Enabling HTTP requests/responses tracing

1.11.1

[DEBUG] Trying to load configuration from file
[DEBUG] Configuration file is: /root/.kube/config
[WARN] Invalid provider configuration was supplied. Provider operations likely to fail.
[DEBUG] Enabling HTTP requests/responses tracing

As a workaround, I tried to create the kube config file as part of the CI pipeline and re-ran using 1.11.1. This time it worked. However, it was OK because the AKS cluster already existed and I knew what to put in kube config. But what if I create the cluster from scratch using TF: before running TF there is no kube config I can set, as the cluster doesn't exist yet.

My understanding is that the Kubernetes provider should pick up the kube config from the AKS module (in my case):

provider "kubernetes" {
  host                   = module.cluster.k8s_host
  client_certificate     = base64decode(module.cluster.k8s_client_certificate)
  client_key             = base64decode(module.cluster.k8s_client_key)
  cluster_ca_certificate = base64decode(module.cluster.k8s_cluster_ca_certificate)
  version                = "1.11.1"
}

However, from the logs above it seems the K8S provider wasn't really able to use those creds. I may be missing something obvious here, so I would be grateful for any pointer you can provide.

alecor191 on 9 Apr 2020

👍1

Having the same issue with kubernetes version 1.11.0.

I cannot rollback to 1.10.0 because that version does not recognize resource type "kubernetes_priority_class"

Mahendrasiddappa on 1 Jun 2020

Error: Invalid resource type
on .terraform/modules/eks_cluster/main.tf line 27, in resource "kubernetes_priority_class" "DS-priority":
27: resource "kubernetes_priority_class" "DS-priority" {
The provider provider.kubernetes does not support resource type
"kubernetes_priority_class".

Mahendrasiddappa on 1 Jun 2020

@Mahendrasiddappa this issue was filed against 1.11.0, and resolved in 1.11.1 - can you retry with 1.11.1?

aareet on 2 Jun 2020

I'm running terraform in a docker container via Docker Desktop on Windows.

Restarting docker itself by restarting Docker Desktop fixed my issue, similar to the one from dubb_b

Edit for more info: I usually run my terraform container detached, and let it live across laptop hibernation.

thorlarholm on 29 Jun 2020

I used to encounter this issue and had to resort to v1.10. Just tried again in a new project, the issue seems to be fixed. See terraform version output below:

Terraform v0.12.29
+ provider.google v3.33.0
+ provider.google-beta v3.33.0
+ provider.kubernetes v1.12.0

coolersport on 11 Aug 2020

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

hashibot[bot] on 10 Oct 2020

Terraform-provider-kubernetes: Provider 1.11.0 fails to initialize

Most helpful comment

All 99 comments

Related issues