Terraform-provider-kubernetes: Call to http://localhost/version with configured host and credentials

Created on 13 Dec 2019 · 62Comments · Source: hashicorp/terraform-provider-kubernetes

Terraform Version

Terraform v0.12.17

provider.azurerm v1.38.0
provider.kubernetes v1.10.0

Affected Resource(s)

Please list the resources as a list, for example:

kubernetes_persistent_volume

Terraform Configuration Files

provider "kubernetes" {
  version                = "~> 1.10.0"
  host                   = module.azurekubernetes.host
  username               = module.azurekubernetes.username
  password               = module.azurekubernetes.password
  client_certificate     = base64decode(module.azurekubernetes.client_certificate)
  client_key             = base64decode(module.azurekubernetes.client_key)
  cluster_ca_certificate = base64decode(module.azurekubernetes.cluster_ca_certificate)
}

resource "kubernetes_persistent_volume" "factfinder-pv" {
  metadata {
    name = "ff-nfs-client"
    labels = {
      type          = "factfinder"
      sub_type      = "nfs"
      instance_type = "pv"
    }
  }
  spec {
    access_modes = ["ReadWriteMany"]
    capacity = map("storage", "${var.shared_storage_size}Gi")

    persistent_volume_source {
      nfs {
        path   = "/"
        server = var.nfs_service_ip
      }
    }
    storage_class_name = "nfs"
  }
}

Debug Output

(The debug output is huge and I just pasted a relevant section of it. If you need more, I'll create a gist)

2019/12/13 09:45:42 [DEBUG] ReferenceTransformer: "module.factfinder.kubernetes_service.factfinder-fffui-service" references: []
2019/12/13 09:45:42 [DEBUG] ReferenceTransformer: "module.loadbalancer.kubernetes_config_map.tcp-services" references: []
2019/12/13 09:45:42 [DEBUG] ReferenceTransformer: "module.factfinder.kubernetes_deployment.factfinder-sftp" references: []
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: ---[ REQUEST ]---------------------------------------
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: GET /version?timeout=32s HTTP/1.1
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Host: localhost
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: User-Agent: HashiCorp/1.0 Terraform/0.12.17
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept: application/json, */*
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept-Encoding: gzip
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2019-12-13T09:45:42.986Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2019-12-13T09:45:42.986Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: -----------------------------------------------------

Expected Behavior

When running terraform in the hashicorp/terraform container, a terraform plan should run properly

Actual Behavior

The plan errors out with the following error:

Error: Get http://localhost/version?timeout=32s: dial tcp 127.0.0.1:80: connect: connection refused

  on ../modules/factfinder/factfinder-nfs-client-pv.tf line 6, in resource "kubernetes_persistent_volume" "factfinder-pv":
   6: resource "kubernetes_persistent_volume" "factfinder-pv" {

This only happens, when running terraform in the container. When ran locally, everything is fine. (Even when the local .kube directory is removed)

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform plan or terraform apply

Important Factoids

Running in Azure AKS
Running in a Docker container based on the hashicorp/terraform image

bug documentation needs investigation

Source

dploeger

👍4

Most helpful comment

I have exact problem of @jakexks and @hazcod. Everything was working when I had everything in the root module but when I split to a separate module, it starts giving errors saying "rror: invalid configuration: no configuration has been provided" as well as trying to connect using localhost.

brpaz on 16 Feb 2020

👍3

All 62 comments

Hm. Now I even somehow configured my local environment, that that happens. 🤷‍♂

dploeger on 16 Dec 2019

Happens with me as well. I have changed kubernetes secret metadata name from a string to interpolated value... which resolves into same string. The original has no issue, the interpolated connects to locahost...

resource "kubernetes_secret" "vault-gcp" {
  metadata {
    name = "${var.deployment_name}-gcp"
  }
...
}

When name is "vault-gcp", it's fine. In new branch with above code and deployment name set to "vault", hence resulting interpolation being "vault-gcp", this fails with connection to locahost.

Seems like TF/provider thinks this is some new/different instance of the resource, which somehow does not belong to the configured kub cluster, so it probably fails back to default "localhost" address.

mtekel on 18 Dec 2019

👍2

I have no interpolated values in metadata, but in the spec. But I have that in all kubernetes resources and only the resource mentioned above has the problem (or it is the first one it comes across and then stops, could be as well).

Quite the phenomenom really. Any core developer around? 😉

dploeger on 18 Dec 2019

Tried a workaround with conditional:

name = var.legacy == "yes" ? "vault-gcp" : "${var.deployment_name}-gcp"

This way I wanted to make it have non-interpolated string directly in some cases, but this ended up with the same issue. My TF version is 0.12.18. I have kub provider configured with host and config:

provider "kubernetes" {
  host  = google_container_cluster.vault.endpoint
  token = data.google_client_config.current.access_token
  cluster_ca_certificate = base64decode(
    ....
  )
  load_config_file = false
}

Then I have tried another workaround, with defining 2 resources, one with interpolation, one with string and then controlling which resource actually gets deployed with

count =  var.legacy == "yes" ? 1 : 0

But this ended up with new resources[0] even for legacy deployment (where it is already deployed and I am trying to achieve 0 changes on TF apply).

So I would say the issue is somehow not respecting existing kubernetes provider config for new resources...

kubernetes_secret.vault-gcp[0]: Refreshing state... [id=default/vault-gcp]
...

Error: Get http://localhost/api/v1/namespaces/default/secrets/vault-gcp: dial tcp [::1]:80: connect: connection refused

mtekel on 18 Dec 2019

I think, it's interesting, that it even tries to call via HTTP and not HTTPS, which would be the default I think.

dploeger on 18 Dec 2019

So it turns out that in my case, I was also pointing to a wrong location in the bucket where it had no tfstate. As most of the resources in GCP have same ID as name, even without state, terraform was able to find and refresh my whole stack, except the kub secrets, where it was connecting to localhost, as it had no state about where the cluster was...

In EC2, that would blow up probably sooner, as resource IDs are quite different from resource names and if you lose state you have a lot of trouble finding where everything is...

mtekel on 19 Dec 2019

Okay, I found the problem for my case. This line here:

https://github.com/terraform-providers/terraform-provider-kubernetes/blob/45d910a26f17f7b03d684221428b86f2f02b5be2/kubernetes/resource_kubernetes_persistent_volume.go#L40

If you remove all the CustomizeDiff part, all works fine. So I guess, the correct server isn't carried through to that point. I try to dig deeper there.

dploeger on 19 Dec 2019

👀2

@alexsomesan @pdecat You added that line there refactoring the whole client handling. Could you think of any implications that could cause this behaviour? It seems as if the MainClientset isn't correctly configured when it reaches the CustomizeDiff function.

dploeger on 19 Dec 2019

~Hi @dploeger, I believe the initialization here occurs too early. The CustomizeDiff probably needs to be replaced by a CustomizeDiffFunc.~

pdecat on 19 Dec 2019

@pdecat You probably know how to do this. I just stumbled through the code. 😆 Are you able to provide a PR for that?

dploeger on 19 Dec 2019

Or can you point me on how to implement that? Just replacing CustomizeDIff with CustomizeDiffFunc didn't work at least. :)

dploeger on 19 Dec 2019

Never mind, it won't work, CustomizeDiffFunc is the type of the CustomizeDiff field.

Let me think of something else.

pdecat on 19 Dec 2019

❤1

@dploeger Are you building the AKS resources from module.azurekubernetes in the same apply run as the kubernetes_persistent_volume ?

alexsomesan on 19 Dec 2019

Yes, I am. And that all worked until 12-9. I can’t really grasp what has changed then, because we didn’t update or change anything there.

dploeger on 19 Dec 2019

@dploeger Are you building the AKS resources from module.azurekubernetes in the same apply run as the kubernetes_persistent_volume ?

Good point, that's the most frequent issue when localhost is involved. The configuration is not available at the time the kubernetes provider is initialized.
The point about removing CustomizeDiff fixing the issue made me think of something else, but it turns out the kubernetes client is only initialized once by the provider.

pdecat on 19 Dec 2019

Further question: is this happening when running TF in a Pod on the cluster?

alexsomesan on 19 Dec 2019

👍1

Ummmm... I haven't tried that. Is that important? I'd have to set that up. I just tried locally. It also happens outside the container now.

dploeger on 19 Dec 2019

I'm experiencing this with a module that nests other modules, sometimes the child modules lose provider configuration and the terraform config becomes un-applyable, but also un-destroyable!

The parent creates a DigitalOcean Kubernetes cluster inside a module, then uses the output of the module to get a data source which configures the provider e.g.

module "e2etest_k8s" {
  source = "./infrastructure/kubernetes/do"
  providers = {
    digitalocean = digitalocean.e2etest
  }
}

data "digitalocean_kubernetes_cluster" "e2etest" {
  provider = digitalocean.e2etest
  name     = module.e2etest_k8s.cluster_name
}

provider "kubernetes" {
  alias            = "e2etest"
  load_config_file = false
  host             = data.digitalocean_kubernetes_cluster.e2etest.endpoint
  token            = data.digitalocean_kubernetes_cluster.e2etest.kube_config[0].token
  cluster_ca_certificate = base64decode(
    data.digitalocean_kubernetes_cluster.e2etest.kube_config[0].cluster_ca_certificate
  )
}

// This also contains submodules
module "<rest of infra>" {
  source = "./<folders>"
  providers = {
    kubernetes = kubernetes.e2etest
  }
}

This provider is then used for a bunch of modules (which also contain modules) that then exhibit the localhost behavior (sometimes, but it seems deterministic between runs).

jakexks on 30 Jan 2020

👍1

any updates on this? Im trying to upgrade from 7.0.1 to 8.2.0 of the EKS terraform module (https://github.com/terraform-aws-modules/terraform-aws-eks) -- I'm able to get through the initial import of the aws-auth configmap by using a local kubeconfig the first time (overriding load_config_file to true for the import), but subsequent plans always fail with a call to localhost.

my provider config looks like

provider "kubernetes" {
  load_config_file       = var.load_config 
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  version                = "1.10.0" # Stable version??
}

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}

Error: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused

module.eks.kubernetes_config_map.aws_auth[0]: Refreshing state... [id=kube-system/aws-auth]
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: 2020/02/11 10:16:22 [INFO] Checking config map aws-auth
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: 2020/02/11 10:16:22 [DEBUG] Kubernetes API Request Details:
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: ---[ REQUEST ]---------------------------------------
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: GET /api/v1/namespaces/kube-system/configmaps/aws-auth HTTP/1.1
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Host: localhost
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: User-Agent: HashiCorp/1.0 Terraform/0.12.20
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept: application/json, */*
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept-Encoding: gzip
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: -----------------------------------------------------
2020-02-11T10:16:22.089-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: 2020/02/11 10:16:22 [DEBUG] Received error: &url.Error{Op:"Get", URL:"http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth", Err:(*net.OpError)(0xc000976050)}
2020/02/11 10:16:22 [ERROR] module.eks: eval: *terraform.EvalRefresh, err: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused
2020/02/11 10:16:22 [ERROR] module.eks: eval: *terraform.EvalSequence, err: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused

I'm happy to provide further information/logs/tests to get this issue resolved ASAP. I have tried provider versions 1.8.1, 1.9.0, 1.10.0 and 1.11.0 (1.11.0 gives me a different error corresponding to issue 759). I'm using terraform 0.12.20

nothingofuse on 11 Feb 2020

👍1

Having the same issue where I use the scaleway kapsule provider kubeconfig output as input for my kubernetes terraform provider. Using local kubeconfig does not resolve the issue during terraform plan. https://github.com/ironPeakServices/infrastructure/runs/435886375?check_suite_focus=true

hazcod on 12 Feb 2020

brpaz on 16 Feb 2020

👍3

@brpaz : so it works if you run it from the root module?
Might be an overall terraform issue, since I had the issue that some terraform variables were not being set for submodules, making me have to set it in the root module too e.g.: https://github.com/ironPeakServices/infrastructure/blob/master/versions.tf#L20

hazcod on 17 Feb 2020

@hazcod yes, I had all my Terraform resources into main.tf in the root module. Everything was working.
Because the configs were growing I created a module and split my main.tf into several files inside the module. After that change and run terraform apply, it started giving these errors.

But then I tried a fresh install (clean state and a new cluster provision from scratch and it worked.
I think somehow a conflict between what was persisted in the state file and the new terraform declarations, resulted in terraform to pick a wrong config?

brpaz on 17 Feb 2020

This might be related to https://github.com/hashicorp/terraform/issues/24131?notification_referrer_id=MDE4Ok5vdGlmaWNhdGlvblRocmVhZDcxODEyOTY3MDo1MjIyNTEy#issuecomment-587144096

hazcod on 18 Feb 2020

After reaching out to terraform core, above issue seems to indicate that it's a kubernetes provider issue where it's not handling the unknown variables well.

hazcod on 21 Feb 2020

👍1

I have drilled this down to the following: if a kubernetes provider is receiving unknown values (because of a dependency), it should go through with the plan because it would normally be fulfilled in the apply phase. I think that's a better approach than just erroring out now.

hazcod on 25 Feb 2020

This is really frustrated, if my scaleway provider cluster is removed, I have to take following manual steps:

Comment out kubernetes/helm provider code
Trigger the pipeline deploy
Re-enable kubernetes/helm provider code
Hope everything goes well or start over again

hazcod on 26 Feb 2020

I circumvented this with:

provider "kubernetes" {
    # fixed to 1.10.0 because of https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759
    version = "1.10.0" 
    # set the variable in the root module or else we have a dependency issue
    token = module.scaleway.token
}

hazcod on 27 Feb 2020

👍1

Have this problem as well, running terraform in a container on gcp cloud build triggers. Since last month it is trying to connect to localhost and ignores the host set in the provider config.

davidq2q on 2 Mar 2020

@davidq2q Have you tried with v1.11.1?

hazcod on 2 Mar 2020

@davidq2q Have you tried with v1.11.1?

Forgot to reply but after setting the version to 1.10.0 yesterday everything seems to work; all builds are green now.

davidq2q on 3 Mar 2020

Yes but the latest would supposedly fix that.

hazcod on 3 Mar 2020

1.11.1 does not fix the issue for me: https://github.com/ironPeakServices/infrastructure/runs/489796449

hazcod on 6 Mar 2020

I circumvented this with:

provider "kubernetes" {
    # fixed to 1.10.0 because of https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759
    version = "1.10.0" 
    # set the variable in the root module or else we have a dependency issue
    token = module.scaleway.token
}

1.11.1 Also didnt fix the issue for me.
@hazcod What do you mean by "set the variable in the root module"?

I have this in my providers.tf on root module:

provider "kubernetes" {
  load_config_file = false
  host             = module.k8s_cluster.cluster_host
  token            = module.k8s_cluster.cluster_token
  cluster_ca_certificate = base64decode(
    module.k8s_cluster.cluster_ca_certificate
  )
}

I am thinking of starting evaluating pulumi as an alternative do terraform if I continue to have issues like this,

brpaz on 7 Mar 2020

👍2

To those still facing this issue with version 1.11.1 of the kubernetes provider, could you please share the output of the terraform providers command?

I came across a similar issue and it was caused by a sub-module redefining a provider without load_config_file = false.

pdecat on 12 Mar 2020

```$ tf providers
.
├── provider.aws ~> 2.44.0
├── provider.kubernetes ~> 1.10
├── provider.terraform
└── module.cluster
├── provider.aws (inherited)
├── module.alb_ingress_controller_iam_policy
│   └── provider.aws (inherited)
├── module.eks
│   ├── provider.aws >= 2.38.0
│   ├── provider.kubernetes >= 1.6.2
│   ├── provider.local >= 1.2
│   ├── provider.null >= 2.1
│   ├── provider.random >= 2.1
│   ├── provider.template >= 2.1
│   └── module.node_groups
│   ├── provider.aws (inherited)
│   └── provider.random
├── module.external_dns_iam_policy
│   └── provider.aws (inherited)
├── module.k8s_config
│   ├── provider.aws
│   ├── provider.helm
│   ├── provider.kubernetes (inherited)
│   └── module.metrics_server
│   └── provider.kubernetes (inherited)
└── module.model_bucket
└── provider.aws (inherited)

```

liangyungong on 12 Mar 2020

@liangyungong are both your providers declared in the root module and eks sub-module defining load_config_file = false ?

pdecat on 12 Mar 2020

yes indeed.

rg 'provider.*kubernetes' -w ../../ --hidden --no-ignore --glob='*.tf' -A 5 | grep load_config_file
../../application/prd-0/environment.tf-  load_config_file       = false
../../application/prd-1/environment.tf-  load_config_file       = false
../../application/stg-0/environment.tf-  load_config_file       = false
../../application/prd-0/.terraform/modules/cluster.k8s_config.metrics_server/azure/stacks/aks_cluster/providers.tf-  load_config_file = false
../../application/prd-1/.terraform/modules/cluster.cluster_autoscaler/azure/stacks/aks_cluster/providers.tf-  load_config_file = false
../../application/prd-1/.terraform/modules/cluster.k8s_config.metrics_server/azure/stacks/aks_cluster/providers.tf-  load_config_file = false

liangyungong on 12 Mar 2020

I'm confused, your terraform providers output has eks and this grep output has aks.

pdecat on 12 Mar 2020

@pdecat : In my case, I encounter this issue when I fire off our kubernetes provider as a dependency on scaleway provider with a fresh cluster. During plan, the variables from the scaleway provider will be empty (since there is no cluster yet), so kubernetes will dial to the default values.
More specifically in my case the kubernetes provider variables are populated with the exported kubeconfig of scaleway provider.

hazcod on 12 Mar 2020

@hazcod I've looked into your case, but did not find any explanation yet. Maybe the issue is in the provider's configuration recorded in the existing state.

FWIW, this works in a single apply pass from scratch with v1.11.1 and GKE (I do not have a test scaleway account):

main.tf:

provider "google" {
  version = "3.12.0"
  region  = "us-west1"
  # Other provider settings provided via ENV variables
}

module gke {
  source = "./gke"
}

data "google_client_config" "default" {
}

provider "kubernetes" {
  version = "1.11.1"

  load_config_file       = false
  host                   = module.gke.endpoint
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(module.gke.cluster_ca_certificate)
}

module kubernetes {
  source = "./kubernetes"
}

output "cluster_name" {
  value = module.gke.cluster_name
}

output "location" {
  value = module.gke.location
}

output "endpoint" {
  value = module.gke.endpoint
}

gke/main.tf:

data "google_compute_zones" "available" {
}

resource "google_container_cluster" "primary" {
  name               = "terraform-example-cluster"
  location           = data.google_compute_zones.available.names[0]
  initial_node_count = 1

  min_master_version = "1.15.9-gke.22"
  node_version       = "1.15.9-gke.22"

  master_auth {
    username = ""
    password = ""
  }
}

output "cluster_name" {
  value = google_container_cluster.primary.name
}

output "location" {
  value = google_container_cluster.primary.location
}

output "endpoint" {
  value = google_container_cluster.primary.endpoint
}

output "cluster_ca_certificate" {
  value = google_container_cluster.primary.master_auth[0].cluster_ca_certificate
}

kubernetes/main.tf:

resource "kubernetes_namespace" "example" {
  metadata {
    name = "terraform-example-namespace"
  }
}

Init:

# rm -rf .terraform/
# terraform init
Initializing modules...
- gke in gke
- kubernetes in kubernetes

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "google" (hashicorp/google) 3.12.0...
- Downloading plugin for provider "kubernetes" (hashicorp/kubernetes) 1.11.1...

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

# terraform providers
.
├── provider.google 3.12.0
├── provider.kubernetes 1.11.1
├── module.gke
│   └── provider.google (inherited)
└── module.kubernetes
    └── provider.kubernetes (inherited)

Apply:

# terraform apply -auto-approve
module.gke.data.google_compute_zones.available: Refreshing state...
data.google_client_config.default: Refreshing state...
module.gke.google_container_cluster.primary: Creating...
module.gke.google_container_cluster.primary: Still creating... [10s elapsed]
module.gke.google_container_cluster.primary: Still creating... [20s elapsed]
module.gke.google_container_cluster.primary: Still creating... [30s elapsed]
module.gke.google_container_cluster.primary: Still creating... [40s elapsed]
module.gke.google_container_cluster.primary: Still creating... [50s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m0s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m10s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m20s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m30s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m40s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m50s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m0s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m10s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m20s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m30s elapsed]
module.gke.google_container_cluster.primary: Creation complete after 2m33s [id=projects/myproject/locations/us-west1-a/clusters/terraform-example-cluster]
module.kubernetes.kubernetes_namespace.example: Creating...
module.kubernetes.kubernetes_namespace.example: Creation complete after 1s [id=terraform-example-namespace]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Outputs:

cluster_name = terraform-example-cluster
endpoint = 35.197.114.71
location = us-west1-a

# kubectl --context gke_myproject_us-west1-a_terraform-example-cluster get ns terraform-example-namespace
NAME                          STATUS   AGE
terraform-example-namespace   Active   3m9s

Destroy:

# terraform destroy -auto-approve
data.google_client_config.default: Refreshing state...
module.gke.data.google_compute_zones.available: Refreshing state...
module.gke.google_container_cluster.primary: Refreshing state... [id=projects/myproject/locations/us-west1-a/clusters/terraform-example-cluster]
module.kubernetes.kubernetes_namespace.example: Refreshing state... [id=terraform-example-namespace]
module.kubernetes.kubernetes_namespace.example: Destroying... [id=terraform-example-namespace]
module.kubernetes.kubernetes_namespace.example: Still destroying... [id=terraform-example-namespace, 10s elapsed]
module.kubernetes.kubernetes_namespace.example: Destruction complete after 15s
module.gke.google_container_cluster.primary: Destroying... [id=projects/myproject/locations/us-west1-a/clusters/terraform-example-cluster]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 10s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 20s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 30s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 40s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 50s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m0s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m10s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m20s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m30s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m40s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m50s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m0s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m10s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m20s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m30s elapsed]
module.gke.google_container_cluster.primary: Destruction complete after 2m37s

Destroy complete! Resources: 2 destroyed.

pdecat on 12 Mar 2020

I'm confused, your terraform providers output has eks and this grep output has aks.

they're irrelevant files, just how the modules are organised in git repo. :)

liangyungong on 12 Mar 2020

@liangyungong I still do not get how you can have AWS resources in the terraform providers output and Azure resource in the grep output. They do not correspond to each other.

Your terraform providers explicitly states that there's a kubernetes provider initialized in the AWS eks module that is not _inherited_ from the root module:

.
├── provider.aws ~> 2.44.0
├── provider.kubernetes ~> 1.10
├── provider.terraform
└── module.cluster
    ├── provider.aws (inherited)
    ├── module.alb_ingress_controller_iam_policy
    │   └── provider.aws (inherited)
    ├── module.eks
    │   ├── provider.aws >= 2.38.0
    │   ├── provider.kubernetes >= 1.6.2 # <-- HERE
[...]

That means there a provider kubernetes block in there.

Can you check the content of that module?

pdecat on 12 Mar 2020

@liangyungong I still do not get how you can have AWS resources in the terraform providers output and Azure resource in the grep output. They do not correspond to each other.

Your terraform providers explicitly states that there's a kubernetes provider initialized in the AWS eks module that is not _inherited_ from the root module:
.
├── provider.aws ~> 2.44.0
├── provider.kubernetes ~> 1.10
├── provider.terraform
└── module.cluster
    ├── provider.aws (inherited)
    ├── module.alb_ingress_controller_iam_policy
    │   └── provider.aws (inherited)
    ├── module.eks
    │   ├── provider.aws >= 2.38.0
    │   ├── provider.kubernetes >= 1.6.2 # <-- HERE
[...]
That means there a provider kubernetes block in there.

Can you check the content of that module?

There're many other modules in the same git repo, and they are irrelevant to the module that I use. Whenever I do terraform init, it clones the whole git repo.

liangyungong on 13 Mar 2020

There're many other modules in the same git repo, and they are irrelevant to the module that I use. Whenever I do terraform init, it clones the whole git repo.

So the module.eks provider block does not have load_config_file = false.

pdecat on 13 Mar 2020

I'm hitting this problem, but not with any modules.

$ terraform providers
.
├── provider.google ~> 3.13
├── provider.google-beta ~> 3.13
├── provider.kubernetes.xxx ~> 1.11.1
└── provider.kubernetes.yyy ~> 1.11.1

(two separate kubernetes providers with aliases)

Is there a known workaround that doesn't involve winding back the kubernetes provider to 1.10? I need to be using 1.11 for other reasons.

dsymonds on 18 Mar 2020

Actually my setup has started working again after forcibly re-fetching credentials, though it was very confusing why it was trying to contact localhost when the creds were bad.

dsymonds on 18 Mar 2020

Not sure if this is the same problem, but just in case, I hit the following.

I had a kubernetes provider blob looking a bit like this.

~
provider "kubernetes" {
version = "1.11"
host = var.credentials.host
username = var.credentials.username
password = var.credentials.password
client_certificate = var.credentials.client_certificate
client_key = var.credentials.client_key
cluster_ca_certificate = var.credentials.cluster_ca_certificate
}
~

This failed in both 1.10 and 1.11. With 1.10, I got an error report explaining that I must set username and password or bearer token not both (fair enough). With 1.11, no error and it ignored host, contacting localhost.

If I removed username and password from then it all worked (in both versions). That makes me think that a failure in validation in 1.11 might lead to it dropping through with the host still set to localhost.

plwhite on 19 Mar 2020

@plwhite The error you got in 1.10 was not right, but not exhaustive since client certificates are also an equivalent form of authentication. Better validation was introduced in 1.11 that why you are not seeing that error anymore. The rule is to have one of either: token, user/pass OR client certificates. Having two of these like in you example is not deterministic (which one should be used to authenticate you?) and it looks like that's not being validated - we'll work on fixing that.

However, the reason you're seeing the connection to localhost is likely because Terraform is unable to resolve the value for var.credentials.host at the right time. How is var.credentials being populated in your case?

alexsomesan on 19 Mar 2020

@alexsomesan I was populating var.credentials through variables set up by the azurerm provider creating an AKS cluster, which from memory did have host configured. I'm moderately sure that was set consistently but it's possible there was a transient error where it failed at about the same time as I hit this. Since moving to the more recent kubernetes provider I've seen no further issues, so quite happy to consider this fixed.

plwhite on 1 Apr 2020

The key aspect here is whether you are creating the azurerm cluster resource in the same apply run as the kubernetes resources?

alexsomesan on 1 Apr 2020

In the same apply run. Sometimes the azure cluster already existed, and sometimes not (and was created by the apply run).

plwhite on 2 Apr 2020

I experience similar issues with this setup:

terraform bootstraps virtual machines
terraform RKE provider set ups Kubernetes/RKE
I use the retrieved attributes from the RKE provider to setup this kubernetes provider

Interestingly, all works well if I run terraform apply locally/on any machine but once I run it in our CI (GitLab CI and/or Jenkins), I run into the same issue that this provider does not pick up the RKE configuration but instead dials localhost port 80.
For CI we use cytopia:terragrunt (clean run without any caches).

muhlba91 on 10 Apr 2020

fyi, my problem was also related to https://github.com/terraform-providers/terraform-provider-kubernetes/issues/708#issuecomment-598122673

My interesting observation was though:

on local machines we had already a kubeconfig present for different clusters than what we tried to create and all was perfectly fine without setting the load_config_file in the kubernetes and helm provider
on CI obviously no kubeconfig at all was already present and suddenly the provider tried to connect on localhost port 80; after setting load_config_file to false in both the kubernetes and helm provider, it was working

Could someone explain me why in one case it's necessary to set load_config_file = false and in the other case with an already existing kubeconfig file it isn't? Furthermore, it seems as if the kubeconfig values would get overwritten anyways.

muhlba91 on 12 Apr 2020

👍1

Has same issue with version "1.11.2". Solved following way:

Downgraded to 1.10.0 and received error:
Error: Failed to configure: username/password or bearer token may be set, but not both

Removed "username/password" and left only "client_certificate/client_key/cluster_ca_certificate"
Problem solved and Now everything works with "1.11.2".

Enjoy.

xp-vit on 7 May 2020

I have the issue with 1.11.4 on EKS. It's like the provider is initialized with default settings even if in my module I'm using the credentials from the EKS cluster. I found no workaround to the issue. This is really frustrating.

I validate that I do not have any other kubernetes provider set that could override. I still unsure but that could be related to the fact that I'm using terragrunt :shrug:

mbelang on 23 Jul 2020

I just tried reverting to 1.10.0 version of the provider. It worked. I managed to create the resources but next plan failed with:

Error: namespaces "my_namespace" is forbidden: User "system:anonymous" cannot get resource "namespaces" in API group "" in the namespace "my_namespace"

I guess it is related to EKS rbac but how is it possible to not use anonymous user without a kube config?

mbelang on 23 Jul 2020

I managed to make it work with

provider "kubernetes" {
  version                = "~> 1.11.0"
  load_config_file       = false
  host                   = aws_eks_cluster.eks.endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.eks.certificate_authority.0.data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["token", "-i", aws_eks_cluster.eks.name, "-r", "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/MyRole"]
    command     = "aws-iam-authenticator"
  }
}

I think I understand what is happening here.

first plan no resources created so the token is not required
first apply, token is generated by

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}

second plan, it doesn't work because the plan will not generate a token from data

Getting the token on each provider call as above solution works just fine.

mbelang on 23 Jul 2020

👍1

I have the same issue when using Kubernetes Provider > 1.10 (maybe related to https://github.com/hashicorp/terraform-provider-kubernetes/issues/759). Using Provider Version 1.10.0 works as expected. 1.11 and 1.12 do not work with the following config running inside a Kubernetes Cluster:

KUBE_LOAD_CONFIG_FILE=false
KUBERNETES_SERVICE_HOST=<k8s-host>
KUBERNETES_SERVICE_PORT=443

Steps to reproduce:

Create a Pod with the Environment Variables mentioned above (https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#in-cluster-service-account-token)
Use a minimal Terraform File with a Kubernetes Secret Resource (see below)
Run terraform apply

Results in Error: Post "http://localhost/api/v1/namespaces/default/secrets": dial tcp 127.0.0.1:80: connect: connection refused

provider "kubernetes" {
  version = "~> 1.11"
}

resource "kubernetes_secret" "test" {
  metadata {
    name = "test"
    namespace = "default"
  }

  data = {
    test = "data"
  }
}

I tried to configure the Kubernetes Provider using load_config_file and KUBE_LOAD_CONFIG_FILE. Enabling debug shows the following: [WARN] Invalid provider configuration was supplied. Provider operations likely to fail: invalid configuration: no configuration has been provided

etwillbefine on 18 Aug 2020

@etwillbefine I wasn't able to reproduce the issue with the configuration you provided.
I ran a test inside a Debian container in a Pod on a 1.18 cluster and it worked as expected for me. See the output below.

root@test-708:/test-708# env | grep KUBERNETES | sort
KUBERNETES_PORT=tcp://10.3.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.3.0.1:443
KUBERNETES_PORT_443_TCP_ADDR=10.3.0.1
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.3.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
root@test-708:/test-708# cat main.tf
provider "kubernetes" {
  version = "~> 1.11"
  load_config_file = "false"
}

resource "kubernetes_namespace" "test" {
  metadata {
    name = "test"
  }
}

root@test-708:/test-708# terraform init

Initializing the backend...

Initializing provider plugins...
- Finding hashicorp/kubernetes versions matching "~> 1.11"...
- Installing hashicorp/kubernetes v1.13.1...
- Installed hashicorp/kubernetes v1.13.1 (signed by HashiCorp)

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
root@test-708:/test-708# terraform version
Terraform v0.13.2
+ provider registry.terraform.io/hashicorp/kubernetes v1.13.1
root@test-708:/test-708# terraform apply -auto-approve
kubernetes_namespace.test: Creating...

Error: namespaces is forbidden: User "system:serviceaccount:default:default" cannot create resource "namespaces" in API group "" at the cluster scope

  on main.tf line 6, in resource "kubernetes_namespace" "test":
   6: resource "kubernetes_namespace" "test" {

alexsomesan on 9 Sep 2020

I'm going to close this issue as it's become a catch-all for credentials misconfigurations.
Please open separate issues if you're having trouble with configuring credentials so we can address them specifically.

alexsomesan on 9 Sep 2020

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!