Terraform-provider-kubernetes: Call to http://localhost/version with configured host and credentials

Created on 13 Dec 2019  Β·  62Comments  Β·  Source: hashicorp/terraform-provider-kubernetes

Terraform Version

Terraform v0.12.17

  • provider.azurerm v1.38.0
  • provider.kubernetes v1.10.0

Affected Resource(s)

Please list the resources as a list, for example:

  • kubernetes_persistent_volume

Terraform Configuration Files

provider "kubernetes" {
  version                = "~> 1.10.0"
  host                   = module.azurekubernetes.host
  username               = module.azurekubernetes.username
  password               = module.azurekubernetes.password
  client_certificate     = base64decode(module.azurekubernetes.client_certificate)
  client_key             = base64decode(module.azurekubernetes.client_key)
  cluster_ca_certificate = base64decode(module.azurekubernetes.cluster_ca_certificate)
}

resource "kubernetes_persistent_volume" "factfinder-pv" {
  metadata {
    name = "ff-nfs-client"
    labels = {
      type          = "factfinder"
      sub_type      = "nfs"
      instance_type = "pv"
    }
  }
  spec {
    access_modes = ["ReadWriteMany"]
    capacity = map("storage", "${var.shared_storage_size}Gi")

    persistent_volume_source {
      nfs {
        path   = "/"
        server = var.nfs_service_ip
      }
    }
    storage_class_name = "nfs"
  }
}

Debug Output

(The debug output is huge and I just pasted a relevant section of it. If you need more, I'll create a gist)

2019/12/13 09:45:42 [DEBUG] ReferenceTransformer: "module.factfinder.kubernetes_service.factfinder-fffui-service" references: []
2019/12/13 09:45:42 [DEBUG] ReferenceTransformer: "module.loadbalancer.kubernetes_config_map.tcp-services" references: []
2019/12/13 09:45:42 [DEBUG] ReferenceTransformer: "module.factfinder.kubernetes_deployment.factfinder-sftp" references: []
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: ---[ REQUEST ]---------------------------------------
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: GET /version?timeout=32s HTTP/1.1
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Host: localhost
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: User-Agent: HashiCorp/1.0 Terraform/0.12.17
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept: application/json, */*
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept-Encoding: gzip
2019-12-13T09:45:42.985Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2019-12-13T09:45:42.986Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2019-12-13T09:45:42.986Z [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: -----------------------------------------------------

Expected Behavior

When running terraform in the hashicorp/terraform container, a terraform plan should run properly

Actual Behavior

The plan errors out with the following error:

Error: Get http://localhost/version?timeout=32s: dial tcp 127.0.0.1:80: connect: connection refused

  on ../modules/factfinder/factfinder-nfs-client-pv.tf line 6, in resource "kubernetes_persistent_volume" "factfinder-pv":
   6: resource "kubernetes_persistent_volume" "factfinder-pv" {

This only happens, when running terraform in the container. When ran locally, everything is fine. (Even when the local .kube directory is removed)

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform plan or terraform apply

Important Factoids

  • Running in Azure AKS
  • Running in a Docker container based on the hashicorp/terraform image
bug documentation needs investigation

Most helpful comment

I have exact problem of @jakexks and @hazcod. Everything was working when I had everything in the root module but when I split to a separate module, it starts giving errors saying "rror: invalid configuration: no configuration has been provided" as well as trying to connect using localhost.

All 62 comments

Hm. Now I even somehow configured my local environment, that that happens. πŸ€·β€β™‚

Happens with me as well. I have changed kubernetes secret metadata name from a string to interpolated value... which resolves into same string. The original has no issue, the interpolated connects to locahost...

resource "kubernetes_secret" "vault-gcp" {
  metadata {
    name = "${var.deployment_name}-gcp"
  }
...
}

When name is "vault-gcp", it's fine. In new branch with above code and deployment name set to "vault", hence resulting interpolation being "vault-gcp", this fails with connection to locahost.

Seems like TF/provider thinks this is some new/different instance of the resource, which somehow does not belong to the configured kub cluster, so it probably fails back to default "localhost" address.

I have no interpolated values in metadata, but in the spec. But I have that in all kubernetes resources and only the resource mentioned above has the problem (or it is the first one it comes across and then stops, could be as well).

Quite the phenomenom really. Any core developer around? πŸ˜‰

Tried a workaround with conditional:

name = var.legacy == "yes" ? "vault-gcp" : "${var.deployment_name}-gcp"

This way I wanted to make it have non-interpolated string directly in some cases, but this ended up with the same issue. My TF version is 0.12.18. I have kub provider configured with host and config:

provider "kubernetes" {
  host  = google_container_cluster.vault.endpoint
  token = data.google_client_config.current.access_token
  cluster_ca_certificate = base64decode(
    ....
  )
  load_config_file = false
}

Then I have tried another workaround, with defining 2 resources, one with interpolation, one with string and then controlling which resource actually gets deployed with

count =  var.legacy == "yes" ? 1 : 0

But this ended up with new resources[0] even for legacy deployment (where it is already deployed and I am trying to achieve 0 changes on TF apply).

So I would say the issue is somehow not respecting existing kubernetes provider config for new resources...

kubernetes_secret.vault-gcp[0]: Refreshing state... [id=default/vault-gcp]
...

Error: Get http://localhost/api/v1/namespaces/default/secrets/vault-gcp: dial tcp [::1]:80: connect: connection refused

I think, it's interesting, that it even tries to call via HTTP and not HTTPS, which would be the default I think.

So it turns out that in my case, I was also pointing to a wrong location in the bucket where it had no tfstate. As most of the resources in GCP have same ID as name, even without state, terraform was able to find and refresh my whole stack, except the kub secrets, where it was connecting to localhost, as it had no state about where the cluster was...

In EC2, that would blow up probably sooner, as resource IDs are quite different from resource names and if you lose state you have a lot of trouble finding where everything is...

Okay, I found the problem for my case. This line here:

https://github.com/terraform-providers/terraform-provider-kubernetes/blob/45d910a26f17f7b03d684221428b86f2f02b5be2/kubernetes/resource_kubernetes_persistent_volume.go#L40

If you remove all the CustomizeDiff part, all works fine. So I guess, the correct server isn't carried through to that point. I try to dig deeper there.

@alexsomesan @pdecat You added that line there refactoring the whole client handling. Could you think of any implications that could cause this behaviour? It seems as if the MainClientset isn't correctly configured when it reaches the CustomizeDiff function.

~Hi @dploeger, I believe the initialization here occurs too early. The CustomizeDiff probably needs to be replaced by a CustomizeDiffFunc.~

@pdecat You probably know how to do this. I just stumbled through the code. πŸ˜† Are you able to provide a PR for that?

Or can you point me on how to implement that? Just replacing CustomizeDIff with CustomizeDiffFunc didn't work at least. :)

Never mind, it won't work, CustomizeDiffFunc is the type of the CustomizeDiff field.

Let me think of something else.

@dploeger Are you building the AKS resources from module.azurekubernetes in the same apply run as the kubernetes_persistent_volume ?

Yes, I am. And that all worked until 12-9. I can’t really grasp what has changed then, because we didn’t update or change anything there.

@dploeger Are you building the AKS resources from module.azurekubernetes in the same apply run as the kubernetes_persistent_volume ?

Good point, that's the most frequent issue when localhost is involved. The configuration is not available at the time the kubernetes provider is initialized.
The point about removing CustomizeDiff fixing the issue made me think of something else, but it turns out the kubernetes client is only initialized once by the provider.

Further question: is this happening when running TF in a Pod on the cluster?

Ummmm... I haven't tried that. Is that important? I'd have to set that up. I just tried locally. It also happens outside the container now.

I'm experiencing this with a module that nests other modules, sometimes the child modules lose provider configuration and the terraform config becomes un-applyable, but also un-destroyable!

The parent creates a DigitalOcean Kubernetes cluster inside a module, then uses the output of the module to get a data source which configures the provider e.g.

module "e2etest_k8s" {
  source = "./infrastructure/kubernetes/do"
  providers = {
    digitalocean = digitalocean.e2etest
  }
}

data "digitalocean_kubernetes_cluster" "e2etest" {
  provider = digitalocean.e2etest
  name     = module.e2etest_k8s.cluster_name
}

provider "kubernetes" {
  alias            = "e2etest"
  load_config_file = false
  host             = data.digitalocean_kubernetes_cluster.e2etest.endpoint
  token            = data.digitalocean_kubernetes_cluster.e2etest.kube_config[0].token
  cluster_ca_certificate = base64decode(
    data.digitalocean_kubernetes_cluster.e2etest.kube_config[0].cluster_ca_certificate
  )
}

// This also contains submodules
module "<rest of infra>" {
  source = "./<folders>"
  providers = {
    kubernetes = kubernetes.e2etest
  }
}

This provider is then used for a bunch of modules (which also contain modules) that then exhibit the localhost behavior (sometimes, but it seems deterministic between runs).

any updates on this? Im trying to upgrade from 7.0.1 to 8.2.0 of the EKS terraform module (https://github.com/terraform-aws-modules/terraform-aws-eks) -- I'm able to get through the initial import of the aws-auth configmap by using a local kubeconfig the first time (overriding load_config_file to true for the import), but subsequent plans always fail with a call to localhost.

my provider config looks like

provider "kubernetes" {
  load_config_file       = var.load_config 
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  version                = "1.10.0" # Stable version??
}

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}
Error: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused
module.eks.kubernetes_config_map.aws_auth[0]: Refreshing state... [id=kube-system/aws-auth]
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: 2020/02/11 10:16:22 [INFO] Checking config map aws-auth
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: 2020/02/11 10:16:22 [DEBUG] Kubernetes API Request Details:
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: ---[ REQUEST ]---------------------------------------
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: GET /api/v1/namespaces/kube-system/configmaps/aws-auth HTTP/1.1
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Host: localhost
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: User-Agent: HashiCorp/1.0 Terraform/0.12.20
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept: application/json, */*
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: Accept-Encoding: gzip
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4:
2020-02-11T10:16:22.087-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: -----------------------------------------------------
2020-02-11T10:16:22.089-0800 [DEBUG] plugin.terraform-provider-kubernetes_v1.10.0_x4: 2020/02/11 10:16:22 [DEBUG] Received error: &url.Error{Op:"Get", URL:"http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth", Err:(*net.OpError)(0xc000976050)}
2020/02/11 10:16:22 [ERROR] module.eks: eval: *terraform.EvalRefresh, err: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused
2020/02/11 10:16:22 [ERROR] module.eks: eval: *terraform.EvalSequence, err: Get http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth: dial tcp [::1]:80: connect: connection refused

I'm happy to provide further information/logs/tests to get this issue resolved ASAP. I have tried provider versions 1.8.1, 1.9.0, 1.10.0 and 1.11.0 (1.11.0 gives me a different error corresponding to issue 759). I'm using terraform 0.12.20

Having the same issue where I use the scaleway kapsule provider kubeconfig output as input for my kubernetes terraform provider. Using local kubeconfig does not resolve the issue during terraform plan. https://github.com/ironPeakServices/infrastructure/runs/435886375?check_suite_focus=true

I have exact problem of @jakexks and @hazcod. Everything was working when I had everything in the root module but when I split to a separate module, it starts giving errors saying "rror: invalid configuration: no configuration has been provided" as well as trying to connect using localhost.

@brpaz : so it works if you run it from the root module?
Might be an overall terraform issue, since I had the issue that some terraform variables were not being set for submodules, making me have to set it in the root module too e.g.: https://github.com/ironPeakServices/infrastructure/blob/master/versions.tf#L20

@hazcod yes, I had all my Terraform resources into main.tf in the root module. Everything was working.
Because the configs were growing I created a module and split my main.tf into several files inside the module. After that change and run terraform apply, it started giving these errors.

But then I tried a fresh install (clean state and a new cluster provision from scratch and it worked.
I think somehow a conflict between what was persisted in the state file and the new terraform declarations, resulted in terraform to pick a wrong config?

After reaching out to terraform core, above issue seems to indicate that it's a kubernetes provider issue where it's not handling the unknown variables well.

I have drilled this down to the following: if a kubernetes provider is receiving unknown values (because of a dependency), it should go through with the plan because it would normally be fulfilled in the apply phase. I think that's a better approach than just erroring out now.

This is really frustrated, if my scaleway provider cluster is removed, I have to take following manual steps:

  • Comment out kubernetes/helm provider code
  • Trigger the pipeline deploy
  • Re-enable kubernetes/helm provider code
  • Hope everything goes well or start over again

I circumvented this with:

provider "kubernetes" {
    # fixed to 1.10.0 because of https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759
    version = "1.10.0" 
    # set the variable in the root module or else we have a dependency issue
    token = module.scaleway.token
}

Have this problem as well, running terraform in a container on gcp cloud build triggers. Since last month it is trying to connect to localhost and ignores the host set in the provider config.

@davidq2q Have you tried with v1.11.1?

@davidq2q Have you tried with v1.11.1?

Forgot to reply but after setting the version to 1.10.0 yesterday everything seems to work; all builds are green now.

Yes but the latest would supposedly fix that.

I circumvented this with:

provider "kubernetes" {
    # fixed to 1.10.0 because of https://github.com/terraform-providers/terraform-provider-kubernetes/issues/759
    version = "1.10.0" 
    # set the variable in the root module or else we have a dependency issue
    token = module.scaleway.token
}

1.11.1 Also didnt fix the issue for me.
@hazcod What do you mean by "set the variable in the root module"?

I have this in my providers.tf on root module:

provider "kubernetes" {
  load_config_file = false
  host             = module.k8s_cluster.cluster_host
  token            = module.k8s_cluster.cluster_token
  cluster_ca_certificate = base64decode(
    module.k8s_cluster.cluster_ca_certificate
  )
}

I am thinking of starting evaluating pulumi as an alternative do terraform if I continue to have issues like this,

To those still facing this issue with version 1.11.1 of the kubernetes provider, could you please share the output of the terraform providers command?

I came across a similar issue and it was caused by a sub-module redefining a provider without load_config_file = false.

```$ tf providers
.
β”œβ”€β”€ provider.aws ~> 2.44.0
β”œβ”€β”€ provider.kubernetes ~> 1.10
β”œβ”€β”€ provider.terraform
└── module.cluster
β”œβ”€β”€ provider.aws (inherited)
β”œβ”€β”€ module.alb_ingress_controller_iam_policy
β”‚Β Β  └── provider.aws (inherited)
β”œβ”€β”€ module.eks
β”‚Β Β  β”œβ”€β”€ provider.aws >= 2.38.0
β”‚Β Β  β”œβ”€β”€ provider.kubernetes >= 1.6.2
β”‚Β Β  β”œβ”€β”€ provider.local >= 1.2
β”‚Β Β  β”œβ”€β”€ provider.null >= 2.1
β”‚Β Β  β”œβ”€β”€ provider.random >= 2.1
β”‚Β Β  β”œβ”€β”€ provider.template >= 2.1
β”‚Β Β  └── module.node_groups
β”‚Β Β  β”œβ”€β”€ provider.aws (inherited)
β”‚Β Β  └── provider.random
β”œβ”€β”€ module.external_dns_iam_policy
β”‚Β Β  └── provider.aws (inherited)
β”œβ”€β”€ module.k8s_config
β”‚Β Β  β”œβ”€β”€ provider.aws
β”‚Β Β  β”œβ”€β”€ provider.helm
β”‚Β Β  β”œβ”€β”€ provider.kubernetes (inherited)
β”‚Β Β  └── module.metrics_server
β”‚Β Β  └── provider.kubernetes (inherited)
└── module.model_bucket
└── provider.aws (inherited)

```

@liangyungong are both your providers declared in the root module and eks sub-module defining load_config_file = false ?

yes indeed.

rg 'provider.*kubernetes' -w ../../ --hidden --no-ignore --glob='*.tf' -A 5 | grep load_config_file
../../application/prd-0/environment.tf-  load_config_file       = false
../../application/prd-1/environment.tf-  load_config_file       = false
../../application/stg-0/environment.tf-  load_config_file       = false
../../application/prd-0/.terraform/modules/cluster.k8s_config.metrics_server/azure/stacks/aks_cluster/providers.tf-  load_config_file = false
../../application/prd-1/.terraform/modules/cluster.cluster_autoscaler/azure/stacks/aks_cluster/providers.tf-  load_config_file = false
../../application/prd-1/.terraform/modules/cluster.k8s_config.metrics_server/azure/stacks/aks_cluster/providers.tf-  load_config_file = false

I'm confused, your terraform providers output has eks and this grep output has aks.

@pdecat : In my case, I encounter this issue when I fire off our kubernetes provider as a dependency on scaleway provider with a fresh cluster. During plan, the variables from the scaleway provider will be empty (since there is no cluster yet), so kubernetes will dial to the default values.
More specifically in my case the kubernetes provider variables are populated with the exported kubeconfig of scaleway provider.

@hazcod I've looked into your case, but did not find any explanation yet. Maybe the issue is in the provider's configuration recorded in the existing state.

FWIW, this works in a single apply pass from scratch with v1.11.1 and GKE (I do not have a test scaleway account):

main.tf:

provider "google" {
  version = "3.12.0"
  region  = "us-west1"
  # Other provider settings provided via ENV variables
}

module gke {
  source = "./gke"
}

data "google_client_config" "default" {
}

provider "kubernetes" {
  version = "1.11.1"

  load_config_file       = false
  host                   = module.gke.endpoint
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(module.gke.cluster_ca_certificate)
}

module kubernetes {
  source = "./kubernetes"
}

output "cluster_name" {
  value = module.gke.cluster_name
}

output "location" {
  value = module.gke.location
}

output "endpoint" {
  value = module.gke.endpoint
}

gke/main.tf:

data "google_compute_zones" "available" {
}

resource "google_container_cluster" "primary" {
  name               = "terraform-example-cluster"
  location           = data.google_compute_zones.available.names[0]
  initial_node_count = 1

  min_master_version = "1.15.9-gke.22"
  node_version       = "1.15.9-gke.22"

  master_auth {
    username = ""
    password = ""
  }
}

output "cluster_name" {
  value = google_container_cluster.primary.name
}

output "location" {
  value = google_container_cluster.primary.location
}

output "endpoint" {
  value = google_container_cluster.primary.endpoint
}

output "cluster_ca_certificate" {
  value = google_container_cluster.primary.master_auth[0].cluster_ca_certificate
}

kubernetes/main.tf:

resource "kubernetes_namespace" "example" {
  metadata {
    name = "terraform-example-namespace"
  }
}

Init:

# rm -rf .terraform/
# terraform init
Initializing modules...
- gke in gke
- kubernetes in kubernetes

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "google" (hashicorp/google) 3.12.0...
- Downloading plugin for provider "kubernetes" (hashicorp/kubernetes) 1.11.1...

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
# terraform providers
.
β”œβ”€β”€ provider.google 3.12.0
β”œβ”€β”€ provider.kubernetes 1.11.1
β”œβ”€β”€ module.gke
β”‚Β Β  └── provider.google (inherited)
└── module.kubernetes
    └── provider.kubernetes (inherited)

Apply:

# terraform apply -auto-approve
module.gke.data.google_compute_zones.available: Refreshing state...
data.google_client_config.default: Refreshing state...
module.gke.google_container_cluster.primary: Creating...
module.gke.google_container_cluster.primary: Still creating... [10s elapsed]
module.gke.google_container_cluster.primary: Still creating... [20s elapsed]
module.gke.google_container_cluster.primary: Still creating... [30s elapsed]
module.gke.google_container_cluster.primary: Still creating... [40s elapsed]
module.gke.google_container_cluster.primary: Still creating... [50s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m0s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m10s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m20s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m30s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m40s elapsed]
module.gke.google_container_cluster.primary: Still creating... [1m50s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m0s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m10s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m20s elapsed]
module.gke.google_container_cluster.primary: Still creating... [2m30s elapsed]
module.gke.google_container_cluster.primary: Creation complete after 2m33s [id=projects/myproject/locations/us-west1-a/clusters/terraform-example-cluster]
module.kubernetes.kubernetes_namespace.example: Creating...
module.kubernetes.kubernetes_namespace.example: Creation complete after 1s [id=terraform-example-namespace]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Outputs:

cluster_name = terraform-example-cluster
endpoint = 35.197.114.71
location = us-west1-a
# kubectl --context gke_myproject_us-west1-a_terraform-example-cluster get ns terraform-example-namespace
NAME                          STATUS   AGE
terraform-example-namespace   Active   3m9s

Destroy:

# terraform destroy -auto-approve
data.google_client_config.default: Refreshing state...
module.gke.data.google_compute_zones.available: Refreshing state...
module.gke.google_container_cluster.primary: Refreshing state... [id=projects/myproject/locations/us-west1-a/clusters/terraform-example-cluster]
module.kubernetes.kubernetes_namespace.example: Refreshing state... [id=terraform-example-namespace]
module.kubernetes.kubernetes_namespace.example: Destroying... [id=terraform-example-namespace]
module.kubernetes.kubernetes_namespace.example: Still destroying... [id=terraform-example-namespace, 10s elapsed]
module.kubernetes.kubernetes_namespace.example: Destruction complete after 15s
module.gke.google_container_cluster.primary: Destroying... [id=projects/myproject/locations/us-west1-a/clusters/terraform-example-cluster]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 10s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 20s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 30s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 40s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 50s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m0s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m10s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m20s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m30s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m40s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 1m50s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m0s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m10s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m20s elapsed]
module.gke.google_container_cluster.primary: Still destroying... [id=projects/myproject/locations/...1-a/clusters/terraform-example-cluster, 2m30s elapsed]
module.gke.google_container_cluster.primary: Destruction complete after 2m37s

Destroy complete! Resources: 2 destroyed.

I'm confused, your terraform providers output has eks and this grep output has aks.

they're irrelevant files, just how the modules are organised in git repo. :)

@liangyungong I still do not get how you can have AWS resources in the terraform providers output and Azure resource in the grep output. They do not correspond to each other.

Your terraform providers explicitly states that there's a kubernetes provider initialized in the AWS eks module that is not _inherited_ from the root module:

.
β”œβ”€β”€ provider.aws ~> 2.44.0
β”œβ”€β”€ provider.kubernetes ~> 1.10
β”œβ”€β”€ provider.terraform
└── module.cluster
    β”œβ”€β”€ provider.aws (inherited)
    β”œβ”€β”€ module.alb_ingress_controller_iam_policy
    β”‚   └── provider.aws (inherited)
    β”œβ”€β”€ module.eks
    β”‚   β”œβ”€β”€ provider.aws >= 2.38.0
    β”‚   β”œβ”€β”€ provider.kubernetes >= 1.6.2 # <-- HERE
[...]

That means there a provider kubernetes block in there.

Can you check the content of that module?

@liangyungong I still do not get how you can have AWS resources in the terraform providers output and Azure resource in the grep output. They do not correspond to each other.

Your terraform providers explicitly states that there's a kubernetes provider initialized in the AWS eks module that is not _inherited_ from the root module:

.
β”œβ”€β”€ provider.aws ~> 2.44.0
β”œβ”€β”€ provider.kubernetes ~> 1.10
β”œβ”€β”€ provider.terraform
└── module.cluster
    β”œβ”€β”€ provider.aws (inherited)
    β”œβ”€β”€ module.alb_ingress_controller_iam_policy
    β”‚   └── provider.aws (inherited)
    β”œβ”€β”€ module.eks
    β”‚   β”œβ”€β”€ provider.aws >= 2.38.0
    β”‚   β”œβ”€β”€ provider.kubernetes >= 1.6.2 # <-- HERE
[...]

That means there a provider kubernetes block in there.

Can you check the content of that module?

There're many other modules in the same git repo, and they are irrelevant to the module that I use. Whenever I do terraform init, it clones the whole git repo.

There're many other modules in the same git repo, and they are irrelevant to the module that I use. Whenever I do terraform init, it clones the whole git repo.

So the module.eks provider block does not have load_config_file = false.

I'm hitting this problem, but not with any modules.

$ terraform providers
.
β”œβ”€β”€ provider.google ~> 3.13
β”œβ”€β”€ provider.google-beta ~> 3.13
β”œβ”€β”€ provider.kubernetes.xxx ~> 1.11.1
└── provider.kubernetes.yyy ~> 1.11.1

(two separate kubernetes providers with aliases)

Is there a known workaround that doesn't involve winding back the kubernetes provider to 1.10? I need to be using 1.11 for other reasons.

Actually my setup has started working again after forcibly re-fetching credentials, though it was very confusing why it was trying to contact localhost when the creds were bad.

Not sure if this is the same problem, but just in case, I hit the following.

I had a kubernetes provider blob looking a bit like this.

~
provider "kubernetes" {
version = "1.11"
host = var.credentials.host
username = var.credentials.username
password = var.credentials.password
client_certificate = var.credentials.client_certificate
client_key = var.credentials.client_key
cluster_ca_certificate = var.credentials.cluster_ca_certificate
}
~

This failed in both 1.10 and 1.11. With 1.10, I got an error report explaining that I must set username and password or bearer token not both (fair enough). With 1.11, no error and it ignored host, contacting localhost.

If I removed username and password from then it all worked (in both versions). That makes me think that a failure in validation in 1.11 might lead to it dropping through with the host still set to localhost.

@plwhite The error you got in 1.10 was not right, but not exhaustive since client certificates are also an equivalent form of authentication. Better validation was introduced in 1.11 that why you are not seeing that error anymore. The rule is to have one of either: token, user/pass OR client certificates. Having two of these like in you example is not deterministic (which one should be used to authenticate you?) and it looks like that's not being validated - we'll work on fixing that.

However, the reason you're seeing the connection to localhost is likely because Terraform is unable to resolve the value for var.credentials.host at the right time. How is var.credentials being populated in your case?

@alexsomesan I was populating var.credentials through variables set up by the azurerm provider creating an AKS cluster, which from memory did have host configured. I'm moderately sure that was set consistently but it's possible there was a transient error where it failed at about the same time as I hit this. Since moving to the more recent kubernetes provider I've seen no further issues, so quite happy to consider this fixed.

The key aspect here is whether you are creating the azurerm cluster resource in the same apply run as the kubernetes resources?

In the same apply run. Sometimes the azure cluster already existed, and sometimes not (and was created by the apply run).

I experience similar issues with this setup:

  • terraform bootstraps virtual machines
  • terraform RKE provider set ups Kubernetes/RKE
  • I use the retrieved attributes from the RKE provider to setup this kubernetes provider

Interestingly, all works well if I run terraform apply locally/on any machine but once I run it in our CI (GitLab CI and/or Jenkins), I run into the same issue that this provider does not pick up the RKE configuration but instead dials localhost port 80.
For CI we use cytopia:terragrunt (clean run without any caches).

fyi, my problem was also related to https://github.com/terraform-providers/terraform-provider-kubernetes/issues/708#issuecomment-598122673

My interesting observation was though:

  • on local machines we had already a kubeconfig present for different clusters than what we tried to create and all was perfectly fine without setting the load_config_file in the kubernetes and helm provider
  • on CI obviously no kubeconfig at all was already present and suddenly the provider tried to connect on localhost port 80; after setting load_config_file to false in both the kubernetes and helm provider, it was working

Could someone explain me why in one case it's necessary to set load_config_file = false and in the other case with an already existing kubeconfig file it isn't? Furthermore, it seems as if the kubeconfig values would get overwritten anyways.

Has same issue with version "1.11.2". Solved following way:

  1. Downgraded to 1.10.0 and received error:
    Error: Failed to configure: username/password or bearer token may be set, but not both
  1. Removed "username/password" and left only "client_certificate/client_key/cluster_ca_certificate"
  2. Problem solved and Now everything works with "1.11.2".

Enjoy.

I have the issue with 1.11.4 on EKS. It's like the provider is initialized with default settings even if in my module I'm using the credentials from the EKS cluster. I found no workaround to the issue. This is really frustrating.

I validate that I do not have any other kubernetes provider set that could override. I still unsure but that could be related to the fact that I'm using terragrunt :shrug:

I just tried reverting to 1.10.0 version of the provider. It worked. I managed to create the resources but next plan failed with:

Error: namespaces "my_namespace" is forbidden: User "system:anonymous" cannot get resource "namespaces" in API group "" in the namespace "my_namespace"

I guess it is related to EKS rbac but how is it possible to not use anonymous user without a kube config?

I managed to make it work with

provider "kubernetes" {
  version                = "~> 1.11.0"
  load_config_file       = false
  host                   = aws_eks_cluster.eks.endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.eks.certificate_authority.0.data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["token", "-i", aws_eks_cluster.eks.name, "-r", "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/MyRole"]
    command     = "aws-iam-authenticator"
  }
}

I think I understand what is happening here.

  1. first plan no resources created so the token is not required
  2. first apply, token is generated by
data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}
  1. second plan, it doesn't work because the plan will not generate a token from data

Getting the token on each provider call as above solution works just fine.

I have the same issue when using Kubernetes Provider > 1.10 (maybe related to https://github.com/hashicorp/terraform-provider-kubernetes/issues/759). Using Provider Version 1.10.0 works as expected. 1.11 and 1.12 do not work with the following config running inside a Kubernetes Cluster:

KUBE_LOAD_CONFIG_FILE=false
KUBERNETES_SERVICE_HOST=<k8s-host>
KUBERNETES_SERVICE_PORT=443

Steps to reproduce:

  1. Create a Pod with the Environment Variables mentioned above (https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs#in-cluster-service-account-token)
  2. Use a minimal Terraform File with a Kubernetes Secret Resource (see below)
  3. Run terraform apply

Results in Error: Post "http://localhost/api/v1/namespaces/default/secrets": dial tcp 127.0.0.1:80: connect: connection refused

provider "kubernetes" {
  version = "~> 1.11"
}

resource "kubernetes_secret" "test" {
  metadata {
    name = "test"
    namespace = "default"
  }

  data = {
    test = "data"
  }
}

I tried to configure the Kubernetes Provider using load_config_file and KUBE_LOAD_CONFIG_FILE. Enabling debug shows the following: [WARN] Invalid provider configuration was supplied. Provider operations likely to fail: invalid configuration: no configuration has been provided

@etwillbefine I wasn't able to reproduce the issue with the configuration you provided.
I ran a test inside a Debian container in a Pod on a 1.18 cluster and it worked as expected for me. See the output below.

root@test-708:/test-708# env | grep KUBERNETES | sort
KUBERNETES_PORT=tcp://10.3.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.3.0.1:443
KUBERNETES_PORT_443_TCP_ADDR=10.3.0.1
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.3.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
root@test-708:/test-708# cat main.tf
provider "kubernetes" {
  version = "~> 1.11"
  load_config_file = "false"
}

resource "kubernetes_namespace" "test" {
  metadata {
    name = "test"
  }
}

root@test-708:/test-708# terraform init

Initializing the backend...

Initializing provider plugins...
- Finding hashicorp/kubernetes versions matching "~> 1.11"...
- Installing hashicorp/kubernetes v1.13.1...
- Installed hashicorp/kubernetes v1.13.1 (signed by HashiCorp)

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
root@test-708:/test-708# terraform version
Terraform v0.13.2
+ provider registry.terraform.io/hashicorp/kubernetes v1.13.1
root@test-708:/test-708# terraform apply -auto-approve
kubernetes_namespace.test: Creating...

Error: namespaces is forbidden: User "system:serviceaccount:default:default" cannot create resource "namespaces" in API group "" at the cluster scope

  on main.tf line 6, in resource "kubernetes_namespace" "test":
   6: resource "kubernetes_namespace" "test" {

I'm going to close this issue as it's become a catch-all for credentials misconfigurations.
Please open separate issues if you're having trouble with configuring credentials so we can address them specifically.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error πŸ€– πŸ™‰ , please reach out to my human friends πŸ‘‰ [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings