Terraform v0.11.8
resource "google_container_cluster" "primary" {
name = "${var.cluster_name}"
# If we want a regional cluster, should we be looking at https://cloud.google.com/kubernetes-engine/docs/concepts/regional-clusters#regional
# region = "${var.region}"
zone = "${var.main_zone}"
additional_zones = "${var.additional_zones}"
# Node count for every region
initial_node_count = 1
project = "${var.project}"
remove_default_node_pool = true
enable_legacy_abac = true
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_write",
"https://www.googleapis.com/auth/sqlservice.admin",
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
]
}
addons_config {
horizontal_pod_autoscaling {
disabled = false
}
}
}
resource "google_container_node_pool" "nodepool" {
name = "${var.cluster_name}nodepool"
zone = "${var.main_zone}"
cluster = "${google_container_cluster.primary.name}"
node_count = "${var.node_count}"
autoscaling {
min_node_count = "${var.min_node_count}"
max_node_count = "${var.max_node_count}"
}
}
A lot of info in those logs to share them openly. Any tool there to anonimase them? Happy to share them if there is no sensitive data on them. I couldn't find much info about it.
It does not crash
Once applied succesfully, if I terraform plan again, no changes should be needed.
If right after applying the changes successfully, I terraform plan, I get:
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
-/+ module.google.google_container_cluster.primary (new resource required)
id: "dev" => <computed> (forces new resource)
additional_zones.#: "1" => "1"
additional_zones.2873062354: "europe-west2-a" => "europe-west2-a"
addons_config.#: "1" => "1"
addons_config.0.horizontal_pod_autoscaling.#: "1" => "1"
addons_config.0.horizontal_pod_autoscaling.0.disabled: "false" => "false"
addons_config.0.http_load_balancing.#: "0" => <computed>
addons_config.0.kubernetes_dashboard.#: "0" => <computed>
addons_config.0.network_policy_config.#: "1" => <computed>
cluster_ipv4_cidr: "10.20.0.0/14" => <computed>
enable_binary_authorization: "false" => "false"
enable_kubernetes_alpha: "false" => "false"
enable_legacy_abac: "true" => "true"
endpoint: "****" => <computed>
initial_node_count: "1" => "1"
instance_group_urls.#: "2" => <computed>
logging_service: "logging.googleapis.com" => <computed>
master_auth.#: "1" => <computed>
master_version: "1.9.7-gke.6" => <computed>
monitoring_service: "monitoring.googleapis.com" => <computed>
name: "dev" => "dev"
network: "****" => "default"
network_policy.#: "1" => <computed>
node_config.#: "1" => "1"
node_config.0.disk_size_gb: "100" => <computed>
node_config.0.disk_type: "pd-standard" => <computed>
node_config.0.guest_accelerator.#: "0" => <computed>
node_config.0.image_type: "COS" => <computed>
node_config.0.local_ssd_count: "0" => <computed>
node_config.0.machine_type: "n1-standard-1" => <computed>
node_config.0.oauth_scopes.#: "6" => "6"
node_config.0.oauth_scopes.1277378754: "https://www.googleapis.com/auth/monitoring" => "https://www.googleapis.com/auth/monitoring"
node_config.0.oauth_scopes.1328717722: "" => "https://www.googleapis.com/auth/devstorage.read_write" (forces new resource)
node_config.0.oauth_scopes.1632638332: "https://www.googleapis.com/auth/devstorage.read_only" => "" (forces new resource)
node_config.0.oauth_scopes.172152165: "https://www.googleapis.com/auth/logging.write" => "https://www.googleapis.com/auth/logging.write"
node_config.0.oauth_scopes.1733087937: "" => "https://www.googleapis.com/auth/cloud-platform" (forces new resource)
node_config.0.oauth_scopes.299962681: "" => "https://www.googleapis.com/auth/compute" (forces new resource)
node_config.0.oauth_scopes.316356861: "https://www.googleapis.com/auth/service.management.readonly" => "" (forces new resource)
node_config.0.oauth_scopes.3663490875: "https://www.googleapis.com/auth/servicecontrol" => "" (forces new resource)
node_config.0.oauth_scopes.3859019814: "https://www.googleapis.com/auth/trace.append" => "" (forces new resource)
node_config.0.oauth_scopes.4205865871: "" => "https://www.googleapis.com/auth/sqlservice.admin" (forces new resource)
node_config.0.preemptible: "false" => "false"
node_config.0.service_account: "default" => <computed>
node_pool.#: "1" => <computed>
node_version: "1.9.7-gke.6" => <computed>
private_cluster: "false" => "false"
project: "***" => "***"
region: "" => <computed>
remove_default_node_pool: "true" => "true"
zone: "europe-west2-b" => "europe-west2-b"
Plan: 1 to add, 0 to change, 1 to destroy.
------------------------------------------------------------------------
This plan was saved to: devplan.tfplan
To perform exactly these actions, run the following command to apply:
terraform apply "devplan.tfplan"
terraform applyterraform apply againThis was not happening when using the default node pool. I started seeing the issue after using my own node pool instead, so I think it may be related to the node pool.
Maybe related to https://github.com/hashicorp/terraform/issues/18209 ?
Hmm, looking at that plan, what stands out to me is:
node_config.0.oauth_scopes.#: "6" => "6"
node_config.0.oauth_scopes.1277378754: "https://www.googleapis.com/auth/monitoring" => "https://www.googleapis.com/auth/monitoring"
node_config.0.oauth_scopes.1328717722: "" => "https://www.googleapis.com/auth/devstorage.read_write" (forces new resource)
node_config.0.oauth_scopes.1632638332: "https://www.googleapis.com/auth/devstorage.read_only" => "" (forces new resource)
node_config.0.oauth_scopes.172152165: "https://www.googleapis.com/auth/logging.write" => "https://www.googleapis.com/auth/logging.write"
node_config.0.oauth_scopes.1733087937: "" => "https://www.googleapis.com/auth/cloud-platform" (forces new resource)
node_config.0.oauth_scopes.299962681: "" => "https://www.googleapis.com/auth/compute" (forces new resource)
node_config.0.oauth_scopes.316356861: "https://www.googleapis.com/auth/service.management.readonly" => "" (forces new resource)
node_config.0.oauth_scopes.3663490875: "https://www.googleapis.com/auth/servicecontrol" => "" (forces new resource)
node_config.0.oauth_scopes.3859019814: "https://www.googleapis.com/auth/trace.append" => "" (forces new resource)
node_config.0.oauth_scopes.4205865871: "" => "https://www.googleapis.com/auth/sqlservice.admin" (forces new resource)
So here's what I think's happening:
It's not perfect, but I believe if you move the node_config block from container_cluster into the node_pool, that confusion will be resolved.
I'll investigate and see if we can't come up with a better solution for this to make it work intuitively.
That actually makes a lot of sense. I didn't think about that.
I was just confused about the:
id: "europe-west2-b/dev/devnodepool" =>
Thank you very much! (yes, it does indeed fix the problem)
omg thanks for this, i've been banging my head with this for a few days ;)
So it sounds like we either have a documentation problem or a validation problem. I'm not 100% up to speed on the reason we have node_config at the cluster and node pool levels, so I'm not comfortable enough that I have all the use cases in mind to be able to say what the ideal solution is here, but I think we can improve this either through documentation or through not letting cluster set node_config, or through potentially handling an empty node_config on a node pool better. I'll leave this open so we can investigate those options.
@paddycarver the answer to your question is that the node_config on the cluster corresponds to the default node pool. The ideal solution would be that we would have a default_node_pool block on the cluster, but alas, that's not what the API gives us to work with. In the meantime, we can probably solve through documentation.
Wow, until this is resolved, a big fat warning should be added to the docs.
We advertise this as the recommended way to bootstrap a GKE cluster, yet recreate the cluster on every terraform apply.
Flipped default and added warning in https://github.com/terraform-providers/terraform-provider-google/pull/3733.
Hey @flokli! Our recommendation is to use separately managed node pools and _not use the default node pool at all_.
If you specify a node_config block, you're telling Terraform you want to use the default node pool. That block was badly named by the API & by extension by the original implementation in Terraform. Despite the name omitting default_ prefix, it only applies to the default node pool.
As shown in the recommended example, node_config should be omitted and node_pool should be omitted.
@rileykarson if I copy that exact example:
https://www.terraform.io/docs/providers/google/r/container_cluster.html#example-usage-with-a-separately-managed-node-pool-recommended-
and terraform apply a second time, it'll destroy and recreate the whole cluster.
I just tested this and I can confirm that the 'recommended' example destroys itself on every run of terraform apply even when not using the default pool
The same is true of the other example using the default node pool, and neither is related to configuration of node pools. This is related to a breaking change from the GKE API where a default value was changed. Patching with https://github.com/GoogleCloudPlatform/magic-modules/pull/1844. See https://github.com/terraform-providers/terraform-provider-google/issues/3672 / #3369.
https://www.terraform.io/docs/providers/google/r/container_cluster.html#node_config is more clear about being used for the default node pool now. I don't think there's anything actionable to fix here, so I'm going to close this out. If anyone has anything unresolved and thinks this should be reopened, feel free to comment and I will.
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!
Most helpful comment
I just tested this and I can confirm that the 'recommended' example destroys itself on every run of
terraform applyeven when not using the default pool