Terraform-provider-google: Not possible to use `google_container_node_pool` without the default node pool

Created on 21 Nov 2017 · 13Comments · Source: hashicorp/terraform-provider-google

Right now it is not possible to use google_container_node_pool without the default node pool created by google_container_cluster. It should be possible to create a google_container_cluster without the default node pool so they can be exclusively managed by google_container_node_pool.

I have confirmed that it is possible in GKE to create a cluster without any node pool.

See #285 and #475

enhancement

Source

rochdev

👍2

Most helpful comment

Here's an example of a container cluster, a separately managed node pool (how it should work, IMO), and then a null_resource that deletes the default pool created by the cluster after it is created.

resource "google_container_cluster" "cluster" {
  name = "my-cluster"
  zone = "us-west1-a"
  initial_node_count = 1
}

resource "google_container_node_pool" "pool" {
  name = "my-cluster-nodes"
  node_count = "3"
  zone = "us-west1-a"
  cluster = "${google_container_cluster.cluster.name}"
  node_config {
    machine_type = "n1-standard-1"
  }
  # Delete the default node pool before spinning this one up
  depends_on = ["null_resource.default_cluster_deleter"]
}

resource "null_resource" "default_cluster_deleter" {
  provisioner "local-exec" {
    command = <<EOF
      gcloud container node-pools \
    --project my-project \
    --quiet \
    delete default-pool \
    --cluster ${google_container_cluster.cluster.name}
EOF
  }
}

I would recommend using the --project flag in the gcloud command in the local provisioner to make sure that you aren't accidentally running in some other configuration. The --quiet flag bypasses the prompt to delete the node pool. For whatever reason, I could only get it to work by putting the --cluster argument after the delete command so pardon the ugliness. The depends_on in the new node pool makes sure that the default one gets deleted first, otherwise you will see some kind of error about trying to do two operations to the cluster (deleting and creating) at the same time.

mattdodge on 23 Jan 2018

👍6

All 13 comments

Hey @rochdev, how did you create a cluster with no node pool? Did you do so by creating the cluster and then deleting the node pool, or some other way?

danawillow on 22 Nov 2017

@danawillow I indeed created the cluster and then deleted the default node pool. It is unclear from the API documentation if it is possible to create a cluster without any node pool directly and I didn't try it.

Our current strategy is to create a cluster module that includes a google_container_cluster resource and a null_resource resource to delete the default node pool. The outputs of this module depend on the null_resource to ensure that external modules can only create node pools after the default node pool has been deleted (to avoid possible conflicts).

rochdev on 22 Nov 2017

We are also struggling with this issue. We finally had to manually delete default-pool and it seems like it is not tracked by Terraform at all, because the next plan execution is not complaining about discrepancies.

burdiyan on 22 Nov 2017

The null_resource is the way to go for this, at least for now. In this case, Terraform follows the exact same behavior as gcloud and the console- first you create the cluster, then you delete the default node pool.

@burdiyan yup, if node pools aren't set in the cluster schema, then terraform won't show a diff.

danawillow on 27 Nov 2017

Can anyone give me some sample code of their implementation with null_resource as I have this exact problem to fix too!

Stono on 28 Dec 2017

👍2

resource "google_container_cluster" "cluster" {
  name = "my-cluster"
  zone = "us-west1-a"
  initial_node_count = 1
}

resource "google_container_node_pool" "pool" {
  name = "my-cluster-nodes"
  node_count = "3"
  zone = "us-west1-a"
  cluster = "${google_container_cluster.cluster.name}"
  node_config {
    machine_type = "n1-standard-1"
  }
  # Delete the default node pool before spinning this one up
  depends_on = ["null_resource.default_cluster_deleter"]
}

resource "null_resource" "default_cluster_deleter" {
  provisioner "local-exec" {
    command = <<EOF
      gcloud container node-pools \
    --project my-project \
    --quiet \
    delete default-pool \
    --cluster ${google_container_cluster.cluster.name}
EOF
  }
}

mattdodge on 23 Jan 2018

👍6

Thanks @mattdodge! A quick note that with #937 (which made it into our most recent release), the depends on shouldn't be necessary anymore (though it also doesn't hurt at all to keep it)

danawillow on 24 Jan 2018

Oh, interesting, glad to see that made it in. I think I would still have to tell terraform that the null resource is doing something to the cluster though, unless there's some super crazy magic going on behind the scenes. Is that right?

I'm running terraform 0.11.2 and 1.5.0 of the google provider and I needed to include it. I would rather mark the null resource as affecting the cluster rather than the fake depends on if possible though.

mattdodge on 24 Jan 2018

Oh nope you're totally right, sorry to mislead.

danawillow on 24 Jan 2018

here's a small enhancement to @mattdodge 's workaround:
When using the node_pool config in google_container_cluster one can actually initialize the cluster with an empty node pool (it still creates a node pool, but the node pool itself has 0 nodes).
This makes deletion of the default node-pool a bit faster. No intermediate VM is created, just the useless empty node pool.

To do so, instead of:

resource "google_container_cluster" "cluster" {
  name = "my-cluster"
  zone = "us-west1-a"
  initial_node_count = 1
}

use:

resource "google_container_cluster" "cluster" {
  name = "my-cluster"
  zone = "us-west1-a"
    node_pool = [{
    name = "default-pool"
    node_count= 0
  }]
}

geekflyer on 24 Feb 2018

👍3

The problem with this approach is then changing parameters of the node pool
which are updatable results in a complete cluster rebuild.

Or at least it did last time I tried

On 24 Feb 2018 9:06 am, "Christian Theilemann" notifications@github.com
wrote:

here's a small enhancement to @mattdodge https://github.com/mattdodge
's workaround:
When using the node_pool config in google_container_cluster one can
actually initialize the cluster with an empty node pool (it still creates a
node pool, but the node pool itself has 0 nodes).
This makes deletion of the default node-pool a bit faster. No intermediate
VM is created, just the useless empty node pool.

To do so, instead of:

resource "google_container_cluster" "cluster" {
name = "my-cluster"
zone = "us-west1-a"
initial_node_count = 1
}

use:

resource "google_container_cluster" "cluster" {
name = "my-cluster"
zone = "us-west1-a"
node_pool = [{
name = "default-pool"
node_count= 0
}]
}

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/terraform-providers/terraform-provider-google/issues/773#issuecomment-368213521,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABavicgDOAHo_U8Zhyxw_QiAmgC4WsbYks5tX9EJgaJpZM4QmUZZ
.

Stono on 24 Feb 2018

@Stono

Well you don't wanna update that inline node_pool config, but you can just delete it.
In my case when I delete the node_pool config after the initial deployment, terraform actually doesn't detect any changes and just ignores it:

resource "google_container_cluster" "cluster" {
  name = "my-cluster"
  zone = "us-west1-a"
}

geekflyer on 24 Feb 2018

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!