Terraform-provider-google: google_container_cluster: updating nodepool configuration recreates cluster

Created on 27 Jun 2018 · 13Comments · Source: hashicorp/terraform-provider-google

Hi there,

I've been reading through the issue history and found some related issues (https://github.com/terraform-providers/terraform-provider-google/issues/285, https://github.com/terraform-providers/terraform-provider-google/issues/408 and https://github.com/terraform-providers/terraform-provider-google/issues/474), but the issue I'm reporting is still not resolved. I'm also not quite sure if it can be solved in Terraform, but I do know for sure that updating node pool configuration of an existing cluster through Google Console can be done without recreating the cluster.

When using only the google_container_cluster resource (and not the node pool resource), the following actions all cause the cluster to be recreated. In practice, none of these require a recreation of the whole cluster as the Google console allows modifications of these:

add or delete a nodepool
node pool initial node count

In addition, changing these also should not recreate the cluster but only the node pool (which is expected behaviour):

node pool preemptible flag
node pool node specs like machine type

Terraform Version

Google provider 0.15.0

Affected Resource(s)

Please list the resources as a list, for example:

google_container_cluster

We are not using any of the google_container_node_pool resources in addition to the google_container_cluster resource.

enhancement

Source

rolandkool

👍7

Most helpful comment

Hey all, sorry for the problems here, and for taking so long to respond. There are a few different things going on, which I'll try to explain here:

Node pools inside the google_container_cluster resource can't be added or removed without deleting the entire cluster.
This is something we've been aware of for a while, but the fix is actually pretty hard, and will almost certainly be a breaking change, so we're planning on fixing it for v2.0.0 of the provider. There's some more context in https://github.com/terraform-providers/terraform-provider-google/issues/780 and its linked issues.
Updating initial node count causes a recreate of the cluster.
This is WAI, as far as I understand it. My interpretation is that if you want to change the _initial_ node count of the cluster, that means you want to recreate it with a new node count at the beginning. If you want to change the node count of an existing node pool, you can use the node_count property.
Changes to node pools inside the cluster recreate the cluster instead of just recreating the node pool.
This is interesting! Here's what's going on. At some point in time, we realized that, in order to meet the use cases of all our users, we did need to have two separate ways of configuring node pools: on the google_container_cluster resource, and as a separate google_container_node_pool resource. The schemas for the two of them eventually started falling out of sync, so we fixed this by using a shared schema between the two. Since some of the attributes of the node pool aren't updatable, we mark them ForceNew in the schema, which means "recreate the resource when this changes". This works great when setting those attributes on the node pool resource, but when they're set on a node pool inside the cluster resource, the ForceNew ends up referring to the full cluster instead of to the node pool inside it. This should definitely be fixed.
Some fields in the node pool are updatable via the API, but Terraform still wants to do a destroy/create for those.
This happens sometimes when GKE starts allowing a field to be updatable that wasn't before. I'd like to do a sweep at some point to check that we have the ability to update everything in TF that is updatable in GKE, but I just haven't gotten around to it yet. If you come across one, feel free to file an issue here and tag me in it.

So, what can you do right now? One option is to use the google_container_node_pool resource, since that already works the way it's supposed to (except for some potentially updatable fields that need to be marked as such). If you use that along with the remove_default_node_pool property on the cluster, you should be able to do everything you'd like to do. This is my recommendation. However, I know not everybody wants to use that, and we want to make sure that users that want to be able to say "I want these node pools and only these node pools on my cluster" can do so. In that case, I ask for your patience as we figure out the story for v2.0.0 of the provider, and try to balance fixing issues like these (that have workarounds, even though they may not be ideal) with some of the other features we're trying to add to the provider. I've gone ahead and assigned this to myself, and hopefully I'll have some time to work on a fix for point 3 above soon. I hope this helps, at least somewhat. Let me know if you have any questions!

danawillow on 3 Aug 2018

👍7

All 13 comments

I'm not sure how this is marked as an enhancement. This clearly is a bug.
I added a second node pool and further terraform plans want to delete the default node_pool, and delete and rebuild the cluster.

dodizzle on 2 Jul 2018

👍6

Chiming in to add more details. It seems any configuration changes related to the node_configparameter trigger the cluster to be re-created. Image type was already mentioned in the original post, but for example any GCP setup that relies on network tags or taints (beta feature, but still) will have to tear down the cluster completely when making changes in google_container_cluster resource. A simple example:

resource "google_container_cluster" "foo" {
  name               = "bar"
  initial_node_count = 3
  node_config {
    tags = ["baz"]
  }
}

If I then add "xyzzy" as one of the tags and therefore change it to tags = [ "baz", "xyzzy" ] Terraform will want to re-create the cluster:

-/+ google_container_cluster.foo (new resource required)
      id:                                "bar" => <computed> (forces new resource)
...
      node_config.0.tags.#:              "1" => "2" (forces new resource)
      node_config.0.tags.0:              "baz" => "baz"
      node_config.0.tags.1:              "" => "xyzzy" (forces new resource)

Now, some of the fields in node_config do have the ability to get updated through the projects.zones.clusters API (both v1 and v1beta1 versions). API takes a ClusterUpdate object as a parameter. Network tags or taints are not supported in that API call so there's no way around it currently.

Only workaround I can think of right now is to not use the google_container_cluster resource for any node pool related configuration. If node pools are versioned by naming them accordingly like my-nodepool-v1 you could create a new node pool with different node config and call it my-nodepool-v2. Then just use k8s to assign pods from nodepool to another.

vtorhonen on 3 Jul 2018

👍3

Hey all, sorry for the problems here, and for taking so long to respond. There are a few different things going on, which I'll try to explain here:

Node pools inside the google_container_cluster resource can't be added or removed without deleting the entire cluster.
This is something we've been aware of for a while, but the fix is actually pretty hard, and will almost certainly be a breaking change, so we're planning on fixing it for v2.0.0 of the provider. There's some more context in https://github.com/terraform-providers/terraform-provider-google/issues/780 and its linked issues.
Updating initial node count causes a recreate of the cluster.
This is WAI, as far as I understand it. My interpretation is that if you want to change the _initial_ node count of the cluster, that means you want to recreate it with a new node count at the beginning. If you want to change the node count of an existing node pool, you can use the node_count property.
Changes to node pools inside the cluster recreate the cluster instead of just recreating the node pool.
This is interesting! Here's what's going on. At some point in time, we realized that, in order to meet the use cases of all our users, we did need to have two separate ways of configuring node pools: on the google_container_cluster resource, and as a separate google_container_node_pool resource. The schemas for the two of them eventually started falling out of sync, so we fixed this by using a shared schema between the two. Since some of the attributes of the node pool aren't updatable, we mark them ForceNew in the schema, which means "recreate the resource when this changes". This works great when setting those attributes on the node pool resource, but when they're set on a node pool inside the cluster resource, the ForceNew ends up referring to the full cluster instead of to the node pool inside it. This should definitely be fixed.
Some fields in the node pool are updatable via the API, but Terraform still wants to do a destroy/create for those.
This happens sometimes when GKE starts allowing a field to be updatable that wasn't before. I'd like to do a sweep at some point to check that we have the ability to update everything in TF that is updatable in GKE, but I just haven't gotten around to it yet. If you come across one, feel free to file an issue here and tag me in it.

danawillow on 3 Aug 2018

👍7

Hi Dana,

Thanks for the extensive reply, very helpful.
I was not aware of the remove_default_node_pool parameter until now, so that allows us to use the google_container_node_pool resources.
If you have an existing cluster with existing terraform state managed through google_container_cluster with node pools configured, is it possible to switch to a combination of google_container_cluster and good_container_node_pool without destroying the cluster? (these are in production already).

Regards,

Roland

rolandkool on 4 Aug 2018

I think so! If you use terraform import, you should be able to get the new node pool resources in your state. Then, because node_pools are marked computed in the cluster resource (implementation detail, not something you necessarily should need to know) you should be able to remove all the node pools from the cluster resource without any diffs. Of course, please run terraform plan to verify that it all worked, and try it out in a dev environment first.

danawillow on 6 Aug 2018

@danawillow thank you for the explanation and work around.
I confirmed that using remove_default_node_pool works.
Below is a two step process for provisioning a new cluster that will allow you to add and remove node_pools without destroying the cluster.

Step 1
Create cluster with a node pool defined, do this with the google_container_cluster resource.
Set remove_default_node_pool to false.
Run terraform apply.
Step 2
Add a second node pool using the google_container_node_pool resource.
Comment out or remove the node_pool block from the google_container_cluster resource.
Set google_container_cluster to true.
Run terraform apply.

Feel free to add or remove node_pools, the cluster will not be destroyed.

dodizzle on 9 Aug 2018

🎉1 👍1

Hi Dana,

Thanks a lot for the comments.

There have been a lot of info added to this issue and i think of the main points from vtorhonen@ fall under the cracks.

One the issues we noticed is when updating network tags on existing pools (regardless of how they were created) this causes the whole cluster to be re-created.

Nodes Network tags are updatebale via the GCE API.

Any idea if this is something that can be fixed easily before v2 of the provider ?

boredabdel on 10 Aug 2018

Yup, that's point 3 in my giant post above, and the reason I'm keeping this issue open. I'm hoping to work on it soon, but I can't make any guarantees. I'd say it's likely, though.

danawillow on 10 Aug 2018

Oh no just kidding, it's not point 3. I don't see anything that lets tags be updated at https://cloud.google.com/kubernetes-engine/docs/reference/rest/, can you point me to the API you use to update them?

danawillow on 10 Aug 2018

Thank you @danawillow for your constructive response.

I have couple of queries.

What's the benefit of having default node pool with cluster ?
Do you suggest to delete default node pool as a work around in production environment?

mayurmaru on 28 Aug 2018

I'm not actually a GKE expert, I'm just very familiar with their API, so I can't really answer question 1. I'd recommend looking around the GKE docs to see if it answers that question. Likewise, I don't have an opinion on question 2 since it's not my area of expertise, but it seems to be a thing many of our Terraform users are happy to have the option of doing.

danawillow on 4 Sep 2018

Hello again everyone! I've looked into things some more and, at least with the way things are right now, I'm going to have to close this issue unsolved.

The main thing in this that was still open was that changing an attribute of a node pool that's defined inside the cluster resource recreates the entire cluster instead of that node pool. The way that we would fix this would be to call nodePools.delete() and then nodePools.add() for the changed node pool. However, the new node pool won't necessarily be created at the same index in the list that the old one was in. This means that the next time you run Terraform, it could show a diff because the node pools were reordered. This is surprising behavior, and I don't want people to run into it. Being able to reorder the node pools falls into bullet point 1 in my earlier reply, which I spent a fair bit of time trying to fix. Unfortunately, it looks like that won't be possible for a number of reasons, which I'll write up (probably in #780) once I have a bit more time to.

The only time that this would be ok is if you only define a single node pool inside the cluster, since you can't reorder a list of size one. However, that's a lot of hacky logic to write in to the resource. If there were absolutely no other workarounds, then I'd consider it. However, there is one- using the node pool resource.

So what I can do is update the documentation to make it much clearer what the limitations are on using the node_pool list inside of the cluster resource so people can make a more aware decision on whether to use that or the node_pool resource. I'll be doing that shortly. In the meantime, sorry for the less-than-ideal outcome of this issue.

danawillow on 31 Oct 2018

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!