Terraform-provider-google: Add option to add taints to `google_container_node_pool`

Created on 3 Nov 2017  ·  15Comments  ·  Source: hashicorp/terraform-provider-google

_This issue was originally opened by @eyalzek as hashicorp/terraform#16548. It was migrated here as a result of the provider split. The original body of the issue is below._


According to this doc:
https://cloud.google.com/container-engine/docs/node-taints#applying_taints_to_a_node_pool

it's possible to apply a taint to a node-pool/cluster, but the taint has to be specified during the creation. This makes it impossible to manage the node pool using terraform unless you create the pool using gcloud, then import it to terraform. The problem with that is that many of the configuration options of the pool cannot be altered after creation, so you're required to convert almost all of the terraform configuration into command line arguments when creating the pool, otherwise terraform will attempt to recreate the pool and as a result dump the taint configuration.

After doing all of that and running terraform plan, terraform still complains for the 2 fields:
initial_node_count (which cannot be configured using gcloud) and id. The only way to make it possible to manage this with terraform was to add a lifecycle such as:

  lifecycle {
    ignore_changes = ["initial_node_count", "id"]
  }

All in all not ideal.

enhancement

Most helpful comment

Is this on the road map?

All 15 comments

Just to update, with the latest google-cloud-sdk (178.0.0-0) there's an option --num-nodes that corresponds to initial_node_count, also, terraform now does not complain for the id attribute (https://github.com/terraform-providers/terraform-provider-google/commit/c9e2ce7f088a2de484079466afe49c1a32e24321), so the lifecycle rules I set up are unnecessary.

Hey @eyalzek, it looks like there are a few different issues you've encountered. The first one is to add the option for node taints, which we can absolutely do. Let's leave this issue open for that.

It also looks like you encountered some issues when running terraform import and terraform plan for node pools/clusters- mind filing separate issues for those so one of us can investigate? If you could also include a sample config and debug logs, that would help us figure out what's not working correctly.

@danawillow I cannot edit the original ticket, but as I mentioned in the previous comment; the second problem I had was due to my out of date packages and not an actual problem with terraform. So this ticket is about adding missing attributes to google_container_node_pool.

Duplicates #621, FWIW.

@danawillow This requires the container/v1beta1 API. I'm super curious how versioned API support has worked out elsewhere and whether or not that's something we want to do here, whether in the context of this feature or not. I know the team maintaining the Kubernetes provider has said they may support non-GA APIs once they dust settles over here and they've had a chance to confer. We can have the more general discussion elsewhere, if you think it's something worth exploring for the container API.

Side note: It is awesome that this team is pioneering versioned API support!

@eyalzek I know it's a bit hacky, but this is what I'm doing until this lands, if it's helpful to you:

node_pool/main.tf

resource "null_resource" "taint" {
  triggers {
    pool  = "${google_container_node_pool.pool.id}"
    taints = "${var.taints}"
  }

  provisioner "local-exec" {
    command = "${path.module}/bin/taint-node-pool ${var.cluster} ${var.name} '${var.taints}'"
  }
}

node_pool/bin/taint-node-pool

#!/usr/bin/env bash
set -euo pipefail
if [ $# -lt 3 ]; then
    echo "Usage taint-node-pool CLUSTER NODE_POOL TAINT"
    exit 1
fi

declare -r cluster=$1 \
    && shift
declare -r pool=$1 \
    && shift
declare -r taints="$@"

source $(cd "$(dirname "$BASH_SOURCE")" && pwd)/utils.sh

wait-for-cluster
get-cluster-credentials

selector="cloud.google.com/gke-nodepool=$pool"
jsonpath='{ .items[].spec.taints[?(@.key != "CriticalAddonsOnly")].key }'
kubectl get nodes -l "$selector" -o jsonpath="$jsonpath" \
    | xargs -n1 \
    | sort -u \
    | xargs -n1 -I{} kubectl taint node -l "$selector" {}-

if [ -n "$taints" ]; then
    kubectl taint node -l "node-pool=$pool" $taints
fi

@davidquarles thanks, but this doesn't exactly cover the case of auto-scaling node pools. Unfortunately tainting the entire node pool has to be done while creating it.

@eyalzek Yeah, you're totally right. I guess you'd have to copy the instance template directly, append to the kube-env metadata, and copy back, which has always seemed like a disgusting (though useful!) hack. It's a shame that isn't exposed as a mutable field after node pool creation.

Can you link to an example of that hack applied somewhere? :)

This is probably near-blasphemy to detail here, but either way – I started scripting an example for you last night, and it's much uglier programmatically than it is in the UI/console. The basic workflow, given a cluster named foo and a node pool named workers is this:

  • copy instance template gke-foo-workers-<uid> to gke-foo-workers-<uid>-1, adding NODE_TAINTS: <your comma-separated taints> to the kube-env metadata before saving
  • update the instance group gke-foo-workers-<uid>-grp to use the new template (meaningless for existing nodes)
  • delete instance template gke-foo-workers-<uid>
  • copy instance template gke-foo-workers-<uid>-1 back to the original name, gke-foo-workers-<uid>
  • update the instance group, again, to use the new gke-foo-workers-<uid> template
  • delete the temporary gke-foo-workers-<uid>-1 template
  • taint existing nodes via kubectl taint nodes -l "cloud.google.com/gke-nodepool=workers" <your comma-separated taints>

...at this point you have tainted nodes, taints will be added by your instance template, and AFAIK terraform isn't going to see this as a change / do anything destructive. this is all potentially achievable in terraform, though i'd imagine it's a bit complex.

Is this on the road map?

@davidquarles could you post the utils.sh also?

GKE API now allows to specify node taint on creation of a node pool. It would be super cool to have this feature in the provider!

The above change only add node-taints to a GKE cluster not node pools. Please consider adding taints to node pools also.

Hey @AswinKakarot, taints are part of the node_config block, which you can set on either the cluster or the node pool. You can see some examples of how to do this in the tests in the linked PR.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings