Terraform-provider-google: Add `enabled` field to the `workload_identity_config` block

Created on 18 Sep 2019  ยท  5Comments  ยท  Source: hashicorp/terraform-provider-google


Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

The only supported fields in the workload_identity_config are identity_namespace
https://github.com/terraform-providers/terraform-provider-google-beta/blob/3b46dc33a3d0ed8df968546462fa0f4908597e7d/google-beta/resource_container_cluster.go#L704-L716

There's no enabled field within workload_identity_config block to default to.

When using one module each for a GKE cluster with the same source, where the source is a "google_container_cluster" resource, we are unable to conditionally enable workload_identity_config it for a few clusters.

The only way we could conditionally add the workload_identity_config block to the google_container_cluster resource is by using dynamic-blocks.


# Workload Identity allows Kubernetes service accounts to act as a user-managed Google IAM Service Account.
  dynamic "workload_identity_config" {
    for_each = var.workload_identity_enabled ? list(var.cluster_project) : []

    content {
      # Currently, the only supported identity namespace is the project's default.
      identity_namespace = "${var.cluster_project}.svc.id.goog"
    }
  }
}

New or Affected Resource(s)

workload_identity_config - https://www.terraform.io/docs/providers/google/r/container_cluster.html#workload_identity_config

provider "google-beta" with version >= 2.12.0

Potential Terraform Configuration

Add an enabled field to the workload_identity_config block.

            "workload_identity_config": {
                Type:     schema.TypeList,
                MaxItems: 1,
                Optional: true,
                Elem: &schema.Resource{
                    Schema: map[string]*schema.Schema{
                        "enabled": {
                            Type:     schema.TypeBool,
                            Required: true,
                        },
                        "identity_namespace": {
                            Type:     schema.TypeString,
                            Required: true,
                        },
                    },
                },
            },
persistent-bug sizS

Most helpful comment

I too have been experimenting with the above approach to enabling workload identity in a shared module used to build multiple clusters and have noted some unexpected behaviour when testing various scenarios, which I thought might be helpful to post here.

  • To disable workload identity on a cluster on which it has previously been enabled, removing the workload_identity_config block entirely _does_ disable workload identity, however:

    • Terraform continually proposes changes on future plans

    • When refreshing the state from the cluster, Terraform establishes that identity_namespace = "" when workload identity has been switched off and attempts to remove the workload_identity_config block forevermore

  • Instead, we can ask Terraform to set identity_namespace = "" to disable workload identity on a cluster that has previously had it enabled, and this seems to work and Terraform proposes no further change
  • However, trying to set identity_namespace = "" on a cluster that has _never_ had workload identity previously enabled fails with: Error: googleapi: Error 400: Must specify a field to update., badRequest
  • Hence, the workload_identity_config block must remain absent from the configuration for a cluster which has never had workload identity enabled

    • NB. Setting identity_namespace = null does not appear to be a valid approach either

So, in a common GKE cluster module, in order to support all state transitions, we appear to need a three-state approach:
| Desired workload identity configuration | Cluster has had workload identity enabled in the past? | Required config |
|---|---|---|
| Enabled | No | identity_namespace = "<namespace>" |
| Disabled | No | workload_identity_config block entirely absent |
| Enabled | Yes | identity_namespace = "<namespace>" |
| Disabled | Yes | identity_namespace = "" |

Which leads us to having something like the following dynamic block inside the google_container_cluster resource:

dynamic "workload_identity_config" {
  for_each = var.enable_workload_identity == null ? [] : [0]
  content {
    identity_namespace = var.enable_workload_identity == true ? "${var.project_id}.svc.id.goog" : ""
  }
}

Which has a number of pitfalls which must be documented for the unwary:

Cluster having not had workload identity previously enabled

| var.enable_workload_identity | Remarks |
|---|---|
| Set to null | โœ… No changes proposed; workload identity left disabled |
| null -> false | โŒ Fails with Error: googleapi: Error 400: Must specify a field to update., badRequest |
| null -> true | โœ… Enables workload identity as expected |

Cluster having had workload identity previously enabled

| var.enable_workload_identity | Remarks |
|---|---|
| true to false | โœ… Disables workload identity |
| true -> null | โŒ Disables workload identity, but proposes change on every subsequent plan |
| null -> true | โœ… Enables workload identity as expected |
| null -> false | โœ… Makes the perpetual change proposals go away, leaves workload identity disabled |
| false -> true | โœ… Enables workload identity as expected |
| false -> null | โŒ Leaves workload identity disabled, but results in endless proposed changes |

This appears to all stem from the fact that having disabled workload identity on a cluster, an empty workloadIdentityConfig section gets returned from the API, whereas this does not exist for a cluster that has never had the feature enabled.

Semi-related google_container_node_pool issues

Thought I'd note this here, although I guess this may warrant a separate issue?

Once workload identity is enabled in a cluster, new node pools, by default, have the GKE Metadata Server enabled.

So when creating a new node pool with the google_container_node_pool resource following the enablement of workload identity on the cluster, if no workload_metadata_config block is specified, the resultant node pool gets created with node_metadata = GKE_METADATA_SERVER anyway. On subsequent plans when Terraform refreshes the state of the google_container_node_pool resource it sees that the workload_metadata_config section is present and tries to remove it on every subsequent plan.

So, when deploying new node pools onto a cluster with workload identity enabled, the workload_metadata_config block and the node_metadata setting are effectively not optional, otherwise Terraform erroneously proposes changes.

In addition, using the node_metadata = "UNSPECIFIED" value always seems to result in Terraform repeatedly proposing changes, because the actual underlying nodeMetadata setting gets set to either EXPOSE or GKE_METADATA_SERVER depending on the cluster configuration.

All 5 comments

I too have been experimenting with the above approach to enabling workload identity in a shared module used to build multiple clusters and have noted some unexpected behaviour when testing various scenarios, which I thought might be helpful to post here.

  • To disable workload identity on a cluster on which it has previously been enabled, removing the workload_identity_config block entirely _does_ disable workload identity, however:

    • Terraform continually proposes changes on future plans

    • When refreshing the state from the cluster, Terraform establishes that identity_namespace = "" when workload identity has been switched off and attempts to remove the workload_identity_config block forevermore

  • Instead, we can ask Terraform to set identity_namespace = "" to disable workload identity on a cluster that has previously had it enabled, and this seems to work and Terraform proposes no further change
  • However, trying to set identity_namespace = "" on a cluster that has _never_ had workload identity previously enabled fails with: Error: googleapi: Error 400: Must specify a field to update., badRequest
  • Hence, the workload_identity_config block must remain absent from the configuration for a cluster which has never had workload identity enabled

    • NB. Setting identity_namespace = null does not appear to be a valid approach either

So, in a common GKE cluster module, in order to support all state transitions, we appear to need a three-state approach:
| Desired workload identity configuration | Cluster has had workload identity enabled in the past? | Required config |
|---|---|---|
| Enabled | No | identity_namespace = "<namespace>" |
| Disabled | No | workload_identity_config block entirely absent |
| Enabled | Yes | identity_namespace = "<namespace>" |
| Disabled | Yes | identity_namespace = "" |

Which leads us to having something like the following dynamic block inside the google_container_cluster resource:

dynamic "workload_identity_config" {
  for_each = var.enable_workload_identity == null ? [] : [0]
  content {
    identity_namespace = var.enable_workload_identity == true ? "${var.project_id}.svc.id.goog" : ""
  }
}

Which has a number of pitfalls which must be documented for the unwary:

Cluster having not had workload identity previously enabled

| var.enable_workload_identity | Remarks |
|---|---|
| Set to null | โœ… No changes proposed; workload identity left disabled |
| null -> false | โŒ Fails with Error: googleapi: Error 400: Must specify a field to update., badRequest |
| null -> true | โœ… Enables workload identity as expected |

Cluster having had workload identity previously enabled

| var.enable_workload_identity | Remarks |
|---|---|
| true to false | โœ… Disables workload identity |
| true -> null | โŒ Disables workload identity, but proposes change on every subsequent plan |
| null -> true | โœ… Enables workload identity as expected |
| null -> false | โœ… Makes the perpetual change proposals go away, leaves workload identity disabled |
| false -> true | โœ… Enables workload identity as expected |
| false -> null | โŒ Leaves workload identity disabled, but results in endless proposed changes |

This appears to all stem from the fact that having disabled workload identity on a cluster, an empty workloadIdentityConfig section gets returned from the API, whereas this does not exist for a cluster that has never had the feature enabled.

Semi-related google_container_node_pool issues

Thought I'd note this here, although I guess this may warrant a separate issue?

Once workload identity is enabled in a cluster, new node pools, by default, have the GKE Metadata Server enabled.

So when creating a new node pool with the google_container_node_pool resource following the enablement of workload identity on the cluster, if no workload_metadata_config block is specified, the resultant node pool gets created with node_metadata = GKE_METADATA_SERVER anyway. On subsequent plans when Terraform refreshes the state of the google_container_node_pool resource it sees that the workload_metadata_config section is present and tries to remove it on every subsequent plan.

So, when deploying new node pools onto a cluster with workload identity enabled, the workload_metadata_config block and the node_metadata setting are effectively not optional, otherwise Terraform erroneously proposes changes.

In addition, using the node_metadata = "UNSPECIFIED" value always seems to result in Terraform repeatedly proposing changes, because the actual underlying nodeMetadata setting gets set to either EXPOSE or GKE_METADATA_SERVER depending on the cluster configuration.

The two failing cases under Cluster having had workload identity previously enabled seem like a clear-cut bug. I'll see if I can handle those next week.

That may involve exposing enabled and may not- if it doesn't, I can re-triage this issue as an enhancement.

Hi, any word on this? This is still a problem in version 3.46.0.

It's been a while, but this was a bigger problem than I expected, if I remember right. I'm triaging this as a persistent-bug, which means we'll pick it up in our triage process as if it was an enhancement request.

Was this page helpful?
0 / 5 - 0 ratings