Terraform-provider-google: google_storage_bucket_object data source missing content attribute

Created on 21 Feb 2019  路  7Comments  路  Source: hashicorp/terraform-provider-google


Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

I want the ability to retrieve the contents of a Google storage bucket object as a data source. The new google_storage_bucket_object data source introduced by #1157 has many content related fields, but does not include a content attribute.

My use case is that I want to store service account keys in storage buckets, encrypted by KMS. Furthermore, I want to have a process that rotates these keys without the need to run the whole terraform script. I have this working today by:

  1. Using gsutil occasionally to download Google cloud storage bucket objects encrypted by KMS to the local file system. This is the step I want to replace with google_storage_bucket_object
  2. Use the first google aliased terraform provider using default application credentials to ask KMS to decrypt that local file contents (loaded using the file(...) function) by using data google_kms_secret
  3. That decrypted secret I then use for the second google provider (non aliased) for all other resources using the google provider.

New or Affected Resource(s)

  • google_storage_bucket_object data resource

Potential Terraform Configuration

The minimal terraform script showing what I want:

provider google {
  version = ">= 2.0.0"
}

data google_storage_bucket_object "encrypted_key" {
  provider = "google.application-default"
  bucket   = "my-secrets-bucket"
  name     = "svc-key.json.encrypted"
}

output "encrypted" {
  value = "${data.google_storage_bucket_object.encrypted_key.content}"
}

To provide more context, the more complete example of what I'm trying to do with KMS looks like this:

provider google {
  alias   = "application-default"
  version = ">= 2.0.0"
}

data google_storage_bucket_object "encrypted_key" {
  provider = "google.application-default"
  bucket   = "my-secrets-bucket"
  name     = "svc-key.json.encrypted"
}

data google_kms_secret "svc_cred" {
  provider   = "google.application-default"
  ciphertext = "${data.google_storage_bucket_object.encrypted_key.content}"
  crypto_key = "projects/my-kms-project/locations/us/keyRings/my-keyring/cryptoKeys/my-svc-key"
}

provider google {
  version     = ">= 2.0.0"
  credentials = "${data.google_kms_secret.svc_cred.plaintext}"
}

output "decrypted" {
  value = "${data.google_kms_secret.svc_cred.plaintext}"
}

Also, it does not appear that the documentation for the google_storage_bucket_object data resource is listed in the index, even though I found some docs via Google at https://www.terraform.io/docs/providers/google/d/datasource_google_cloud_bucket_object.html

References

  • #1157
enhancement new-data-source sizM

Most helpful comment

For anyone landing here my current workaround is :

data "google_client_config" "current" {}

data "google_storage_bucket_object" "object" {
 name   = "global/file"
 bucket = "bucket-name"
}

data "http" "object" {
 url = "${format("%s?alt=media", data.google_storage_bucket_object.object.self_link)}"

 # Optional request headers
 request_headers {
   "Authorization" = "Bearer ${data.google_client_config.current.access_token}"
 }
}

output "object" {
 value = "${data.http.object.body}"
}

All 7 comments

I don't think it's advisable to do this as an attribute on the existing data source.
It would require a second call to the api as the api can either provide metadata or the content depending on the flags passed. Incurring the cost of the extra api call for every datasource for people who don't need the content isn't a good experience.
This would be especially true for content such as images, videos or other large file sizes. I much prefer that users are able to explicitly opt into downloading the content of these buckets.

Given that I think this is better off being its own data source (google_storage_bucket_object_content perhaps).

I agree that it would be best to avoid two calls to the api, but it would make it very symmetrical with the resource which has a writable attribute called content. Perhaps another attribute to indicate what I'm interested in: metadata vs. content? And an attribute indicating whether to store it in the content attribute or to a destination file? Is there any way to make the API call only when I access the content attribute?

I was actually surprised that this data source provided only metadata, as either source or content are required attributes on the resource version. I understand it could be dangerous to read potentially large objects into memory through a content attribute, but if I know that the object is not huge, I would like the option to load it into a content attribute. Putting it into another data source just seems odd to me.

I have no interest in this content attribute being stored in state. In fact, I would prefer it not be stored in state.

In order to use the contents in a config via interpolation terraform will need to store the contents in state. This isn't something we can manage in the provider and even if we could anybody with the ability to run the original terraform would have access to the credentials anyway.

For context, the original intent was allow for objects to be used by reference in other resources (eg: by self_link) without the need to have the contents downloaded to a local disk. I do see there is a need for also downloading those contents and it would be nice to mimic the resource but retrieval doesn't work quite the same as creation in the API and try to maintain consistency with the api as our first priority. In hind sight I would have had _metadata and _content but I don't think it's worth breaking users at this point.

For anyone landing here my current workaround is :

data "google_client_config" "current" {}

data "google_storage_bucket_object" "object" {
 name   = "global/file"
 bucket = "bucket-name"
}

data "http" "object" {
 url = "${format("%s?alt=media", data.google_storage_bucket_object.object.self_link)}"

 # Optional request headers
 request_headers {
   "Authorization" = "Bearer ${data.google_client_config.current.access_token}"
 }
}

output "object" {
 value = "${data.http.object.body}"
}

Thanks, @primeroz. That worked like a charm. I just needed to make sure the object was uploaded with text/plain or as application/json.

@bbrouwer FYI we actually have a module which does handle fetching GCS objects: https://github.com/terraform-google-modules/terraform-google-secret/tree/master/modules/gcs-object

The discussion here was a while ago, but it seems reasonable to me to support this as a google_storage_bucket_object_contents datasource based on making the call with the alt=media parameter set. That avoids introducing (potentially secret) contents to users' state files, but provides a first-class way of getting at them.

Was this page helpful?
0 / 5 - 0 ratings