11.7.0
Currently google_project_services manages all services on the project and tries to be authoritative. google_project_service manages a single service, leaving the rest alone. The challenge is that google_project_service creates a new entry in the graph and (more importantly) relies on individual API calls to enable services, instead of batching them together.
I propose that we need a middle-ground resource like google_project_services_batch or a way to configure the behavior of google_project_services such that it only manages the services that it manages.
Proposals:
Naming tbd, but add a new resource google_project_services_batch with docs that describe the behavior:
resource "google_project_services_batch" "my-services" {
project = "my-project"
services = [
"iam.googleapis.com",
"containerregistry.googleapis.com",
]
}
Advantages
Disadvantages
google_project_services vs google_project_service; this would only make that worsegoogle_project_services to not manage all the servicesField name tbd, provide a boolean option to google_project_services such that it does not remove services that it was not previously managing.
resource "google_project_services" "my-services" {
manage = false
project = "my-project"
services = [
"iam.googleapis.com",
"containerregistry.googleapis.com",
]
}
Advantages
google_project_service (disable_on_destroy)Disadvantages
google_project_service vs google_project_services+1 to adding something which can do this, with a slight bias towards adding a boolean on google_project_services as it would also allow easier migration between managed and unmanaged services.
I'm thinking on this one, just because in the past (google_project, google_project_iam, etc.) we've yet to add an authoritative boolean we didn't regret later. So I'm hesitant. But what this has that's different is it's just a presence, not a value, resource, so we don't need to store old value, or "restore" anything on resource deletion.
Presumably:
google_services resources are in the same config? What happens if they overlap, or don't? What about an authoritative and non-authoritative?I think the case where multiple instances of the resource are used in the same config needs to be thought through some more, but I'm struggling to come up with instances where this would bite us. I'm going to look through some of the historical issues and see if I can find anything that sticks out as relevant, too.
What happens if two, non-authoritative google_services resources are in the same config?
If two non-authoritative resources are present, they would both activate the APIs in their list. Just like if you had a list of multiple google_project_service resources.
What happens if they overlap, or don't?
If they overlap, I imagine they would both try to activate it. Deleting from one would deactivate it, similarly to two google_project_service resources. This would be (naturally) unsupported and non-recommended, like today.
What about an authoritative and non-authoritative?
I think this is unsupported and would lead to similar errors as using the current google_project_services together with google_project_service.
At a high level, I would argue that the _current_ behavior Terraform is taking is a little... un-Terraform-ey. One of Terraform's promises is that it will only manages the resources it previously managed, so it feels a bit weird to have a resource that disables resources previously enabled outside of Terraform.
I understand from a "policy enforcement" angle why this is important and necessary, but it feels wrong. It'd be the equivalent of Terraform deleting all the instances in a project, even if it wasn't previously managing them.
When a non-authoritative resource becomes authoritative, any services not in the list are disabled.
I think that makes sense, so long as the plan _shows_ that is the action that will be taken
When an authoritative resource becomes non-authoritative, no API calls are generated.
Iff there are no other changes. If I change to non-authoritative and add two new APIs, I expect two APIs to be enabled.
When an authoritative or non-authoritative resource is deleted, all the services in it are disabled.
Unless disable_on_destroy is set to false.
I think @morgante answered most of the other questions, but I would add one guiding principle that we can add in the docs: users should either use authoritative or not - don't mix them in the same set of configs.
I understand from a "policy enforcement" angle why this is important and necessary, but it feels wrong. It'd be the equivalent of Terraform deleting all the instances in a project, even if it wasn't previously managing them.
I think that's fair. Historically, I believe the decision was made when services was a field on a project, meaning they weren't a resource at all, so when we migrated to a resource, we kept that for a migration path. But it's fair to reconsider the API as it stands, not as it was implemented originally.
I think that makes sense, so long as the plan shows that is the action that will be taken
What does that look like? A list of services that are getting turned off?
I would add one guiding principle that we can add in the docs: users should either use authoritative or not - don't mix them in the same set of configs.
I'd agree, but sometimes people don't read the docs ahead of time, and so I want to at least understand the failure modes that will inevitably come up, and try to make them as least-bad as possible.
I think, in sum, I'm weakly for this. My main concern is just to think it through a _lot_ so we can make sure we get the UX we want that doesn't end in a confusing mess, the way authoritative historically has.
_In theory_, there should be no reason to do a "batch" non-authoritative resource. In practice, until batch requests are a thing in Terraform, I can see it being beneficial. For posterity, though, is there a use case that is made unreasonably difficult or slow through lack of batching support? Basically, what use case prompted this issue being opened?
What does that look like? A list of services that are getting turned off?
I think it's a diff of a string slice
For posterity, though, is there a use case that is made unreasonably difficult or slow through lack of batching support? Basically, what use case prompted this issue being opened?
There's a few, namely of which is API quotas and rate limits. If I'm enabling 50 services on a project, then polling for all of them to be enabled, I'll hit rate limits on the standard quota, because that's 50 parallel Terraform requests. However, if we leverage the batchEnableServices API call, it's 1, significantly reducing the number of requests. We've had a few customers who have been bitten by this, and, worse, you don't know until you're in the middle of a Terraform run 😦
That all sounds reasonable to me. I think this will become moot in the future, because I've heard rumours of request batching functionality at the provider level, but I see nothing wrong with adding this, then deprecating and removing it if/when that ships. My understanding is it's not imminent and hasn't even been designed yet, so I definitely don't want to wait for it.
In the meantime, this seems like a pragmatic solution.
We now have batching functionality and 0.12 makes specifying multiple google_project_service resources a lot easier, so I'm not sure this is necessary anymore. If users are still experiencing pain around this, please feel free to open a new issue and we can examine solutions.Thanks!
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!