It would be nice to be able to mark a resource in a module as a singleton. When marked, a single instance of that resource would be shared across all instances of the module.
For instance, I have an elasticsearch module on AWS I wrote, which requires a custom security group.
Currently, if I want to have multiple elasticsearch clusters in my deployment, I have two options:
Create a security group for each cluster.
Build the security group outside of the elasticsearch module and pass it in.
Neither of these are really ideal. I would rather define a single instance of the security group in the elasticsearch module. When applying, Terraform would look to see if an instance of this resource was already created and use the existing one if it was found.
Another use case I ran into when playing around with the elasticbeanstalk feature branch.
It is recommended practice to use a single elasticbeanstalk application and then deploy multiple environments within that application for prod, development, qa, etc.
This would be easier to manage if I could identify the elasticbeanstalk application as a singleton so that I could manage the different environments separately, without manually creating and passing in the application.
I would use this all the time, it would be one of the most useful changes terraform could make IMO.
I'm surprised this issue doesn't have more discussion on it, is there a duplicate for it somewhere else? Has this feature ever been considered by the core team?
Hi @oillio, @dcosson! Sorry this sat here unattended for so long.
With current Terraform best-practices we'd recommend the second option that was proposed in the original request: create the security group outside and pass it in.
I'd like to understand better what you both see as the disadvantages with this approach.
One piece of feedback we've got before is that it's sometimes inconvenient to use this composition pattern because in some cases it's necessary to provide many different attributes of the shared resource (the security group, in this case) and this can get unwieldy today since they must each be passed in a separate attribute. In the case of security groups though, usually just the id is sufficient and so having the security group id as an argument has worked well in several cases.
We've been looking recently at pain points with the configuration language and sketching out ideas to improve usability while remaining consistent with the current goals. One idea we're considering is being able to pass the data for an entire resource as a variable, like this:
# DRAFT DESIGN: not valid in Terraform today, and may change before implementation
resource "aws_security_group" "example" {
# ...
}
module "elasticsearch" {
source = "./elasticsearch"
aws_security_group = "${aws_security_group.example}" # the entire resource as an object
}
...and then in the elasticsearch module:
# DRAFT DESIGN: not valid in Terraform today, and may change before implementation
variable "aws_security_group" {
}
resource "aws_instance" "example" {
# ...
vpc_security_group_ids = ["${var.aws_security_group.id}"]
}
We've observed this dependency-injection-like pattern work well in many cases so this feature is intended to make the pattern easier to use; the recommendation then would be to compose a system out of several small modules, each of which deals with one aspect of the problem, and then compose those modules together in a root module, passing entire resources as objects where that's necessary/convenient.
My concern with the singleton idea as proposed is that it runs against the principle that each module is self-contained. We usually recommend a "top-down" design approach, with modules used primarily for organization rather than creating big abstractions, so that an operator can clearly understand the relationship between changes shown in the plan and resources in configuration. My hope is that, by making the "create it once and pass it in" approach more convenient, we can address this need without adding a new concept.
With that said, I'm curious to hear what you both think of the above as an alternative.
Thanks for the great response @apparentlymart ! That would be a good change too and makes a lot of sense to me. But IMO a singleton would still be useful even with that change.
The main motivation is just to get rid of the extra dependency, and then partly just for code organization. In the security group example given, that security group may only ever be relevant in the context of passing it to the elasticsearch cluster. If you create it outside of the module and pass it in, that's one extra variable that the module has to take (meaning everywhere you call the module you have to have access to that variable too, you have to copy it around and remember to pass it in, potentially pass it through multiple layers of modules depending on where it's defined, etc.). It also makes the code hard to read, there's just more indirection tracing around the source of a variable.
IMO singleton resources in terraform would kind of be analagous to class methods in OO languages. You don't absolutely need them, you could put the code in top-level functions somewhere else. But if the method is closely related to the other instance methods in that class, it's nice to have it in the same place and you save a little bit of overhead of not polluting the global namespace, not having to import that code explicitly, and having access to other constants already defined on the class.
Another idea I've had, along the same lines of reducing duplication of arguments, is it would be very useful to bind variables to a module globally. Continuing with the same elasticsearch example, maybe you also have a variable for "environment" to specify "staging" or "production". In every instance of this module, you now have 2 boilerplate vars to pass in, the security group and the environment, which are always the same across your whole project. In my experience this number of boilerplate vars commonly grows to 5 or 6 different things in moderately-sized modules.
If you could somehow globally bind a variable, such that the value you set is now effectively the default value for that variable when no explicit value is passed in, you wouldn't have to pass around any of these boilerplate values. I'm imagining this would be scoped to the module you're currently in, or at the top level scoped to the whole project. I've been wondering if the _override.tf files can get close to this behavior, but it wouldn't track the dependencies as well as keeping this all within the terraform graph as a real terraform feature.
As an example:
# NOT VALID TERRAFORM
resource "aws_security_group" "example" {
# ...
}
bind "example_group_to_es_cluster" {
module = "../modules/elasticsearch_cluster"
vars {
environment = "${var.environment}"
security_group = "${aws_security_group.example}"
}
}
module "es1" {
source = "../modules/elasticsearch_cluster"
name = "es1"
}
# with the bind, the above would be equivalent to:
module "es1" {
source = "../modules/elasticsearch_cluster"
name = "es1"
environment = "${var.environment}"
security_group = "${aws_security_group.example}"
}
This feature would be useful in some cases that singletons wouldn't have helped (the environment variable above) and would also do a lot to alleviate the need for singletons, you don't get the code organization benefits but you get the biggest benefit of not having to pass in the variable everywhere you instantiate the module.
@dcosson explained the issue well.
Due to the nature of how some functionality in AWS is setup, I need to configure some aspects of a particular sub-system outside of its module. This means the client code using the module needs to know a lot more about the internals of the module than I think it should. In the elasticsearch example, I have an elasticsearch module, which is used by a number of different service modules.
With the current design, my root terraform config needs to know about elasticsearch and its security group requirements (which it really shouldn't need to know about). It then needs to pass that SG into the different service modules that happen to have an elasticsearch cluster (it really shouldn't need to care which services use elasticsearch). In the end, I am passing a number of variables through many of my modules that tends to make their configuration less readable.
Ideally, when a service needs elasticsearch, I would like to just be able to add the elasticsearch module and set whatever needed variables, which is not possible currently.
As @dcosson alluded to, I also have a half dozen standard variables I have ended up passing throughout all of my service modules. I currently pass them as a single map variable. I don't have a good answer to a solution for this, as I think global variables are even more antithetical to your design than singletons. It is an annoyance for large deployments, however.
@apparentlymart I have two terraform projects "staging" and "production". I would like to use my module which is named "global-security-groups" in a singleton manner in both projects "staging" and "production".
In other words, I would like to share my security groups between the two projects "staging" and "production" without having terraform to try to create the security groups again for production and staging. They are already created.
It seems the only current workaround is to create a separate project for the security groups and then import the terraform state file into both staging and production to be used with output variables holding the security group names. The import takes place using "data_terraform_remote_state".
Not sure if there would be a better approach. But the current workaround is hectic in the sense that you need to keep refactoring your project using "terraform state mv" to move out all singleton infrastructure to separate terraform state files.
A suggested approach would be having a "module_singleton" construct that would internally create the terraform state file inside the module's directory and export the outputs accordingly.
Example:
production/main.tf
module_singleton "security_groups" {
source = "../modules/singletons/security-groups"
}
staging/main.tf
module_singleton "security_groups" {
source = "../modules/singletons/security-groups"
}
For the above to work, module_singleton should not be allowed any input variables (ie. variables.tf). You can't have a singleton being passed a different set of input variables. However, the singleton module itself can receive inputs from other modules it is using.
Enhancement: Rethinking the above, a module_singleton can be allowed input variables. Terraform will generate a unique hash for the set of input variables received to identify this singleton module. In this case, the module will instantiate only once for the same specific set of input variables when referenced from different projects. In this case, a more sensible name would be shared_module instead of module_singleton
This would resolve all shared infrastructure issues such as key pairs, security groups, even databases.
Hope that made sense.
Work around described here: https://stackoverflow.com/questions/45378635/managing-multiple-configurations-which-depend-on-a-singleton-shared-resource
Hi all,
Since the earlier comments from @apparentlymart Terraform v0.12 has shipped with improvements to allow passing whole resource objects through input variables and output values so that they can be shared more conveniently between modules, rather than having to pass each required attribute separately.
This is intended to minimize the friction that previously made the dependency inversion approach inconvenient. The Terraform documentation now contains specific recommendations about using dependency inversion as part of decomposing a system into multiple modules.
Taking security groups as an example, a module that needs a security group might define an input variable like the following to take in the subset of aws_security_group attributes that it needs:
variable "security_group" {
type = object({
id = string
arn = string
vpc_id = string
})
}
The caller of this module could then create a suitable security group and pass it in as a whole object:
resource "aws_security_group" "example" {
# ...
}
module "example" {
security_group = aws_security_group.example
}
It is an intentional language design decision to require explicit dependency passing rather than implicit sharing, because we want to ensure that the data flow between modules in a configuration can be understood by reading only the calls to those modules, rather than having to study the details of the entire configuration tree. This also allows for the other common decomposition technique of splitting a system into multiple separate _configurations_ connected by data resources, whereas a singleton mechanism could not support that without somehow giving Terraform a global view of the entire infrastructure.
With all of that said, we're going to close this issue out now. We realize that the solution offered here is not the one that was proposed, but after carefully evaluating the tradeoffs we've concluded that supporting this sort of implicit/automatic object sharing would conflict with the goals of Terraform modules and make a configuration using modules harder to read and understand. Thank you for the great discussion here, and hopefully the Module Composition guide provides some helpful alternative techniques for decomposition.
I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
@dcosson explained the issue well.
Due to the nature of how some functionality in AWS is setup, I need to configure some aspects of a particular sub-system outside of its module. This means the client code using the module needs to know a lot more about the internals of the module than I think it should. In the elasticsearch example, I have an elasticsearch module, which is used by a number of different service modules.
With the current design, my root terraform config needs to know about elasticsearch and its security group requirements (which it really shouldn't need to know about). It then needs to pass that SG into the different service modules that happen to have an elasticsearch cluster (it really shouldn't need to care which services use elasticsearch). In the end, I am passing a number of variables through many of my modules that tends to make their configuration less readable.
Ideally, when a service needs elasticsearch, I would like to just be able to add the elasticsearch module and set whatever needed variables, which is not possible currently.
As @dcosson alluded to, I also have a half dozen standard variables I have ended up passing throughout all of my service modules. I currently pass them as a single map variable. I don't have a good answer to a solution for this, as I think global variables are even more antithetical to your design than singletons. It is an annoyance for large deployments, however.