Terraform: Module-aware explicit dependencies

Created on 13 Jan 2018 · 16Comments · Source: hashicorp/terraform

Terraform currently allows the declaration of explicit inter-resource dependencies using depends_on:

resource "example" "example1" {
}

resource "example" "example2" {
  depends_on = ["example.example1"]
}

The presence of the depends_on in the above example causes the graph builder to create a dependency edge from example2 to example1, which ensures that example1 is visited first during any graph traversal.

This mechanism does not generalize to other constructs within Terraform. In particular, it doesn't generalize to modules, since a module is not represented as a single node in the graph. Instead, each individual variable and output in a module is its own graph node, which allows us to optimize our parallelism by getting started on _some_ aspects of a module before all of the input variables are ready, and to begin processing resources that _depend_ on a module before all of its outputs are complete. Even though variables and outputs _are_ in the graph, we do not currently support referring to them in depends_on.

The following proposal describes a generalization of the depends_on mechanism to apply to both resources and modules, with the goal of satisfying the use-cases discussed in #10462, allowing explicit dependencies on module variables and outputs, along with a syntax that creates the _effect_ of an entire-module dependency.

New addressing forms for `depends_on`

We currently allow references to managed and data resources in depends_on. To support dependencies with modules, we must extend this to support the following forms:

aws_instance.example - managed resource dependency, as today
aws_instance.another_example[2] - a particular instance of a managed resource with count set
data.template_file.example - data resource dependency, as today
var.foo - dependency on an input variable passed by a parent module
module.example.foo - dependency on an output of a named child module
module.example - dependency on an entire module

Our improved configuration language parser (which, at the time of writing, is in the process of being integrated into Terraform Core) allows us to improve the depends_on syntax through direct use of expressions, rather than requiring these references to be inside quoted strings:

# DESIGN SKETCH: not yet implemented and may change before release

resource "example" "example2" {
  depends_on = [
    aws_instance.example,
    aws_instance.another_example[2]
    data.template_file.example,
    var.foo,
    module.example.foo,
    module.example,
  ]
}

This syntax will be used for the examples in the remainder of this proposal.

Support `depends_on` as a `module` block argument

The above allows modules to be used as explicit dependencies, but we need to additionally support depends_on inside module blocks in order to allow _modules_ to have dependencies:

# DESIGN SKETCH: not yet implemented and may change before release

module "example" {
  depends_on = [
    aws_instance.example,
  ]
}

Depending on a Module Variable

At first glance, an explicit dependency on a var.foo expression feels a little strange: variables don't have externally-visible side-effects, so it's strange to want to depend on them without using their result.

However, allowing explicit dependencies on variables creates a mechanism for the author of a more-complex reusable module to create custom depends_on-like attributes that serve to block _subsets_ of the functionality of the module. For example:

# DESIGN SKETCH: not yet implemented and may change before release

### in root module

module "database" {
}

module "app" {
  ami_id = "ami-1234"
  app_server_depends_on = [
    module.database,
  ]
}

### in module "app"

variable "app_server_depends_on" {
  default = []
}

resource "aws_security_group" "foo" {
  # Work on _this_ resource can begin immediately
  # ...
}


resource "aws_instance" "app_server" {
  ami = var.ami_id
  # ...

  # We can't create this resource until the caller tells us that it's
  # prepared some hidden dependencies.
  depends_on = [
    var.app_server_depends_on,
  ]
}

This makes it possible to create a re-usable module for deploying arbitrary applications (parameterized by an AMI to deploy, etc), which can immediately create supporting resources like the security group in this example, but defer creating the actual compute resources until some arbitrary, caller-defined dependencies have been dealt with. The caller knows that ami-1234 expects to have a database available to it on boot, while the re-usable module has no direct knowledge of that database.

The actual _value_ of app_server_depends_on in the above example is not actually significant. Instead, we effectively pass the _dependencies_ of that expression through to the module by creating a transitive dependency relationship in the graph.

Depending on a Whole Module

As noted above, modules are not represented directly by graph nodes today, so whole-module dependencies (either as dependencies or dependents) require some new graph-building functionality.

The most likely user intent for a dependency of the form module.example is to wait until _everything_ in the module has completed before continuing. This behavior would have a severe impact on Terraform's ability to achieve parallelism though, and so this proposal suggests a compromise for when depends_on references a whole module: treat this as an alias for depending on each of the module's outputs, but not on any resources or nested modules.

Terraform graph where a nested module called "example" has two resources, example1 and example2, where only example1 is a dependency of the module's outputs

The biggest consequence of this compromise is that in the above example null_resource.example will block until module.example.null_resource.example2 is complete, but will not wait for module.example.null_resource.example3 because none of the module's outputs depend on that resource.

This consequence gives a measure of flexibility and control for the module author, however: if the author knows that the module performs a time-consuming operation but that this operation does not block access to the objects that the caller will depend on then this can be expressed by making that operation _not_ be a dependency of the outputs. From the module _caller's_ perspective, the module can still be thought of as a black box, with the module author designing it such that all significant effects of the module are referenced in an output. In effect, the module author uses output blocks to define what it means for the module to be considered "complete".

The improved configuration language, whose integration is in progress as we write this, allows passing the result of an entire module as a value into another module:

# DESIGN SKETCH: not yet implemented and may change before release

### root module
module "example1" {
}
module "example2" {
  example1 = module.example1
}

### module example1

output "id" {
  value = "placeholder-id"
}

### module example2

variable "example1" {
}

resource "null_resource" "example" {
  triggers = {
    example1_id = var.example1.id
  }
}

This new usage creates an _implicit_ dependency between module.example2.var.example1 and all of the outputs of module.example1, since they must all be complete before the language runtime can construct the value of module.example1 to assign. This implicit usage further reinforces the idea that only the outputs are dependencies in this case, because that is what is necessary to construct the object value returned by module.example1.

Whole-module `depends_on`

Using depends_on in a module block will also limit parallelism, but the impact is less severe in this case because the effect is under the direct control of the _caller_ module, and so its author can make a tradeoff to decide at what point the limited parallelism hurts enough to warrant more precise dependency handling:

# DESIGN SKETCH: not yet implemented and may change before release

### root module

variable "baz" {
}

resource "null_resource" "example1" {
  triggers = {
    example = "hello"
  }
}

module "example" {
  foo = var.baz

  depends_on = [
    null_resource.example1,
  ]
}

### module "example"

variable "foo" {
}

resource "null_resource.example2" {
  triggers = {
    foo = var.foo
  }
}

resource "null_resource.example3" {
}

module "example2" {
}

### module "example2"

resource "null_resource.example4" {
}

Dependencies _away_ from the module require the creation of a new "begin" graph node for the module that declares depends_on, which must then be a dependency of every resource in the module _and_ of any downstream modules. To reduce the number of graph edges, a "begin" node will be created for each of the downstream modules too, so that only one additional edge needs to be added _between_ the modules (to connect the "begin" nodes).

A "begin" graph node takes no action when visited during a walk and so just serves as an aggregation point to reduce the number of dependency edges. For a module block without depends_on the "begin" graph node can be safely optimized away, along with its incoming dependency edges, during graph construction.

`depends_on` in other contexts

depends_on can be useful for any Terraform construct that causes externally-visible side-effects, as a means to influence the ordering of those side-effects.

Provider initialization also sometimes has side effects, such as reaching out to an external network service to begin a session or to validate credentials. depends_on could therefore also be useful in provider blocks, as described in #2430. However, providers are special in that they need to be instantiated in _all_ phases of Terraform's operation, and thus it is not always possible to force an ordering for provider initialization relative to resource creation as described in #4149. Implementation of depends_on for modules should not block on the implementation of "partial apply", but we should reserve the depends_on argument for provider blocks as part of implementing _this_ proposal to minimize the risk that a provider in the wild will introduce its own depends_on configuration argument that would then be in conflict.

output, variable and locals blocks do not have any externally-visible side-effects and so depends_on would not serve any useful purpose for these blocks; it is always safe to evaluate the corresponding graph nodes as soon as their implicit dependencies become ready.

provisioner blocks within managed resources are not currently represented as separate graph nodes, and so they are processed as part of a create action for their parent resource node.

config enhancement proposal thinking

Source

apparentlymart

👍140 ❤51 👀4

Most helpful comment

Hi all,

We've been laying some groundwork for this during the v0.12 release development cycle, but won't be able to get it all done before v0.12.0 final due to scope.

The portions of this that should be included in v0.12.0 will be dependency edges from resources to modules, as opposed to the other way around or between modules. In other words, it will be possible to write something like this, with the behavior described in the proposal above:

resource "example" "example" {
  depends_on = [module.foo]

  # ...
}

This comes along with the ability to use an object value representing all of the outputs of a module together as an expression, which builds on the same mechanism:

module "a" {
  # ...
}
module "b" {
  # ...

  # Pass an object representing _all_ of the outputs of module a, which
  # then implicitly depends on all of those outputs.
  a_result = module.a
}

The main thing we were not able to include for v0.12 was the ability for a module as a whole to depend on something else, as opposed to individual variables of that module. This is more complex because it requires Terraform to generate a different shape of graph than it traditionally has and so we want to introduce that change separately from the various other configuration language changes in v0.12 so it can be more effectively tested in isolation. Since we use an iterative planning style we don't know yet when that follow on work would begin, but we'll post more updates here when we have them.

apparentlymart on 23 Oct 2018

👍36

All 16 comments

At the time of writing the Terraform Core team at HashiCorp is focused on integrating the improved configuration language parser/interpreter, so we are not yet able to begin prototyping and implementation of this proposal.

However, I'm sharing this now because some parts of this proposal overlap with the configuration language improvements and so we'd like to lay some groundwork during our current project to ease the later implementation of this feature.

apparentlymart on 13 Jan 2018

Hi, I’m interested on this proposal. Is there any update around this idea?

herry13 on 15 Oct 2018

Hi Folks,

Any update on this feature?

jayudhandha on 23 Oct 2018

Hi all,

We've been laying some groundwork for this during the v0.12 release development cycle, but won't be able to get it all done before v0.12.0 final due to scope.

resource "example" "example" {
  depends_on = [module.foo]

  # ...
}

This comes along with the ability to use an object value representing all of the outputs of a module together as an expression, which builds on the same mechanism:

module "a" {
  # ...
}
module "b" {
  # ...

  # Pass an object representing _all_ of the outputs of module a, which
  # then implicitly depends on all of those outputs.
  a_result = module.a
}

apparentlymart on 23 Oct 2018

👍36

@apparentlymart
Appreciate your effort! :+1:

jayudhandha on 25 Oct 2018

Any update on this?

flmmartins on 22 Mar 2019

👍14

still no updates ? :( i`m in pain do to this lack of determinism.. :(

carct on 12 Jun 2019

👎1

Hi folks!

I know this is a long-awaited feature, but "+1" and "any update" comments create noise for others watching the issue and ultimately doesn't influence our prioritization.

Instead, please react to the original issue comment with 👍, which we can and do report on during prioritization.

mildwonkey on 21 Jun 2019

👍16 ❤4

Would it help push feature if an official feature request was logged for this functionality?

Edit: Or is this a feature request? 👀

tscully49 on 23 Aug 2019

Hi @tscully49, good question! This issue is labeled "enhancement", which we use to track feature requests.

mildwonkey on 26 Aug 2019

It took me a while to figure out Terraform resource dependencies. My first instinct was do try module dependencies as this feature request suggests. If that had worked, I wouldn't have had to struggle so much getting _something like module dependencies_ to work.

Since this was such a pain, I documented everything I learned about Terraform dependencies. This includes:

Resource dependencies (with depends_on)
Module dependencies (by hooking outputs into inputs)
A more explicit work-around using a special module_dependency input and module_complete output.

So far, so good. I'm able to make deterministic dependencies between modules and resources.

The docs and working code examples can be found here:

https://github.com/guitarmanvt/terraform-dependencies-explained

I hope this helps someone. :)

guitarmanvt on 30 Aug 2019

🎉1

No news on this?

peturgq on 10 Feb 2020

Hi everyone! any update here?

martincastrocm on 25 Mar 2020

@martincastrocm @peturgq my update to "depends_on cannot be used in a module" is relevant here. We are planning on adding module depends_onduring the 0.13.x lifecycle. We're not 100% sure yet if this will make it into 0.13.0 or in a later 0.13.x release.

danieldreier on 25 Mar 2020

👍2

I'm very excited to announce that beta 1 of terraform 0.13.0 will be available on June 3rd, and will include module depend_on. I've pinned an issue with more details about the beta program, and posted a discuss thread for folks who want to talk about it more.

danieldreier on 22 May 2020

🎉11

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.