Terraform: 0.9.5 regression: index out of range when passing a list of <computed> values to a module

Created on 15 May 2017 · 10Comments · Source: hashicorp/terraform

Terraform Version

This is a 0.9.5 regression, still occurring in master (as of f5056b7e63cf42eec7bb7b87b1d9b3bc943dd9ec)

I am guessing that this has something to do with https://github.com/hashicorp/terraform/pull/14135 but I could be lying.

Affected Resource(s)

Not resource specific. Seems like a core issue.

Terraform Configuration Files

bash-3.2$ tree
.
├── consumer
│   └── main.tf
└── main.tf

1 directory, 2 files

main.tf

variable "count" {}

resource "null_resource" "source" {
  count = "${var.count}"
}

module "consumer" {
  source = "./consumer"
  count  = "${var.count}"
  list   = ["${null_resource.source.*.id}"]
}

consumer/main.tf

variable "count" {}

variable "list" {
  type = "list"
}

resource "null_resource" "consumer" {
  count = "${var.count}"

  triggers {
    trig = "${var.list[count.index]}"
  }
}

Expected Behavior

Should work for any value of var.count; as is the case in 0.9.4:

bash-3.2$ brew switch terraform 0.9.4
Cleaning /usr/local/Cellar/terraform/0.9.4
Cleaning /usr/local/Cellar/terraform/0.9.5
2 links created for /usr/local/Cellar/terraform/0.9.4
bash-3.2$ terraform --version
Terraform v0.9.4

Your version of Terraform is out of date! The latest version
is 0.9.5. You can update by downloading from www.terraform.io
bash-3.2$ terraform plan
var.count
  Enter a value: 1

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ null_resource.source

+ module.consumer.null_resource.consumer
    triggers.%: "<computed>"


Plan: 2 to add, 0 to change, 0 to destroy.
bash-3.2$ terraform plan
var.count
  Enter a value: 2

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ null_resource.source.0

+ null_resource.source.1

+ module.consumer.null_resource.consumer.0
    triggers.%: "<computed>"

+ module.consumer.null_resource.consumer.1
    triggers.%: "<computed>"


Plan: 4 to add, 0 to change, 0 to destroy.
bash-3.2$ 👍

Actual Behavior

Errors for any var.count > 1:

bash-3.2$ brew switch terraform 0.9.5
Cleaning /usr/local/Cellar/terraform/0.9.4
Cleaning /usr/local/Cellar/terraform/0.9.5
2 links created for /usr/local/Cellar/terraform/0.9.5
bash-3.2$ terraform --version
Terraform v0.9.5

bash-3.2$ terraform plan
var.count
  Enter a value: 1

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ null_resource.source

+ module.consumer.null_resource.consumer
    triggers.%: "<computed>"


Plan: 2 to add, 0 to change, 0 to destroy.
bash-3.2$ terraform plan
var.count
  Enter a value: 2

1 error(s) occurred:

* module.consumer.null_resource.consumer: 1 error(s) occurred:

* module.consumer.null_resource.consumer[1]: index 1 out of range for list var.list (max 1) in:

${var.list[count.index]}
bash-3.2$ 😭

Steps to Reproduce

Use the provided code
Run terraform plan
Give var.count any value > 1
Cry

Important Factoids

This only happens when passing the list of items to a module. If we took the contents of the consumer module and put it straight in the root module, everything would work fine.
This only happens with the [idx] operator. Switching to element() works correctly:

bash-3.2$ cat consumer/main.tf
variable "count" {}

variable "list" {
  type = "list"
}

resource "null_resource" "consumer" {
  count = "${var.count}"

  triggers {
    # trig = "${var.list[count.index]}"
    trig = "${element(var.list, count.index)}"
  }
}

bash-3.2$ terraform plan
var.count
  Enter a value: 2

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ null_resource.source.0

+ null_resource.source.1

+ module.consumer.null_resource.consumer.0
    triggers.%: "<computed>"

+ module.consumer.null_resource.consumer.1
    triggers.%: "<computed>"


Plan: 4 to add, 0 to change, 0 to destroy.
bash-3.2$ terraform apply
var.count
  Enter a value: 2

null_resource.source.0: Creating...
null_resource.source.1: Creating...
null_resource.source.0: Creation complete (ID: 1544203862132868412)
null_resource.source.1: Creation complete (ID: 5067203570795836621)
module.consumer.null_resource.consumer.0: Creating...
  triggers.%:    "" => "1"
  triggers.trig: "" => "1544203862132868412"
module.consumer.null_resource.consumer.1: Creating...
  triggers.%:    "" => "1"
  triggers.trig: "" => "5067203570795836621"
module.consumer.null_resource.consumer.0: Creation complete (ID: 8210043061277706153)
module.consumer.null_resource.consumer.1: Creation complete (ID: 3511070126733757179)

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

The state of your infrastructure has been saved to the path
below. This state is required to modify and destroy your
infrastructure, so keep it safe. To inspect the complete state
use the `terraform show` command.

State path:
bash-3.2$

Note the different triggers; element did not cause results to simply wrap. It actually did work; however, if I raise the count:

bash-3.2$ terraform plan
var.count
  Enter a value: 3

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

null_resource.source.0: Refreshing state... (ID: 1544203862132868412)
null_resource.source.1: Refreshing state... (ID: 5067203570795836621)
null_resource.consumer.1: Refreshing state... (ID: 3511070126733757179)
null_resource.consumer.0: Refreshing state... (ID: 8210043061277706153)
The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ null_resource.source.2

-/+ module.consumer.null_resource.consumer.0
    triggers.%:    "1" => "<computed>" (forces new resource)
    triggers.trig: "1544203862132868412" => "" (forces new resource)

-/+ module.consumer.null_resource.consumer.1
    triggers.%:    "1" => "<computed>" (forces new resource)
    triggers.trig: "5067203570795836621" => "" (forces new resource)

+ module.consumer.null_resource.consumer.2
    triggers.%: "<computed>"


Plan: 4 to add, 0 to change, 2 to destroy.

It wants to destroy the already-existing resources. So the element() workaround kinda voids the whole point of https://github.com/hashicorp/terraform/pull/14135 but could be useful for scenarios where this doesn't matter.

bug core regression

Source

kreisys

Most helpful comment

Hi everyone! Sorry for the silence here.

After trying a few different approaches, we ended up choosing to make my originally-suggested workaround work. Removing the requirement to surround splat expressions with brackets has been on the docket for a while anyway, since it was a holdover from old versions that doesn't really make sense any longer, so in #14737 (which will be included in 0.9.6) the following becomes true:

The surrounding brackets are _optional_ for resource attributes, and no longer recommended. We're continuing to support both forms here in the short term for compatibility, but it creates some ambiguity in the configuration language and so we're planning to phase this out in some future version.
When partially-unknown splats are used in other contexts such as child module variables, the surrounding brackets are _not_ supported, and only the bare string form will work. We accepted this compromise because directly passing partially-unknown splats to modules is a rarer case (it didn't even really work until pretty recently) and so we're going to ask those who are doing it to drop the surrounding brackets to make it work.

This outcome was a bit of a compromise due to some historical baggage in Terraform's core, and the desire not to do any drastic re-architecture in a point release. We're in the early stages of planning a set of holistic improvements to the configuration language that will undoubtedly cause some more disruptive reorganization of Terraform's core, at which point we hope to be able to address this more robustly and phase out the redundant surrounding brackets pattern altogether.

Sorry again for this regression and thanks to everyone for their patience while we worked through this.

apparentlymart on 23 May 2017

👍2 🎉1

All 10 comments

Thanks for digging in to this, @kreisys! I'll take a look soon.

apparentlymart on 15 May 2017

👍2

Hi @kreisys!

After staring at this for a while I think I've figured out what's going on here.

The key is this:

 list   = ["${null_resource.source.*.id}"]

By wrapping this in brackets, what you created here is actually a list of lists. When the string inside is evaluated, it produces an unknown value (because partially-unknown lists flatten to entirely-unknown values when exiting an interpolation sequence), and then that result is itself wrapped in a list by the outer brackets. The outer list is not visible to the interpolation language, since it's outside of the string that it dealt with.

I think the reason this worked before was that we formerly had a "fixup" that would happen where any partially-unknown list would get flattened to a scalar unknown on _entry_ into the interpolation language, as part of reading a variable. That was removed in #14135 for the express purpose of allowing this sort of thing to work, but it was intended to only apply to lists generated by the "splat syntax" since I'd expected that the "no partial unknowns on exit" rule would prevent them from being introduced any other way.

The attribute in module blocks are handled in a slightly different way than attributes in resource blocks, since there's some extra processing layers for resources that adjust for certain quirks of configuration representation, including this old habit we inherited from pre-0.7 of wrapping splats in a single level of list. That's why the expression appears to work when you use it in a resource attribute directly, rather than passing it over the module boundary.

If I change your config to just have a single level of list then it works as expected:

 list = "${null_resource.source.*.id}"

This inconsistency is annoying, so I'm going to spend a little time trying to work out where it stems from but arguably the module attribute behavior is the more correct behavior, and this extra list wrapping for resource splats is a hold-over from before Terraform had proper list support.

apparentlymart on 16 May 2017

Thanks for looking into it! What version are you testing on? I'm not getting the same results as you... When I remove the square brackets (either 0.9.5 or 0.9.6-dev/master), I get this error:

bash-3.2$ terraform plan
var.count
  Enter a value: 1

1 error(s) occurred:

* module root: 1 error(s) occurred:

* module 'consumer': use of the splat ('*') operator must be wrapped in a list declaration

kreisys on 16 May 2017

Hmm... curious. I didn't have time today to get all the way to proving what I stated earlier, but I did successfully run a test without the brackets on latest master. Evidently something else is going on here. I will investigate further tomorrow.

Thanks again!

apparentlymart on 16 May 2017

Hey folks - just chiming in that we hit this same issue in an internal config and I can confirm that the removal of the surrounding braces yields the same must be wrapped message that @kreisys reports.

Kudos on a thorough report @kreisys! ❤️ @apparentlymart feel free to ping me if you could use any further detail from my repro case.

phinze on 18 May 2017

Sorry for the confusion here. I did dig into this some more since my last comment, but didn't reach any conclusion yet so I didn't report back.

But I did learn something which I will say here to reduce confusion with my earlier comments:

There remains a special case in the config validator that the "splat variable" syntax _in particular_ requires this bracket wrapping. This is a weird inconsistency resulting from before Terraform had proper list support. Unfortunately this seems to interact poorly with some different assumptions made when dealing with module config, since that doesn't have the benefit of all the config normalization that gets done for resources.

I am still looking into this and hope to have something more concrete to report soon. But for now, it may be possible to work around this by introducing some indirection into the expression, such as concat() around the splat variable.

apparentlymart on 18 May 2017

Okay... finally got my head around how all this fits together.

As I expected, the weirdness here dates back to <0.7 when Terraform's idea of lists was just strings with a magic delimiter between items. In those days, a splat-variable expanded to a flat, delimited string and then the interpolating code (the part that walks the config structure looking for strings containing HIL templates) would do a just-in-time fixup of any list of strings found in the configuration, splitting the strings inside on that magic delimiter to expand out into a proper list. The now-silly need to wrap expressions involving splats in an extra level of list was the signal that this expansion should be tried.

In 0.7 we got the ability to represent proper lists (in the form of Go slices) but -- perhaps due to an oversight -- the "splitting" code, which previously split these strings to make lists, was retained as a now-rather-odd codepath that just peels off one level of list from any list of lists found in the config.

Given an expression like this:

foo = ["${split(" ", "hello world")}"]

... the initial result of this is []interface{}{ []interface{}{ "hello", "world" } } and so the "split" function detects that this is a list of lists and flattens it to []interface{}{ "hello", "world" }.

The problem arises when the interpolation expression has a value that's unknown. In that case, due to the rule that HIL _outputs_ are always either wholly known or wholly unknown, the result of this is []interface{}{ UnknownValue }. This _isn't_ a list of lists, so the split function leaves it untouched.

Previously we recovered from this oversight because HIL itself had a rule that if a variable value was partially-unknown then it was replaced with a full unknown on entry into HIL. Thus although this erroneous value made its way through most of Terraform core, it was eventually caught at the last moment when HIL evaluated the splat variable.

Now that HIL intentionally _allows_ partially-unknown values, this last-moment fix no longer applies, and so we end up with a one-element list whose single element is unknown.

I think the fix here is to deal with the "list of a single unknown" problem in the interpolater, but my initial quick attempt at that caused other tests to fail, so I'm going to pause my work on this for today and pick it up again tomorrow when I've got more time to dig in and understand whether these test failures are legitimate (my change has broken something) or if these tests were just accepting the previously-broken behavior and need to be updated.

Sorry to everyone who is affected by this problem. I'm still looking at this and hope to have a fix of some sort very soon.

apparentlymart on 19 May 2017

Hi everyone! Sorry for the silence here.

The surrounding brackets are _optional_ for resource attributes, and no longer recommended. We're continuing to support both forms here in the short term for compatibility, but it creates some ambiguity in the configuration language and so we're planning to phase this out in some future version.
When partially-unknown splats are used in other contexts such as child module variables, the surrounding brackets are _not_ supported, and only the bare string form will work. We accepted this compromise because directly passing partially-unknown splats to modules is a rarer case (it didn't even really work until pretty recently) and so we're going to ask those who are doing it to drop the surrounding brackets to make it work.

Sorry again for this regression and thanks to everyone for their patience while we worked through this.

apparentlymart on 23 May 2017

👍2 🎉1

That's really interesting. I only started using Terraform seriously during 0.8.x so I didn't have the context to understand the rationale behind these brackets. They did puzzle me and feel "wrong"; especially because the docs indicated that they're required while in practice omitting them worked. Sometimes. Anyways I'm not going to be mourning them. I hope 🙃

kreisys on 23 May 2017

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.