0.10.4
data "external" "foo" {
depends_on = ["null_resource.test"]
program = ["bash", "${path.module}/test.sh"]
}
resource "null_resource" "test" {
}
output "foo" {
value = "${data.external.foo.result["foo"]}"
}
(test.sh)
#!/bin/bash
jq -n --arg foo "bar" '{"foo":$foo}'
terraform plan -detailed-exitcode should return 0 in this case.
it's returning 2
> terraform plan -detailed-exitcode
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
null_resource.test: Refreshing state... (ID: 910628614195011957)
------------------------------------------------------------------------
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
<= read (data resources)
Terraform will perform the following actions:
<= data.external.foo
id: <computed>
program.#: "2"
program.0: "bash"
program.1: "/Users/slee1/tmp/tf/ext-prov-detailed-exitcode/test.sh"
result.%: <computed>
Plan: 0 to add, 0 to change, 0 to destroy.
------------------------------------------------------------------------
Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.
> echo $?
2
1) terraform init
2) terraform apply
3) terraform plan -detailed-exitcode
4) echo $?
If depends_on is removed from the external provider stanza, then the exit code becomes 0. However, that is not a viable workaround when depends_on is actually necessary.
Hi @sjwl!
At present using depends_on with data resources is problematic since it always forces the source to be refreshed during the apply phase to make sure it's "after" its dependency. To get better behavior here you can write the configuration like this:
data "external" "foo" {
program = ["bash", "${path.module}/test.sh"]
query = {
# useless extra argument to create the implicit dependency
dep = "${null_resource.test.id}"
}
}
resource "null_resource" "test" {
}
output "foo" {
value = "${data.external.foo.result["foo"]}"
}
Creating that _implicit_ dependency via the dep query argument gives Terraform more information: it can tell that if the null resource already exists (and, as a consequence, it has an id set) then it's fine to refresh it during the plan step, rather than during apply.
This has the unfortunate side-effect of passing a useless extra query argument to your external program, which it must then ignore.
With all of that said, this is of course not intuitive and what you've encountered here is the intersection of a few different things:
depends_on for data sources should be smarter and defer the refresh to the apply step _only_ if one of the dependencies also has an action in the diff.-detailed-exitcode return 0 if not.The reason for Terraform exiting with code 2 here is a subtle one: since the result of a data source refresh is kept in the state, a data source read _is_ actually an action, and so it _does_ need to be applied in order to actually take effect in the state. Most of the time this subtlety makes no difference, but in cases like your example -- where the data source result is used in an output -- the change to the output is sometimes actually the main side-effect we're looking for. For example, if the result is being consumed elsewhere using terraform_remote_state then updating the output is a change in its own right, even though no resources are actually being altered. (#15419 is mainly about making this subtlety more explicit in the UI, and improving its behavior.)
So what Terraform is saying here is: you need to run terraform apply in order for the result from this data source to be reflected in your state.
The problem arises because Terraform currently treats data source updates in an imprecise way... it can't distinguish cases where refreshing the data source produces the same result, and the depends_on handling is too conservative.
Resolving both #11806 and #15419 would have the side-effect of curing the weird behavior you saw here, and I don't think there's any immediate action we can take here to address this specific issue... adding a special case for this scenario would break the situation where we _do_ need to apply a data source update. I hope for now the above workaround gives you a way past this bug until we're able to address these other two deeper issues. Sorry again for the weirdness here.
thanks @apparentlymart your workaround is working for me!
Hello! :robot:
This issue seems to be covering the same problem or request as #11806 and #15419, so we're going to close it just to consolidate the discussion over there. Thanks!
I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
Hi @sjwl!
At present using
depends_onwith data resources is problematic since it always forces the source to be refreshed during the apply phase to make sure it's "after" its dependency. To get better behavior here you can write the configuration like this:Creating that _implicit_ dependency via the
depquery argument gives Terraform more information: it can tell that if the null resource already exists (and, as a consequence, it has anidset) then it's fine to refresh it during theplanstep, rather than during apply.This has the unfortunate side-effect of passing a useless extra query argument to your external program, which it must then ignore.
With all of that said, this is of course not intuitive and what you've encountered here is the intersection of a few different things:
depends_onfor data sources should be smarter and defer the refresh to the apply step _only_ if one of the dependencies also has an action in the diff.-detailed-exitcodereturn 0 if not.The reason for Terraform exiting with code 2 here is a subtle one: since the result of a data source refresh is kept in the state, a data source read _is_ actually an action, and so it _does_ need to be applied in order to actually take effect in the state. Most of the time this subtlety makes no difference, but in cases like your example -- where the data source result is used in an output -- the change to the output is sometimes actually the main side-effect we're looking for. For example, if the result is being consumed elsewhere using
terraform_remote_statethen updating the output is a change in its own right, even though no resources are actually being altered. (#15419 is mainly about making this subtlety more explicit in the UI, and improving its behavior.)So what Terraform is saying here is: you need to run
terraform applyin order for the result from this data source to be reflected in your state.The problem arises because Terraform currently treats data source updates in an imprecise way... it can't distinguish cases where refreshing the data source produces the same result, and the
depends_onhandling is too conservative.Resolving both #11806 and #15419 would have the side-effect of curing the weird behavior you saw here, and I don't think there's any immediate action we can take here to address this specific issue... adding a special case for this scenario would break the situation where we _do_ need to apply a data source update. I hope for now the above workaround gives you a way past this bug until we're able to address these other two deeper issues. Sorry again for the weirdness here.