Terraform: datasources are not always evaluated before providers depending on it

Created on 18 Jan 2017  ยท  23Comments  ยท  Source: hashicorp/terraform

I'm currently writing a provider to store the secret in pass (https://www.passwordstore.org/) (https://github.com/camptocamp/terraform-provider-pass) and it sometimes work and sometimes it looks like the datasource is evaluated after the creation of a provider that is uses it.

Terraform Version

v0.8.4

Affected Resource(s)

provider

Terraform Configuration Files

data "pass_password" "rancher" {
  path = "terraform/rancher"
}

provider "rancher" {
  api_url    = "${data.pass_password.rancher.data["RANCHER_URL"]}"
  access_key = "${data.pass_password.rancher.data["RANCHER_ACCESS_KEY"]}"
  secret_key = "${data.pass_password.rancher.data["RANCHER_SECRET_KEY"]}"
}

resource "rancher_environment" "test" {
  name = "test"
}

Debug Output

https://gist.github.com/mcanevet/4eaecde59467b19a1f20ee1142e44af8

Expected Behavior

Rancher provider should not have empty attributes

Actual Behavior

rancher_environment resource fails with Get /v1: unsupported protocol scheme "" because api_url is empty.

I added debuging in https://github.com/hashicorp/terraform/blob/master/builtin/providers/rancher/provider.go#L56 and I can see that api_url, access_key and secret_key are empty.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform plan
bug core

Most helpful comment

13264 is my fix for this. It was a pretty insidious issue in the end... details in the PR if you're interested. I ran out of time to complete this today but I'll finish this up on Monday. Thanks for your help in debugging this, everyone!

All 23 comments

This is odd.

In the debug log the ordering is correct so the original thesis of the issue appears incorrect: data sources are evaluated in the proper order. So, perhaps this is an issue with interpolation. Unfortunately, I'm going to require a reproduction to move forward since I can't seem to get this to go myself.

Do you have reproduction steps + config I can follow? Thanks!

@mcanevet out of curiosity, does the problem go away or symptoms if you run with -input=false? My theory is that this error is occurring when Terraform instantiates the provider for either the validation or input passes, which happen before data sources can be evaluated.

Unfortunately interpolated provider arguments are still a bit tricky even with data sources.

@mitchellh to reproduce you can just:

  • compile the pass plugin I wrote (https://github.com/camptocamp/terraform-provider-pass)
  • initialize a pass repository if you don't already have one

    • pass init "ZX2C4 Password Storage Key" where ZX2C4 Password Storage Key is the ID of your GPG key

  • Create an entry in the password store:

    • pass edit terraform/openstack

{
  "auth_url": "https://auth.cloud.example.com/v2.0",
  "tenant_name": "example",
  "user_name": "johndoe",
  "password": "secret"
}

and try with this manifest:

provider "pass" {
}

data "pass_password" "openstack" {
  path = "terraform/openstack"
}

provider "openstack" {
  user_name = "${data.pass_password.openstack.data["user_name"]}"
  tenant_name = "${data.pass_password.openstack.data["tenant_name"]}"
  password = "${data.pass_password.openstack.data["password"]}"
  auth_url = "${data.pass_password.openstack.data["auth_url"]}"
}

resource "openstack_compute_keypair_v2" "foo" {
  name       = "foo"
  public_key = "ssh-rsa DEADBEEF"
}

What is weird is that it sometimes work. I don't know if we have the same issue with the Vault provider...

@apparentlymart with -input=false it looks like it works...

@mcanevet it working with -input=false is a good clue here.

This suggests that your configureFunc is running during the "input" pass, which happens before data sources are loaded. (Data sources are loaded in the "refresh" pass.)

It seems weird to me that configureFunc would run on the input pass, since the purpose of the input pass is to gather data needed to complete the configuration. But assuming that is intended behavior, perhaps a reasonable compromise would be for Terraform to skip doing input on a provider if its configuration contains un-resolvable interpolations, thus allowing input to still work for the simple case but disabling it in this tricky case.

This compromise seems valid to me because the input pass is mainly there as a new user UX, to help them get started with the tool using a minimal configuration. By the time someone is using one provider to configure another I think they are arguably far beyond "getting started" and it's reasonable for them to be properly configuring their providers within config files by this point.

@apparentlymart if I understand correctly you are referring to the configureFunc of the providers using the data sources, so in my case the rancher provider (https://github.com/hashicorp/terraform/blob/master/builtin/providers/rancher/provider.go#L56-L66) and the openstack provider (https://github.com/hashicorp/terraform/blob/master/builtin/providers/openstack/provider.go#L214-L238), not the configureFunc function of the provider that provides the data source (https://github.com/camptocamp/terraform-provider-pass/blob/master/pass/provider.go#L29-L33).

In that case I guess we'll have the same issue with the Vault provider and it's a bug (or a feature) in the core? @mitchellh could you please give your opinion about that?

During the input walk Terraform will instantiate both the "pass" provider and the "rancher" provider in order to prompt for user input on both of them.

I would call it a "design flaw" rather than a "bug", since things are working as designed but the design does not consider this situation. But that's just quibbling over semantics. :grinning:

A workaround for now is to always use -input=false when interpolating variables into provider arguments, which then manually achieves what in my earlier comment I was suggesting that Terraform do automatically. You're right that this is also an issue for other situations, including Vault as you said and also anyone using consul_keys to populate provider attributes.

OK, now it's more clear. Thanks a lot for the explanation.
My provider (https://github.com/hashicorp/terraform/pull/11381) works fine as long as I use -input=false.

This should be fixed in master now. Please let me know if I'm wrong!

Hey @mitchellh,

I'm running on master (Terraform v0.9.0-dev (eab2104c5896da8a136fe75209430ce1b1a0323a)) and still seeing a pretty similar error with the Vault provider.

Config:

provider "vault" {}

data "vault_generic_secret" "gcp" {
  path = "tf/gcp"
}

provider "google" {
  credentials = "${data.vault_generic_secret.gcp.data["credentials"]}"
  project     = "${data.vault_generic_secret.gcp.data["project"]}"
  region      = "${data.vault_generic_secret.gcp.data["region"]}"
}

Vault setup:

  • tf is mounted as a generic backend.
  • vault write tf/gcp credentials=@GOOGLE_CREDENTIALS project=@GOOGLE_PROJECT region=@GOOGLE_REGION

    • (GOOGLE_CREDENTIALS is the json file downloaded from GCP, GOOGLE_PROJECT is the project name, and GOOGLE_REGION is the region name. Ping me on Slack if you need to know where any of those can be found.)

Commands to reproduce, on any config that uses the google provider:

  • VAULT_TOKEN=xxxx terraform apply -- you'll get an error that project isn't defined
  • terraform show -- you'll see that all the vault data is now in state
  • VAULT_TOKEN=xxxx terraform apply -- no errors this time

I think this is probably the same issue? Using -input=false fixes the issue and works on first run.

I'll take a look thanks!

@mitchellh I just tested with Terraform 0.9.0 and it looks like it is not fixed.

@paddyforan After spending some time trying to repro with variants of your config I believe the issue you're seeing is #12393 rather than this one, since the error is occuring during the plan rather than the input walk. I'm going to close this one to consolidate the discussion over there.

@apparentlymart didn't you close this issue by mistake?

@mcanevet no, it looks like this issue was fixed but there is another similar issue that is already tracked, so I would like to consolidate discussion over there

@apparentlymart the initial issue is still not fixed (at least in 0.9.1, I didn't have time to check with 0.9.2 yet)

Okay, I will reopen this to take another look, but I've not yet been able to repro so @mcanevet if you have some time to try to show a minimal reproduction config (ideally using only builtin providers, since that's less complexity to reproduce) that would go a long way to helping me narrow down what exactly is going on here.

@apparentlymart maybe this one can help? https://github.com/hashicorp/terraform/issues/12775
That uses the AWS and mysql provider.

Thanks for the reminder about that other issue, @MattiasGees.

Here's the configuration that allowed me to repro, based on what was in that other issue:

variable "aws_region" {
  default = "eu-west-1"
}

provider "aws" {
  region = "${var.aws_region}"
}

data "terraform_remote_state" "static" {
  backend = "local"

  config {
    path = "scratch.tfstate"
  }
}

provider "mysql" {
  endpoint = "127.0.0.1:3306"
  username = "test"
  password = "${data.terraform_remote_state.static.aurora_root_password}"
}

resource "mysql_database" "foo" {
  name = "foo"
}

I made a fake state in scratch.tfstate to give the remote state data source something to chew on:

{
    "version": 3,
    "terraform_version": "0.9.3",
    "serial": 9,
    "lineage": "8aee55fe-03d9-4b88-b9af-51f6968c5270",
    "modules": [
        {
            "path": [
                "root"
            ],
            "outputs": {
                "aurora_root_password": {
                    "value": "test"
                }
            },
            "resources": {},
            "depends_on": []
        }
    ]
}

This then did indeed confirm the original issue:

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.terraform_remote_state.static: Refreshing state...
Error running plan: 1 error(s) occurred:

* provider.mysql: Received #1045 error from MySQL server: "Access denied for user 'test'@'localhost' (using password: NO)"

$ terraform plan -input=false
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.terraform_remote_state.static: Refreshing state...
The Terraform execution plan has been generated and is shown below.
Resources are shown in alphabetical order for quick scanning. Green resources
will be created (or destroyed and then created if an existing resource
exists), yellow resources are being changed in-place, and red resources
will be destroyed. Cyan entries are data sources to be read.

Note: You didn't specify an "-out" parameter to save this plan, so when
"apply" is called, Terraform can't guarantee this is what will execute.

+ mysql_database.foo
    default_character_set: "utf8"
    default_collation:     "utf8_general_ci"
    name:                  "foo"


Plan: 1 to add, 0 to change, 0 to destroy.

Thanks for the repro help! I will get into trying to fix this now.

Having spent the majority of the day drilling into this, I just wanted to make some notes for my future self before I get on a plane:

Although it's true that -input=false prevents this bug, the root cause is not what I initially guessed. We don't actually configure the provider during the input walk, and in fact we are successfully getting past both the input and validate walks and failing during refresh.

The config data structure in the context gets shared between all of the different walks we do, but it gets mutated along the way by various actions taken by the different walks.It looks like when the input walk interpolates the config it correctly finds at password is computed (in my example above) and moves past it, but by the time we interpolate for refresh the password value has become the unknown value uuid and is not detected as a computed key.

My assumption, based on the fact that -input=false makes this go away, is that we're doing something bad to the provider config during input that's breaking the computed values for later walks. My next task will be to dig into the input code to figure out what all changes it's making to the config structure.

13264 is my fix for this. It was a pretty insidious issue in the end... details in the PR if you're interested. I ran out of time to complete this today but I'll finish this up on Monday. Thanks for your help in debugging this, everyone!

Is this fixed? I'm using the latest version of terraform, 0.10.6 and I'm getting the same error. It works fine when I run it with -input=false but this doesn't work for us since we're using Atlantis + Terraform.

Here's my code:

# Variables
variable "db-admin-encrypted-password" {
  default = "ENCRYPTED_PASSWORD"
}

# Decrypt passwords
data "external" "db-admin-password" {
  program = ["bash", "../../scripts/decrypt.sh", "${var.db-admin-encrypted-password}"]
}

# MySQL connection
provider "mysql" {
  endpoint = "db-server.example.com:3306"
  username = "mysqladmin"
  password = "${data.external.db-admin-password.result.password}"
}

And the error:

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.external.db-admin-password: Refreshing state...
Error refreshing state: 1 error(s) occurred:

* provider.mysql: Received #1045 error from MySQL server: "Access denied for user 'mysqladmin'@'ec2-51-232-124-143.eu-west-1.compute.amazonaws.com' (using password: NO)"

Thanks

I'd like to reiterate what @damascenorakuten reported a year ago with 0.10.6--this is still apparent in 0.11.10, even with --input=false. Use case here is getting a terraform-provisioned rds database's address for a postgresql provider. The failure is clear when the postgresql provider attempts to connect to an empty string at port 5432.

$ terraform --version
Terraform v0.11.10

$ cd datamart && terraform output
datamart_address = someurl.rds.amazonaws.com

$ cd ../datamart-databases && terraform plan --input=false
...
provider.postgresql: Error initializing PostgreSQL client: error detecting capabilities: error PostgreSQL version: dial tcp :5432: getsockopt: connection refused
...
$ cat main.tf
...
provider "postgresql" {
  alias    = "datamart"
  host     = "${data.terraform_remote_state.datamart.datamart_address}"
  port     = "5432"
...
data "terraform_remote_state" "datamart" {
...

With a trivial test, the datamart_address output interpolates fine

$ cat main.tf
data "terraform_remote_state" "datamart" {
...
resource "null_resource" "whatever" {
  provisioner "local-exec" {
    command = "echo '${data.terraform_remote_state.datamart.datamart_address}'"
  }
}

$ terraform apply
...
null_resource.whatever (local-exec): Executing: ["/bin/sh" "-c" "echo 'someurl.us-west-2.rds.amazonaws.com'"]
null_resource.whatever (local-exec): someurl.us-west-2.rds.amazonaws.com
...

All this to implement the remote state workaround described in #4149, but it appears that workaround is nonfunctional in this case.

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ncraike picture ncraike  ยท  77Comments

shubhambhartiya picture shubhambhartiya  ยท  72Comments

nevir picture nevir  ยท  82Comments

atkinchris picture atkinchris  ยท  68Comments

phinze picture phinze  ยท  167Comments