Terraform: Allow configuring remote storage via config file

Created on 14 May 2015  ยท  61Comments  ยท  Source: hashicorp/terraform

https://www.terraform.io/docs/commands/remote-config.html

It's ok to configure storage via CLI once as one member of the team, but it's pain to do this repeatedly and to share this with team members in a form of a Readme file suggesting "do the following":

terraform remote config -backend=S3 -backend-config="bucket=terraform-state" -backend-config="key=tf-state" -backend-config="region=us-east-1"
core enhancement

Most helpful comment

We currently plan to support this in 0.9.

All 61 comments

I have almost exactly that sprinkled in README files all over our codebases. :)

Just wanted to add that it'd also be useful to be able to use interpolations in a remote config specified inline, so that variables can affect the final location.

Interpolating actual resource attributes (rather than just input variables) would create the interesting situation where the remote state itself depends on some of the resources, but I think supporting interpolation of variables only would be sufficient for my use case. (which is: re-using the same terraform config for a number development environments that differ only in name, while maintaining a separate state for each)

Perhaps even a .terraform.rc file would work. Then you could override on the command line which rc file to use so your directory could look like:

dir/
  .dev.rc
  .prod.rc
  env.tf

@apparentlymart

Just wanted to add that it'd also be useful to be able to use interpolations in a remote config specified inline, so that variables can affect the final location.

I was thinking about this as well, mostly because of a single variable for the _environment name_, is there any other use-case, that you'd have here?

I'm being more and more convinced that Terraform just needs something like terraform apply -env=env_name, terraform destroy -env=env_name etc., then we could have a limited interpolation only to this special core variable that you wouldn't be able to define in the code, but only as a CLI flag or ENV variable (i.e. export TF_env=...).

All that btw. brings me back to my older issue: https://github.com/hashicorp/terraform/issues/1295

@johnrengelman

Perhaps even a .terraform.rc file would work.

I believe these are mostly project-level settings, not user-level settings. That said, it's no harm to support this.

Lots of tools support an rc type file at the project level: NPM, Bower are just 2 off the top of my head. Call it what you like but I was just thinking a configs type file at the project base dir

@johnrengelman Ah, right, I see what do you mean. Sure, that could be a solution as well.

I actually have two use-cases, which are related and could probably both be forced into the single-variable solution:

  • We have a number of different "environments" that we identify by a domain name where a Consul cluster can be found. Our developer environments are all created from a single config, parameterized by the developer's name. When creating the environment, we pass in a parameter "username" and it creates an environment username.dev.ourdomain.com . Environment configs then get written into a special S3 bucket, with the key being username.dev.ourdomain.com.
  • When we deploy our applications, there is an input variable for environment, which can be the hostname of any environment. For example, it could be username.dev.ourdomain.com as above, or it could be a shared environment like qa.ourdomain.com or production.ourdomain.com. Application _deploy_ configs get written into the Consul key-value store accessible at that hostname.

So our invocations for terraform for these two cases are a little different, but both involve just one parameter:

  • terraform apply -var="username=apparentlymart" (on the shared "developer environment" config) to create a Consul cluster at apparentlymart.dev.ourdomain.com. The full environment name then becomes the "key" argument to the S3 remote state backend, with all of the other arguments hard-coded.
  • terraform apply -var="target_env=apparentlymart.dev.ourdomain.com" (on an app-specific config) to deploy a particular app to the created environment. The target_env value then becomes the "address" argument to the Consul remote state backend, with all of the other arguments hard-coded.

We're currently working around this mainly by having Jenkins run Terraform for us, so we can bake all of the necessary remote setup into a reusable job. In some cases our developers end up running terraform locally and so we end up putting terraform remote config invocations in README files to help with that, but mistakes can still be made.

I've been putting some thought into this as well, and I'm toying with the idea of making a provider called something like "remote-state". Providers in terraform already have fencing in place to ensure they're only called once, and since remote state config is global (to a project), it'd be configured in a similar fashion to, say, the AWS provider. I'm envisioning something like:

provider "remote-state" {
  backend = "HTTP"
  config = {
    address = "https://foo.bar.example/state/"
  }
}

Commandline options would then easily map to provider configs, e.g.

provider "remote-state" {
  backend = "S3"
  config = {
    bucket = "terraform-state"
    key= "tf-state"
    region = "us-east-1"
  }
}

which maps to terraform remote config -backend=S3 -backend-config="bucket=terraform-state" -backend-config="key=tf-state" -backend-config="region=us-east-1" given in the example at the top.

I imagine then we could take advantage of interpolation, variables, etc. That would help solve @apparentlymart 's scenario as well.

Any thoughts?

Also, could use feedback ( @mitchellh ? @radeksimko ?) on if this would be best as a built-in provider, or a third-party provider hosted on my github. I suppose it could easily be folded in later as a built-in if it works and makes sense.

@LeftyBC I think this may be a bit more difficult to handle as a provider since most providers (more likely all) are just designed to provide connection & anything that resources need - i.e. providers themselves don't do any work.

Either way I agree with you on the syntax as it's similar to terraform_remote_state.

Another question to be solved is right ordering/lifecycle - i.e. we need to make sure that such code gets evaluated/executed first, before anything else, any other providers/resources. This cannot run in parallel w/ other things, because we may want to stop provisioning if the state returns any error (i.e. I may not want to provision if I can't save the state).

I think it's more realistic to implement this similarly to atlas:
https://terraform.io/docs/configuration/atlas.html

i.e. It could look like this:

remote_state {
  backend = "S3"
  config = {
    bucket = "terraform-state"
    key= "tf-state"
    region = "us-east-1"
  }
}

@radeksimko I like your suggestion, as long as the config properties support variable interpolation. :-)

This would effectively prevent issues like https://github.com/hashicorp/terraform/issues/2549

:+1: This is exactly the kind of thing that would help us implement Terraform :smile:

I like the remote_state block that @radeksimko posted but I'd note that having both this and the terraform_remote_state resource is likely to be confusing, since they both sound like they are the same thing. If a breaking change is acceptable then I'd probably rename the resource to something like terraform_upstream_state, to make it clearer that we're talking about the state for some _other_ configuration, vs the remote_state block that is about _this_ configuration.

Conversely, without a breaking change the new block could be called something like state_storage. Not awesomely clear but at least it uses a distinct terminology from the resource so it's easier to talk about the differences.

@apparentlymart Agreed :+1: my suggested naming would be very confusing in combination with terraform_remote_state, but after all I don't think that naming convention is something that would prevent us from implementing it :)

Finding the right place in the lifecycle/graph might be a bit challenging.
I do have some other (subjectively) more important things in TF I want to work on in following weeks/months, even though this one is tempting since the current state is painful.

If anyone wants to submit a PR, feel free :smiley:

I guess I was expecting this would be handled well before we get into the graph stage, inside the command.Meta.Context method similarly to how the atlas block is handled inside the terraform push command. Other than a quick tour of that bit of code I've not looked too closely so I don't know if I'm on the right track here, but it seems plausible from what I've seen so far.

Sadly I too have other stuff going on that are more important to me and my organization so I can't take this one on right now, but I too keep getting tempted every time I write another README with a "terraform remote config" command in it for users to copy-paste.

:+1: +1 for separating the remote state config from the state - we had the same issue in my company when dealing with remote state - for now I just hacked together a bash script which we use when executing terraform commands - it takes arguments, configures the remote state and then executes the desired terraform command. A native solution would be so much better! :-)

@antonosmond Is it possible to get a copy of your script?

@gamename Here you go. It's a simplified version of what I use at work. I'm actually in the process of refactoring it to use getopts for the args but for now it does the job - in our real one we have lots more args!
Basically instead of running terraform apply or terraform destroy or whatever, we run this script and give it arguments e.g.
/bin/bash terraform.sh /some_directory/some_terraform_module/ test plan
It handles the remote state set up for us (we have a standard structure to how we store our state files in our s3 bucket) so in the example here it'd assume state is at your_s3_bucket_name/some_terraform_module/test.tfstate
but obviously that's change based on the arguments you provide

#!/bin/bash
set -e

terraform_bucket_region='your_s3_bucket_region'
terraform_bucket_name='your_s3_bucket_name'

function usage() {
  echo "Usage: terraform.sh [module_path] [environment] [action]"
  echo
  echo "module_path:"
  echo " - the path of the terraform module"
  echo
  echo "environment:"
  echo " - dev"
  echo " - test"
  echo " - prod"
  echo
  echo "action:"
  echo " - plan"
  echo " - apply"
  echo " - plan-destroy"
  echo " - destroy"
}

# Ensure script console output is separated by blank line at top and bottom to improve readability
trap echo EXIT
echo

# Validate the input arguments
if [ "$#" -ne 3 ]; then
  usage
  exit 1
fi

module_path="$1"
environment="$2"
action="$3"

# Get the absolute path to the module
if [[ "$module_path" != /* ]]; then
  module_path=$(cd "$(pwd)/$module_path" && pwd)
else
  module_path=$(cd "$module_path" && pwd)
fi

case "$action" in
  plan) ;;
  apply) ;;
  plan-destroy) ;;
  destroy) ;;
  *)
    usage
    exit 1
esac

if [ "$action" == "plan-destroy" ]; then
  action="plan"
  destroy="-destroy"
fi

if [ "$action" == "destroy" ]; then
  destroy='-destroy'
  force='-force'
fi

# Clear the .terraform directory (we want to pull the state from the remote)
rm -rf "$module_path/.terraform"

# Make sure we're running in the module directory
cd "$module_path"

# Configure remote state storage
terraform remote config \
  -backend=S3 \
  -backend-config="region=$terraform_bucket_region" \
  -backend-config="bucket=$terraform_bucket_name" \
  -backend-config="key=$(basename $module_path)/$environment.tfstate"

terraform get

# Plan
if [ "$action" == "plan" ]; then
  # Output a plan
  terraform plan \
    -input=false \
    -refresh=true \
    -module-depth=-1 \
    $destroy
  exit 0
fi

# Execute the terraform action
terraform "$action" \
  -input=false \
  -refresh=true \
  $force

It just sucks we have to do this but until the remote config is separated from the state file it seemed like the easiest way to handle it. In our environments at work this is all automated so most of this is hidden from our developers and they don't have to worry about it.

Put me down as a +1. Was looking at the Makefile solution from http://karlcode.owtelse.com/blog/2015/09/01/working-with-terraform-remote-statefile/ , but the script above would handle our use cases better (distinct dev/prod environments with different state).

I wrote https://github.com/nadnerb/terraform_exec to do this.

It is written in go but only calls terraform on the command line under the covers using exec (I might fix that in the future). This syncs state with s3 and works at an environment level.

This works for me, I am happy to improve it if others find it useful.

I was looking at the code yesterday to see what the effort would be to get it in as a config section on the level of atlas, providers, variables etc.

I would imagine efforts would have to be focused around getting:

  • Adding remote state config support in config/config.go and other related files
  • Running remote state config sometime after variable read-in to allow for interpolation
  • Ensuring remote state does not pre-push local state (as generally happens with remote config)

I stopped at whether or not it _should_ be done, but it sounds like it might be worth looking into further. If anyone has any insight into the where/when/how config happens and how to hook in remote config in a run, I'm willing to do the work.

@mitchellh had this to say on #4546, which I just closed as a duplicate:


For the future, when we do this, we should put it in a terraform block that can be used for meta-configurations to configure Terraform itself rather than what Terraform is doing. I'd prefer this:

terraform {
    remote_state { ... }
}

I can imagine other things that would go in there some day (minimum version, required plugins, etc.), and I'd like to avoid polluting the top level namespace with too many keywords.

+1

This is pretty critical; hope it's on the near-future roadmap...

I had a thought about a step-wise way to get here, that might allow us to get the most important parts of this functionality faster:

All of this assumes a new configuration structure like this:

terraform {
    remote_state {
        backend = "consul"
        config = {
            address = "consul.example.com:80"
            // Only "var" interpolations are permitted
            path = "environments/${var.envname}"
        }
    }
}

With the config language supporting the above syntax, this could then be implemented in three stages that each require progressively more internal architecture work inside Terraform but could be merged in isolation:

  1. Terraform checks for the configuration block shown above, interpolates variables into it, and then fails with an error if the current remote config doesn't match what the config file says. This still requires the manual setup but it allows Terraform to catch the case where you configured it wrong, preventing the user from accidentally working with the wrong state and thus causing havoc. If the block isn't present then Terraform's behavior is unchanged.
  2. Add a terraform remote auto-config command, which works like terraform remote config except that the configuration is taken from the config file rather than from the command line arguments. Change the error message added in the previous change to recommend the use of this command when the current remote config doesn't match what the configuration file says. This makes the UX slightly better by doing some work for the user, but still requires manual intervention from the user and thus still leaves the original Terraform behavior unchanged.
  3. Re-work how Terraform handles remote config so that it's something that happens on a per-operation basis rather than requiring a separate setup step. This is a more significant architecture change that probably requires e.g. rethinking how the local state cache works. At this point there would be a breaking change to Terraform's remote state handling and users would be required to switch to this new workflow, since the previous model of stateful setup would no longer make sense.

Personally I'd love to have even the first of these steps right now, since me and others on my team have on more than one occasion accidentally messed up the remote config and ended up doing something strange, like overwriting one state with another... I'd love a way for Terraform to just tell me I'm wrong, even if it doesn't immediately help me fix it.

@apparentlymart Great ideas! As a relative newcomer to Terraform from heavily CloudFormation usage, I agree that all of these steps would be useful.

While I understand the usefulness of the first option as a sanity check (and would welcome that), it doesn't solve the issue that "remote state management isn't intuitive" to newcomers. Now that I understand a bit more of the state lifecycle it would be useful, but it doesn't solve the initial sticker shock of the remote state file management.

The second option would be a huge help, we have adopted a strategy similar to those that people have outlined above, but merging that process into the tool itself would be immensely useful and would also reduce the friction for newcomers pretty significantly for using the terraform_remote_state (which in my opinion is one of the killer features of Terraform).

:+1: I screw this up even just with myself between machines all the time haha

Hi all, is there any update on this feature? It would really be great if it was available.

@mikljohansson, how far have you progressed with your remote state config commits?

@alexclifford I created a proof of conceptfor configuring remote state from tf files and warn if the actual setup mismatches, but got stuck on the unit testing. Have a look at https://github.com/mikljohansson/terraform/compare/master...mikljohansson:remote_state_config if you're interested in picking up. Unfortunately it seems unlikely I'll be able to prioritze working more on it in the short term

@mitchellh / @armon Is there any chance this (or something similar) will be part of the 0.7.0 release? This has bit us pretty badly again and is a much needed feature.

This is really a necessary feature. The current situation of mixing config and data in the same file is problematic.

+1

What if I commit a minimal state file right into version control? For example .terraform/terraform.tfstate with

{
    "remote": {
        "type": "consul",
        "config": {
            "address": "consul:8500",
            "path": "terraform/service"
        }
    }
}

terraform show fails with No state. error, but terraform plan and terraform apply seem working.
Do you think this workaround is reliable?

@mkuzmin this what was i thinking about. in the begging run once terraform remote config and commit the statefile with remote only

I use this hack for several weeks in my projects, and it works OK.
Two issues so far:

But remember Tao of Vagrant? In any project on any machine I could run just vagrant up, and be sure all technical details are hidden in configuration files.

I wish to have the same workflow in Terraform.

Sure, this why i was totally confused when i added resource terraform_remote_state and figured that it's for informational purposes only o_O

From reading the various discussions related to this issue, it seems like there are three main options discussed as a solution to the issue of shared remote state configuration.

  1. A terraform section in the tf files
  2. A .terraformrc file
  3. Leave remote state config in tfstate, move locally cached state to a tfcache file (suggested in https://github.com/hashicorp/terraform/issues/2549#issuecomment-122722920)

Option 1 does seem the most appealing, and I'd go so far as to say that it may be possible to implement any of these options in a backwards-compatible way. That is, a version of Terraform with this functionality could offer to create a new file with this configuration when it detects a tfstate file with the remote block but no matching configuration otherwise. In the case of Option 3, this could even happen automatically.

If there's a consensus about one of the solutions (this thread seems to point to Option 1), I'll take a stab at an implementation.

I agree on 1
Also, should it be up to users to ignore from their vcs but this could work exactly like a 'provider'
Thanks to everyone involved for looking into this

Just to confirm the pain in my use case.. I am using STS to get temporary credentials for access to the remote state s3 bucket backend. What happens for me is:

  1. Configure remote state
  2. Terraform creates empty state file except for the bucket credentials (and therefore token)
  3. Terraform pulls the remote state and overwrites the bucket credentials with the ones put in the tfstate file when it was created
  4. Subsequent pull or push uses the out of date token and fails.

For my case, I would want the credentials to simply be considered a local value and not stored in the remote copy when it is pushed so there's nothing to overwrite with on the pull and merge. Given that I had to configure the bucket credentials to get access to the state file, the credentials being present in the remote copy is pointless and a security issue (that obviously we are partly trying to mitigate with STS).

The main thing I want to ensure in a solution for this is that I can dynamically set the credentials because with STS the token will frequently change. I am happy with there being a "terraform" provider in the tf files as I can define the credentials as a run-time variable which can then be interpolated in the terraform provider.

However, that solution is quite a rearchitecting of the way terraform backend config works. For the short term I would be at least comfortable with either "remote push" not pushing the credentials information to the remote state, or or the remote pull merge process not overwriting local credentials config.

@Zordrak Terraform honors environment variables and ~/.aws. I would recommend using those methods instead of directly configuring terraform to use the specific STS credentials. That way when the credentials expire and you get new credentials changes do not have to be made to the terraform configuration.

However, does that not then get in the way of there being different credentials for the plan/apply? The state s3 bucket isnt in the same AWS account or accessed by the same permissions as those used for managing the infrastructure.

The AWS provider allows us to specify an instance profile as the role to assume.. but there's no terraform provider to allow us do the same for the bucket credentials. And if i specify the role to assume in environment variables (assuming I can) that superceded the details in the aws provider doesnt it? i.e.. comes first in the search path.

_edit_
Continuing to investigate, I might have this the wrong way round and be able to define an assumable role in the environment to then override in the provider

@Zordrak something that may help you - I wrote aws-runas originally to wrap TF calls in pre-fetched STS credentials before TF had the ability to assume roles itself. We use it now to wrap all of our TF and packer calls (and more!)

What you can do is wrap the call to terraform remote config and terraform apply to it using the same invocation (this is what we do). That way, you can expect apply to use the same credentials that remote config used. --no-role will also get a token for you but won't assume a role.

Let me know if it's useful for you!

PS: When you use this tool, don't specify any credentials when doing terraform remote config or running terraform apply, ie: your provider config. You won't need to as the creds will be set in environment. I understand this is not a 100% fix but it's been good for us so far - I wasn't even aware remote state creds were pushed!

Thanks @vancluever - ironically (or coincidentally) I am already working inside my own "tfwrapper" in which I am working on most of this.

It looks like I have two options.

  1. Define ".config/aws: [provider foo] role_arn = " and then pass AWS_PROFILE=foo
    or
  2. Still do the STS call manually, but use the AWS_ACCESS_KEY etc. environment variables to pass the temporary credentials into terraform for the use of the remote bucket config.

In both cases, hoping to override the credentials defined in the environment with a role_arn in the aws provider block in the terraform code.

@Zordrak In that case you can use different AWS Profiles as terraform aws provider supports profile.

  • Store both the STS credentials and the non-sts credentials under different profiles and specify those profiles for the aws provider.

@sstarcher However "profiles" cannot be configured in environment variables, they have to be configured on-disk - which means that everything that runs terraform then needs to have ~/.aws/config hard-configured in order to work that way; which when including Jenkins as the main place it is run from is... sub-optimal in the least.

The path I'm looking at with doing the STS call in bash and then passing the returned creds into the terraform remote state config is still not best I realise, because when it comes to apply_time, and terraform tries to assume the role given in the aws provider block, it will try to use the bucket credentials fed in to assume the role.. when actually I need it to fall back to the instance profile. But since the boto search order for creds is to come to the instance profile last - i would have to first unset the environment variables.

@Zordrak if you don't mind setting all that stuff in environment, you can skip passing credentials to both terraform remote config and terraform apply. Don't specify any credentials or anything credential related in the provider aws block as it's not necessary. The AWS Go SDK (which TF uses, of course) is set up to auto-detect those credentials via the appropriate AWS standard environment variables.

If you are using one bucket for state for several accounts, what you might need to do is make sure the remote state bucket exists in the same AWS account you are deploying to. This isn't so bad - you just need a bucket in each AWS account TF touches, which can be controlled via tooling. Have your toolchain switch remote state to the bucket for the account you are deploying to and then do your terraform apply, and just ensure your configs are granular to the account they are deploying to.

@vancluever That would be ideal except the instance profile doesnt have permission to do everything.. it has permission to assume a set of roles depending on what it is doing. So when applying production, it has permission to assume the role that allows it to read and write the central state bucket, and permission to assume a role that lets it read and write prod infrastructure. When a different slave in a different place with different permissions for roles it may assume wants to run applky on the dev environment, it assumes the central-bucket role for the state file, and on apply assumes a role that lets it read and write dev infrastructure.

Hope that's not too confusing.

Maybe I can generate a compromise where the jenkins instance profile has permission to read and write the state bucket on its own... but assumes another role for planning/applying infra. The trouble is; I'm not sure S3 allows that kind of usage when working cross-account.

@Zordrak it sounds like your problem _may_ be fixed by moving the bucket permissions for the dev role to the dev role (ie: not have that workflow use 2 different permission sets for state and apply).

Our ultimate issue was that (pre 0.7) without a way to assume roles in the provider, we were reliant on external tooling to assume the roles and run TF. I actually did have a central bucket architected for us at one point in time, but encountered other issues such as object ACLs being written in ways that removed access for future TF runs, breaking them.

Rather than deal with the whole cross-account mess, I just decided it was easier to just have a different state bucket per account. So one for prod, and one for dev, with each assumed role having get/put/delete access to their respective one. The instance profile does not get permission at all. This actually proved useful to us in different ways - now we can lock down production, and developers can use their team buckets using the same toolchain that is used when they take their stuff to production.

about AWS creds i managed it just using the following in my tf files:

provider "aws" {
  region              = "us-east-1"
  allowed_account_ids = ["${var.aws_account_id}"]

  assume_role {
    role_arn = "arn:aws:iam::${var.aws_account_id}:role/terraform"
  }
}

We just make all related people to have ability to assume this role and it works like a charm

I thought this was the purpose of data "terraform_remote_state", but that's not sufficient.

The docs at https://www.terraform.io/docs/state/remote/s3.html could do with improving IMO - I'm not sure how to read "example referencing" vs. "example usage" - do we need both? Why the duplication?

@OJFord "example usage" is how to configure your project to use the S3 backend to store it's state. "example referencing" is how you would configure your project to consume the state of another project (which is stored in S3) so that you can access it's outputs as values in your project.

This issue is about making the configuration for remote state storage for your project to be a 1st class citizen of the project, as opposed to the current "do configuration via the command line".

"example referencing" is how you would configure your project to consume the state of another project (which is stored in S3) so that you can access it's outputs as values in your project

Ah! Got it, thanks.

This issue is about making the configuration for remote state storage for your project to be a 1st class citizen of the project

Yep, I want this too. I just misunderstood the referencing to be the implementation, and found this issue when it didn't behave as I expected.

Hi, just wanted to touch base and see if there has been any progress on this concept? Requiring a wrapper script or assuming a user is going to run the remote state commands properly has been a hassle for us.

Thanks,
Mike

We currently plan to support this in 0.9.

+1

@apparentlymart coming back to your earlier remark about confusion between remote_state and terraform_remote_state: I guess with the introduction of data sources, that should no longer be an issue.

  • A terraform remote state resource could be the resource to be used to store the remote state for the project at hand.
  • A terraform remote stage data source can be used to access state from other projects.

/cc @mitchellh

Implemented in https://github.com/hashicorp/terraform/pull/11286 which is already part of 0.9.0-beta1.

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings