terraform init doesn't honor input or automation flags for invalid workspace

Created on 23 May 2019  路  20Comments  路  Source: hashicorp/terraform

Terraform still produces a prompt to select a workspace when terraform init is run with either -input=false or TF_IN_AUTOMATION=1. I'm using the remote backend.

It appears there is no way presently to detect or list what workspaces are available before an terraform init which means that if the remote backend is used with multi-workspace prefix attribute then it will always produce a blocking user input prompt. This makes automation difficult.

terraform init -help states:

-input=true Ask for input if necessary. If false, will error if input was required.

Which isn't being followed.

Terraform Version

0.11.14

Expected Behavior

Init should fail with an error message about the workspace being invalid rather than producing a prompt that will block indefinitely.

Actual Behavior

$ TF_IN_AUTOMATION=true terraform init

Initializing the backend...
The currently selected workspace (QAF) does not exist.
  This is expected behavior when the selected workspace did not have an
  existing non-empty state. Please enter a number to select a workspace:

  1. Prod
  2. QA

  Enter a value: ^C
Error asking to select workspace: interrupted

$ terraform init -input=false

Initializing the backend...
The currently selected workspace (QAF) does not exist.
  This is expected behavior when the selected workspace did not have an
  existing non-empty state. Please enter a number to select a workspace:

  1. Prod
  2. QA

  Enter a value: ^C
Error asking to select workspace: interrupted

Steps to Reproduce

  1. Init a project with a remote backend configured for multi-workspace and no .terraform/environment present or an invalid workspace specified in .terraform/environment.

Either flag will have no effect on prompt behavior.

backenremote bug v0.11

Most helpful comment

The interactive prompt still happens for me with 0.12.1. I've exported TF_IN_AUTOMATION=1 and am calling terraform init -input=false, but I still get an interactive prompt.

$ export TF_IN_AUTOMATION=1
$ terraform init -input=false

Initializing the backend...

The currently selected workspace (devops) does not exist.
  This is expected behavior when the selected workspace did not have an
  existing non-empty state. Please enter a number to select a workspace:

  1. default

  Enter a value: 

Error: Failed to select workspace: interrupted

$ terraform version
Terraform v0.12.1
+ provider.aws v2.13.0

All 20 comments

Looks like the recently merged https://github.com/hashicorp/terraform/pull/21234 doesn't take this into account either. I think if either of these "no prompt" flags were specified, it might be reasonable to select workspace 1 (index 0) automatically so that subsequent commands to list/select workspaces work properly.

Thanks for reporting this, @mattlqx!

An interactive prompt is clearly not appropriate for non-interactive automated scenarios, so we should indeed find a different way to address this for those.

Automatically selecting an implied workspace feels a little too magical/risky to me, since the worst case there is applying changes intended for one workspace to another workspace. I feel like the baseline here is that the command should fail outright if it doesn't have enough information to proceed and can't obtain it via user input, which then brings us to another question: how can we give it enough information to proceed, in non-interactive mode?

One way to address this in automation scenarios would be to make the workspace selection respect the TF_WORKSPACE environment variable that can already be used to force a particular workspace for other commands. That's intended to make it easier to set a workspace in an automated scenario where state on disk persisting between commands tends to be inconvenient. In that case, the sequence of steps would be something approximately equivalent to the following:

  • export TF_IN_AUTOMATION=true
  • export TF_WORKSPACE=QA
  • terraform init -input=false
  • terraform plan -input=false -out=tfplan
  • terraform apply -input=false tfplan

If TF_WORKSPACE contains a string that doesn't correspond to an available workspace, then that would cause terraform init -input=false to fail immediately, rather than presenting the prompt it currently uses.

How does that sound, @mattlqx?

I hear you on the point of not defaulting. If we're going to outright fail the init, then there still needs to be some lightweight way for determining what workspaces ARE available by automation. There's a bit of chicken and egg here on init and workspace so I'm not sure what's feasible there.

Is it possible to query my Terraform Cloud account to determine what project's workspaces are available, even if it's out-of-band, so automation can determine what needs done? e.g. Creating a new workspace before trying to select it in the project. Would adding a -create-missing-workspace to init be too dangerous also?

adding a -create-missing-workspace

... seems like an excellent solution to allow advanced usages without behind-the-scenes magic that might hurt new users.

We use Terraform (but not Terraform Cloud). Transient, on-demand workspaces created by the build server are quite useful. With versions < 0.11.14, TF_IN_AUTOMATION=not-empty TF_WORKSPACE=<some workspace which does not exist> terraform init -input=false worked just fine.

Hi @mattlqx! Thanks for the additional context.

In the automation scenarios I've seen before, the automation itself is choosing a specific workspace to work with, and so it doesn't need to know anything other than the single specific workspace name it's been configured with.

It sounds like you have something different in mind. If you'd be willing, I'd love to hear more about the use-case. If you had a way to get a machine-readable list of workspaces to choose from, what exactly would your automation do with it? Your original suggestion seemed to lean towards just picking the first item from the list, but then I wonder how the automation would ever work with a workspace other than the one that happens to sort first?

Terraform itself is hitting an API on Terraform Cloud to enumerate the workspaces matching the given prefix, so in principle you could mimic that logic, though I'd like to understand better what exactly you're trying to achieve.

Sure. The reason why I'm asking for such a thing mostly is for migration of states from what we're currently using today (stored via the consul backend, no workspaces) to remote backend with multi-workspaces. We have the functionality of workspaces today by manipulating part of the consul path. Our usage predates environments/workspaces functionality in Terraform. Now seems like a great time to shift to the official workspaces as part of this migration.

Long story short, I have a bash script that handles the backend block generation among other things (grabbing other config data from various places and setting it via TF_VAR env vars, installing third-party plugins, etc), runs init and the desired terraform subcommand with the appropriate switches. I'm trying to have that script handle the migration if a TF_BACKEND variable is set to remote. Through some juggling of backend configs, I can grab the old state from Consul and have Terraform do its state copy magic to TF Cloud.

The real trick is determining if the current Consul state which represents a single environment (workspace, effectively default) has already been copied over to Terraform Cloud. I could verify that easily if I could just list the workspaces for a project before hitting init and this prompt coming up. And this only applies if the project has a no existing .terraform (this is the case in CI or a workstation that has just never run the project before) and at least one of the environments for it has already been migrated. I still have to deal with another input prompt for that first migration where it asks you what you'd like the new workspace to be called. I hacked around that with expect since there's no available switches to provide that input.

I've gotten my desired behavior today with another expect workaround such that:

  1. I see that the remote backend is specified by my project.
  2. I run init which sees there is no workspace selected, but some exist in Cloud.
  3. expect just selects 1 which completes init.
  4. I try a workspace select for the workspace name and check the return code to determine if it already exists. If it does, great, it can move on with the actual run. If not, then I can do the backend juggling to migrate the state over.

This probably sounds hilariously overcomplicated but it started out with the best of intentions and as always snowballs into something else. :) Once all my projects and environments are migrated over (precisely 170 of them), I agree I wouldn't have much need to list workspaces prior to init. My expect approach is brittle but it's the only tool I have if I want to use terraform as an interface for mass-migration. I think even if I wanted to do this via terraform state pull/push, I'd still need to deal with the prompts in init before I could create additional workspaces.

Anyway, thanks for the interest, I really appreciate the effort to understand.

Thanks for all of that great additional context, @mattlqx. Migrating from a world with only default workspaces into a world with non-default workspaces is indeed an interesting challenge, since Terraform doesn't really have enough information to help fully, but I can see now the places where Terraform is getting in the way:

  • You have to run terraform init before you can run terraform workspace select, because otherwise the working directory is either uninitialized or pointed at the old backend.
  • ...but terraform init can't complete without knowing a specific workspace to initially select.

In principle, setting TF_WORKSPACE to the same workspace name you would've tried in step 4 could do something here, but I can see that the number of possible failure modes of terraform init is much larger than terraform workspace select, so it would be risky to say "If terraform init fails then I need to do backend juggling": the failure could be for any number of different reasons.

One option available to you today -- though I realize not as convenient as just having Terraform figure this out itself -- is to hit the Organization Workspaces operation on the Terraform Cloud API, which can tell you which workspaces exist for your entire organization in one go, which you could then filter down to just the ones relevant for your current configuration.

When I did something similar in a previous job (before I joined HashiCorp) I ended up writing a script that didn't really use Terraform CLI at all, and instead just interacted with the different APIs directly. I suggest this only to note that it is possible, understanding that it's not particularly convenient, but an API-only path here could be:

  • Use the Consul Key/Value Store API to enumerate all of your relevant state keys in Consul, using some custom logic to understand whatever naming scheme you established in Consul for differentiating state snapshots by project and environment.
  • Use the Terraform Cloud "Create a Workspace" operation to ensure that a workspace exists for each project/environment pair you found in the first step.
  • Fetch the state snapshot bytes for each project/environment from Consul (using the Key/Value Store API again) and write it into the corresponding Terraform Cloud workspace using the "Create a State Version" operation.

I preferred an approach like this because it allowed me to get the whole migration done in one shot, rather than gradually over time. Of course, it being one-shot could be considered a disadvantage in another context, so again I'm suggesting this only to describe what is possible right now, in case one of these answers feels like enough to get you unblocked for your immediate work, rather than waiting for changes to Terraform CLI itself.

With all of that said, I'll pass this issue on to the Terraform Cloud team (who are primarily responsible for this part of Terraform CLI) so they can think about what might be best here.

Setting TF_WORKSPACE worked for me in Azure Devops and using the remote backend against TFE free.

This is what I've had to do to get around this issue...there might be something easier, so please let me know.

Before running anything in an automated environment (in my case, Bitbucket Pipelines), you MUST create at least 1 workspace manually. This is unfortunate, but it only has to be done once.

Then, the following works for all new workspaces:

$ mkdir .terraform && echo "manually_created_workspace_name" > .terraform/environment
$ terraform init
$ terraform workspace select $WORKSPACE || terraform workspace new $WORKSPACE
$ terraform apply -auto-approve

Essentially, by making the .terraform/environment file before I run terraform init, I can default the workspace to one that for sure exists. Then I try to either select a workspace or create it if it doesn't exist. You can then plan/apply.

@adback03 Thanks for your workaround. It almost worked for me with a few tweaks.

I am using the S3 backend. With that backend, there always appears to be a "default" workspace even if no state for one exists in S3. With that, I was able to get the following to work without a manually created workspace:

env TF_WORKSPACE=default terraform init
terraform workspace select $WORKSPACE || terraform workspace new $WORKSPACE
terraform apply

We set the workspace by the environment TF_WORKSPACE (in a larger set of python scripts) and that caused a few issues with the workspace select command. I had to ensure this variable was not set when running those commands.

This bug seems to be fixed in the terraform 0.12.1 release.

The interactive prompt still happens for me with 0.12.1. I've exported TF_IN_AUTOMATION=1 and am calling terraform init -input=false, but I still get an interactive prompt.

$ export TF_IN_AUTOMATION=1
$ terraform init -input=false

Initializing the backend...

The currently selected workspace (devops) does not exist.
  This is expected behavior when the selected workspace did not have an
  existing non-empty state. Please enter a number to select a workspace:

  1. default

  Enter a value: 

Error: Failed to select workspace: interrupted

$ terraform version
Terraform v0.12.1
+ provider.aws v2.13.0

I took some inspiration from @apparentlymart's mentions of using the Terraform Enterprise API, and came up with what I think is a decent solution to this for me, that doesn't require any manual pre-creation of workspaces. This feels really over-engineered. It would be great if some kind of logic could be introduced to handle running in automation + Terraform Cloud workspaces. The suggestion of a -create-missing-workspace flag seems like it would help with my use case. Anyway, this is what I have configured to run things in Azure DevOps in case it helps anyone.

backend.tf:

terraform {
  required_version = "~> 0.12.0"
  backend "remote" {
    hostname     = "app.terraform.io"
    organization = "my-org"
    workspaces {
      prefix = "my-workspace-prefix-"
    }
  }
}

Other .tf files can have whatever, or nothing.

I'm using the YAML schema to define my pipeline in Azure DevOps. This is azure-pipelines.yml in the same repo as my Terraform templates:

trigger:
- master

resources:
  repositories:
  - repository: templates
    type: git
    name: my-project/azure-devops-templates

variables:
  workspacePrefix: my-workspace-prefix-
  organization: my-org

stages:
- stage: terraformPlan
  displayName: Terraform Plan
  jobs:
  - template: terraform-plan-job.yml@templates
    parameters:
      jobName: terraformPlanDev
      workspace: dev
      workspacePrefix: $(workspacePrefix)
      organization: $(organization)
  - template: terraform-plan-job.yml@templates
    parameters:
      jobName: terraformPlanProd
      workspace: prod
      workspacePrefix: $(workspacePrefix)
      organization: $(organization)

The azure-devops-templates repository under the my-project project in Azure DevOps repos contains the following files. terraform-plan-job.yml is the main job that's called from my pipeline. It uses terraform-docker-steps.yml to run the Terraform image from Docker Hub, and terraform-workspace-steps.yml is the piece that helped me get workspaces pre-created in my organization via the API.

terraform-plan-job.yml:

parameters:
  jobName: 'terraformPlan'
  workspace: ''
  workspacePrefix: ''
  organization: ''
  gitSshUrl: 'ssh.dev.azure.com'
  terraformCloudUrl: 'app.terraform.io'
  terraformRequiredVersion: '~> 0.12.0'
  sshKeyFile: 'id_rsa'
  terraformRcFile: 'terraformrc'

jobs:  
  - job: ${{ parameters.jobName }}
    displayName: terraform plan (${{ parameters.workspace }})
    pool:
      vmImage: 'ubuntu-latest'
    steps:
    - bash: |
        if [ -z "$WORKSPACE"  ] || [ -z "$WORKSPACE_PREFIX" ] || [ -z "$ORGANIZATION" ]; then
          echo "##vso[task.logissue type=error;]Missing required parameters"
          echo "##vso[task.complete result=Failed;]"
        fi
      env:
        WORKSPACE: ${{ parameters.workspace }}
        WORKSPACE_PREFIX: ${{ parameters.workspacePrefix }}
        ORGANIZATION: ${{ parameters.organization }}
      displayName: Check for required parameters
    - task: DownloadSecureFile@1
      displayName: Download read-only SSH key used to source modules
      name: sshPrivateKey
      inputs:
        secureFile: ${{ parameters.sshKeyFile }}
    - task: DownloadSecureFile@1
      displayName: Download terraformrc that contains API token
      name: terraformRC
      inputs:
        secureFile: ${{ parameters.terraformRcFile }}
    - bash: |
        mkdir "${AGENT_HOMEDIRECTORY}/.ssh"
        cp $(sshPrivateKey.secureFilePath) "${AGENT_HOMEDIRECTORY}/.ssh/id_rsa"
        chmod 400 "${AGENT_HOMEDIRECTORY}/.ssh/id_rsa"
        ssh-keyscan -t rsa ${{ parameters.gitSshUrl }} > "${AGENT_HOMEDIRECTORY}/.ssh/known_hosts"
        cp $(terraformRC.secureFilePath) "${AGENT_HOMEDIRECTORY}/.terraformrc"
      displayName: Copy read-only SSH key and terraformrc into $AGENT_HOMEDIRECTORY
    - template: terraform-workspace-steps.yml
      parameters:
        workspace: ${{ parameters.workspace }}
        workspacePrefix: ${{ parameters.workspacePrefix }}
        organization: ${{ parameters.organization }}
        terraformRcPath: $(Agent.HomeDirectory)/.terraformrc
    - template: terraform-docker-steps.yml
      parameters:
        command: init -input=false
        workspace: ${{ parameters.workspace }}

terraform-docker-steps.yml:

parameters:
  command: ''
  workspace: ''
  terraformImageTag: '0.12.3'

steps:
    - bash: |
        if [ -z "$COMMAND"  ] || [ -z "$WORKSPACE" ]; then
          echo "##vso[task.logissue type=error;]Missing required parameters"
          echo "##vso[task.complete result=Failed;]"
        fi
      env:
        COMMAND: ${{ parameters.command }}
        WORKSPACE: ${{ parameters.workspace }}
      displayName: Check for required parameters
    - bash: docker run -e TF_WORKSPACE -v "$AGENT_HOMEDIRECTORY":/root -v "$BUILD_SOURCESDIRECTORY":/src -w /src hashicorp/terraform:${{ parameters.terraformImageTag }} ${{ parameters.command }}
      env:
        TF_WORKSPACE: ${{ parameters.workspace }}
      displayName: terraform ${{ parameters.command }}

terraform-workspace-steps.yml:

parameters:
  workspace: ''
  workspacePrefix: ''
  organization: ''
  apiUrl: 'https://app.terraform.io/api/v2'
  terraformRcPath: ''

steps:
    - bash: |
        if [ -z "$WORKSPACE"  ] || [ -z "$WORKSPACE_PREFIX" ] || [ -z "$ORGANIZATION" ] || [ -z "$TERRAFORM_RC_PATH" ]; then
          echo "##vso[task.logissue type=error;]Missing required parameters"
          echo "##vso[task.complete result=Failed;]"
        fi
      env:
        WORKSPACE: ${{ parameters.workspace }}
        WORKSPACE_PREFIX: ${{ parameters.workspacePrefix }}
        ORGANIZATION: ${{ parameters.organization }}
        TERRAFORM_RC_PATH: ${{ parameters.terraformRcPath }}
      displayName: Check for required parameters
    - bash: |
        api_token=$(cat ${{ parameters.terraformRcPath }} | grep 'token' | sed 's/^.*token = //' | sed 's/"//g')
        response=$(curl -s -X GET -H "Content-Type: application/vnd.api+json" -H "Authorization: Bearer ${api_token}" "${{ parameters.apiUrl }}/organizations/${{ parameters.organization }}/workspaces")
        workspaces=$(printf "%s" "$response" | jq -r '.data[] | .attributes.name')
        # If grep returns 0 indicating the workspace is found, don't do anything. Otherwise create the workspace.
        if printf "%s" "$workspaces" | grep -e '${{ parameters.workspacePrefix }}${{ parameters.workspace }}' > /dev/null; then
          echo "##[section]Workspace ${{ parameters.workspace }} with prefix ${{ parameters.workspacePrefix }} found. No action needed."
        else
          echo "##[warning]Workspace ${{ parameters.workspace }} with prefix ${{ parameters.workspacePrefix }} not found. Creating..."
          post_data=$(jq -n --arg workspaceName ${{ parameters.workspacePrefix }}${{ parameters.workspace }} '{ data: { attributes: { name: $workspaceName }, type: "workspaces" } }')
          echo "##[section]Posting the following data to create the new workspace:"
          echo "$post_data"
          printf "%s" "$post_data" > tmp_post_data.json
          curl -s -X POST -H "Content-Type: application/vnd.api+json" -H "Authorization: Bearer ${api_token}" -d @tmp_post_data.json "${{ parameters.apiUrl }}/organizations/${{ parameters.organization }}/workspaces"
          rm tmp_post_data.json
        fi
      displayName: Create the workspace if it doesn't already exist

Many thanks guys for interesting ideas.
I still wasn't satisfied, as I wanted code to work in all use cases.
And specifically when remote state doesn't exist yet.
I'm using Terraform Enterprise aka Terraform Cloud (https://app.terraform.io) for it, which even in Free tier allows to manage state and many other things.
It conveniently creates a remote workspace for me, and I suppose it won't be a problem for other remote backends like S3 where you create bucket in a separate preparation step.

So here is my ugly code that seems to work
(CI_ENVIRONMENT_SLUG is the short environment name, coming from GitLab CI):

    rm -rf .terraform

    echo "Initializing terraform ..."
    TEMP_FILE=/tmp/terraform-init-output
    set +e
    if TF_WORKSPACE=${CI_ENVIRONMENT_SLUG} terraform init -no-color 2>${TEMP_FILE}
    then
        echo "Selecting terraform workspace ..."
        terraform workspace select ${CI_ENVIRONMENT_SLUG}
    else
        if cat ${TEMP_FILE} | grep -q 'Error: No existing workspaces.'
        then
            echo "Creating terraform workspace ..."
            terraform workspace new ${CI_ENVIRONMENT_SLUG}

            echo "Rerunning init as it failed before ..."
            terraform init
        else
            cat ${TEMP_FILE}
            exit 1
        fi
    fi
    set -e

This is what I've had to do to get around this issue...there might be something easier, so please let me know.

Before running anything in an automated environment (in my case, Bitbucket Pipelines), you MUST create at least 1 workspace manually. This is unfortunate, but it only has to be done once.

Then, the following works for all new workspaces:

$ mkdir .terraform && echo "manually_created_workspace_name" > .terraform/environment
$ terraform init
$ terraform workspace select $WORKSPACE || terraform workspace new $WORKSPACE
$ terraform apply -auto-approve

Essentially, by making the .terraform/environment file before I run terraform init, I can default the workspace to one that for sure exists. Then I try to either select a workspace or create it if it doesn't exist. You can then plan/apply.

@adback03 You can't do this if you're using prefix along remote backend, at least I could not do its work.

terraform {
  backend "remote" {
    organization = "my-org"

    workspaces {
      prefix = "my-infra"
    }
  }
}

@kassyuz You can fix it by feeding 1 to terraform init:

echo '1' | TF_WORKSPACE=$non_existing_workspace_suffix terraform init

I noticed some interesting behavior of terraform init: when there's already one workspace with certain prefix existing (on app.terraform.io), I can just:

echo '1' | TF_WORKSPACE=new-workspace terraform init

and it automatically creates "new-workspace" on app.terraform.io without me ever running terraform workspace new!

This can be demonstrated with a more complete example:

terraform {
  backend "remote" {
    organization = "my-org"

    workspaces {
      prefix = "my-prefix-"
    }
  }
}
# Assumes there aren't any workspaces on app.terraform.io with prefix "my-prefix-" yet
rm -rf ./.terraform

TF_WORKSPACE=first-workspace terraform init
# Returns Error "No existing workspaces.", expected

# We create our first workspace
terraform workspace new first-workspace 

# Interestingly, when we run this, terraform automatically creates
# "my-prefix-second-workspace" on app.terraform.io
echo '1' | TF_WORKSPACE=second-workspace terraform init

# ...but doesn't select it locally
terraform workspace show
# Returns "first-workspace"

# But we can select it manually now
terraform workspace select second-workspace
# Switched to workspace "second-workspace".

@kassyuz You can fix it by feeding 1 to terraform init:

echo '1' | TF_WORKSPACE=$non_existing_workspace_suffix terraform init

thanks @ilyasotkov I did exactly that, and it works.

this worked for me:

terraform init -backend-config="conn_str=$PG_DATABASE_URL" || terraform workspace new $TERRAFORM_WORKSPACE

I was able to have it working, by writing "default" to the .terraform/environment
echo default > .terraform/environment

before running _terraform init_

Was this page helpful?
0 / 5 - 0 ratings