Terraform-provider-aws: AWS assume role not working (regression?)

Created on 23 Nov 2018  路  9Comments  路  Source: hashicorp/terraform-provider-aws

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

$ terraform -v
Terraform v0.11.10
+ provider.aws v1.46.0

Affected Resource(s)

  • aws_XXXXX

Terraform Configuration Files

Sourced from https://github.com/terraform-providers/terraform-provider-aws/issues/472#issuecomment-308311071

# Grab the ARN of the current logged in user
data "aws_caller_identity" "current" {}

# create a role which allows the current user to assume it
resource "aws_iam_role" "terraform_11270" {
  name = "terraform_11270"
  path = "/test/"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "${data.aws_caller_identity.current.arn}"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy" "terraform_11270" {
  name = "terraform_11270"
  role = "${aws_iam_role.terraform_11270.id}"

  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:*",
        "ec2:*"
      ],
      "Resource": "*"
    }
  ]
}
EOF
}

# configure this provider alias to only use the IAM Role created above
provider "aws" {
  alias = "iamrole"

  assume_role {
    role_arn = "${aws_iam_role.terraform_11270.arn}"
  }
}

resource "aws_security_group" "primary" {
  name = "primary"
}

# Create a security group with the above IAM Role assumed
resource "aws_security_group" "secondary" {
  provider = "aws.iamrole"
  name     = "secondary"
}

Expected Behavior

Security group secondary should have been created.

Actual Behavior

Error thrown when trying to assume created role:

Error: Error applying plan:

1 error(s) occurred:

* provider.aws.iamrole: The role "arn:aws:iam::<account>:role/test/terraform_11270" cannot be assumed.

  There are a number of possible causes of this - the most common are:
    * The credentials used in order to assume the role are invalid
    * The credentials do not have appropriate permission to assume the role
    * The role ARN is not valid

Replaying the plan (after ~10 seconds) succeeds in creating the security group:

$ terraform apply
aws_security_group.primary: Refreshing state... (ID: sg-<primary_id>)
data.aws_caller_identity.current: Refreshing state...
aws_iam_role.terraform_11270: Refreshing state... (ID: terraform_11270)
aws_iam_role_policy.terraform_11270: Refreshing state... (ID: terraform_11270:terraform_11270)

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + aws_security_group.secondary
      id:                     <computed>
      arn:                    <computed>
      description:            "Managed by Terraform"
      egress.#:               <computed>
      ingress.#:              <computed>
      name:                   "secondary"
      owner_id:               <computed>
      revoke_rules_on_delete: "false"
      vpc_id:                 <computed>


Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_security_group.secondary: Creating...
  arn:                    "" => "<computed>"
  description:            "" => "Managed by Terraform"
  egress.#:               "" => "<computed>"
  ingress.#:              "" => "<computed>"
  name:                   "" => "secondary"
  owner_id:               "" => "<computed>"
  revoke_rules_on_delete: "" => "false"
  vpc_id:                 "" => "<computed>"
aws_security_group.secondary: Creation complete after 1s (ID: sg-<secondary_id>)

Steps to Reproduce

  1. terraform apply

Important Factoids

References

  • #472
bug provider

All 9 comments

For other's running into the same issue I worked around this by using an external data provider to supply STS credentials:

#!/usr/bin/env python3

import json
import os
import select
import sys
from time import sleep

import boto3
import botocore.exceptions


def error(message):
    """
    Errors must create non-zero status codes and human-readable, ideally one-line, messages on stderr.
    """
    print(message, file=sys.stderr)
    sys.exit(1)


def validate(data):
    """
    Query data and result data must have keys who's values are strings.
    """
    if not isinstance(data, dict):
        error('Data must be a dictionary.')
    for value in data.values():
        if not isinstance(value, str):
            error('Values must be strings.')


def assume_role():
    if not select.select([sys.stdin,], [], [], 0.0)[0]:
        error("No stdin data.")

    query = json.loads(sys.stdin.read())

    if not isinstance(query, dict):
        error("Data must be a dictionary.")

    validate(query)

    if "role_arn" not in query:
        error("Data parameter must define 'role_arn'.")

    session = boto3.Session()
    if "access_key" in query and "secret_key" in query:
        session = boto3.Session(
            aws_access_key_id=query["access_key"],
            aws_secret_access_key=query["secret_key"],
        )

    if "wait" in query:
        sleep(int(query["wait"]))

    sts = session.client("sts")
    response = {}
    try:
        response = sts.assume_role(RoleArn=query["role_arn"], RoleSessionName=os.path.basename(sys.argv[0]))
    except botocore.exceptions.ClientError as e:
        error(f"Error from AWS API: {e.response['Error']['Message']}")

    sys.stdout.write(json.dumps({
        "access_key": response["Credentials"]["AccessKeyId"],
        "secret_key": response["Credentials"]["SecretAccessKey"],
        "token": response["Credentials"]["SessionToken"],
    }))


if __name__ == '__main__':
    assume_role()

And the following HCL configuration

```hcl
data "external" "aws_assume_role" {
program = ["python3", "terraform_aws_assume_role.py"]
query {
role_arn = "${aws_iam_role.terraform_11270.arn}"
wait = 10
}
depends_on = ["aws_iam_role.terraform_11270", "aws_iam_role_policy.terraform_11270"]
}

configure this provider alias to only use the IAM Role created above

provider "aws" {
alias = "iamrole"

access_key = "${data.external.aws_assume_role.result["access_key"]}"
secret_key = "${data.external.aws_assume_role.result["secret_key"]}"
token = "${data.external.aws_assume_role.result["token"]}"
}

I met the same issue under version:

/terraform-plan/dev/application # terraform -v
Terraform v0.11.11

  • provider.aws v1.59.0

but i could not see the behavior like "Replaying the plan (after ~10 seconds) succeeds in creating the security group:", the error exists always.

I _believe_ this is resulting from the same bug addressed here: hashicorp/aws-sdk-go-base#5

I have had success using the python program provided by @markchalloner - thank you :) I use profiles to choose which user to assume role as so I added the following check for a query["profile"] before the default call to Boto3.Session()

    if "profile" in query:
        session = boto3.Session(profile_name=query["profile"])

Seems to work for me with the following HCL configuration:

data "external" "aws_assume_role" {
  program = ["python3", "terraform_aws_assume_role.py"]
  query {
    role_arn = "<insert role_arn here>"
    profile = "<insert profile name to assume role with here>"
    wait = 3
  }
}

provider "aws" {
  access_key = "${data.external.aws_assume_role.result["access_key"]}"
  secret_key = "${data.external.aws_assume_role.result["secret_key"]}"
  token = "${data.external.aws_assume_role.result["token"]}"
}

@aeschright @bflad I've reproduced this issue. It results from eventual consistency. After the creation of a role, it cannot be assumed for 10-30 seconds.

I messed with a wait state for this (see my branch) but the IAM role goes through 2 states before being ready. For 10-20 seconds, the API returns AccessDenied and then UnauthorizedOperation and finally you can successfully assume the role.

@markchalloner An easy, ugly workaround for this is to use a local-exec provisioner with a sleep (timeout on Windows): see a reproducible test of the workaround

resource "aws_iam_role" "tf-test-6d3868d9bed3" {
  name = var.role_name
  path = "/test/"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "${data.aws_caller_identity.current.arn}"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

  provisioner "local-exec" {
    command = "sleep 30"
  }
}

I've created a repo with tests to easily reproduce credential-related issues. Visit and contribute. The test to reproduce this issue is here: https://github.com/YakDriver/terraform-cred-tests/tree/master/tests/assume_after_create

I have had success using the python program provided by @markchalloner - thank you :) I use profiles to choose which user to assume role as so I added the following check for a query["profile"] before the default call to Boto3.Session()

    if "profile" in query:
        session = boto3.Session(profile_name=query["profile"])

Seems to work for me with the following HCL configuration:

data "external" "aws_assume_role" {
  program = ["python3", "terraform_aws_assume_role.py"]
  query {
    role_arn = "<insert role_arn here>"
    profile = "<insert profile name to assume role with here>"
    wait = 3
  }
}

provider "aws" {
  access_key = "${data.external.aws_assume_role.result["access_key"]}"
  secret_key = "${data.external.aws_assume_role.result["secret_key"]}"
  token = "${data.external.aws_assume_role.result["token"]}"
}

Is this still an issue? Do you have a link to the Python program provided by Mark. It would be of great use! Thank you.

It's higher up in the comments 馃槀
https://github.com/terraform-providers/terraform-provider-aws/issues/6566#issuecomment-441253343

Unsure if it's still an issue

It's higher up in the comments 馃槀
#6566 (comment)

Unsure if it's still an issue

Sorry so it is haha! Anyway yes it appears to be an issue for me.

Was this page helpful?
0 / 5 - 0 ratings