Terraform-provider-aws: MalformedPolicyDocument: Invalid principal in policy: "AWS" [Only when Principal is a ROLE.]

Created on 7 Jun 2019  路  12Comments  路  Source: hashicorp/terraform-provider-aws

https://github.com/terraform-providers/terraform-provider-aws/issues/1388

The documentation specifically says this is allowed:
https://www.terraform.io/docs/providers/aws/d/iam_policy_document.html#example-with-multiple-principals

data "aws_caller_identity" "caller_identity" {}

data "aws_iam_policy_document" "trust-assume-role-policy" {
  statement {
    effect = "Allow"

    actions = [
      "sts:AssumeRole",
    ]

    principals {
      type = "Service"

      identifiers = [
        "ec2.amazonaws.com",
      ]
    }

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${data.aws_caller_identity.caller_identity.account_id}:role/nodes.${var.env}"]
    }
  }
}

Terraform message:
Assume Role Policy: MalformedPolicyDocument: Invalid principal in policy

documentation good first issue serviciam

Most helpful comment

I don't think this is an issue with Terraform or the AWS provider. If you try creating this role in the AWS console you would likely get the same error. I've experienced this problem and ended up here when searching for a solution. What I ultimately discovered is that you get this error if the role you are referencing doesn't actually exist. The reason is that the role ARN is translated to the underlying unique role ID when it is saved. See https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html

If your Principal element in a role trust policy contains an ARN that points to a specific IAM role, then that ARN is transformed to the role's unique principal ID when the policy is saved.

I created the referenced role just to test, and this error went away.

All 12 comments

I don't think this is an issue with Terraform or the AWS provider. If you try creating this role in the AWS console you would likely get the same error. I've experienced this problem and ended up here when searching for a solution. What I ultimately discovered is that you get this error if the role you are referencing doesn't actually exist. The reason is that the role ARN is translated to the underlying unique role ID when it is saved. See https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html

If your Principal element in a role trust policy contains an ARN that points to a specific IAM role, then that ARN is transformed to the role's unique principal ID when the policy is saved.

I created the referenced role just to test, and this error went away.

Could you please try adding policy as json in role itself.I was getting the same error. I tried this and it worked
Something Like this -

resource "aws_iam_role" "test_role" {
  name = "test_role"

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com",
        "AWS": "${resource}"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF

  tags = {
    tag-key = "tag-value"
  }
}.

I also have the same error when trying to create an aws_iam_policy_document which is referencing a an aws_iam_user in Principals. The output is "MalformedPolicyDocumentException: Policy contains an invalid principal". I tried to use "depends_on" to force the resource dependency, but the same error arises.

I just encountered this error when the username whose ARN I am using as Principal in the "assume role policy" contains valid as IAM identifier but invalid as ARN identifier characters (e.g. @ or .).

For example, this thing triggers the error:

resource "aws_iam_user" "github" {
  name = "github.myapp"
  path = "/github/"
}

// Define who can assume the role, where this policy is attached
data "aws_iam_policy_document" "assume_role_policy" {
  statement {
    actions = [
      "sts:AssumeRole"
    ]

    principals {
      type = "AWS"
      identifiers = [
        aws_iam_user.github.arn
      ]
    }
  }
}
// policy and role creation omitted for readability...

If the "name" attribute of the "aws_iam_user" contains simple alphanumeric characters - it works.

Another workaround (better in my opinion):
Don't refer to the ARN when defining the Principal trust relation: aws_iam_user.github.arn.
Instead, refer to the unique ID of the IAM user: aws_iam_user.github.unique_id.

This helped resolve the issue on my end, allowing me to keep using characters like @ and . as IAM usernames.

Terraform v0.12.21
provider.aws v2.51.0

For me this also happens when I use an account instead of a role.

The following aws_iam_policy_document worked perfectly fine for weeks

data "aws_iam_policy_document" "assume_role_policy_readonly" {
  statement {
    effect = "Allow"

    actions = [
      "sts:AssumeRole"
    ]

    condition {
      test = "Bool"
      variable = "aws:MultiFactorAuthPresent"
      values = [
        "true"
      ]
    }

    principals {
      type = "AWS"

      identifiers = [
        "arn:aws:iam::${var.aws_users_account_id}:root"
      ]
    }
  }
}

When I tried to update the role a few days ago I just got:

Error Updating IAM Role (readonly) Assume Role Policy: MalformedPolicyDocument: Invalid principal in policy: "AWS":"arn:aws:iam::###########:root" status code: 400

In the diff of the terraform plan it looks like terraform wants to remove the type:

Principal = {
  ~ AWS = "arn:aws:iam::#########:root" -> "arn:aws:iam::#########:root"
}

I completely removed the role and tried to create it from scratch. This resulted in the same error message.

Then I tried to use the account id directly in order to recreate the role.

identifiers = [
  var.aws_users_account_id
]

This resulted in the same error message, again.

At last I used inline JSON and tried to recreate the role:

  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Action": "sts:AssumeRole",
      "Condition": {
        "Bool": {
          "aws:MultiFactorAuthPresent": "true"
        }
      },
      "Principal": {
        "AWS": "arn:aws:iam::${var.aws_users_account_id}:root"
      }
    }
  ]
}
EOF

This actually worked. The role was created successfully, but as soon as I ran terraform again (using inline JSON) terraform tried to get rid of the type again

Principal = {
  ~ AWS = "arn:aws:iam::#########:root" -> "arn:aws:iam::#########:root"
}

and resulted in Error Updating IAM Role (readonly) Assume Role Policy: MalformedPolicyDocument: Invalid principal in policy: "AWS":"arn:aws:iam::###########:root" status code: 400

I also tried to set the aws provider to a previous version without success.

EDIT:
We use variables fo the account ids. When we introduced type number to those variables the behaviour above was the result. The reason is that account ids can have leading zeros. So instead of number we used string as type for the variables of the account ids and that fixed the problem for us.

I encountered this issue when one of the iam user has been removed from our user list. He resigned and urgently we removed his IAM User. We succesfully removed him from most of our user configs but forgot to removed in a hardcoded users in terraform vars.

Same isuse here.
Creating a Secret whose policy contains reference to a role (role has an assume role policy).

Error: setting Secrets Manager Secret
policy: MalformedPolicyDocumentException: This resource policy contains an unsupported principal.
on secrets_create.tf line 23,
in resource "aws_secretsmanager_secret"
resource "aws_secretsmanager_secret" "my_secret"

From the apply output, I see that the role was completed before the secret was reached

2020-09-29T18:16:07.9115331Z aws_iam_role.my_role: Creation complete after 2s [id=SomeRole]
2020-09-29T18:16:13.4780358Z aws_secretsmanager_secret.my_secret: Creating..
2020-09-29T18:21:30.2262084Z Error: error setting Secrets Manager Secret

As with previous commenters, if I simply run the apply a second time, everything succeeds - but that is not an acceptable solution.
Have tried various depends_on workarounds, to no avail.

resource "aws_secretsmanager_secret" "my_secret" {
    name = "env/db/abc"
    policy = data.aws_iam_policy_document.my_secret_admin_policy.json
    recovery_window_in_days = 0
}

data aws_iam_policy_document "my_secret_admin_policy" {
  statement {
    effect = "Allow"
    principals {
      identifiers = ["${aws_iam_role.my_role.arn}"]
      type = "AWS"
    }
    actions = [
      "secretsmanager:GetSecret",
      "secretsmanager:GetSecretValue",
      "secretsmanager:ListSecrets"
    ]
    resources = ["*"]
  }
}

resource "aws_iam_role" "my_role" {
    name     = "SomeRole"
        assume_role_policy = <<EOF
{
        "Version": "2012-10-17",
        "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
            "Federated": "arn:aws:iam::${local.account_id}:xxxxxxxxxxxx"
            },
            "Action": "sts:AssumeRoleWithSAML",
            "Condition": {
            "StringEquals": {
                "SAML:aud": "https://signin.aws.amazon.com/saml"
            }
            }
        }
        ]
    }
    EOF
} 

I encountered this today when I create a user and add that user arn into the trust policy for an existing role. I was able to recreate it consistently. The error I got was:

Error: Error Updating IAM Role (test_cert) Assume Role Policy: MalformedPolicyDocument: Invalid principal in policy: "AWS":"arn:aws:iam::xxx:user/test_user"

In order to workaround it I added a local-exec to the user creation (thankfully I have a library module that we use to create all users). Note that I can safely use the linux "sleep command as all our terraform runs inside a linux container. Others may want to use the terraform time_sleep resource. https://registry.terraform.io/providers/hashicorp/time/latest/docs/resources/sleep

resource "aws_iam_user" "my_user" {
  name          = "svc_terraform_${var.name}"
  force_destroy = var.force_destroy
  tags          = var.tags

  provisioner "local-exec" {
    # This is because when we create a user and immediately add it into the trust policy of a role
    # it can fail. IAM resources are all handled via us-east-1 and have to get replicated/cached etc
    # see https://docs.aws.amazon.com/IAM/latest/UserGuide/troubleshoot_general.html#troubleshoot_general_eventual-consistency
    # I thought that waiting for the user to exist would fix this, but it doesn't
    # command = "aws iam wait user-exists --user-name ${aws_iam_user.my_user.name}"
    #
    # I also tried using an explicit AWS provider for us-east-1, but that didn't seem to have any effect.
    # Sleeping seems to work, but I'm sure if IAM is delayed then it could break. Rerunning the plan solves this,
    # but I wanted to find something that makes this a non-problem most of the time. 

    command = "sleep 10s"
  }
}

Theoretically this could happen on other IAM resources (roles, policies etc) but I've only experienced it with users so far.

What @rsheldon recommended worked great for me.
Thanks!

@yanirj .. it works, but using sleep arrangements is not really a 'production' level solution to fill anyone with confidence.
Hope someone fixes this ...

@rsheldon and I was trying 5 seconds and still failing... 10 was enough...

Was this page helpful?
0 / 5 - 0 ratings