When creating a new KMS key with the AWS provider, if the key policy references an IAM role that was just created from the same Terraform configuration, a generic MalformedPolicyDocumentException is returned by the AWS API and the operation fails. Waiting a minute and running apply again results in success. It seems that there may be a short delay between the time an IAM entity (in my case it's a role, but I'd bet it'd be the same for a user) is created and the time KMS is able to see it and use it in a key policy. Terraform should probably wait and retry the operation a few times like it does with other resources that may take time to complete.
Unfortunately, the KMS key creation API uses this generic MalformedPolicyDocumentException for any error related to the contents of the key policy (a syntax error in the policy or using a malformed ARN will also result in the same exception) so it may not be possible to differentiate between this case and an actual failure that will not succeed on a subsequent attempt. The exception doesn't include a message or any sort of information other than its name.
0.6.15
resource "aws_iam_role" "example" {
name = "example"
assume_role_policy = "<redacted>"
}
resource "aws_kms_key" "example" {
description = "example"
policy = "${file("kms-policy.json")}"
}
The policy looks like this:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Enable IAM User Permissions",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::redacted:root"
]
},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Allow access for Key Administrators",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::redacted:user/alice",
"arn:aws:iam::redacted:user/bob"
]
},
"Action": [
"kms:Create*",
"kms:Describe*",
"kms:Enable*",
"kms:List*",
"kms:Put*",
"kms:Update*",
"kms:Revoke*",
"kms:Disable*",
"kms:Get*",
"kms:Delete*",
"kms:ScheduleKeyDeletion",
"kms:CancelKeyDeletion"
],
"Resource": "*"
},
{
"Sid": "Allow use of the key",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::redacted:role/example",
"arn:aws:iam::redacted:user/alice",
"arn:aws:iam::redacted:user/bob"
]
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Sid": "Allow attachment of persistent resources",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::redacted:role/example",
"arn:aws:iam::redacted:user/alice",
"arn:aws:iam::redacted:user/bob"
]
},
"Action": [
"kms:CreateGrant",
"kms:ListGrants",
"kms:RevokeGrant"
],
"Resource": "*",
"Condition": {
"Bool": {
"kms:GrantIsForAWSResource": true
}
}
}
]
}
Note that the "example" IAM role is created by Terraform, but the IAM users "alice" and "bob" already exist and are not managed by Terraform.
The key should be created and terraform apply should complete successfully.
The following error is produced, which halts the terraform apply:
* aws_kms_key.pki: MalformedPolicyDocumentException:
status code: 400, request id: <redacted>
Waiting a minute and running terraform apply again successfully creates the key.
terraform applyHey @jimmycuadra –
This is happening because Terraform will create your resources in parallel, if it's unable to see any dependency / relationship between them. In your example, the role and key are not related, so it creates both at the same(-ish) time, which results in the error you've seen. The policy you're using does not derive any of it's contents from the role you're creating, so it doesn't know that it needs to allow the role to be created _first_, and then the key.
So, the creation graph of your example looks like so:

After the Provider is established, the other 2 remaining resources can be created in parallel.
To explicitly declare a dependency, you can use the depends_on attribute, as I demonstrate here:
resource "aws_iam_role" "example" {
name = "example"
assume_role_policy = <<EOF
{
"Version": "2008-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {"AWS": "*"},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
resource "aws_kms_key" "example" {
description = "example"
depends_on = ["aws_iam_role.example"]
policy = "${file("kms_policy.json")}"
}
This explicitly declares a dependency, and the graph to create your infrastructure becomes this:

Where you see the role must be created first before the key can be created.
Hope that helps!
I did as you described but still got the same behavior as before: MalformedPolicyDocumentException on the first run, success on the second. Can you reopen the issue or do you have any other ideas how to address this?
Any chance of this being reopened? The issue still happens after adding depends_on.
Likely fixed in 0.7. See https://github.com/hashicorp/terraform/issues/4709
this is still happening on 0.7.3
I added the depends_on with no luck. Same behavior: Fails the first run, passes the second run.
Could a maintainer please reopen this issue?
I would believe this may be still happening even with direct dependency connection between aws_kms_key and the IAM role/user, just because IAM is eventually consistent.
The best solution would be probably similar to https://github.com/hashicorp/terraform/pull/7324 which addressed EC2 instance & IAM Role itself (possibly referencing other IAM identities).
Just had the same issue with TF version 8.7.
The issue is still there. It works 3 out of 5 times at the moment and fails with the below exception
MalformedPolicyDocumentException:
15:21:16 status code: 400, request id: 8dc35a50-f45b-11e6-adb0-35c782e9b387
I think the retry isn't happening.
Do we have a solution for this? Can we reopen this issue?
This issue is still present in Terraform version 0.11.14.4 using AWS provider 2.39.0. Can we please reopen the issue?
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Most helpful comment
Just had the same issue with TF version 8.7.
The issue is still there. It works 3 out of 5 times at the moment and fails with the below exception
I think the retry isn't happening.
Do we have a solution for this? Can we reopen this issue?