Terraform 0.9.6 - 0.10.8
So I was upgrading Terraform from 0.9.5 to 0.9.6 and I am now getting the following error when I run a jenkins job on a build slave with IAM permissions attached:
terraform096 apply -var db_snap_stamp=171120171217 -var db_snapshot=rds-dev-13102017 -var-file=env.tfvars -no-color
Error loading state: AccessDenied: Access Denied
status code: 403, request id: 288766CE5CCA24A0, host id: FOOBAR
The jenkins job does run terraform init before hand and on my local test server I am not seeing the error. On the local test server I am using an aws credentials file.
I have had a look through the release notes for 0.9.6 but I can't see which of the changes could be causing this ( https://github.com/hashicorp/terraform/issues/14423 maybe?).
Any ideas?
UPDATE
I turned on terraform debug and found that the 403 was happening on a s3 list object. The IAM role in use allows this in 0.9.5 but NOT in 0.9.6 to 0.10.8 - I tried giving the role admin access but no change:
-----------------------------------------------------
2017/11/17 15:01:47 [DEBUG] [aws-sdk-go] DEBUG: Response s3/ListObjects
Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 403 Forbidden
Connection: close
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Fri, 17 Nov 2017 15:01:47 GMT
Server: AmazonS3
X-Amz-Bucket-Region: eu-west-2
The S3 bucket in question does use KMS encryption but all that is set up in the init run prior:
terraform096 init -backend=true -get=true -input=false -backend-config="bucket=${BUCKET}" -backend-config="key=${ENV}.tfstate" -backend-config="region=eu-west-2" backend-config="profile=${AWS_PROFILE}" -backend-config="encrypt=true" -backend-config="kms_key_id=${KMS}"
I can get versions above 0.9.6 working when not using S3 endpoints locally. On Jenkins Build Slaves in a VPC with private subnets and S3 endpoints 0.9.5 works but versions above this error.
Hi @SnazzyBootMan! Sorry for this weird behavior.
It looks like this extra ListObjects call was introduced by b279b1abb57885bb74d8318cf4d10edfc8014b3a, which uses it to recognize whether it's creating a new workspace or writing an existing one.
You mentioned that you tried giving the role "admin access"; what permissions exactly does that imply? If you can share the effective permissions both before and after applying admin access that may help to figure out what exactly is failing here. Unfortunately it's not always obvious specifically which actions and resource strings apply to each operation, but Terraform here is running ListObjects with a prefix argument of the given environment key prefix, and that key prefix may be adding an extra hurdle that must be contended with in the policy.
Hi @apparentlymart, the "admin access" role was a stab in the dark and is actually a wildcard that gives full access to all services. Before I applied this the role had wildcards by services e.g ec2:, s3:, kms:* and some others. Not really much of a difference but it was before I turned on debug and I thought I had added something new. I can be more specific when I get back into the office in the morning.
@apparentlymart This is the role that was in place:
{
"Statement": [
{
"Sid": "Stmt1312295543082",
"Action": [
"ec2:*",
"autoscaling:*",
"route53:*",
"elasticloadbalancing:*",
"s3:*",
"sns:*",
"cloudwatch:*",
"rds:*",
"elasticache:*",
"events:*",
"acm:ListCertificates",
"iam:GetRole",
"iam:PassRole",
"iam:RemoveRoleFromInstanceProfile",
"iam:GetRolePolicy",
"iam:DeleteRolePolicy",
"iam:ListInstanceProfilesForRole",
"iam:DeleteInstanceProfile",
"iam:CreateRole",
"iam:CreateInstanceProfile",
"iam:GetInstanceProfile",
"iam:PutRolePolicy",
"iam:AddRoleToInstanceProfile",
"iam:DeleteRole",
"kms:UpdateKeyDescription",
"kms:CreateKey",
"kms:DescribeKey",
"kms:GetKeyPolicy",
"kms:GetKeyRotationStatus",
"kms:ListResourceTags",
"lambda:*",
"logs:*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
This is the role that I added to rule out permissions issues:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "*",
"Resource": "*"
}
]
}
What is the update on this this? I am getting the same error with v0.11.0.
I'm sorry I didn't respond here before... at the moment I don't have any leads as to what's going on here, and haven't been able to reproduce it myself.
I have a few ideas but I'm not sure if any apply to your respective configurations:
s3:ListObjects (which deals only with object metadata) would need to do any KMS operations (which are, AFAIK, concerned with encrypting the _body_ of each object).It may be possible to gather some additional information using IAM Policy Simulator; you can use the detailed request debug information from Terraform's log to see what actions are being performed and try them with the policy simulator to see which policy statements are affecting each operation.
Thanks - just back from Christmas Holidays so I will take a look and see what I can find.
i stumbled upon this thread while looking for a solution to my problem. I have a Make file. The Makefile looks like this:
.PHONY: all vpc instances destroy_all destroy_vpc destroy_instances
all: vpc instances
vpc:
cd vpc && terraform plan -out=create_vpc && terraform apply "create_vpc" && cd -
instances:
cd instances && terraform plan -out=terraform.tfplan && terraform apply terraform.tfplan && cd -
destroy_instances:
cd instances && terraform destroy && cd -
destroy_vpc:
cd vpc && terraform destroy && cd -
destroy_all:
cd instances && terraform destroy && cd - && cd vpc && terraform destroy && cd -
If I run "make vpc" it create the create_vpc plan. but when it tries and run the plan, the next step in "vpc" directive it fails with:
Failed to load backend: Error reading state: AccessDenied: Access Denied
status code: 403, request id: 032613A5DE265353, host id:
If i run each of those commands from the command line it all works fine.
my aws creds are in the ~/.aws/credentials file and i have a profile called "colmac"
Is this related? and any ideas?
Thanks
-cm3
After upgrading to 0.11.1, my S3 backend is working again. This AccessDenied error is strange.
hmm, i am running Terraform v0.11.1 but have the error.
I still get the error - the only testing I have been able to do so far is upgrading all my Jenkins slaves to v0.11.1.
Switching back to v0.9.5 fixes.
I discovered that if i type the command over and over it will at some point run!
colmac$ terraform plan --out=createCNAME
Error: Error loading state: AccessDenied: Access Denied
status code: 403, request id: blah, host id: blah
If i type it again, (maybe it will be a few times) it will run at some point:
colmac$ terraform plan --out=createCNAME
var.user_name
Enter a value: colmac
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
Plan: 1 to add, 2 to change, 0 to destroy.
This plan was saved to: createCNAME
To perform exactly these actions, run the following command to apply:
terraform apply "createCNAME"
after entering the same command a few times it did run?? very strange
also. even the apply does the same thing...
colmac$ terraform apply "createCNAME"
Failed to load backend: Error reading state: AccessDenied: Access Denied
status code: 403, request id: blah, host id: blah
colmac$ terraform apply "createCNAME"
Error: Error loading state: AccessDenied: Access Denied
status code: 403, request id: blah, host id: blah
colmac$ terraform apply "createCNAME"
aws_route53_record.blah: Creation complete after 47s (ID: blah_blah_CNAME)
Apply complete! Resources: 1 added, 2 changed, 0 destroyed.
Releasing state lock. This may take a few moments...
it ran after 3 tries
Okay - so I have finally got back to testing this and found that it is related to the S3 Endpoint IAM permissions. The required permissions after v0.9.5 have changed (not sure where exactly as I haven't had time to investigate).
Initially my S3 Endpoint IAM permissions for "aws_vpc_endpoint" were:
"s3:ListObjects",
"s3:AbortMultipartUpload",
"s3:GetBucketAcl",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:GetObjectAcl",
"s3:ListBucketMultipartUploads",
"s3:PutObject",
"s3:PutObjectAcl"
I tried a very lazy test and changed this to:
"s3:*"
This has allowed Terraform to v0.11.1 to run an apply. "ListObjects" was included as an allowed permission so not sure why it complained the way it did. Now I just need to find out what the extra permissions are that were added in v0.9.6 onwards to tighten up the IAM permissions.
I had this issue when running terraform init initially. The command would always fail with:
2018/07/21 15:16:21 [DEBUG] [aws-sdk-go] DEBUG: Validate Response s3/ListObjects failed, not retrying, error AccessDenied: Access Denied
When I ran:
aws s3 ls s3://my-bucket-name
This always worked. In the end I deleted the tfstate file:
rm .terraform/terraform.tfstate
When I re-ran terraform init the init completed successfully. It may have been that terraform was using the wrong creds. When I first ran terraform init I was missing some env vars and so terraform was (I suspect) using some incorrect creds from my ~/.aws/credentials file. I guess some credential information got cached in the tfstate?
I guess some credential information got cached in the tfstate?
I'm pretty sure that should never happen...
But I wonder if there was perhaps an old, incorrect bucket name (or bucket object) referenced in your state, and AWS was returning Access Denied rather than Not Found?
It's hard to say now because it's fixed, but perhaps more specific error messages could help avoid any confusion in this situation.
rm .terraform/terraform.tfstate also worked for me. I had been fiddling around with the s3 backend bucket names/keys previously so I assume it's something to do with that.
I am having the same error with terraform init with our Jenkins CI setup and fixed temporarily by running it in an EC2 instance without an ec2 instance role attached. It looks like terraform is using the ec2 instance role when calling STS even when the provider is set to use profile. Is this by design or is there a flag to make sure terraform will use the AWS profile instead of the EC2 role?
rm .terraform/terraform.tfstatealso worked for me. I had been fiddling around with the s3 backend bucket names/keys previously so I assume it's something to do with that.
Thank you @LittleMikeDev , it helped me
Most helpful comment
I had this issue when running
terraform initinitially. The command would always fail with:When I ran:
This always worked. In the end I deleted the tfstate file:
When I re-ran
terraform initthe init completed successfully. It may have been that terraform was using the wrong creds. When I first ranterraform initI was missing some env vars and so terraform was (I suspect) using some incorrect creds from my~/.aws/credentialsfile. I guess some credential information got cached in the tfstate?