Terraform v0.11.10
data "aws_iam_policy_document" "emr_assume_role" {
statement {
effect = "Allow"
principals {
type = "Service"
identifiers = ["elasticmapreduce.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}
resource "aws_iam_role" "emr_service_role" {
name = "cjf-example-service-role"
assume_role_policy = "${data.aws_iam_policy_document.emr_assume_role.json}"
}
resource "aws_iam_role_policy_attachment" "emr_service_role" {
role = "${aws_iam_role.emr_service_role.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceRole"
}
data "aws_iam_policy_document" "ec2_assume_role" {
statement {
effect = "Allow"
principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}
resource "aws_iam_role" "emr_ec2_instance_profile" {
name = "cjf-JobFlowInstanceProfile"
assume_role_policy = "${data.aws_iam_policy_document.ec2_assume_role.json}"
}
resource "aws_iam_role_policy_attachment" "emr_ec2_instance_profile" {
role = "${aws_iam_role.emr_ec2_instance_profile.name}"
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role"
}
resource "aws_iam_instance_profile" "emr_ec2_instance_profile" {
name = "${aws_iam_role.emr_ec2_instance_profile.name}"
role = "${aws_iam_role.emr_ec2_instance_profile.name}"
}
resource "aws_security_group" "sg" {
revoke_rules_on_delete = true
tags = {
created_by = "Charles.Ferguson"
}
}
resource "aws_emr_cluster" "cluster" {
name = "emr-test-arn"
release_label = "emr-5.13.0"
applications = ["Spark"]
termination_protection = false
keep_job_flow_alive_when_no_steps = true
ec2_attributes {
emr_managed_master_security_group = "${aws_security_group.sg.id}"
emr_managed_slave_security_group = "${aws_security_group.sg.id}"
instance_profile = "${aws_iam_instance_profile.emr_ec2_instance_profile.arn}"
}
master_instance_group {
instance_type = "m5.xlarge"
}
core_instance_group {
instance_type = "m5.xlarge"
instance_count = 1
}
tags = {
created_by = "Charles.Ferguson"
}
bootstrap_action = [
{
path = "s3://elasticmapreduce/bootstrap-actions/run-if"
name = "runif-1"
args = ["instance.isMaster=true", "echo running on master node"]
},
{
path = "s3://elasticmapreduce/bootstrap-actions/run-if"
name = "runif-2"
args = ["instance.isMaster=true", "echo REALLY running on master node"]
},
{
path = "s3://elasticmapreduce/bootstrap-actions/run-if"
name = "runif-3"
args = ["instance.isMaster=true", "echo HONESTLY running on master node"]
}
]
service_role = "geospock-emr-service-role"
}
The 3 bootstrap actions, executed in the order runif-1
, runif-2
, runif-3
.
They're executed in a random order.
terraform apply
Ignore the fact that this reports VALIDATION_ERROR: The security group sg-025fab96aef45ece4 for the master instance i-0371167ca92c75dc2 does not have any egress rule
- this will have happened after the cluster deployed and is not part of the problem with this deployment.
Visit the AWS console and observe the order of the bootstrap actions on the cluster tab.
Observe that they're not in the order given (unless you were lucky - in which case repeat, and you'll see that they change).
If you wish, use command line to list the steps on the cluster:
aws emr describe-cluster --cluster-id CLUSTER-ID-HERE | jq .Cluster.BootstrapActions
Which in my case gave me:
~/example_emr (develop)$ aws emr describe-cluster --cluster-id j-2LC74XPITNXHM | jq .Cluster.BootstrapActions
[
{
"Name": "runif-3",
"ScriptPath": "s3://elasticmapreduce/bootstrap-actions/run-if",
"Args": [
"instance.isMaster=true",
"echo HONESTLY running on master node"
]
},
{
"Name": "runif-2",
"ScriptPath": "s3://elasticmapreduce/bootstrap-actions/run-if",
"Args": [
"instance.isMaster=true",
"echo REALLY running on master node"
]
},
{
"Name": "runif-1",
"ScriptPath": "s3://elasticmapreduce/bootstrap-actions/run-if",
"Args": [
"instance.isMaster=true",
"echo running on master node"
]
}
]
This line states that the bootstrap_actions are defined as a Set. They should be a List. Otherwise, they'll come out in a random order. This is REALLY important if you're trying to create EMR clusters because you want to be able to guarantee the order in which the operations you define will be run. Having them run randomly is ... unhelpful.
"bootstrap_action": {
Type: schema.TypeSet,
Optional: true,
ForceNew: true,
Recommend changing the Type
to be schema.TypeList
.
Any updates on this issue. We need to convert CloudFormation templates for EMR over to Terraform and our clusters won't start unless the Bootstrap actions are executed in the correct order. This is for a major enterprise application and this issue has halted the conversion to Terraform.
thanks!
Please thumbs up the PR! https://github.com/terraform-providers/terraform-provider-aws/pull/12389
cc: @karl-cardenas-coding, @sabarivr, @Meestafan, @nitincm, @arajhub, @icycle77, @salikov1809, @tom-geospock, @Zanvork, @fpenim, @gareth625, @tridrummer, @dantonwhittier, @lyonsden
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!
Most helpful comment
Any updates on this issue. We need to convert CloudFormation templates for EMR over to Terraform and our clusters won't start unless the Bootstrap actions are executed in the correct order. This is for a major enterprise application and this issue has halted the conversion to Terraform.
thanks!