Terraform-provider-aws: aws_emr_cluster recreates cluster when there are no changes in configurations_json

Created on 10 Jan 2019 · 2Comments · Source: hashicorp/terraform-provider-aws

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

```$ terraform version
Terraform v0.11.11

provider.aws v1.54.0
provider.external v1.0.0
provider.postgresql v0.1.3
provider.template v1.0.0


### Affected Resource(s)

<!--- Please list the affected resources and data sources. --->

* aws_emr_cluster

### Terraform Configuration Files

<!--- Information about code formatting: https://help.github.com/articles/basic-writing-and-formatting-syntax/#quoting-code --->

```hcl
data "template_file" "emr_configuration_json" {
  template = <<EOF
[
    {
        "classification": "spark-log4j",
        "configurations": [],
        "properties": {
            "log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter": "WARN",
            "log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper": "WARN",
            "log4j.logger.org.apache.spark.streaming": "WARN",
            "log4j.rootCategory": "WARN, console"
        }
    },
    {
        "classification": "spark-env",
        "configurations": [
            {
                "classification": "export",
                "configurations": [],
                "properties": {
                    "PYSPARK_PYTHON": "/usr/bin/python3.6"
                }
            }
        ],
        "properties": {}
    }
]
EOF
}

resource "aws_emr_cluster" "data_emr_519" {
  name          = "data-emr-519"
  release_label = "emr-5.19.0"
  applications  = ["Spark", "Hadoop"]
  count         = "${lookup(var.emr_count, var.env)}"

  ec2_attributes {
    subnet_id                         = "${var.subnet_id}"
    emr_managed_master_security_group = "${var.master_security_group_id}"
    emr_managed_slave_security_group  = "${var.slave_security_group_id}"
    instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
    key_name                          = "${var.data_karl_marx_deploy_key_name}"
    additional_master_security_groups = "${var.sg_emr_gocd_id}"
    additional_slave_security_groups  = "${var.sg_emr_gocd_id}"
  }

  instance_group {
    instance_role  = "MASTER"
    instance_count = 1
    instance_type  = "${lookup(var.emr_master_instance_type, var.env)}"
  }

  instance_group {
    instance_role  = "CORE"
    instance_count = "${lookup(var.emr_core_instances_count, var.env)}"
    instance_type  = "${lookup(var.emr_core_instances_type, var.env)}"
  }

  tags {
    description    = "Managed by Terraform"
    "sf:costGroup" = "data"
    "sf:env"       = "${var.env}"
    "sf:team"      = "data"
  }

  configurations_json = "${data.template_file.emr_configuration_json.rendered}"

  service_role = "${aws_iam_role.iam_emr_service_role.arn}"

  bootstrap_action {
    path = "s3://${aws_s3_bucket.data_emr_bootstrap_actions.bucket}/bootstrap.sh"
    name = "bootstrap"
  }

  depends_on = ["aws_s3_bucket.data_emr_bootstrap_actions", "aws_s3_bucket_object.bootstrap"]
}

Debug Output

https://gist.github.com/l13t/cca0c1c195c34dc275bdae51cc413cfe

Panic Output

Expected Behavior

No cluster recreation.

Actual Behavior

Cluster is recreated because of difference in configuration.

Trying to fix an issue I copied config from aws console. So configurations_json is:

[
    {
        "classification": "spark-log4j",
        "configurations": [],
        "properties": {
            "log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter": "WARN",
            "log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper": "WARN",
            "log4j.logger.org.apache.spark.streaming": "WARN",
            "log4j.rootCategory": "WARN, console"
        }
    },
    {
        "classification": "spark-env",
        "configurations": [
            {
                "classification": "export",
                "configurations": [],
                "properties": {
                    "PYSPARK_PYTHON": "/usr/bin/python3.6"
                }
            }
        ],
        "properties": {}
    }
]

I agree that I need to remove "configurations": [], from config. But real root cause of issue is that terraform gets config from aws and it is in next code box:

[
    {
        "Classification": "spark-log4j",
        "Properties": {
            "log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter": "WARN",
            "log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper": "WARN",
            "log4j.logger.org.apache.spark.streaming": "WARN",
            "log4j.rootCategory": "WARN, console"
        }
    },
    {
        "Classification": "spark-env",
        "Configurations": [
            {
                "Classification": "export",
                "Properties": {
                    "PYSPARK_PYTHON": "/usr/bin/python3.6"
                }
            }
        ],
        "Properties": {}
    }
]

So there is some magic between actual config in aws console and config downloaded by terraform to my pc.

Steps to Reproduce

terraform plan/apply

Important Factoids

References

bug servicemr

Source

l13t

👍7

Most helpful comment

I had the same issue, and it was solved by capitalizing the json keys to match aws's convention. Also NOTE on configurations_json: If the Configurations value is empty then you should skip the Configurations field instead of providing empty list as value "Configurations": [].

jpatallah on 25 Jan 2019

❤2 👍2

All 2 comments

I'm having this same issue and I did a diff on mine as well. I suspect that the last box of JSON code is getting capitalized json keys?

hylaride on 23 Jan 2019

👍2

jpatallah on 25 Jan 2019

❤2 👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

aws_cloudfront_distribution - "Only one viewer certificate change may be in progress at a time"

hashibot · 3Comments

tainting an ecs service and running terraform apply receives Creation of service was not idempotent

hashibot · 3Comments

Failed to destroy AWS node with volume: * aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXX) to detach from Instance: i-XXXXX

hashibot · 3Comments

aws_security_group: revoke_rules_on_delete conflict with 'terraform plan'

carmas · 3Comments

aws_elasticsearch_domain: vpc_options - subnet_ids asks for list but can only be a single value.

ccslamstack · 3Comments