Terraform-provider-aws: aws_emr_cluster recreates cluster when there are no changes in configurations_json

Created on 10 Jan 2019  路  2Comments  路  Source: hashicorp/terraform-provider-aws

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

```$ terraform version
Terraform v0.11.11

  • provider.aws v1.54.0
  • provider.external v1.0.0
  • provider.postgresql v0.1.3
  • provider.template v1.0.0

### Affected Resource(s)

<!--- Please list the affected resources and data sources. --->

* aws_emr_cluster

### Terraform Configuration Files

<!--- Information about code formatting: https://help.github.com/articles/basic-writing-and-formatting-syntax/#quoting-code --->

```hcl
data "template_file" "emr_configuration_json" {
  template = <<EOF
[
    {
        "classification": "spark-log4j",
        "configurations": [],
        "properties": {
            "log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter": "WARN",
            "log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper": "WARN",
            "log4j.logger.org.apache.spark.streaming": "WARN",
            "log4j.rootCategory": "WARN, console"
        }
    },
    {
        "classification": "spark-env",
        "configurations": [
            {
                "classification": "export",
                "configurations": [],
                "properties": {
                    "PYSPARK_PYTHON": "/usr/bin/python3.6"
                }
            }
        ],
        "properties": {}
    }
]
EOF
}

resource "aws_emr_cluster" "data_emr_519" {
  name          = "data-emr-519"
  release_label = "emr-5.19.0"
  applications  = ["Spark", "Hadoop"]
  count         = "${lookup(var.emr_count, var.env)}"

  ec2_attributes {
    subnet_id                         = "${var.subnet_id}"
    emr_managed_master_security_group = "${var.master_security_group_id}"
    emr_managed_slave_security_group  = "${var.slave_security_group_id}"
    instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
    key_name                          = "${var.data_karl_marx_deploy_key_name}"
    additional_master_security_groups = "${var.sg_emr_gocd_id}"
    additional_slave_security_groups  = "${var.sg_emr_gocd_id}"
  }

  instance_group {
    instance_role  = "MASTER"
    instance_count = 1
    instance_type  = "${lookup(var.emr_master_instance_type, var.env)}"
  }

  instance_group {
    instance_role  = "CORE"
    instance_count = "${lookup(var.emr_core_instances_count, var.env)}"
    instance_type  = "${lookup(var.emr_core_instances_type, var.env)}"
  }

  tags {
    description    = "Managed by Terraform"
    "sf:costGroup" = "data"
    "sf:env"       = "${var.env}"
    "sf:team"      = "data"
  }

  configurations_json = "${data.template_file.emr_configuration_json.rendered}"

  service_role = "${aws_iam_role.iam_emr_service_role.arn}"

  bootstrap_action {
    path = "s3://${aws_s3_bucket.data_emr_bootstrap_actions.bucket}/bootstrap.sh"
    name = "bootstrap"
  }

  depends_on = ["aws_s3_bucket.data_emr_bootstrap_actions", "aws_s3_bucket_object.bootstrap"]
}

Debug Output

https://gist.github.com/l13t/cca0c1c195c34dc275bdae51cc413cfe

Panic Output

Expected Behavior

No cluster recreation.

Actual Behavior

Cluster is recreated because of difference in configuration.

Trying to fix an issue I copied config from aws console. So configurations_json is:

[
    {
        "classification": "spark-log4j",
        "configurations": [],
        "properties": {
            "log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter": "WARN",
            "log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper": "WARN",
            "log4j.logger.org.apache.spark.streaming": "WARN",
            "log4j.rootCategory": "WARN, console"
        }
    },
    {
        "classification": "spark-env",
        "configurations": [
            {
                "classification": "export",
                "configurations": [],
                "properties": {
                    "PYSPARK_PYTHON": "/usr/bin/python3.6"
                }
            }
        ],
        "properties": {}
    }
]

I agree that I need to remove "configurations": [], from config. But real root cause of issue is that terraform gets config from aws and it is in next code box:

[
    {
        "Classification": "spark-log4j",
        "Properties": {
            "log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter": "WARN",
            "log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper": "WARN",
            "log4j.logger.org.apache.spark.streaming": "WARN",
            "log4j.rootCategory": "WARN, console"
        }
    },
    {
        "Classification": "spark-env",
        "Configurations": [
            {
                "Classification": "export",
                "Properties": {
                    "PYSPARK_PYTHON": "/usr/bin/python3.6"
                }
            }
        ],
        "Properties": {}
    }
]

So there is some magic between actual config in aws console and config downloaded by terraform to my pc.

Steps to Reproduce

  1. terraform plan/apply

Important Factoids

References

bug servicemr

Most helpful comment

I had the same issue, and it was solved by capitalizing the json keys to match aws's convention. Also NOTE on configurations_json: If the Configurations value is empty then you should skip the Configurations field instead of providing empty list as value "Configurations": [].

All 2 comments

I'm having this same issue and I did a diff on mine as well. I suspect that the last box of JSON code is getting capitalized json keys?

I had the same issue, and it was solved by capitalizing the json keys to match aws's convention. Also NOTE on configurations_json: If the Configurations value is empty then you should skip the Configurations field instead of providing empty list as value "Configurations": [].

Was this page helpful?
0 / 5 - 0 ratings