Terraform-provider-aws: Issue destroying EMR on AWS

Created on 21 Feb 2018  路  9Comments  路  Source: hashicorp/terraform-provider-aws

_This issue was originally opened by @miloup as hashicorp/terraform#17330. It was migrated here as a result of the provider split. The original body of the issue is below._


Hi,

Many of us in many company are facing an issue while trying to destroy an EMR infrastructure on AWS.
The Terraform folder has an emr-requirements.tf file which contains Security-Groups, Security-Configuration,...etc. and an emr.tf file which creates the cluster using the configuration is the emr-requirements.tf

When running "_terraform apply_", the infrastructure is successfully created, but when running "_terraform destroy_", it seems that Terraform does not wait for the EMR cluster to terminate before destroying the remaining resources which leads to a failure (timeout) due to their dependencies to this cluster. The only way to have a clean "destroy" is by making sure that the EMR cluster is terminated (by checking on the AWS console for instance), then run "_terraform destroy_" again. Now, all the remaining resources are destroyed.

Would you please fix this bug?

Thanks

bug servicemr

Most helpful comment

We're experiencing the same issue, currently only with a security configuration. It seems like Terraform thinks the EMR cluster has been removed and continues to remove the other resources that the EMR cluster depended on, but the EMR cluster isn't actually removed yet, which causes the destroy of the security configuration to fail because it's still in use.

All 9 comments

I've observed this behavior as well.

We're experiencing the same issue, currently only with a security configuration. It seems like Terraform thinks the EMR cluster has been removed and continues to remove the other resources that the EMR cluster depended on, but the EMR cluster isn't actually removed yet, which causes the destroy of the security configuration to fail because it's still in use.

We are facing the same issue as well, for the time being, our "solution" has been:

  1. Run Targeted destroy on EMR cluster only
  2. Wait until EMR cluster has been reported as terminated in the console
  3. Run normal destroy to delete the remaining resources

This causes an execution flow without errors, but it may be hard to fully automate this process with a script as step 2 above would require a CLI command to determine if the EMR cluster is terminated but I believe that CLI command is the same bugged command that is reporting prematurely back to terraform.

馃憤

Or, another way to do it is to use a shell script with a loop knowing that terraform will return a "0" exit code if it's finished with no error.

#!/bin/bash

terraform destroy
RETURN_CODE=$?
LOOP_COUNT=10      # try 11 times (1st time + 10 from the loop)

if [ "$RETURN_CODE" -ne 0 ] && [ "$LOOP_COUNT" -gt 0 ]; then
  LOOP_COUNT=$((LOOP_COUNT-1))
  echo "Failed to destroy the cluster. Retrying..."
  terraform destroy
  RETURN_CODE=$?
fi

# Display error message on the terminal if the destruction fails after 10 attempts.
if [ "$RETURN_CODE" -ne 0 ] && [ "$LOOP_COUNT" -eq 0 ]; then
  echo "Failed to destroy the cluster after $LOOP_COUNT attempts. Exiting."
  exit 1
fi

Hi,

Any updated by when it will be fixed?
My cluster recreates everytime :(

im having this issue as well when i change the security configuration. it tries to destroy the old one and aws reports its still in use, if i wait 5 seconds and run again all is well

We have been facing this issue as well. While destroying EMR with terraform.
first destroy we get errors about
aws_emr_security_configuration.security_config: InvalidRequestException: Security configuration 'clustter-name-security-config' cannot be deleted because it is in use by active clusters
if we delete the cluster first and add another delete in the end after cluster has been destroyed
that creates even more module level errors
module.stack_module.output.master_lb_dns: Resource 'aws_lb.master_lb' does not have attribute 'dns_name' for variable 'aws_lb.master_lb.dns_name'
* module.stack_module.output.master_public_dns: Resource 'aws_emr_cluster.cluster' does not have attribute 'master_public_dns' for variable 'aws_emr_cluster.cluster.master_public_dns'
* module.stack_module.output.master_private_ip: Resource 'data.aws_instance.master_node' does not have attribute 'private_ip' for variable 'data.aws_instance.master_node.private_ip'
* module.stack_module.output.lb_security_group_id: Resource 'aws_security_group.emr_lb_security_group' does not have attribute 'id' for variable 'aws_security_group.emr_lb_security_group.id'
* module.stack_module.output.name: Resource 'aws_emr_cluster.cluster' does not have attribute 'name' for variable 'aws_emr_cluster.cluster.name'
* module.stack_module.output.master_url: Resource 'aws_route53_record.lb_cname' does not have attribute 'fqdn' for variable 'aws_route53_record.lb_cname.fqdn'
* module.stack_module.output.id: Resource 'aws_emr_cluster.cluster' does not have attribute 'id' for variable 'aws_emr_cluster.cluster.id'

so we can never create no error workflow. Cluster gets destroyed and its not hampering our work.

We use the terraform version:
Terraform v0.11.14

  • provider.aws v2.51.0
  • provider.template v2.1.2

IS this issue fixed in terraform version 0.12? or timeline on when it will be fixed. or any workarounds we can use to fixe this issue?

any updates on this?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Bwanabanana picture Bwanabanana  路  46Comments

jch254 picture jch254  路  37Comments

cjeanneret picture cjeanneret  路  39Comments

charles-at-geospock picture charles-at-geospock  路  44Comments

gazoakley picture gazoakley  路  40Comments