_This issue was originally opened by @miloup as hashicorp/terraform#17330. It was migrated here as a result of the provider split. The original body of the issue is below._
Hi,
Many of us in many company are facing an issue while trying to destroy an EMR infrastructure on AWS.
The Terraform folder has an emr-requirements.tf file which contains Security-Groups, Security-Configuration,...etc. and an emr.tf file which creates the cluster using the configuration is the emr-requirements.tf
When running "_terraform apply_", the infrastructure is successfully created, but when running "_terraform destroy_", it seems that Terraform does not wait for the EMR cluster to terminate before destroying the remaining resources which leads to a failure (timeout) due to their dependencies to this cluster. The only way to have a clean "destroy" is by making sure that the EMR cluster is terminated (by checking on the AWS console for instance), then run "_terraform destroy_" again. Now, all the remaining resources are destroyed.
Would you please fix this bug?
Thanks
I've observed this behavior as well.
We're experiencing the same issue, currently only with a security configuration. It seems like Terraform thinks the EMR cluster has been removed and continues to remove the other resources that the EMR cluster depended on, but the EMR cluster isn't actually removed yet, which causes the destroy of the security configuration to fail because it's still in use.
We are facing the same issue as well, for the time being, our "solution" has been:
This causes an execution flow without errors, but it may be hard to fully automate this process with a script as step 2 above would require a CLI command to determine if the EMR cluster is terminated but I believe that CLI command is the same bugged command that is reporting prematurely back to terraform.
馃憤
Or, another way to do it is to use a shell script with a loop knowing that terraform will return a "0" exit code if it's finished with no error.
#!/bin/bash
terraform destroy
RETURN_CODE=$?
LOOP_COUNT=10 # try 11 times (1st time + 10 from the loop)
if [ "$RETURN_CODE" -ne 0 ] && [ "$LOOP_COUNT" -gt 0 ]; then
LOOP_COUNT=$((LOOP_COUNT-1))
echo "Failed to destroy the cluster. Retrying..."
terraform destroy
RETURN_CODE=$?
fi
# Display error message on the terminal if the destruction fails after 10 attempts.
if [ "$RETURN_CODE" -ne 0 ] && [ "$LOOP_COUNT" -eq 0 ]; then
echo "Failed to destroy the cluster after $LOOP_COUNT attempts. Exiting."
exit 1
fi
Hi,
Any updated by when it will be fixed?
My cluster recreates everytime :(
im having this issue as well when i change the security configuration. it tries to destroy the old one and aws reports its still in use, if i wait 5 seconds and run again all is well
We have been facing this issue as well. While destroying EMR with terraform.
first destroy we get errors about
aws_emr_security_configuration.security_config: InvalidRequestException: Security configuration 'clustter-name-security-config' cannot be deleted because it is in use by active clusters
if we delete the cluster first and add another delete in the end after cluster has been destroyed
that creates even more module level errors
module.stack_module.output.master_lb_dns: Resource 'aws_lb.master_lb' does not have attribute 'dns_name' for variable 'aws_lb.master_lb.dns_name'
* module.stack_module.output.master_public_dns: Resource 'aws_emr_cluster.cluster' does not have attribute 'master_public_dns' for variable 'aws_emr_cluster.cluster.master_public_dns'
* module.stack_module.output.master_private_ip: Resource 'data.aws_instance.master_node' does not have attribute 'private_ip' for variable 'data.aws_instance.master_node.private_ip'
* module.stack_module.output.lb_security_group_id: Resource 'aws_security_group.emr_lb_security_group' does not have attribute 'id' for variable 'aws_security_group.emr_lb_security_group.id'
* module.stack_module.output.name: Resource 'aws_emr_cluster.cluster' does not have attribute 'name' for variable 'aws_emr_cluster.cluster.name'
* module.stack_module.output.master_url: Resource 'aws_route53_record.lb_cname' does not have attribute 'fqdn' for variable 'aws_route53_record.lb_cname.fqdn'
* module.stack_module.output.id: Resource 'aws_emr_cluster.cluster' does not have attribute 'id' for variable 'aws_emr_cluster.cluster.id'
so we can never create no error workflow. Cluster gets destroyed and its not hampering our work.
We use the terraform version:
Terraform v0.11.14
IS this issue fixed in terraform version 0.12? or timeline on when it will be fixed. or any workarounds we can use to fixe this issue?
any updates on this?
Most helpful comment
We're experiencing the same issue, currently only with a security configuration. It seems like Terraform thinks the EMR cluster has been removed and continues to remove the other resources that the EMR cluster depended on, but the EMR cluster isn't actually removed yet, which causes the destroy of the security configuration to fail because it's still in use.