Terraform-provider-aws: Cannot destroy an Aurora RDS cluster when it was built with a `replication_source_identifier` value

Created on 7 Dec 2018  路  8Comments  路  Source: hashicorp/terraform-provider-aws

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.11.10
+ provider.aws v1.50.0

Affected Resource(s)

*provider.aws v1.50.0

Terraform Configuration Files

module "rds-cluster-vpc-1" {
  source = "../../modules/rds_cluster"
  name = "${var.db_name}-rds-cluster-${var.user}"
  user = "${var.user}"
  availability_zones = ["${data.aws_availability_zones.vpc-1-azs.names[0]}",
                        "${data.aws_availability_zones.vpc-1-azs.names[1]}",
                        "${data.aws_availability_zones.vpc-1-azs.names[2]}"
                       ]
  rds_final_snapshot_id   = "${var.db_name}-final-snapshot-${var.user}"
  skip_final_rds_snapshot = true
  vpc_id                  = "${module.vpc-1.vpc_id}"
  aws_subnet_ids          = ["${module.vpc-1.database_subnets}"]
  rds_access_sg              = ["${module.vpc-1-jump.security_group_id}"]
  providers = {
    "aws" = "aws.us-east-1"
  }
  db_name = "${var.db_name}"
  rds_admin_user = "${var.rds_admin_user}"
  rds_admin_password = "${var.rds_admin_password}"
  port = "${var.port}"
  tags = "${local.tags}"
  sox_compliant = "${var.sox_compliant}"
}

module "rds-cluster-vpc-2" {

  source = "../../modules/rds_cluster"
  name = "${var.db_name}-rds-cluster-${var.user}"
  user = "${var.user}"
  availability_zones = ["${data.aws_availability_zones.vpc-2-azs.names[0]}",
                        "${data.aws_availability_zones.vpc-2-azs.names[1]}",
                        "${data.aws_availability_zones.vpc-2-azs.names[2]}"
                       ]
  replication_source_identifier = "${module.rds-cluster-vpc-1.rds_cluster_arn}"
  rds_final_snapshot_id   = "${var.db_name}-final-snapshot-${var.user}"
  skip_final_rds_snapshot = true
  vpc_id                  = "${module.vpc-2.vpc_id}"
  aws_subnet_ids          = ["${module.vpc-2.database_subnets}"]
  rds_access_sg              = ["${module.vpc-2-jump.security_group_id}"]
  providers = {
    "aws" = "aws.us-west-2"
  }
  db_name = "${var.db_name}"
  rds_admin_user = "${var.rds_admin_user}"
  rds_admin_password = "${var.rds_admin_password}"
  port = "${var.port}"
  tags = "${local.tags}"
  sox_compliant = "${var.sox_compliant}"
}

Expected Behavior

Running terraofrm destroy should destroy everything including both RDS clusters and their VPCs

Actual Behavior

Destroy works on the primary cluster but fails on the secondary cluster

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

module.rds-cluster-vpc-2.aws_rds_cluster_instance.rds_cluster_instance[2]: Destroying... (ID: rdstest-2-dev)
Releasing state lock. This may take a few moments...

Error: Error applying plan:

3 error(s) occurred:

* module.vpc-1-jump.output.public_ip: element: element() may not be used with an empty list in:

${var.create == false ? "" : element(aws_instance.jumpbox.*.public_ip,0)}
* module.vpc-2-jump.output.public_ip: element: element() may not be used with an empty list in:

${var.create == false ? "" : element(aws_instance.jumpbox.*.public_ip,0)}
* module.rds-cluster-vpc-2.aws_rds_cluster_instance.rds_cluster_instance[2] (destroy): 1 error(s) occurred:

* aws_rds_cluster_instance.rds_cluster_instance.2: InvalidDBClusterStateFault: Cannot delete the last instance of the read replica DB cluster. Promote the DB cluster to a standalone DB cluster in order to delete it.
        status code: 400, request id: 456e5bf2-656a-4e22-84d7-49565db5976c

Steps to Reproduce

  1. terraform apply
  2. terraform destroy

References

See https://github.com/terraform-providers/terraform-provider-aws/issues/6672 for a related issue regarding trying to terraform cross region aurora replica clusters.

bug servicrds

Most helpful comment

@SaravanRaman - as @pbeaumontQc mentioned - is it possible to promote the cluster to standalone using terraform?

For anyone else coming across this, promotion can be done using aws(1): aws rds promote-read-replica-db-cluster --db-cluster-identifier <identifier>

All 8 comments

The output of a terraform destroy clearly shows that it is destroying the primary cluster and its VPC first where it should start with the secondary VPC and cluster to avoid this orphaned instance situation. Output is in this gist

in fact, even if I force a destroy order using phased targets, the secondary cluster still doesnt cleanly go away with the same error.

This is an expected behavior if am not wrong. This is a deliberate safeguard is put in place from aws in the upstream API to prevent accidental deletion and it would not seem appropriate for terraform to override this. https://aws.amazon.com/premiumsupport/knowledge-center/rds-error-delete-aurora-cluster/

If I am asking terraform to destroy, it should destroy. There is already guardrails in terraform that list things that will be destroyed and allowing for approval.

ah. This is at the Database level. if you promote the read replica to a standalone cluster, then your destroy should go through.

But how do we promote the cluster to standalone?

resource "aws_rds_cluster" "replica" {
  cluster_identifier              = "db1-replica"
  database_name                   = "..."
  master_username                 = "..."
  master_password                 = "..."
  final_snapshot_identifier       = "..."
  skip_final_snapshot             = "true"
  backup_retention_period         = "7"
  preferred_backup_window         = "..."
  preferred_maintenance_window    = "..."
  port                            = "3306"
  vpc_security_group_ids          = ["..."]
  storage_encrypted               = "false"
  kms_key_id                      = ""
  apply_immediately               = "true"
  db_subnet_group_name            = "..."
  db_cluster_parameter_group_name = "..."
  replication_source_identifier   = "arn:aws:rds:..."
  engine                          = "aurora-mysql"
  engine_version                  = "5.7.mysql_aurora.2.04.5"
  source_region                   = "<cross region...>"

  lifecycle {
    prevent_destroy = false
  }
}

resource "aws_rds_cluster_instance" "replica" {
  count                        = "1"
  identifier                   = "db1-replica-0"
  cluster_identifier           = "${aws_rds_cluster.replica.id}"
  instance_class               = "db.t3.small"
  db_subnet_group_name         = "..."
  preferred_maintenance_window = "..."
  apply_immediately            = "true"
  db_parameter_group_name      = "..."
  auto_minor_version_upgrade   = "true"
  monitoring_interval          = "0"
  monitoring_role_arn          = ""
  engine                       = "aurora-mysql"
  engine_version               = "5.7.mysql_aurora.2.04.5"

  lifecycle {
    prevent_destroy = false
  }
}

This is the definition of our cluster, and if I blank the replication_source_identifier = "", it passes in the apply, but does nothing to the actual Read Replica, it stay as it is.

aws_rds_cluster.replica: Modifying... (ID: db1-replica) replication_source_identifier: "arn:aws:rds:..." => ""

@SaravanRaman - as @pbeaumontQc mentioned - is it possible to promote the cluster to standalone using terraform?

For anyone else coming across this, promotion can be done using aws(1): aws rds promote-read-replica-db-cluster --db-cluster-identifier <identifier>

Pinging this issue so it stays alive, unfortunately I landed on this while building a HA and Disaster Recovery setup;
here's what we had to script for single instance DB clusters
aws rds promote-read-replica --db-instance-identifier mysql-xxxx-ro --profile <profile>,or use IAM instance profile --region <region>
takes a few minutes but everything stays up, you can still work on the DB and in a few minutes you are able to write to the DB with the same credentials
CLI details: https://docs.aws.amazon.com/cli/latest/reference/rds/promote-read-replica.html

Was this page helpful?
0 / 5 - 0 ratings