Terraform-provider-aws: Destroy/recreate DB instance on minor version update rather than updating

Created on 18 Jul 2019 · 8Comments · Source: hashicorp/terraform-provider-aws

Terraform Version

Terraform v0.12.3

provider.aws v2.16.0
provider.template v2.1.2

Affected Resource(s)

aws_rds_cluster
aws_rds_cluster_instance

Terraform Configuration Files

resource "aws_rds_cluster" "main_postgresql" {
  cluster_identifier           = "aurora-cluster-main"
  deletion_protection          = false
  availability_zones           = ['us-east-1a', 'us-east-1b', 'us-east-1c']
  database_name                = "pcs"
  skip_final_snapshot          = true
  backup_retention_period      = 5
  preferred_backup_window      = "03:00-05:00"
  preferred_maintenance_window = "Mon:05:00-Mon:06:00"
  vpc_security_group_ids       = [aws_security_group.main_postgresql.id]
  storage_encrypted            = true

  # Careful, the below property need to be in sync between the cluster and the instances
  engine               = "aurora-postgresql"
  engine_version       = "10.7"
  db_subnet_group_name = aws_db_subnet_group.main.name

  master_username = "admin"

  master_password = "duMMY$123"

  apply_immediately = true
}

resource "aws_rds_cluster_instance" "main_postgresql_instances" {
  count                      = 2
  identifier_prefix          = "aurora-cluster-main-instance-"
  cluster_identifier         = aws_rds_cluster.main_postgresql.id
  publicly_accessible        = true
  instance_class             = var.db_instance_type_per_env[terraform.workspace]
  auto_minor_version_upgrade = false

  # Careful, the below property need to be in sync between the cluster and the instances
  engine               = local.engine
  engine_version       = local.engine_version
  db_subnet_group_name = aws_db_subnet_group.main.name

  apply_immediately = true
}

Debug Output

https://gist.github.com/gbataille/9c7b6084614b1b6c022342c48dbb80f7

Expected Behavior

DB cluster and DB instances are upgraded in place like if you did it through the AWS console.
If you do it from the AWS console, the cluster and the instances are put in upgrading status, a dump is taken, pg_upgrade is run live, the instances are rebooted (~10s) and everything is back up.

Actual Behavior

Instances are destroyed and new ones with the new minor version are re-created
--> it takes way longer
--> the downtime is way longer.
Luckily, since it's Aurora and the data layer is separate from the engine, no data was lost.

Steps to Reproduce

terraform apply with a RDS Aurora specifying postgresql 10.6
terraform apply with a RDS Aurora specifying postgresql 10.7

enhancement servicrds

Source

gbataille

👍9

Most helpful comment

Try ignoring changes to the engine version on the aws_rds_cluster_instance resource.

resource "aws_rds_cluster" "main" {
  apply_immediately  = true
  cluster_identifier = "my-cluster"
  engine             = "aurora-postgresql"
  engine_version     = "10.7"

  # other attributes omitted
}

resource "aws_rds_cluster_instance" "cluster_instance" {
  apply_immediately  = true
  identifier_prefix  = "my-instance"
  cluster_identifier = aws_rds_cluster.main.id
  engine             = aws_rds_cluster.main.engine
  engine_version     = aws_rds_cluster.main.engine_version

  # other attributes omitted

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [engine_version]
  }
}

My simple experimentation shows that when you terraform apply an engine version change on the cluster resource, AWS upgrades the cluster instances at the same time, thus negating the need to update the cluster instances with Terraform.

Note that I've also marked the cluster instances as create_before_destroy, so that if Terraform does insist on replacing the instance, it will spin up a replacement instance first and this will minimize downtime.

jcarlson on 8 Nov 2019

👍6 ❤1

All 8 comments

Hmm looks like that uses the same api as aws_db_instance but has different settings on that parameter :thinking:

$ git diff aws/resource_aws_rds_cluster_instance.go
diff --git a/aws/resource_aws_rds_cluster_instance.go b/aws/resource_aws_rds_cluster_instance.go
index 02e8e94a9..1ca54e226 100644
--- a/aws/resource_aws_rds_cluster_instance.go
+++ b/aws/resource_aws_rds_cluster_instance.go
@@ -99,10 +99,10 @@ func resourceAwsRDSClusterInstance() *schema.Resource {
                        },

                        "engine_version": {
-                               Type:     schema.TypeString,
-                               Optional: true,
-                               ForceNew: true,
-                               Computed: true,
+                               Type:             schema.TypeString,
+                               Optional:         true,
+                               Computed:         true,
+                               DiffSuppressFunc: suppressAwsDbEngineVersionDiffs,
                        },

                        "db_parameter_group_name": {

nijave on 19 Oct 2019

The same for me while upgrading the minor version of an Aurora MySQL Database.
See below for "# forces replacement".
The only workaround is to manually update the version via AWS console and after finish to update/align the Terraform source files -> very fragile!

# aws_rds_cluster_instance.customerscoring_unittest_rds_cluster_instance must be replaced
 -/+ resource "aws_rds_cluster_instance" "customerscoring_unittest_rds_cluster_instance" {
        apply_immediately               = true
      ~ arn                             = "arn:aws:rds:eu-central-1:XXXXXXXXXXX:db:customerscoringunittest" -> (known after apply)
        auto_minor_version_upgrade      = true
      ~ availability_zone               = "eu-central-1b" -> (known after apply)
        cluster_identifier              = "customerscoringunittest-cluster"
        copy_tags_to_snapshot           = true
        db_parameter_group_name         = "customerscoringqa-aurora-mysql57"
        db_subnet_group_name            = "privat"
      ~ dbi_resource_id                 = "db-H54JTW27MWJTJTPUJVNLTXEH7I" -> (known after apply)
      ~ endpoint                        = "customerscoringunittest.co4pdundcaoq.eu-central-1.rds.amazonaws.com" -> (known after apply)
        engine                          = "aurora-mysql"
      ~ engine_version                  = "5.7.mysql_aurora.2.04.6" -> "5.7.mysql_aurora.2.05.0" # forces replacement
      ~ id                              = "customerscoringunittest" -> (known after apply)
        identifier                      = "customerscoringunittest"
      + identifier_prefix               = (known after apply)
        instance_class                  = "db.t3.medium"
      ~ kms_key_id                      = "arn:aws:kms:eu-central-1:XXXXXXXXXX:key/0000000-1111-acbb-80e5-1fb4254b6666" -> (known after apply)
        monitoring_interval             = 0
      + monitoring_role_arn             = (known after apply)
      ~ performance_insights_enabled    = false -> (known after apply)
      + performance_insights_kms_key_id = (known after apply)
      ~ port                            = 3306 -> (known after apply)
      ~ preferred_backup_window         = "21:43-22:43" -> (known after apply)
      ~ preferred_maintenance_window    = "mon:02:32-mon:03:02" -> (known after apply)
        promotion_tier                  = 1
        publicly_accessible             = false
      ~ storage_encrypted               = true -> (known after apply)
      ~ writer                          = true -> (known after apply)
    }

pioneer2k on 22 Oct 2019

Try ignoring changes to the engine version on the aws_rds_cluster_instance resource.

resource "aws_rds_cluster" "main" {
  apply_immediately  = true
  cluster_identifier = "my-cluster"
  engine             = "aurora-postgresql"
  engine_version     = "10.7"

  # other attributes omitted
}

resource "aws_rds_cluster_instance" "cluster_instance" {
  apply_immediately  = true
  identifier_prefix  = "my-instance"
  cluster_identifier = aws_rds_cluster.main.id
  engine             = aws_rds_cluster.main.engine
  engine_version     = aws_rds_cluster.main.engine_version

  # other attributes omitted

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [engine_version]
  }
}

jcarlson on 8 Nov 2019

👍6 ❤1

AWS Provider: 2.38
Terraform: 0.12.13

We have the same issue with aurora, but the instances once destroyed they cannot be recreated

module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Destroying... [id=abc-dev-0]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Destroying... [id=abc-dev-1]
module.abc-eks-customer-quality.aws_launch_configuration.workers[1]: Creating...
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Modifying... [id=abc-dev]
module.abc-eks-customer-quality.aws_launch_configuration.workers[0]: Creating...
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 30s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 40s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 50s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 1m0s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Still modifying... [id=abc-dev, 1m10s elapsed]
module.abc-aurora-dev.aws_rds_cluster.cluster_with_encryption_provisioned[0]: Modifications complete after 1m14s [id=abc-dev]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[0]: Still destroying... [id=abc-dev-0, 1m20s elapsed]
module.abc-aurora-dev.aws_rds_cluster_instance.cluster_instances[1]: Still destroying... [id=abc-dev-1, 1m20s elapsed]

Error: error creating RDS DB Instance: InvalidParameterCombination: The engine version that you requested for your DB instance (5.7.mysql_aurora.2.05.0) does not match the engine version of your DB cluster (5.7.mysql_aurora.2.04.6).
    status code: 400, request id: 6b737240-b302-4ac6-b632-ae9a1e632960

  on .terraform/modules/abc-aurora-dev/main.tf line 335, in resource "aws_rds_cluster_instance" "cluster_instances":
 335: resource "aws_rds_cluster_instance" "cluster_instances" {



Error: error creating RDS DB Instance: InvalidParameterCombination: The engine version that you requested for your DB instance (5.7.mysql_aurora.2.05.0) does not match the engine version of your DB cluster (5.7.mysql_aurora.2.04.6).
    status code: 400, request id: 4c6ee853-ad2f-4706-ade7-d2a8968d4f98

  on .terraform/modules/abc-aurora-dev/main.tf line 335, in resource "aws_rds_cluster_instance" "cluster_instances":
 335: resource "aws_rds_cluster_instance" "cluster_instances" {

brianmori on 22 Nov 2019

As @jcarlson already wrote the solution is to work with the engine_version on the cluster only and leave the engine_version on the cluster_instance out, since it is optional. When doing so, Terraform does an inplace upgrade of the cluster and AWS RDS upgrades the cluster_instance's itself. Terraform then sees no difference on the cluster_instance and does nothing.

pioneer2k on 28 Nov 2019

👍2

@pioneer2k have you tried that with global_clusters? I tried but seems like on rds_global_cluster we need to specify the same version of the rds_clusters.

marinsalinas on 26 Jun 2020

@nywilken this issue is related to service/rds not to service/dynamodb

marinsalinas on 26 Jun 2020

Hello Guys, I have created a template
resource "aws_rds_cluster" "default" {
cluster_identifier = var.name
engine = "aurora-mysql"
engine_mode = "serverless"
engine_version = "5.7.mysql_aurora.2.07.1"
availability_zones = ["us-east-2a", "us-east-2b"]
master_username = var.database_username
master_password = var.database_password
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
preferred_backup_window = "07:00-09:00"
db_subnet_group_name = aws_db_subnet_group.rds.id
db_cluster_parameter_group_name = aws_rds_cluster_parameter_group.aurora_db_57.id
final_snapshot_identifier = "${var.name}-final"
skip_final_snapshot = false
deletion_protection = true
apply_immediately = true
scaling_configuration {
min_capacity = var.min_capacity
auto_pause = true
max_capacity = var.max_capacity
seconds_until_auto_pause = 300
timeout_action = "ForceApplyCapacityChange"
}

if I change anything in this template, it will delete the rds and create it again. is there a way where we can only modifying the rds instead of deleting?

AndrewAyush on 11 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Failed to destroy AWS node with volume: * aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXX) to detach from Instance: i-XXXXX

hashibot · 3Comments

Terraform support for Alexa Smart Home Lambda trigger

joelittlejohn · 3Comments

AWS CodeBuild using environment variables from EC2 Parameter Store

blaltarriba · 3Comments

Feature: S3 bucket-wide default encryption

reedloden · 3Comments

aws_elasticsearch_domain: vpc_options - subnet_ids asks for list but can only be a single value.

ccslamstack · 3Comments