Terraform-provider-aws: Modify aws_db_instance and delete aws_db_parameter_group breaks

Created on 13 Nov 2018 · 11Comments · Source: hashicorp/terraform-provider-aws

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

$ terraform -v
Terraform v0.11.8
+ provider.aws v1.41.0

Affected Resource(s)

aws_db_parameter_group
aws_db_instance

Terraform Configuration Files

If I have a running RDS instance which is using a DB parameter group, and I want to modify the DB instance to use a different parameter group, and delete the old parameter group, I'll get a TF plan which looks like this:

  ~ module.my_module.aws_db_instance.application
      parameter_group_name: "pg10-foo" => "pg10-bar"

  - module.my_module.aws_db_parameter_group.foo

Expected Behavior

Modify the RDS instance, then delete the parameter group which is now unused.

Actual Behavior

It tries to delete the parameter group first, which fails because the parameter group is still in use.

InvalidDBParameterGroupState: One or more database instances are still members of this parameter group pg10-foo, so the group cannot be deleted

If it would have done the modify action first on the DB instance, it would then be able to do the destroy action on the now unused parameter group.

Steps to Reproduce

Run a plan which plans to modify a DB instance to change the parameter group to some other parameter group, and which also plans to delete the now unused parameter group.

bug servicrds

Source

jrobison-sb

👍36

Most helpful comment

Hi guys, I agree with the previous comment. This causes problems in the automation and use of pipelines. In my case, when using Jenkins. Removing or modifying RDS is impossible in those tasks that pursue terraforms. Unfortunately, I can not offer a solution to the problem in the form of code, I can only assume that this option will work:

Rollback to the default RDS group
Deleting a parameter group
Removing RDS

Here is our code:

resource "aws_db_parameter_group" "pg" {
name = "paramg"
family = "mysql5.7"

parameter {
name = "log_bin_trust_function_creators"
value = "1"
}
}
resource "aws_db_instance" "db" {
allocated_storage = 30 # gigabytes
backup_retention_period = 7 # in days
engine = "mysql"
engine_version = "5.7"
identifier = "db"
instance_class = "db.t3.small"
multi_az = true
name = "mydb"
password = "password"
port = 5465
storage_type = "gp2"
username = "devops"
vpc_security_group_ids = ["${aws_security_group.DB-SG.id}"]
parameter_group_name = "paramg"
skip_final_snapshot = true
}

I understand that there are workarounds using AWS console, but agree that this is not a solution to this problem.

Reaction to destroy:

Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group paramg, so the group cannot be deleted
status code: 400, request id: xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxx-xxxxxxxxxx

Then the pipeline will not move.
This problem also reproduces with command-line commands.
It doesn't matter whether it is enabled or not apply_imediately.
If I missed something, please correct me.
Thank you. Regards.

UPD
This problem can be fooled. In the case of Jenkins, we reset the error of the first destruction and immediately launch the second in this way:

    stage('Terraform Destroy') {
      steps {
        input 'Destroy Plan'
        catchError(buildResult: 'SUCCESS', stageResult: 'FAILURE') {
        sh "${env.TERRAFORM_HOME}/terraform destroy -force -input=false"
        }
      }
    }
    stage('Terraform second Destroy') {
      steps {
        input 'Destroy Plan'
        sh "${env.TERRAFORM_HOME}/terraform destroy -force -input=false"
      }
    }
  }
}

I hope it will be useful to someone.

ku9n on 27 Jun 2020

👍2

All 11 comments

Hi @jrobison-sb 👋 Thanks for reporting this and sorry for the trouble.

Can you confirm a few pieces of information here?

Are you using a reference such as parameter_group_name = "${aws_db_parameter_group.foo.name}" in your aws_db_instance configuration?
Is apply_immediately set to true in the aws_db_instance configuration?
What is the state of the RDS instance after this apply? Is the parameter group changed? Does it say pending-reboot in the console for the parameter group?

Without seeing the debug log here its not possible to tell definitively, but my initial hunch is that the operations are happening in order, but that the parameter group change is requiring an instance reboot still, something which the Update function of aws_db_instance does not do currently. Since the parameter group update isn't successfully completed by the instance, the parameter group is stuck.

If you would like to see the debug log yourself or provide it for reference in this issue via a Gist, documentation can be found here: https://www.terraform.io/docs/internals/debugging.html

Please let us know about the items above and hopefully we can dig into this further, thanks.

bflad on 13 Nov 2018

@bflad Thanks for your fast response. I'll answer each of your questions below.

Are you using a reference such as parameter_group_name = "${aws_db_parameter_group.foo.name}" in your aws_db_instance configuration?

I'm using aws_db_parameter_group.foo.id, but close enough, yes.

Is apply_immediately set to true in the aws_db_instance configuration?

Yes.

What is the state of the RDS instance after this apply? Is the parameter group changed? Does it say pending-reboot in the console for the parameter group?

No, the RDS instance is unchanged, and still using the old parameter group. Since the RDS instance wasn't changed, there is no pending-reboot needed.

As for a debug log, my module has hundreds (or more) of resources, and a full debug log is 128,000 lines, so I'm hesitant to publicly post that. If it's highly necessary, let me know, and maybe I can creatively grep or parse through it somehow.

jrobison-sb on 13 Nov 2018

I'm experiencing what may be the same problem. It looks like the modification simply does not happen in my case. After an apply, the parameter group used by my instance is still the original one: default.postgres9.5 (in-sync). There is no pending reboot on the instance either.

If you remove the deletion of your old group, does the modification actually happen at all?

liamg-form3 on 14 Dec 2018

👍1

I'm having the same issue when trying to upgrade from Postgres 10.6 to 11.1 on RDS. As with @liamg-form3 the parameter group remains the original one.

My code doesn't explicitly delete the parameter group, the only change was to upgrade the engine version and use a new postgres11 family for the parameter group. I'm using the terraform-aws-rds module.

This workaround has worked fine on several instances with the same issue:

Run Terraform, get the error above about parameter group
Reboot the DB instance - after reboot, will be on the new version, and using the new parameter group from postgres11 family
Run Terraform again to ensure the old parameter group is deleted

Not very elegant but quite easy to do, and some downtime is required anyway with RDS when upgrading Postgres.

rdonkin on 7 May 2019

👍1

I believe the key here is:

Reboot the DB instance

After calling ModifyDBInstance. We had done a similar fix during resource creation with snapshots (https://github.com/terraform-providers/terraform-provider-aws/pull/5672), but looks look we also need to do something similar during update.

bflad on 13 Nov 2019

We got stuck trying to change the parameter group (PG) of our Postgres DB. The error msg:
Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group [our PG name], so the group cannot be deleted

Assuming we want to change from PG _A_ -> _B_ our workaround solution for this was.

Via AWS ui manually create a copy of the PG _A_ called _C_.
Comment out the RDS dependancy on PG in main.tf:

// parameter_group_name = aws_db_parameter_group.default.id

Via AWS ui change RDS to use PG _C_.
Run terraform apply, this could now successfully replace _A_ -> _B_ since _A_ is not used by RDS.
In our main.tf file switch back the requirement on PG (Which now is _B_).
Running terraform apply RDS can now pick up the PG and the terraform state is in sync with our real state.
Manually remove the PG _C_.

Hope this helps someone in the future.

ePoromaa on 8 Jan 2020

I also encountered this while trying to change the name of a db param group

Plan:

-/+ resource "aws_db_parameter_group" "scuba" {
      ~ arn         = "arn:aws:rds:xxxxxxxxxxxxxx" -> (known after apply)
        description = "Managed by Terraform"
        family      = "postgres11"
      ~ id          = "xxxxxx-postgres-11" -> (known after apply)
      ~ name        = "xxxxxx-postgres-11" -> "yyyyyy-postgres-11" # forces replacement
      + name_prefix = (known after apply)
      - tags        = {} -> null

        parameter {
            apply_method = "immediate"
            name         = "temp_file_limit"
            value        = "2147483647"
        }
        parameter {
            apply_method = "immediate"
            name         = "work_mem"
            value        = "65536"
        }
    }

aws_db_parameter_group.xxxxx: Still destroying... [id=xxxxx-postgres-11, 2m40s elapsed]
aws_db_parameter_group.xxxxx: Still destroying... [id=xxxxx-postgres-11, 2m50s elapsed]

Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group xxxxx-postgres-11, so the group cannot be deleted
    status code: 400, request id: xxxxx-xxxxx-xxxxx-xxxxx-xxxxx

Action:
I saw the rds instance was marked as ready-for-reboot, manually rebooted the instance.
Impact:
No change - same results as above.

Action:
Created a backup of the xxx database group. Tried to delete the xxxxx-postgres-11 database group.
Impact:
Failed to delete xxxxx-postgres-11: One or more database instances are still members of this parameter group xxxxx-postgres-11, so the group cannot be deleted (Service: AmazonRDS; Status Code: 400; Error Code: InvalidDBParameterGroupState; Request ID: xxxxx-xxxxx-xxxxx-xxxxx-xxxxx).

Action:

Logged into the AWS Console
modified the actual RDS instance back to the default.postgres11 group provided by AWS
re ran terraform apply

Impact: Worked. Once the group was no longer actively assigned to a database, terraform could rename the custom xxxxx-postgres-11 to yyyy-postgres-11 database group. TF then swapped the default.postgres.11 group for the yyyy-postgres-11 group by applying the change immediately.

Suggestion:
Looks like terraform needs to assign to a temporary or default group to the RDS instance prior to modifying the aws_db_parameter_group. Upon completion, restore the intended group.

rightisleft on 28 Apr 2020

👍2

The same issue 2 years after. No RDS cleanup/destroy possible:

Error: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group ambari-hdf-peterz, so the group cannot be deleted
status code: 400, request id: 64d52b07-e31d-4355-89a6-76755072a433

Error: error deleting RDS Cluster (ambari-hdf-peterz): DBClusterSnapshotAlreadyExistsFault: Cannot create the cluster snapshot because one with the identifier ambari-hdf-final-snapshot already exists.
status code: 400, request id: 58b59224-5a06-4586-b6dc-4d9ab62ead67

Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group xxxxx, so the group cannot be deleted
status code: 400, request id: xxxxxx

[terragrunt] 2020/06/19 12:37:01 Hit multiple errors:
exit status 1

pzi123 on 19 Jun 2020

👍1

Rollback to the default RDS group
Deleting a parameter group
Removing RDS