$ terraform -v
Terraform v0.11.8
+ provider.aws v1.41.0
If I have a running RDS instance which is using a DB parameter group, and I want to modify the DB instance to use a different parameter group, and delete the old parameter group, I'll get a TF plan which looks like this:
~ module.my_module.aws_db_instance.application
parameter_group_name: "pg10-foo" => "pg10-bar"
- module.my_module.aws_db_parameter_group.foo
Modify the RDS instance, then delete the parameter group which is now unused.
It tries to delete the parameter group first, which fails because the parameter group is still in use.
InvalidDBParameterGroupState: One or more database instances are still members of this parameter group pg10-foo, so the group cannot be deleted
If it would have done the modify
action first on the DB instance, it would then be able to do the destroy
action on the now unused parameter group.
Run a plan which plans to modify a DB instance to change the parameter group to some other parameter group, and which also plans to delete the now unused parameter group.
Hi @jrobison-sb 馃憢 Thanks for reporting this and sorry for the trouble.
Can you confirm a few pieces of information here?
parameter_group_name = "${aws_db_parameter_group.foo.name}"
in your aws_db_instance
configuration?apply_immediately
set to true
in the aws_db_instance
configuration?pending-reboot
in the console for the parameter group?Without seeing the debug log here its not possible to tell definitively, but my initial hunch is that the operations are happening in order, but that the parameter group change is requiring an instance reboot still, something which the Update
function of aws_db_instance
does not do currently. Since the parameter group update isn't successfully completed by the instance, the parameter group is stuck.
If you would like to see the debug log yourself or provide it for reference in this issue via a Gist, documentation can be found here: https://www.terraform.io/docs/internals/debugging.html
Please let us know about the items above and hopefully we can dig into this further, thanks.
@bflad Thanks for your fast response. I'll answer each of your questions below.
Are you using a reference such as
parameter_group_name = "${aws_db_parameter_group.foo.name}"
in youraws_db_instance
configuration?
I'm using aws_db_parameter_group.foo.id
, but close enough, yes.
Is
apply_immediately
set totrue
in theaws_db_instance
configuration?
Yes.
What is the state of the RDS instance after this apply? Is the parameter group changed? Does it say
pending-reboot
in the console for the parameter group?
No, the RDS instance is unchanged, and still using the old parameter group. Since the RDS instance wasn't changed, there is no pending-reboot
needed.
As for a debug log, my module has hundreds (or more) of resources, and a full debug log is 128,000 lines, so I'm hesitant to publicly post that. If it's highly necessary, let me know, and maybe I can creatively grep or parse through it somehow.
I'm experiencing what may be the same problem. It looks like the modification simply does not happen in my case. After an apply, the parameter group used by my instance is still the original one: default.postgres9.5 (in-sync)
. There is no pending reboot on the instance either.
If you remove the deletion of your old group, does the modification actually happen at all?
I'm having the same issue when trying to upgrade from Postgres 10.6 to 11.1 on RDS. As with @liamg-form3 the parameter group remains the original one.
My code doesn't explicitly delete the parameter group, the only change was to upgrade the engine version and use a new postgres11
family for the parameter group. I'm using the terraform-aws-rds module.
This workaround has worked fine on several instances with the same issue:
postgres11
familyNot very elegant but quite easy to do, and some downtime is required anyway with RDS when upgrading Postgres.
I believe the key here is:
Reboot the DB instance
After calling ModifyDBInstance
. We had done a similar fix during resource creation with snapshots (https://github.com/terraform-providers/terraform-provider-aws/pull/5672), but looks look we also need to do something similar during update.
We got stuck trying to change the parameter group (PG) of our Postgres DB. The error msg:
Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group [our PG name], so the group cannot be deleted
Assuming we want to change from PG _A_ -> _B_ our workaround solution for this was.
main.tf
: // parameter_group_name = aws_db_parameter_group.default.id
main.tf
file switch back the requirement on PG (Which now is _B_). Hope this helps someone in the future.
I also encountered this while trying to change the name of a db param group
Plan:
-/+ resource "aws_db_parameter_group" "scuba" {
~ arn = "arn:aws:rds:xxxxxxxxxxxxxx" -> (known after apply)
description = "Managed by Terraform"
family = "postgres11"
~ id = "xxxxxx-postgres-11" -> (known after apply)
~ name = "xxxxxx-postgres-11" -> "yyyyyy-postgres-11" # forces replacement
+ name_prefix = (known after apply)
- tags = {} -> null
parameter {
apply_method = "immediate"
name = "temp_file_limit"
value = "2147483647"
}
parameter {
apply_method = "immediate"
name = "work_mem"
value = "65536"
}
}
aws_db_parameter_group.xxxxx: Still destroying... [id=xxxxx-postgres-11, 2m40s elapsed]
aws_db_parameter_group.xxxxx: Still destroying... [id=xxxxx-postgres-11, 2m50s elapsed]
Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group xxxxx-postgres-11, so the group cannot be deleted
status code: 400, request id: xxxxx-xxxxx-xxxxx-xxxxx-xxxxx
Action:
I saw the rds instance was marked as ready-for-reboot, manually rebooted the instance.
Impact:
No change - same results as above.
Action:
Created a backup of the xxx database group. Tried to delete the xxxxx-postgres-11 database group.
Impact:
Failed to delete xxxxx-postgres-11: One or more database instances are still members of this parameter group xxxxx-postgres-11, so the group cannot be deleted (Service: AmazonRDS; Status Code: 400; Error Code: InvalidDBParameterGroupState; Request ID: xxxxx-xxxxx-xxxxx-xxxxx-xxxxx).
Action:
Impact: Worked. Once the group was no longer actively assigned to a database, terraform could rename the custom xxxxx-postgres-11 to yyyy-postgres-11 database group. TF then swapped the default.postgres.11 group for the yyyy-postgres-11 group by applying the change immediately.
Suggestion:
Looks like terraform needs to assign to a temporary or default group to the RDS instance prior to modifying the aws_db_parameter_group. Upon completion, restore the intended group.
The same issue 2 years after. No RDS cleanup/destroy possible:
Error: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group ambari-hdf-peterz, so the group cannot be deleted
status code: 400, request id: 64d52b07-e31d-4355-89a6-76755072a433
Error: error deleting RDS Cluster (ambari-hdf-peterz): DBClusterSnapshotAlreadyExistsFault: Cannot create the cluster snapshot because one with the identifier ambari-hdf-final-snapshot already exists.
status code: 400, request id: 58b59224-5a06-4586-b6dc-4d9ab62ead67
Error: Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group xxxxx, so the group cannot be deleted
status code: 400, request id: xxxxxx
[terragrunt] 2020/06/19 12:37:01 Hit multiple errors:
exit status 1
Hi guys, I agree with the previous comment. This causes problems in the automation and use of pipelines. In my case, when using Jenkins. Removing or modifying RDS is impossible in those tasks that pursue terraforms. Unfortunately, I can not offer a solution to the problem in the form of code, I can only assume that this option will work:
Here is our code:
resource "aws_db_parameter_group" "pg" {
name = "paramg"
family = "mysql5.7"
parameter {
name = "log_bin_trust_function_creators"
value = "1"
}
}
resource "aws_db_instance" "db" {
allocated_storage = 30 # gigabytes
backup_retention_period = 7 # in days
engine = "mysql"
engine_version = "5.7"
identifier = "db"
instance_class = "db.t3.small"
multi_az = true
name = "mydb"
password = "password"
port = 5465
storage_type = "gp2"
username = "devops"
vpc_security_group_ids = ["${aws_security_group.DB-SG.id}"]
parameter_group_name = "paramg"
skip_final_snapshot = true
}
I understand that there are workarounds using AWS console, but agree that this is not a solution to this problem.
Reaction to destroy:
Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group paramg, so the group cannot be deleted
status code: 400, request id: xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxx-xxxxxxxxxx
Then the pipeline will not move.
This problem also reproduces with command-line commands.
It doesn't matter whether it is enabled or not apply_imediately.
If I missed something, please correct me.
Thank you. Regards.
UPD
This problem can be fooled. In the case of Jenkins, we reset the error of the first destruction and immediately launch the second in this way:
stage('Terraform Destroy') {
steps {
input 'Destroy Plan'
catchError(buildResult: 'SUCCESS', stageResult: 'FAILURE') {
sh "${env.TERRAFORM_HOME}/terraform destroy -force -input=false"
}
}
}
stage('Terraform second Destroy') {
steps {
input 'Destroy Plan'
sh "${env.TERRAFORM_HOME}/terraform destroy -force -input=false"
}
}
}
}
I hope it will be useful to someone.
This is preventing me to upgrade a SQL Server 2017 to 2019 on RDS.
Can't move the postgres engine version on RDS because of this :(
Most helpful comment
Hi guys, I agree with the previous comment. This causes problems in the automation and use of pipelines. In my case, when using Jenkins. Removing or modifying RDS is impossible in those tasks that pursue terraforms. Unfortunately, I can not offer a solution to the problem in the form of code, I can only assume that this option will work:
Here is our code:
resource "aws_db_parameter_group" "pg" {
name = "paramg"
family = "mysql5.7"
parameter {
name = "log_bin_trust_function_creators"
value = "1"
}
}
resource "aws_db_instance" "db" {
allocated_storage = 30 # gigabytes
backup_retention_period = 7 # in days
engine = "mysql"
engine_version = "5.7"
identifier = "db"
instance_class = "db.t3.small"
multi_az = true
name = "mydb"
password = "password"
port = 5465
storage_type = "gp2"
username = "devops"
vpc_security_group_ids = ["${aws_security_group.DB-SG.id}"]
parameter_group_name = "paramg"
skip_final_snapshot = true
}
I understand that there are workarounds using AWS console, but agree that this is not a solution to this problem.
Reaction to destroy:
Error deleting DB parameter group: InvalidDBParameterGroupState: One or more database instances are still members of this parameter group paramg, so the group cannot be deleted
status code: 400, request id: xxxxxxx-xxxxxx-xxxxxxx-xxxxxxxx-xxxxxxxxxx
Then the pipeline will not move.
This problem also reproduces with command-line commands.
It doesn't matter whether it is enabled or not apply_imediately.
If I missed something, please correct me.
Thank you. Regards.
UPD
This problem can be fooled. In the case of Jenkins, we reset the error of the first destruction and immediately launch the second in this way:
I hope it will be useful to someone.