Terraform: prevent_destroy does not prevent destruction when removing a resource from the configuration

Created on 16 Mar 2018  Â·  29Comments  Â·  Source: hashicorp/terraform

The documentation for prevent_destroy says

This flag provides extra protection against the destruction of a given resource. When this is set to true, any plan that includes a destroy of this resource will return an error message

However, removing or commenting out a resource seems to still destroy it with no warning, even though technically it was a plan that includes the destruction of the resource. What I expected was that the state file would remember that it was created with prevent_destroy, and prevent its destruction even if it was no longer part of the parsed configuration. It would be reasonable to expect that the user would be forced to remove the prevent_destroy flag, terraform apply, then remove it from the configuration if they were absolutely sure they wanted to go ahead.

Terraform Version

Terraform v0.11.3
+ provider.aws v1.11.0

Terraform Configuration Files

locals {
  dynamo_db_mutext_attribute = "LockID"  # The name must be exactly this
}

provider "aws" {
  profile = "test"
  region = "ap-southeast-2"
}

resource "aws_s3_bucket" "test_environment_terraform_backend_storage" {
  region = "ap-southeast-2"
  bucket_prefix = "SOME-BUCKET-NAME_PREFIX-"
  versioning {
    enabled = true
  }
  tags {
    Environment = "test"
  }
  # We explicitly prevent destruction using terraform. Remove this only if you really know what you're doing.
  lifecycle {
    prevent_destroy = true
  }
}

resource "aws_dynamodb_table" "test_environment_terraform_backend_mutex" {
  "attribute" {
    name = "${local.dynamo_db_mutext_attribute}"
    type = "S"
  }
  hash_key = "${local.dynamo_db_mutext_attribute}"
  name = "test_environment_terraform_backend_mutex"
  read_capacity = 5
  write_capacity = 5
  # We explicitly prevent destruction using terraform. Remove this only if you really know what you're doing.
  lifecycle {
    prevent_destroy = true
  }
}


Expected Behavior


Resource is prevented from being destroyed because it was created with lifecycle { prevent_destroy = true }

Actual Behavior


Resource with prevent_destroy set to true is destroyed when commenting out or removing from configuration.

Steps to Reproduce

  • Enter the above Terraform configuration into a new module.
  • terraform apply and say yes.
  • Now, comment out the s3 bucket resource.
  • terraform apply again, and say yes
  • Additional Context

    References

    config documentation enhancement thinking

    Most helpful comment

    Storing last set prevent_destroy in state would be beneficial. Especially when generating tf plans programmatically.

    All 29 comments

    Hi @theflyingnerd! Thanks for pointing this out.

    Currently removing a resource from configuration is treated as intent to remove the resource, since that's an explicit action on your part rather than it happening as a part of some other action, such as resource replacement or a whole-configuration destroy.

    However, your idea of retaining this in the state and requiring multiple steps to actually destroy is an interesting one:

    1. Leave resource in config but remove prevent_destroy
    2. Run terraform apply to update the state to no longer have that flag set.
    3. Remove the resource from configuration entirely.
    4. Run terraform apply again to actually destroy it.

    I'd like to think about this some more because while this does seem _safer_ it also feels like this flow could be quite frustrating if e.g. you've already committed some reorganization to your version control that includes removing a resource and then Terraform refuses to apply it until you revert and do steps 1 and 2 from the above.

    In the mean time, we should at least update the documentation to be more explicit about how prevent_destroy behaves in this scenario.

    Great, thanks for your quick and thorough reply @apparentlymart . I agree that a documentation update would be sufficient for now and the proposed functionality could be left open for debate. My comments about this process were a mixture of how I might expect it to work if I didn't read any more documentation as well as a proposal. It's influenced by the idea of AWS instance termination protection. This protection must be explicitly removed first before the instance may be terminated, even if the instance is explicitly terminated by id.

    Any further thoughts on this?
    I used Terraform to setup build pipelines including AWS CodeCommit for storage of code.
    A mistake in terraform will wipe out the entire contents of the Git repo. I too want a way to protect resources in case they get removed from a script.

    Potential duplicate of #3468

    Strange that 17599 closes 3468 and not the opposite but that is still a very wanted feature.

    This was (is) a major risk factor of accidental deletion and was unacceptable for us using terraform in production.

    This is our workaround for this that runs in our Azure DevOps pipeline on every commit before application, I'll see if I can get the code (Powershell Pester Test) approved to be shared.

    1. Run Terraform plan and get list of resources to-be-destroyed
    2. Review the previous state (retrieved from previous commit) for any to-be-destroyed resources that had prevent_destroy set and if so, fail the test, and stop the pipeline.
    3. To fix, two commits are required, one to remove prevent_destroy (which doesn't trigger the destroy resource and thus doesn't trigger failing the test), and then one to remove the resource from the config.

    In addition to this all production changes have an approval step that formats the terraform apply state very clearly and has to be approved by all the team members.

    I currently don't intend to manage certain things with Terraform due to this. As part of deployment of new code (i.e. lambda, ECS containers, api gateway, etc...) my Terraform files are built dynamically, so although it seems like removing it from the .tf files is explicit approval, it could just as easily be a coding error.

    As such, managing any type of Dynamo table or RDS or something is out of the question, and those will just have to be managed manually. While I'm kind of ok with this, it certainly would be nice to just manage the entire application and still be able to sleep.

    Any update on this? It seems like quite an important feature...

    I would imagine all their energies are on getting 0.12 out the door right
    now.

    On Mon, Mar 4, 2019 at 7:37 PM simlu notifications@github.com wrote:

    Any update on this? It seems like quite an important feature...

    —
    You are receiving this because you commented.
    Reply to this email directly, view it on GitHub
    https://github.com/hashicorp/terraform/issues/17599#issuecomment-469523875,
    or mute the thread
    https://github.com/notifications/unsubscribe-auth/AOjVUmLqlNPxCZNSo6y4R-PqzZVILSSXks5vTeZcgaJpZM4StAww
    .

    +2

    omg. I'm looking at how to just prevent helm provider from destroying a chart every time I deploy. I came here to see this amazing prevent_destroy trait that I want to try out, but judging from this, it doesnt even work?

    You are willing to wipe out a whole database created with prevent_destroy flag on? What the heck?

    @CodeSwimBikeRun Your comment before the edit was more correct - it works, but _not_ if you remove the resource from your .tf file, because by doing so you're then also no longer setting prevent_destroy. This issue is about potentially storing that in the state so that it the instruction can potentially still remain around (I'm not sure if this is particularly what you were after or not).

    You can, however, set flags on some resources (such as AWS EC2 and RDS instances) to prevent deletion at the AWS API level, which will help you no matter what.

    Not sure the language is warranted.

    @CodeSwimBikeRun This issue is about a specific case, in which the configuration block for the resource _no longer exists_. In which case, Terraform is following instructions in the strict sense that the resource is no longer configured with prevent_destroy.

    Terraform interprets the configuration files as your _desired state_, and the state file (and/or refreshed in-memory state) as the current state. Destruction becomes a complex edge-case in this way because the absence of a resource implicitly means that you _desired_ it to be destroyed, but in practice this may not _always_ be true. There are other cases to watch out for which are similar, but are consequences of the basic design. For example, https://github.com/hashicorp/terraform/issues/13549.

    The feature works - in the general sense - and is worth using. But it's still pretty important to examine the plan before applying changes. Declarative configuration systems in general are very powerful, but also come with an inherent risk of accidentally automating failure (see: Automation: Enabling Failure at Scale for an example outside Terraform). I personally find that Terraform's plan feature is actually superior to some other declarative configuration tools I've used in that regard. It will tell you, _before changing anything at all_, that it wants to destroy a resource.

    we had part of our production deleted because of this :(

    Storing last set prevent_destroy in state would be beneficial. Especially when generating tf plans programmatically.

    seems like a no-brainer safety-critical feature ... prevent destroy should work

    Would be nice to have a feature added like Cloudformations Retain. If the resource is removed from the file, CF doesn't try to remove the resource, and just removes it from the state file.

    I believe the correct way to accomplish this would be to use

    terraform state rm <identifier>
    

    or in your case

    terraform state rm aws_s3_bucket.test_environment_terraform_backend_storage
    

    This tells Terraform to stop managing the S3 bucket, but will not destroy it. After removing it from the state, you can safely comment out the resource from the configuration file.

    That said, from an automation perspective you may not have the ability to manipulate the terraform.tfstate state file yourself. You would probably need your automation to run a script that performs the terraform state rm before terraform plan and terraform apply.

    @apparentlymart - in my opinion, @samjgalbraith's comment is not only interesting, it describes the only correct way to do it. Currently I'm not using Terraform to manage my data stores because of the lack of this feature (like some other folks on this thread).

    @eric-spitler i agree on the correct way to do things, but its the incorrect stuff that should not lead to catastrophic failure of infrastructure :)
    An easy way to declare my intent to never delete anything if not explicitly requested would be great.

    A thought on the inconvenience aspects mentioned by @apparentlymart way up top - perhaps there could be an override? Like a command-line flag that globally disabled the prevent_destroy behaviour, or took a list of resources to disable the behaviour for? Something like that could prevent accidental deletion but still allow explicitly intentional deletion without significant inconvenience.

    I just discovered this and am fairly horrified -- I thought this was the entire point of prevent_destroy. Of course I protect at the API level too but this is just not acceptable as a risk. I will have to keep KMS and RDS out of Terraform, which is a real shame.

    I do agree, I would also like some way to REALLY prevent accidental destruction of critical resources. @apparentlymart 's comment at the top, actually mimicks the behavior i expected when I first read about the prevent_destroy flag.
    I would realy like to have something like that.

    i'm having the same problem when using azure rm. my "azurerm_app_service_custom_hostname_binding" gets destroyed every time, which breaks the dns provider proxy and forces me to disable and reconfigure every time. i'll have to rely on a powershell script for now. would be great if this feature would be clarified or enforced. as it is, terraform complains that it cannot follow "the plan", but doesn't explain.

    Although it seems to be the intended behaviour when you remove a resource for it to also remove the prevent_destroy flag and delete protection, it is sometimes risky in the context of CI/CD pipelines. In some cases we keep a central RDS module and individual applications use it, so each application has its own main.tf file where it is called and it sometimes happens where different teams make changes to the code in their repository and run pipelines. We would like to protect the production databases to not be removed by accident even if someone removes the whole RDS resource from their application.

    @burizz You might want to consider keeping the Terraform config for the RDS resources in a separate repo with its own Terraform state, and having the applications' Terraform config use remote state as necessary to access its output.

    @eestolano Thanks, that is not a bad idea. I will keep it in mind in the future. Although at this stage it would require significant refactoring of pipelines and testing since DBs are already used in Production with Terraform that is part of the main application state file.

    I ended up resolving this by just protecting my resources with iam on aws. For me I wanted to prevent our user pool from being deleted.

    resource "aws_iam_policy" "prevent_user_pool_deletion" {
      name = "prevent_user_pool_deletion"
    
      policy = <<EOF
    {
      "Version": "2012-10-17",
      "Statement": [
        {
            "Effect": "Deny",
            "Action": [
              "cognito-idp:DeleteUserPool",
              "cognito-idp:DeleteUserPoolClient",
              "cognito-idp:DeleteUserPoolDomain"
            ],
            "Resource": [
                "arn:aws:cognito-idp:${var.region}:${local.account_number}:userpool/eu-west-3_xxxx"
            ]
        }
      ]
    }
    EOF
    }
    
    resource "aws_iam_group_policy_attachment" "attach" {
      group      = "admin"
      policy_arn = aws_iam_policy.prevent_user_pool_deletion.arn
    }
    

    With this in place I have less fear of accidentally deleting our user pool.

    Was this page helpful?
    0 / 5 - 0 ratings