Terraform-provider-aws: aws_security_group: DependencyViolation: resource sg-XXX has a dependent object

Created on 15 Sep 2017 · 28Comments · Source: hashicorp/terraform-provider-aws

_This issue was originally opened by @brikis98 as hashicorp/terraform#11047. It was migrated here as a result of the provider split. The original body of the issue is below._

Terraform Version

Terraform v0.8.2

Affected Resource(s)

aws_security_group

Terraform Configuration Files

This is part of a larger configuration, but I think the relevant parts are as follows.

Under modules/webserver-cluster/main.tf, I define a module with the following code:

resource "aws_autoscaling_group" "example" {
  launch_configuration = "${aws_launch_configuration.example.id}"
  availability_zones   = ["${data.aws_availability_zones.all.names}"]
  load_balancers       = ["${aws_elb.example.name}"]
  health_check_type    = "ELB"

  min_size = 2
  max_size = 10
}

resource "aws_launch_configuration" "example" {
  image_id        = "ami-40d28157"
  instance_type   = "t2.micro"
  security_groups = ["${aws_security_group.instance.id}"]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_security_group" "instance" {
  name = "my-security-group"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_security_group_rule" "allow_http_inbound" {
  type              = "ingress"
  security_group_id = "${aws_security_group.instance.id}"

  from_port   = 80
  to_port     = 80
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
}

data "aws_availability_zones" "all" {}

resource "aws_elb" "example" {
  name               = "my-example-elb"
  availability_zones = ["${data.aws_availability_zones.all.names}"]
  security_groups    = ["${aws_security_group.elb.id}"]

  listener {
    lb_port           = 80
    lb_protocol       = "http"
    instance_port     = 80
    instance_protocol = "http"
  }

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 3
    interval            = 30
    target              = "HTTP:80/"
  }
}

resource "aws_security_group" "elb" {
  name = "elb"
}

resource "aws_security_group_rule" "allow_http_inbound" {
  type              = "ingress"
  security_group_id = "${aws_security_group.elb.id}"

  from_port   = 80
  to_port     = 80
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "allow_all_outbound" {
  type              = "egress"
  security_group_id = "${aws_security_group.elb.id}"

  from_port   = 0
  to_port     = 0
  protocol    = "-1"
  cidr_blocks = ["0.0.0.0/0"]
}

output "elb_security_group_id" {
  value = "${aws_security_group.elb.id}"
}

In a separate folder, I use this module in the usual way, but also add a custom security group rule:

module "webserver_cluster" {
  source = "modules/webserver-cluster"

  # ... pass various parameters ...
}

resource "aws_security_group_rule" "allow_testing_inbound" {
  type              = "ingress"
  security_group_id = "${module.webserver_cluster.elb_security_group_id}"

  from_port   = 12345
  to_port     = 12345
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
}

Expected Behavior

I expect to be able to run terraform apply and terraform destroy without errors.

Actual Behavior

terraform apply works fine. Occasionally, terraform destroy fails with the following error:

aws_security_group.elb: DependencyViolation: resource sg-344baa48 has a dependent object

Steps to Reproduce

terraform apply
terraform destroy

Important Factoids

It's an intermittent issue, so I can't be sure, but I don't think this error happened with Terraform 0.7.x.

bug servicec2

Source

hashibot

👍19 👀1

Most helpful comment

OK, it looks like adding create_before_destroy = true, and using name_prefix instead of name fixed this issue. I can't believe it took me this long to figure it out!

resource "aws_security_group" "example" {
  name_prefix = "${var.name}"
  # ... (other params omitted) ...

  lifecycle {
    create_before_destroy = true
  }
}

brikis98 on 21 Feb 2018

🎉20 👍11

All 28 comments

I have run into this issue with terraform 0.10.6.

+ module.infrastructure.aws_security_group.sg
    id:                                          <computed>
    description:                                 "Allow traffic to sg from client security groups"
    egress.#:                                    <computed>
    ingress.#:                                   "1"
    ingress.522618655.cidr_blocks.#:             "0"
    ingress.522618655.from_port:                 "1234"
    ingress.522618655.ipv6_cidr_blocks.#:        "0"
    ingress.522618655.protocol:                  "tcp"
    ingress.522618655.security_groups.#:         "1"
    ingress.522618655.security_groups.980544208: "sg-175fa66a"
    ingress.522618655.self:                      "false"
    ingress.522618655.to_port:                   "1234"
    name:                                        "sg_ingress_ydqxa4"
    owner_id:                                    <computed>
    vpc_id:                                      "vpc-63741921"

the delete retried multiple times

* aws_security_group.sg: DependencyViolation: resource sg-234bb25e has a dependent object
    status code: 400, request id: bd64a44d-3e84-4ac4-a2c9-4e392f7c88a3

sstarcher on 21 Sep 2017

Terraform v0.9.9

Same issue

sandyfox on 25 Sep 2017

Terraform v0.10.7

Same issue. Is the only workaround to delete the SG manually and then recreate it via TF?

ghost on 3 Oct 2017

Same issue. For me I'm 99% sure it's because there an ec-2 instance not being changed that is still using the security group. So right now looks like I have to make this change manually.

llaski on 6 Oct 2017

👍1

Yes, that's indeed the case. I don't have access to the Web UI (managed by the client), so I had to resolve it manually. I created an empty security group, replaced the existing one with that empty SG and then reran the Terraform command. That worked fine.

ghost on 6 Oct 2017

Same issue for us, has anyone tested if v0.10.8 fixes this?

nmarchini on 10 Nov 2017

I'm on v0.10.8 and have experienced this.

gaui on 10 Nov 2017

Got it again...

* module.network.module.aws_vpc.aws_security_group.default_private (destroy): 1 error(s) occurred:

* aws_security_group.default_private: DependencyViolation: resource sg-cd40e2b6 has a dependent object
        status code: 400, request id: 6da496ce-b444-4a5c-b85d-c4f2bbadf842
* module.network.module.aws_vpc.aws_subnet.public[0] (destroy): 1 error(s) occurred:

* aws_subnet.public.0: Error deleting subnet: timeout while waiting for state to become 'destroyed' (last state: 'pending', timeout: 10m0s

gaui on 11 Nov 2017

Hey all –

It sounds like, because this is intermittent, that the times it's failing is because "aws_security_group_rule" "allow_testing_inbound" is set to be destroyed _after_ the Security Group itself... I believe because the rule is dependent on the _output_ of the module and not the _group itself_. But I could be wrong.

As a workaround for that, version 1.2.0 of the AWS provider shipped with an new attribute on security_group called revoke_rules_on_delete:

https://github.com/terraform-providers/terraform-provider-aws/pull/2074

Adding that to the security group in the module will likely work around this.

There was other mention of an instance still using the group, can anyone provide a configuration that triggers this with instances?

Thanks!

catsby on 16 Nov 2017

@catsby I just tried setting revoke_rules_on_delete to true on my Security Group, but I still get the exact same aws_security_group: DependencyViolation: resource sg-XXX has a dependent object error on destroy.

The code doesn't seem to be doing anything very complicated. The simplified version is as follows.

I have a module called single-server:

resource "aws_instance" "instance" {
  ami = "${var.ami}"
  instance_type = "${var.instance_type}"
  vpc_security_group_ids = ["${aws_security_group.instance.id}"]
  user_data = "${var.user_data}"
  # ... (other params omitted) ...
}

resource "aws_security_group" "instance" {
  name = "${var.name}"
  description = "Security Group for ${var.name}"
  vpc_id = "${var.vpc_id}"

  # This workaround, unfortunately, did not help
  revoke_rules_on_delete = true
}

resource "aws_security_group_rule" "allow_outbound_all" {
  type = "egress"
  from_port = 0
  to_port = 0
  protocol = "-1"
  cidr_blocks = ["0.0.0.0/0"]
  security_group_id = "${aws_security_group.instance.id}"
}

resource "aws_security_group_rule" "allow_inbound_ssh_from_cidr" {
  count = "${signum(var.allow_ssh_from_cidr)}"
  type = "ingress"
  from_port = 22
  to_port = 22
  protocol = "tcp"
  cidr_blocks = ["${var.allow_ssh_from_cidr_list}"]
  security_group_id = "${aws_security_group.instance.id}"
}

resource "aws_security_group_rule" "allow_inbound_ssh_from_security_group" {
  count = "${signum(var.allow_ssh_from_security_group)}"
  type = "ingress"
  from_port = 22
  to_port = 22
  protocol = "tcp"
  source_security_group_id = "${var.allow_ssh_from_security_group_id}"
  security_group_id = "${aws_security_group.instance.id}"
}

I'm using this module in some code that creates a server and two EBS volumes for it:

module "example" {
  source = "../../modules/single-server"

  name = "example"
  instance_type = "t2.micro"
  ami = "${var.ami}"

  allow_ssh_from_cidr_list = ["0.0.0.0/0"]

  vpc_id = "${data.aws_vpc.default.id}"
  subnet_id = "${data.aws_subnet.selected.id}"

  # Script that attaches and mounts the two EBS volumes
  user_data = "${data.template_file.user_data.rendered}"
}

resource "aws_ebs_volume" "example_1" {
  availability_zone = "${data.aws_subnet.selected.availability_zone}"
  type = "gp2"
  size = 5
}

resource "aws_ebs_volume" "example_2" {
  availability_zone = "${data.aws_subnet.selected.availability_zone}"
  type = "gp2"
  size = 5
}

We have automated tests that run against this code and do the following:

Run apply.
Wait for the server to boot.
SSH to the server and write some files to the two EBS volumes.
Change the server name parameter and re-run apply to force a redeploy.
SSH to the server again to make sure the EBS volumes were properly re-attached and we can still read the files we wrote earlier.
Run destroy.

All of this works until destroy, where we get the aws_security_group: DependencyViolation: resource sg-XXX has a dependent object error. It's happening fairly consistently lately, even though there is nothing in the code anywhere—including the Terraform code, User Data script, or test code—that in any way touches the security group, so I'm quite stumped what could possibly be triggering this problem.

brikis98 on 4 Dec 2017

I'm on 0.10.8 and am currently experiencing this issue. One tip for figuring out what the "dependent object" is: type the name of the SG into the ENI instances searchbox (https://serverfault.com/a/866203/223606). It appears that an attached ENI is preventing Terraform from deleting my SG.

elektron9 on 13 Dec 2017

👍2

@brikis98 are you able to do as elektron9 above mentioned, and determine what the dependency is? The revoke_rules_on_delete parameter will only help here if the dependency is due to a security group rule that has caused a dependency loop with another security group. Perhaps yours is something else?

catsby on 14 Dec 2017

I'll have to check next time I'm working on this code, but I'm pretty sure it's not an ENI, as there are no ENIs being created in that code.

brikis98 on 14 Dec 2017

Update: we eventually determined that this issue was being caused by a security group that had an inbound rule for another security group. After we manually removed the inbound rule, Terraform was able to proceed with the destruction of the security group that was causing this issue.

We did toggle the revoke_rules_on_delete setting to true but the Terraform deploy of that change was blocked by this issue.

elektron9 on 10 Jan 2018

👍3

@elektron9 Can you confirm that revoke_rules_on_delete fixes the issue of being unable to delete a security group that had an inbound rule for another security group?

texascloud on 19 Jan 2018

@CamelCaseNotation yes, our issue was resolved after setting revoke_rules_on_delete to true.

elektron9 on 19 Jan 2018

Spent several hours on various configurations to attempt to work around this. The DependencyViolation... has a dependent object error occurs after the 5 minute timeout in every scenario. Bottom line is that the network interface does not get assigned to the new security group if a new SG resource must be created (e.g. sg name change). The security group is created (if lifecycle create_before_destroy = true) as desired alongside the existing sg which is assigned to the ENI, but the ENI is never reassigned to the new SG.

While Terraform is waiting, I can go into the AWS console and do "change security groups" on either the Network Interface or the ec2 Instance itself and Terraform will immediately continue its process and remove the old SG before completing.

I also tried several iterations using aws_network_interface_sg_attachment without a security_group block on the aws_instance. This deploys fine, but it relies on the AZs default_vpc to initially launch the ec2 instance into which is a security issue for us and leaves the issue of removing it after deployment. Anyway, the idea was to see if a more specific dependency on the ENI would cause terraform to make the change on AWS (ENI to new SG). It did not work.

Is there no explicit way of causing the "Change Security Group" functionality? Seems this should be done under the Terraform hood whenever it recognizes that the SG will change for an instance or ENI.

BlaineBradbury on 24 Jan 2018

OK, I finally had some time to go back and dig into this, and I think I've figured out what's happening! The code looks roughly like this:

resource "aws_instance" "example" {
  # ... (other params omitted) ...

  vpc_security_group_ids = ["${aws_security_group.example.id}"]

  tags {
    Name = "${var.name}"
  }
}

resource "aws_security_group" "example" {
  name = "${var.name}"
  revoke_rules_on_delete = true
  # ... (other params omitted) ...
}

In our test code, we are updating var.name and running terraform apply. Changing the name of a security group means deleting the old one and replacing it with a new one... But Terraform can't do that because aws_instance.example still depends on it! That's why we are getting the DependencyViolation: resource sg-XXX has a dependent object error.

I think all we really need is a create_before_destroy = true on aws_security_group.example. I'll try that and report back.

brikis98 on 21 Feb 2018

OK, it looks like adding create_before_destroy = true, and using name_prefix instead of name fixed this issue. I can't believe it took me this long to figure it out!

resource "aws_security_group" "example" {
  name_prefix = "${var.name}"
  # ... (other params omitted) ...

  lifecycle {
    create_before_destroy = true
  }
}

brikis98 on 21 Feb 2018

🎉20 👍11

Sweet! I tested create_before_destroy = true, and using name_prefix instead of name fixed it for me! Thank you brikis98

ura718 on 21 Feb 2018

👍4

@brikis98 @ura718 you can also use revoke_rules_on_delete = true

gaui on 22 Feb 2018

I am experiencing the issue where security groups are not deleted, referencing dependent objects, because they are attached to lingering ENIs.

The ENIs seem to be coming from an aws_launch_template/aws_autoscaling_group combo and since I did not experience this behaviour when I was using aws_launch_configuration, I suspect that aws_launch_template is somehow the cause of this.

I have tried to solve the problem via revoke_rules_on_delete, lifecycle and name_prefix but they all have no effect since the root cause are the lingering ENIs.

martinbokmankewill on 19 May 2018

👍6

As of 0.11.7 it was fixed by lifecycle { create_before_destroy = true }

DrHashi on 9 Aug 2018

👍2

@martinbokmankewill I've been running into the same issue recently as well. Noticed, the lingering ENIs are almost always previously attached to an ELB. Are you still running into the issue?

jammerful on 10 Sep 2018

I am not running into the issue anymore.

I traced it to not having set delete_on_termination = true in the network_interfaces part of the aws_launch_template resource I was using.

martinbokmankewill on 11 Sep 2018

👍2

Anything I try is doesn't work. You can try in this repo. Just make sure you have DEBUG enabled. Is anyone knows the solution for this repo?

mvershinin-chwy on 8 Jan 2019

No of the above work for me, i have to change SG name in terraform

Adiii717 on 27 Jun 2019

Encountered this today as well.

lifecycle {
    create_before_destroy = true
  }

fixed it for me as well.

Heliosmaster on 25 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

aws_security_group: revoke_rules_on_delete conflict with 'terraform plan'

carmas · 3Comments

AWS CodeBuild using environment variables from EC2 Parameter Store

blaltarriba · 3Comments

Provide releases more often

dvishniakov · 3Comments

aws_alb_target_group_attachment does not support list for target_id

hashibot · 3Comments

Failed to destroy AWS node with volume: * aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXX) to detach from Instance: i-XXXXX

hashibot · 3Comments