Terraform: removing security_groups from aws_elb results in "no changes"

Created on 24 Feb 2016  路  11Comments  路  Source: hashicorp/terraform

Terraform 0.6.11 on OS X 10.11.3

If you remove the security_groups field from an aws_elb resource, Terraform will not recognize there is any change to the infrastructure.

Example infrastructure:

provider "aws" {
    region = "us-east-1"
}

resource "aws_vpc" "default" {
    cidr_block = "192.168.0.0/16"
}

resource "aws_internet_gateway" "sandbox" {
    vpc_id = "${aws_vpc.default.id}"
}

resource "aws_subnet" "default" {
    vpc_id            = "${aws_vpc.default.id}"
    cidr_block        = "192.168.0.0/20"
    availability_zone = "us-east-1a"
}

resource "aws_security_group" "blue" {
  vpc_id      = "${aws_vpc.default.id}"
}

resource "aws_elb" "deafult" {
    subnets = ["${aws_subnet.default.id}"]
    #security_groups = ["${aws_security_group.blue.id}"]

    listener {
        instance_port = 12345
        instance_protocol = "tcp"
        lb_port = 12345
        lb_protocol = "tcp"
    }
}

To reproduce:

  1. terraform apply
  2. perl -i -pe 's/security_groups/#security_groups/' main.tf
  3. terraform plan

No changes are detected as you can see in this gist of the output.

This leaves the security group dangling on that resource in AWS. Trying to remove the security group will result in a timeout (since it's still in use).

bug core v0.11

Most helpful comment

We're facing #10267, which directed me to #2151, which directed me here.

There hasn't been any update here for quite a while now, and I feel this is a pretty critical problem from reliability as well as security point of view (since security group rules are involved).

All 11 comments

Hi @brainsik,

I was able to reproduce this easy enough. I was able to force the behaviour to work as expected though:

provider "aws" {
    region = "us-east-1"
}

resource "aws_vpc" "default" {
    cidr_block = "192.168.0.0/16"
}

resource "aws_internet_gateway" "sandbox" {
    vpc_id = "${aws_vpc.default.id}"
}

resource "aws_subnet" "default" {
    vpc_id            = "${aws_vpc.default.id}"
    cidr_block        = "192.168.0.0/20"
    availability_zone = "us-east-1b"
}

resource "aws_security_group" "blue" {
  vpc_id      = "${aws_vpc.default.id}"
}

resource "aws_elb" "deafult" {
    subnets = ["${aws_subnet.default.id}"]
    security_groups = []

    listener {
        instance_port = 12345
        instance_protocol = "tcp"
        lb_port = 12345
        lb_protocol = "tcp"
    }
}

So, instead of completely removing the security groups parameter on the ELB, I made it an empty array.

When you removed security_groups, an internal function d.HasChange("") didn't fire as security_groups wasn't in the state at that point

The work around above should unblock you for now. I will make this as a bug for now and we can continue having a look at it

Paul

I guess that this is related with the problem that I've detected..

If you've a elb without the security_group parameter define (the default sg will be used for that elb) and if you add a security group using the console, terraform won't detect any change when you run terraform plan again.

Wow! @brainsik this is an incredible repro. Thank you so much. I got it reproduced here locally, going to take a look at this soon to figure out what the correct path forward is.

Okay, so this is not currently possible to fix without some major changes.

The issue is that both fields are "computed" which means that even in the absense of a value in the config, some value will be created and put in the state. When you remove it from the config, its still in the state, and when a computed value already exists it has no reason to believe that this will change.

What we need to do is somehow be able to mark (in the state) where each field's value came from: config or computed. If we can do this, then we'd be able to actually make this change work.

That is quite a bit of a bigger chance so I'm going to move on to look at some other issues but definitely something we need to tackle.

The workaround does not work anymore, as it fails:

* aws_elb.cf_loggregator: Failure applying security groups to ELB: InvalidConfigurationRequest: SecurityGroups must not be empty
    status code: 409, request id: 250c0670-b0d0-11e6-8a9f-d9cc1904b6b3
* aws_elb.cf_doppler: Failure applying security groups to ELB: InvalidConfigurationRequest: SecurityGroups must not be empty
    status code: 409, request id: 251778df-b0d0-11e6-80df-79882f1cac48
* aws_elb.cf_cc: Failure applying security groups to ELB: InvalidConfigurationRequest: SecurityGroups must not be empty
    status code: 409, request id: 25172abe-b0d0-11e6-80df-79882f1cac48

We're facing #10267, which directed me to #2151, which directed me here.

There hasn't been any update here for quite a while now, and I feel this is a pretty critical problem from reliability as well as security point of view (since security group rules are involved).

@apsops I believe the workaround is to use an empty list instead of removing the entry entirely. There's an example of that above.

Holy bejeezus that was a lot of issue hopping to find the one that remains open.

I have the same problem with aws_security_group and removing the last egress rule. It doesn't take effect and remains on my security group.

https://github.com/hashicorp/terraform/issues/2151 got closed and rolled into this bug with supposedly the same root cause, but I'm not sure that's accurate. That dealt with aws_security_group ingress and egress rule changes not taking effect. This seems different since it is lists of security groups in other resources not taking effect. Those issues seem very different.

Hi all,

The commonality between the issues that have been combined into this one is that they relate to the fact that in several cases providers are interpreting the absence of an argument as an intent to ignore any existing value of that argument, rather than to explicitly unset that argument, and thus in some cases it is impossible to explicitly unset the argument and in other cases a special parsing mode is activated for that argument to allow for both intents.

At this time it seems most likely that this problem will be resolved by defining some general design patterns that providers can follow in situations where ignoring existing data is desirable, rather than unsetting it. For example, a provider could offer a boolean setting which explicitly opts in to ignoring the existing value, and have the default behavior be to unset it. However, even when such a pattern is defined it will require some changes to provider schema and logic, so each provider will need to navigate those changes while considering compatibility with existing configurations; it's likely that the existing ambiguous cases using the special "attributes as blocks" mode will remain using that special mode until their next major release.

In the mean time, if you have a specific case where unsetting an argument does not unset it in the remote system (ignores it instead), we'd encourage opening an issue against the provider in question and mentioning this issue in the references so that we give the provider teams an opportunity to seek out local solutions where possible, and so that _this_ issue will grow to have a list of specific situations attached to it that can inform the solution to this issue.

This is happening in the Vault provider too: https://github.com/terraform-providers/terraform-provider-vault/issues/572

In this case, the attribute is simply a set and not a block. It seems like this also causes this particular buggy (?) behaviour to surface.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rjinski picture rjinski  路  3Comments

c4milo picture c4milo  路  3Comments

thebenwaters picture thebenwaters  路  3Comments

rnowosielski picture rnowosielski  路  3Comments

ketzacoatl picture ketzacoatl  路  3Comments