Terraform: aws_route: continually deletes and re-updates the network_interface_id attribute

Created on 28 Nov 2015  ·  26Comments  ·  Source: hashicorp/terraform

to reproduce:

  1. create an aws_routetable resource called "rt" and an aws_instance resource called "inst"
  2. create an aws_route resource pointing to the aws_routetable resource, with an instance_id attribute of ${aws_instance.inst.id}
  3. terraform plan --out=tf.plan
  4. terraform apply tf.plan

Repeat steps 3 and 4. Note that each time you repeat these steps, Terraform wants to change the aws_route resources by deleting the network_instance_id attributes.

bug provideaws waiting-response

Most helpful comment

OK, I take that back. This is "fixed", at least for the nat gateway use case.

After debugging the TF source for a bit I found it was PEBKAC. I made a mistake in my TF configuration: I used gateway_id instead of nat_gateway_id:

-      gateway_id = "${aws_nat_gateway.subnet_private_nat_natgw.id}"
+      nat_gateway_id = "${aws_nat_gateway.subnet_private_nat_natgw.id}"

If you look _very_ closely at @mtekel's comment (which is the same I was experiencing), things start to become clear. The aws_route_table route attribute set is migrating from one hash (the number between route. and attributes like .nat_gateway_id) to another:

aws_route_table.internet.0: Modifying...
  route.1105048783.cidr_block:                "0.0.0.0/0" => ""
  route.1105048783.gateway_id:                "" => ""
  route.1105048783.instance_id:               "" => ""
  route.1105048783.nat_gateway_id:            "nat-085c1198692ea0caa" => ""
  route.1105048783.network_interface_id:      "" => ""
  route.1105048783.vpc_peering_connection_id: "" => ""
  route.1639638082.cidr_block:                "" => "0.0.0.0/0"
  route.1639638082.gateway_id:                "" => "nat-085c1198692ea0caa"
  route.1639638082.instance_id:               "" => ""
  route.1639638082.nat_gateway_id:            "" => ""
  route.1639638082.network_interface_id:      "" => ""
  route.1639638082.vpc_peering_connection_id: "" => ""

We can clearly see that TF is setting gateway_id and unsetting nat_gateway_id. In fact, infrastructure deployed using the nat gateway in the wrong field will still work fine. After debugging the Amazon API call responses, it appears that even if you specify a nat_gateway_id in the gateway_id field for a route create request, amazon will return that id in the nat_gateway_id field. In other words, the AWS API automatically reassigns the nat gateway id to the correct nat gateway field on subsequent read responses for us.

All 26 comments

Bump, just ran into this!

~ module.vpc.aws_route.gateway
    network_interface_id: "eni-<...>" => ""

~ module.kubernetes.aws_route.master
    network_interface_id: "eni-<...>" => ""

On every plan/apply. Interestingly, the routes remain intact, they're not blown away.

What's happening is that when you specify a route with an instance ID, AWS helpfully finds the ENI ID that it's going to use for that route and adds it to the route object internally. Then, when you run terraform plan again, Terraform refreshes the status from AWS, finds the ENI ID, and says, "whoa, I don't have that in my state file, I need to delete that." Then, .... you see the loop.

Just experienced this as well. No work around eh?

It really doesn't require a workaround, it doesn't harm anything.

Maybe it doesn't harm anything, but its certainly a bug as Terraform is indicating there's a state change when there shouldn't be one

Oh, no doubt it's a bug! I'm simply pointing out that one doesn't need a "workaround", because while annoying, it doesn't negatively impact the system.

@SpencerBrown, yes it does negatively impact the system, check out the #4105 linked by @pikeas

I don't think so, see my comment on #4105.

I just encountered this same issue trying to create a route on a route table for private subnets with a NAT instance as the target. I was able to get around it (stop Terraform from thinking that it needs to make a change) by specifying _both_ the instance_id and the network_interface_id on the route. If you're creating/managing these separately, then you can just use the interpolation syntax to specify both attribute values. If not (as I did), then you'll have to get the id for the ENI attached to the instance that you're targeting, and hard-code it 😞. It seems to me (and I think @SpencerBrown is saying the same thing) that Terraform is able get the ENI ID based on the instance ID, but doesn't record the fact that it created it. So on subsequent plan/apply runs, Terraform thinks "I didn't create that ENI, but the current state reflects that it's there - best that I remove the reference," which then leads to a loop where again...it gets the ENI ID based on the instance ID, but doesn't record that it created it.

TLDR: if you really want to stop Terraform from thinking that a change needs to be made, specify _both_ the instance_id and network_interface_id on the route as a temporary fix.

I am creating routes to the new AWS NAT gateways and TF updates the routing tables on every run, even though there are no changes:

aws_route_table.internet.0: Modifying...
  route.1105048783.cidr_block:                "0.0.0.0/0" => ""
  route.1105048783.gateway_id:                "" => ""
  route.1105048783.instance_id:               "" => ""
  route.1105048783.nat_gateway_id:            "nat-085c1198692ea0caa" => ""
  route.1105048783.network_interface_id:      "" => ""
  route.1105048783.vpc_peering_connection_id: "" => ""
  route.1639638082.cidr_block:                "" => "0.0.0.0/0"
  route.1639638082.gateway_id:                "" => "nat-085c1198692ea0caa"
  route.1639638082.instance_id:               "" => ""
  route.1639638082.nat_gateway_id:            "" => ""
  route.1639638082.network_interface_id:      "" => ""
  route.1639638082.vpc_peering_connection_id: "" => ""
aws_route_table.internet.1: Modifying...
  route.4224434663.cidr_block:                "0.0.0.0/0" => ""
  route.4224434663.gateway_id:                "" => ""
  route.4224434663.instance_id:               "" => ""
  route.4224434663.nat_gateway_id:            "nat-004491712a91b2a74" => ""
  route.4224434663.network_interface_id:      "" => ""
  route.4224434663.vpc_peering_connection_id: "" => ""
  route.478704973.cidr_block:                 "" => "0.0.0.0/0"
  route.478704973.gateway_id:                 "" => "nat-004491712a91b2a74"
  route.478704973.instance_id:                "" => ""
  route.478704973.nat_gateway_id:             "" => ""
  route.478704973.network_interface_id:       "" => ""
  route.478704973.vpc_peering_connection_id:  "" => ""

+1, also encountering this issue with aws_route_table. In my case it's with the use of a nat gateway.

+1. Annoying

Most likely related to #4311.
Submitted PR to fix.

As reported in #4311: https://github.com/hashicorp/terraform/issues/4311#issuecomment-198366104 via @gozer

Hi @SpencerBrown,

I believe this issue has may have been fixed as part of the 0.6.13 release - can you have a look and tell me if your config shows a continual loop still?

I have been able to recreate it in 0.6.12 but then it wasn't happening in 0.6.13

Thanks

Paul

I've experienced this on master as recently as today

@stack72: 0.6.13 fixes the issue for me. You can close this issue, or am I supposed to do that?

@SpencerBrown this is brilliant news! This to me suggests that the issue that @gozer experienced is a different problem - i will look at this and then close it. Thanks for letting me know so fast @SpencerBrown

this still happen for me on 0.6.13

@einyx can you post a configuration that will help me track down the issue that you are having?

Thanks

Paul

@stack72 - I'm using terraform from homebrew (link) and I"m still seeing this issue in version 0.6.13. What configuration can I provide to help track down the issue? Not sure if this helps, but we're using terraform at DIsney Parks and Resorts, and we'd be glad to help squash this bug.

Hi @sochoa & @einyx

Please can you tell me if this issue still exists in 0.6.15 ?

Thanks

Paul

@stack72 don't know about @sochoa and @einyx but it still does the same for me on 0.6.15 (so, not fixed AFAIK).

OK, I take that back. This is "fixed", at least for the nat gateway use case.

After debugging the TF source for a bit I found it was PEBKAC. I made a mistake in my TF configuration: I used gateway_id instead of nat_gateway_id:

-      gateway_id = "${aws_nat_gateway.subnet_private_nat_natgw.id}"
+      nat_gateway_id = "${aws_nat_gateway.subnet_private_nat_natgw.id}"

If you look _very_ closely at @mtekel's comment (which is the same I was experiencing), things start to become clear. The aws_route_table route attribute set is migrating from one hash (the number between route. and attributes like .nat_gateway_id) to another:

aws_route_table.internet.0: Modifying...
  route.1105048783.cidr_block:                "0.0.0.0/0" => ""
  route.1105048783.gateway_id:                "" => ""
  route.1105048783.instance_id:               "" => ""
  route.1105048783.nat_gateway_id:            "nat-085c1198692ea0caa" => ""
  route.1105048783.network_interface_id:      "" => ""
  route.1105048783.vpc_peering_connection_id: "" => ""
  route.1639638082.cidr_block:                "" => "0.0.0.0/0"
  route.1639638082.gateway_id:                "" => "nat-085c1198692ea0caa"
  route.1639638082.instance_id:               "" => ""
  route.1639638082.nat_gateway_id:            "" => ""
  route.1639638082.network_interface_id:      "" => ""
  route.1639638082.vpc_peering_connection_id: "" => ""

We can clearly see that TF is setting gateway_id and unsetting nat_gateway_id. In fact, infrastructure deployed using the nat gateway in the wrong field will still work fine. After debugging the Amazon API call responses, it appears that even if you specify a nat_gateway_id in the gateway_id field for a route create request, amazon will return that id in the nat_gateway_id field. In other words, the AWS API automatically reassigns the nat gateway id to the correct nat gateway field on subsequent read responses for us.

There's a discussion in #6551 about making the gateway_id/nat_gateway_id use case more user friendly. Which should allow you to close this network_interface_id issue.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings