Terraform: Error creating route for VPC peering connections

Created on 29 Aug 2016  ยท  4Comments  ยท  Source: hashicorp/terraform

Terraform Version

Terraform v0.7.2

Affected Resource(s)

  • aws_route

    Terraform Configuration Files

The full set of templates is fairly big. It creates a VPC, a number of routes, an internet gateway, nat gateway, network ACLs, peering connections, and so on. The routes for the VPC peering connections are what seem to cause the problem:

resource "aws_vpc_peering_connection" "vpc_peering_connection" {
  peer_owner_id = "${var.aws_account_id}"
  vpc_id = "${var.origin_vpc_id}"
  peer_vpc_id = "${var.destination_vpc_id}"
  auto_accept = true
  tags { Name = "${var.origin_vpc_name}-to-${var.destination_vpc_name}" }
}

resource "aws_route" "origin_to_destination" {
  count = "${var.num_origin_vpc_route_tables}"
  route_table_id = "${element(split(",", var.origin_vpc_route_table_ids), count.index)}"
  destination_cidr_block = "${var.destination_vpc_cidr_block}"
  vpc_peering_connection_id = "${aws_vpc_peering_connection.vpc_peering_connection.id}"
}

Note that this code has not yet been updated to take advantage of first-class support for lists in Terraform 0.7.x. Is it possible that has anything to do with the problem?

Expected Behavior

In Terraform 0.6.x, this would create the VPC, routes, and peering connections, _usually_ without problems.

Actual Behavior

On almost every single run with Terraform 0.7.x, I get errors like the following:

* aws_route.origin_to_destination.1: Error creating route: RouteAlreadyExists: The route identified by 10.2.0.0/18 already exists.
    status code: 400, request id: 632fc036-ec35-441c-be0e-4616c2ff8067
* aws_route.origin_to_destination.0: Error creating route: RouteAlreadyExists: The route identified by 10.2.0.0/18 already exists.
    status code: 400, request id: 90b06123-714c-4037-b604-5043e7b9a2f9
* aws_route.origin_to_destination.2: Error creating route: RouteAlreadyExists: The route identified by 10.2.0.0/18 already exists.
    status code: 400, request id: ad67ab36-1d39-4e04-a35a-847195eb80e3
* aws_route.origin_to_destination.3: Error creating route: RouteAlreadyExists: The route identified by 10.2.0.0/18 already exists.
    status code: 400, request id: 1db124b2-a7fb-4b2d-a7ec-58ee5c009388
* aws_route.nat.0: Error finding route after creating it: error finding matching route for Route table (rtb-11fd5977) and destination CIDR block (0.0.0.0/0)

Of course, none of these routes actually existed before I ran terraform apply, so there must be some issue with Terraform trying to create them twice.

Steps to Reproduce

  1. terraform apply
bug provideaws

Most helpful comment

Update: I took a look at the route table and the route for 10.0.0.0/18 did exist and had a status of "black hole". I think the series of events that led to this was as follows:

  1. I created my first VPC without issues.
  2. Then I ran terraform apply to create the second VPC, plus its peering connection and corresponding route table entries.
  3. The second creation failed because I happened to hit the EIP limit in my AWS account. I sent an email to the AWS rep to request that the limit be increased.
  4. While waiting for AWS to respond, I ran terraform destroy to ensure I didn't have a partially created VPC hanging around.
  5. AWS upped the limit today and I re-ran terraform apply on the second VPC. It failed with the error The route identified by 10.0.0.0/18 already exists.

My interpretation is that when I ran terraform destroy on the second VPC after its failed creation, it removed the NAT Gateways, subnets, and the VPC itself, but it did NOT clean up the 10.0.0.0/18 route table entries. That left them in a black hole state and therefore, they conflicted the next time I tried to create the VPC and those same route table entries. To work around the issue, I had to clean up those black hole route table entries by hand.

In short, it looks like Terraform may fail to record state (such as a route table entry) when a terraform apply fails due to some sort of AWS error (e.g. hitting the EIP limit).

All 4 comments

Update: I've found, through trial and error and copying code examples I found online, that most of the issues I describe in this bug are resolved by adding two depends_on entries to each aws_route resource: one that points to the Internet Gateway in the VPC and one that points to the corresponding aws_route_table resource.

resource "aws_route" "internet" {
    route_table_id = "${aws_route_table.public.id}"
    destination_cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.main.id}"

    # A workaround for a series of eventual consistency bugs in Terraform. For a list of the errors, see the related
    # bugs described in this issue: https://github.com/hashicorp/terraform/issues/8542. The workaround is based on:
    # https://github.com/hashicorp/terraform/issues/5335 and https://charity.wtf/2016/04/14/scrapbag-of-useful-terraform-tips/
    depends_on = ["aws_internet_gateway.main", "aws_route_table.public"]
}

I have no idea why that helps, but it gets rid of _most_ issues. The only one it does NOT get rid of is https://github.com/hashicorp/terraform/issues/8542.

This error is back and I can't seem to work around it. When I try to create routes for a VPC peering connection with terraform 0.7.8, I see:

Error creating route: RouteAlreadyExists: The route identified by 10.0.0.0/18 already exists.

Update: I took a look at the route table and the route for 10.0.0.0/18 did exist and had a status of "black hole". I think the series of events that led to this was as follows:

  1. I created my first VPC without issues.
  2. Then I ran terraform apply to create the second VPC, plus its peering connection and corresponding route table entries.
  3. The second creation failed because I happened to hit the EIP limit in my AWS account. I sent an email to the AWS rep to request that the limit be increased.
  4. While waiting for AWS to respond, I ran terraform destroy to ensure I didn't have a partially created VPC hanging around.
  5. AWS upped the limit today and I re-ran terraform apply on the second VPC. It failed with the error The route identified by 10.0.0.0/18 already exists.

My interpretation is that when I ran terraform destroy on the second VPC after its failed creation, it removed the NAT Gateways, subnets, and the VPC itself, but it did NOT clean up the 10.0.0.0/18 route table entries. That left them in a black hole state and therefore, they conflicted the next time I tried to create the VPC and those same route table entries. To work around the issue, I had to clean up those black hole route table entries by hand.

In short, it looks like Terraform may fail to record state (such as a route table entry) when a terraform apply fails due to some sort of AWS error (e.g. hitting the EIP limit).

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zeninfinity picture zeninfinity  ยท  3Comments

rjinski picture rjinski  ยท  3Comments

ronnix picture ronnix  ยท  3Comments

c4milo picture c4milo  ยท  3Comments

carl-youngblood picture carl-youngblood  ยท  3Comments