terraform version - 0.12.9
provider-aws version - 2.26.0
resource "aws_vpc_dhcp_options" "vpc_dhcp_options" {
domain_name = "eu-west-1.compute.internal"
domain_name_servers = ["AmazonProvidedDNS"]
}
resource "aws_vpc" "vpc" {
cidr_block = "10.250.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
}
resource "aws_vpc_dhcp_options_association" "vpc_dhcp_options_association" {
vpc_id = "${aws_vpc.vpc.id}"
dhcp_options_id = "${aws_vpc_dhcp_options.vpc_dhcp_options.id}"
}
resource "aws_eip" "eip_natgw_z0" {
vpc = true
}
resource "aws_internet_gateway" "igw" {
vpc_id = "${aws_vpc.vpc.id}"
}
resource "aws_route_table" "routetable_main" {
vpc_id = "${aws_vpc.vpc.id}"
}
resource "aws_route" "public" {
route_table_id = "${aws_route_table.routetable_main.id}"
destination_cidr_block = "0.0.0.0/0"
gateway_id = "${aws_internet_gateway.igw.id}"
}
resource "aws_subnet" "public_utility_z0" {
vpc_id = "${aws_vpc.vpc.id}"
cidr_block = "10.250.32.0/20"
availability_zone = "us-east-1a"
}
resource "aws_nat_gateway" "natgw_z0" {
allocation_id = "${aws_eip.eip_natgw_z0.id}"
subnet_id = "${aws_subnet.public_utility_z0.id}"
}
resource "aws_route_table" "routetable_private_utility_z0" {
vpc_id = "${aws_vpc.vpc.id}"
}
resource "aws_route" "private_utility_z0_nat" {
route_table_id = "${aws_route_table.routetable_private_utility_z0.id}"
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = "${aws_nat_gateway.natgw_z0.id}"
}
We quite often hit RouteAlreadyExists
issue during aws_route
creation. We also hit this as part of creation of new VPCs where obviously the aws_route
does not exists. terraform apply
reports:
Error creating route: RouteAlreadyExists: The route identified by 0.0.0.0/0 already exists.
status code: 400, request id: <omitted>
on tf/main.tf line 221, in resource "aws_route" "private_utility_z0_nat":
221: resource "aws_route" "private_utility_z0_nat"
We end up with aws_route
created in AWS but not persistent in the terraform.state
. And the thing requires manual intervention to clean up the aws_route
or import it to the terraform.state
as terraform apply
will always try to create it and will fail with RouteAlreadyExists
.
It does not happen consistently and because of this I cannot provide clear steps to reproduce.
I checked for existing issues and I found terraform-providers/terraform-provider-aws#520 (back from 2017) where the same issue was reported with the statement that there is a data race during route creation.
When I check the logs from the initial terraform apply
it actually fails with:
Error finding route after creating it: Unable to find matching route for Route Table (rtb-1234) and destination CIDR block (0.0.0.0/0).
on tf/main.tf line 183, in resource \"aws_route\" \"private_utility_z0_nat\":
183: resource \"aws_route\" \"private_utility_z0_nat\"
Any idea why it would fail to find the route that it just created?
I see one more issue which describes pretty much the same - https://github.com/terraform-providers/terraform-provider-aws/issues/10666.
It looks like increasing the creation timeout for the aws_route fixes/mitigates this issue:
timeouts {
create = "5m"
}
The reported error is not intuitive:
Error finding route after creating it: Unable to find matching route for Route Table (rtb-1234) and destination CIDR block (0.0.0.0/0).
on tf/main.tf line 183, in resource \"aws_route\" \"private_utility_z0_nat\":
183: resource \"aws_route\" \"private_utility_z0_nat\"
@bflad, @radeksimko, @catsby does it makes sense to improve the error?
I've run in to this issue in the past and discovered the workaround in #338 and mentioned again here (setting create timeout to 5m). But I'm back because I'm experiencing it again even with the elevated timeout. From the timestamps in our logs it appears that the timeout isn't being honored, as I get the Unable to find matching route
error less than 2 minutes after seeing the Creating..
log line for the resource.
Issue #10666 may be related (timeouts on route creation)
See also #13138 , which suggests that maybe some retry logic is needed in the AWS provider
Most helpful comment
I've run in to this issue in the past and discovered the workaround in #338 and mentioned again here (setting create timeout to 5m). But I'm back because I'm experiencing it again even with the elevated timeout. From the timestamps in our logs it appears that the timeout isn't being honored, as I get the
Unable to find matching route
error less than 2 minutes after seeing theCreating..
log line for the resource.Issue #10666 may be related (timeouts on route creation)
See also #13138 , which suggests that maybe some retry logic is needed in the AWS provider