Terraform v0.11.11
+ provider.google v1.20.0
google_compute_network_peering
// Peer the infra vpc with the dev vpc.
resource "google_compute_network_peering" "infra_dev" {
name = "infra-dev"
network = "${module.infra_network.network_self_link}"
peer_network = "${module.dev_network.network_self_link}"
}
// Peer the dev vpc with the infra vpc.
resource "google_compute_network_peering" "dev_infra" {
name = "dev-infra"
network = "${module.dev_network.network_self_link}"
peer_network = "${module.infra_network.network_self_link}"
depends_on = ["google_compute_network_peering.infra_dev"]
}
Error: Error applying plan:
1 error(s) occurred:
* google_compute_network_peering.dev_infra: 1 error(s) occurred:
* google_compute_network_peering.dev_infra: Error adding network peering: googleapi: Error 400: There is a route operation in progress on the local or peer network. Try again later., badRequest
The infra and the dev vpc should be peered.
See error. This basically means that peering is broken. The depends_on has no effect. The use of depends_on in dev_infra should mean that it WAITS until the first peering operation completes thereby fulfilling the GCP API requirement of one peering operation at a time.
terraform applyHi @dgarstang - is this a different issue than https://github.com/terraform-providers/terraform-provider-google/issues/3026?
I will close #3026. I think this ticket describes the situation more clearly.
Hmm, I can't seem to recreate this issue with the following config:
// Peer the infra vpc with the dev vpc.
resource "google_compute_network_peering" "infra_dev" {
name = "infra-dev"
network = "${google_compute_network.infra_network.self_link}"
peer_network = "${google_compute_network.dev_network.self_link}"
}
// Peer the dev vpc with the infra vpc.
resource "google_compute_network_peering" "dev_infra" {
name = "dev-infra"
network = "${google_compute_network.dev_network.self_link}"
peer_network = "${google_compute_network.infra_network.self_link}"
depends_on = ["google_compute_network_peering.infra_dev"]
}
resource "google_compute_network" "infra_network" {
name = "prodfoobar"
auto_create_subnetworks = "false"
}
resource "google_compute_network" "dev_network" {
name = "devfoobar"
auto_create_subnetworks = "false"
}
Do you mind running with the full debug logs? i.e. TF_LOG="DEBUG"
I have been working around this with a null resource:
resource "google_compute_network_peering" "to" {
name = "to"
network = "network-2"
peer_network = "network-1"
}
resource "google_compute_network_peering" "from" {
name = "from"
network = "network-1"
peer_network = "network-2"
// only one operation at a time for network peering, so we need an explicit serialization
depends_on = ["null_resource.force_networks_in_order"]
}
resource "null_resource" "force_networks_in_order" {
provisioner "local-exec" {
command = "echo ${google_compute_network_peering.to.id}"
}
}
@emilymye this is because you need more networks and peerings to reproduce the race condition.
We have a lot of different networks in GCP using a shared VPC. Each service lies in its own separated network and we need to peer each network to allow communication between relevant services.
We hit the race condition every single time and without a dependency hack using input/output it would take like 20 iterations of plan/apply to have all the peerings created from scratch.
Now Terraform team seems to want to let Terraform being dumb regarding parallelism, I mean dumb in a good way. And let the provider to take care of provider specific implementation details like the parallelism issue we have here, i.e. "In GCP it is not possible to peer a network with several other networks at the same time".
Our solution is to reproduce a graph of dependency of the peerings using input/output:
1) We have a module to peer two networks in both direction and we use depends_on to do them in sequence:
# Terraform module: gcp/google/vpc_network/network_peering
# Peer a network with another.
# Note: a network cannot be peered to multiple networks simultaneously.
# We have to create the peering sequentially thus you'll notice some hacks
# to be able to do so with Terraform 0.12
resource "google_compute_network_peering" "network" {
name = "${var.network_name}-${var.peered_network_name}"
network = "${var.network_link}"
peer_network = "${var.peered_network_link}"
}
resource "google_compute_network_peering" "peered_network" {
depends_on = ["google_compute_network_peering.network"]
name = "${var.peered_network_name}-${var.network_name}"
network = "${var.peered_network_link}"
peer_network = "${var.network_link}"
}
2) This module declares its own network link inputs as outputs:
# Outputs for gcp/google/vpc_network/network_peering module
# Modules dependency hack as of Terraform 0.12
# We use the network variable to define a chain of dependencies between the
# different calls of this module.
# Note that the values seem to be reversed but this is expected as we use the
# google_compute_network_peering.peered_network resource which is the last one
# to be created.
# Inspired from:
# https://github.com/hashicorp/terraform/issues/1178#issuecomment-207369534
output "network_link" { value = "${google_compute_network_peering.peered_network.peer_network}" }
output "peered_network_link" { value = "${google_compute_network_peering.peered_network.network}" }
3) Then the module callers can reproduce the dependency graph like the following (note that we have another module to actually create the network, they have the name <name>_network):
module "peering_A_B"
source = "../../vpc_network/network_peering"
network_name = module.A_network.project_name
network_link = module.A_network.network_link
peered_network_name = module.B_network.project_name
peered_network_link = module.B_network.network_link
}
module "peering_B_C" {
source = "../../vpc_network/network_peering"
network_name = module.B_network.project_name
network_link = module.peering_A_B.peered_network_link
peered_network_name = module.C_network.project_name
peered_network_link = module.C_network.network_link
}
module "peering_A_C" {
source = "../../vpc_network/network_peering"
network_name = module.A_network.project_name
network_link = module.peering_A_B.network_link
peered_network_name = module.C_network.project_name
peered_network_link = module.peering_B_C.peered_network_link
}
The example above reproduces the following graph:
A -> B -> C
|__________^
The above solution effectively peers in sequence A-B then B-C then A-C. If we don't do that then Terraform will do all the 3 peerings at the same time which will fail 2 times and require 3 apply iterations to complete. The first time the B-C and A-C peering will fail because A-B is being peered. The second time A-C will fail because B-C is being peered. The third time A-C will be created.
So it would be upra-supra-mega cool if the Google provider could handle this for us, one possible way would be that the provider allows only one peering resource to run at any given time. It will be slower but will work in one pass and we can use count in our peering resources, saving a lot of management burden because the example above is simple, in a real use-case it becomes much more harder to maintain the graph.
An easy way to reproduce this is to use google_compute_network_peering with a count of networks. Setting up a hub and spoke network with counts causes this error every single time.
variable "organization_id" {
description = "The organization where the projects and folders should be created"
type = "string"
}
variable "billing_account_id" {
description = "The ID of the billing account resources should be created under (XXXXXX-XXXXX-XXXXXX)"
type = "string"
}
variable "labels" {
description = "Map of labels that will be applied to all resources that have labels"
type = "map"
}
variable "number_of_spokes" {
description = "How many VPCs should be created and peered with the hub"
type = "string"
default = 4
}
resource "google_project" "compute_project" {
name = "compute-project"
project_id = "project-${random_id.compute_project.hex}"
org_id = "${var.organization_id}"
billing_account = "${var.billing_account_id}"
labels = "${var.labels}"
auto_create_network = false
}
resource "random_id" "compute_project" {
byte_length = 4
}
resource "google_compute_network" "hub_network" {
name = "hub-network"
project = "${google_project.compute_project.id}"
auto_create_subnetworks = false
delete_default_routes_on_create = true
}
resource "google_compute_subnetwork" "hub_subnetwork" {
provider = "google-beta"
name = "hub-subnetwork"
project = "${google_project.compute_project.id}"
ip_cidr_range = "10.1.1.0/24"
region = "us-central1"
network = "${google_compute_network.hub_network.self_link}"
enable_flow_logs = true
log_config {
aggregation_interval = "INTERVAL_10_MIN"
flow_sampling = 0.5
metadata = "INCLUDE_ALL_METADATA"
}
}
resource "google_compute_firewall" "ingress" {
provider = "google-beta"
name = "hub-firewall"
network = "${google_compute_network.hub_network.name}"
project = "${google_project.compute_project.id}"
enable_logging = true
allow {
protocol = "tcp"
ports = [
"80", //http
"443", //https
"22" //ssh
]
}
}
resource "google_compute_route" "internet" {
name = "hub-network"
project = "${google_project.compute_project.id}"
dest_range = "0.0.0.0/0"
network = "${google_compute_network.hub_network.name}"
next_hop_gateway = "default-internet-gateway"
priority = 1
}
resource "google_compute_network" "vpc_network" {
count = "${var.number_of_spokes}"
name = "spoke-network-${count.index}"
project = "${google_project.compute_project.id}"
auto_create_subnetworks = false
delete_default_routes_on_create = true
depends_on = ["google_compute_subnetwork.hub_subnetwork"]
}
resource "random_id" "vpc_network" {
count = "${var.number_of_spokes}"
byte_length = 4
}
resource "google_compute_subnetwork" "vpc_subnetwork" {
count = length(google_compute_network.vpc_network)
provider = "google-beta"
name = "spoke-subnetwork-${count.index}"
project = "${google_project.compute_project.id}"
ip_cidr_range = "${cidrsubnet("10.1.1.0/16", 8, count.index + 2)}"
region = "us-central1"
network = "${element(google_compute_network.vpc_network.*.self_link, count.index)}"
enable_flow_logs = true
log_config {
aggregation_interval = "INTERVAL_10_MIN"
flow_sampling = 0.5
metadata = "INCLUDE_ALL_METADATA"
}
}
resource "google_compute_network_peering" "hub_to_peer" {
count = length(google_compute_network.vpc_network)
name = "hub-to-peer-${count.index}"
network = "${google_compute_network.hub_network.self_link}"
peer_network = "${element(google_compute_network.vpc_network.*.self_link, count.index)}"
depends_on = ["google_compute_subnetwork.vpc_subnetwork", "google_compute_subnetwork.hub_subnetwork"]
}
resource "google_compute_network_peering" "peer_to_hub" {
count = length(google_compute_network.vpc_network)
name = "peer-to-hub-${count.index}"
network = "${element(google_compute_network.vpc_network.*.self_link, count.index)}"
peer_network = "${google_compute_network.hub_network.self_link}"
depends_on = ["google_compute_subnetwork.vpc_subnetwork", "google_compute_subnetwork.hub_subnetwork"]
}
I also agree that it would be really nice if this worked. Creating a wrapper resource just to fulfill this is pretty painful.
I also don't think anyone has yet mentioned the easiest workaround, which is using:
terraform apply -parallelism=1
That nicely sidesteps the issue, at the expense of deployment time increasing.
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!
Most helpful comment
@emilymye this is because you need more networks and peerings to reproduce the race condition.
We have a lot of different networks in GCP using a shared VPC. Each service lies in its own separated network and we need to peer each network to allow communication between relevant services.
We hit the race condition every single time and without a dependency hack using input/output it would take like 20 iterations of plan/apply to have all the peerings created from scratch.
Now Terraform team seems to want to let Terraform being dumb regarding parallelism, I mean dumb in a good way. And let the provider to take care of provider specific implementation details like the parallelism issue we have here, i.e. "In GCP it is not possible to peer a network with several other networks at the same time".
Our solution is to reproduce a graph of dependency of the peerings using input/output:
1) We have a module to peer two networks in both direction and we use
depends_onto do them in sequence:2) This module declares its own network link inputs as outputs:
3) Then the module callers can reproduce the dependency graph like the following (note that we have another module to actually create the network, they have the name
<name>_network):The example above reproduces the following graph:
The above solution effectively peers in sequence
A-BthenB-CthenA-C. If we don't do that then Terraform will do all the 3 peerings at the same time which will fail 2 times and require 3 apply iterations to complete. The first time theB-CandA-Cpeering will fail becauseA-Bis being peered. The second timeA-Cwill fail becauseB-Cis being peered. The third timeA-Cwill be created.So it would be upra-supra-mega cool if the Google provider could handle this for us, one possible way would be that the provider allows only one peering resource to run at any given time. It will be slower but will work in one pass and we can use
countin our peering resources, saving a lot of management burden because the example above is simple, in a real use-case it becomes much more harder to maintain the graph.