Terraform-provider-azurerm: Can create 3 subnets with NSG and Route Table associations, but no more

Created on 11 Dec 2018  路  36Comments  路  Source: terraform-providers/terraform-provider-azurerm

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.11.10

  • provider.azurerm v1.19.0

Affected Resource(s)

  • azurerm_subnet
  • azurerm_route_table
  • azurerm_subnet_network_security_group_association
  • azurerm_subnet_route_table_association

Terraform Configuration Files

provider "azurerm" {
  subscription_id = "${var.sub_id}"
  tenant_id       = "${var.tenant_id}"
  client_id       = "${var.tf_sp_appid}"
  client_secret   = "${var.tf_sp_secret}"
}

data "azurerm_resource_group" "vnet_rg" {
  name     = "${var.vnet_rg_name}"
}

data "azurerm_virtual_network" "existing_vnet" {
  name     = "${var.existing_vnet_name}"
  resource_group_name = "${var.vnet_rg_name}"
}

data "azurerm_network_security_group" "required_nsg" {
  name     = "${var.required_nsg_name}"
  resource_group_name = "${var.vnet_rg_name}"
}
# ##
resource "azurerm_subnet" "subnet0" {
  name                 = "${var.subnet_names[0]}"
  resource_group_name  = "${var.vnet_rg_name}"
  virtual_network_name = "${var.existing_vnet_name}"
  address_prefix       = "${var.subnet_prefixes[0]}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
  route_table_id       = "${azurerm_route_table.routetable0.id}"
}

resource "azurerm_route_table" "routetable0" {
  name     = "${var.existing_vnet_name}-${var.subnet_names[0]}-Routetable"
  location     = "${data.azurerm_resource_group.vnet_rg.location}"
  resource_group_name     = "${var.vnet_rg_name}"

  tags {
    environment = "${var.tag_environment_name}"
  }
}

resource "azurerm_subnet_network_security_group_association" "nsgassociation0" {
  subnet_id                 = "${azurerm_subnet.subnet0.id}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
}

resource "azurerm_subnet_route_table_association" "routetableassociation0" {
  subnet_id                 = "${azurerm_subnet.subnet0.id}"
  route_table_id = "${azurerm_route_table.routetable0.id}"
}
# ##
resource "azurerm_subnet" "subnet1" {
  name                 = "${var.subnet_names[1]}"
  resource_group_name  = "${var.vnet_rg_name}"
  virtual_network_name = "${var.existing_vnet_name}"
  address_prefix       = "${var.subnet_prefixes[1]}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
  route_table_id       = "${azurerm_route_table.routetable1.id}"
}

resource "azurerm_route_table" "routetable1" {
  name     = "${var.existing_vnet_name}-${var.subnet_names[1]}-Routetable"
  location     = "${data.azurerm_resource_group.vnet_rg.location}"
  resource_group_name     = "${var.vnet_rg_name}"

  tags {
    environment = "${var.tag_environment_name}"
  }
}

resource "azurerm_subnet_network_security_group_association" "nsgassociation1" {
  subnet_id                 = "${azurerm_subnet.subnet1.id}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
}

resource "azurerm_subnet_route_table_association" "routetableassociation1" {
  subnet_id                 = "${azurerm_subnet.subnet1.id}"
  route_table_id = "${azurerm_route_table.routetable1.id}"
}
# ##
resource "azurerm_subnet" "subnet2" {
  name                 = "${var.subnet_names[2]}"
  resource_group_name  = "${var.vnet_rg_name}"
  virtual_network_name = "${var.existing_vnet_name}"
  address_prefix       = "${var.subnet_prefixes[2]}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
  route_table_id       = "${azurerm_route_table.routetable2.id}"
}

resource "azurerm_route_table" "routetable2" {
  name     = "${var.existing_vnet_name}-${var.subnet_names[2]}-Routetable"
  location     = "${data.azurerm_resource_group.vnet_rg.location}"
  resource_group_name     = "${var.vnet_rg_name}"

  tags {
    environment = "${var.tag_environment_name}"
  }
}

resource "azurerm_subnet_network_security_group_association" "nsgassociation2" {
  subnet_id                 = "${azurerm_subnet.subnet2.id}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
}

resource "azurerm_subnet_route_table_association" "routetableassociation2" {
  subnet_id                 = "${azurerm_subnet.subnet2.id}"
  route_table_id = "${azurerm_route_table.routetable2.id}"
}
# ##
resource "azurerm_subnet" "subnet3" {
  name                 = "${var.subnet_names[3]}"
  resource_group_name  = "${var.vnet_rg_name}"
  virtual_network_name = "${var.existing_vnet_name}"
  address_prefix       = "${var.subnet_prefixes[3]}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
  route_table_id       = "${azurerm_route_table.routetable3.id}"
}

resource "azurerm_route_table" "routetable3" {
  name     = "${var.existing_vnet_name}-${var.subnet_names[3]}-Routetable"
  location     = "${data.azurerm_resource_group.vnet_rg.location}"
  resource_group_name     = "${var.vnet_rg_name}"

  tags {
    environment = "${var.tag_environment_name}"
  }
}

resource "azurerm_subnet_network_security_group_association" "nsgassociation3" {
  subnet_id                 = "${azurerm_subnet.subnet3.id}"
  network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
}

resource "azurerm_subnet_route_table_association" "routetableassociation3" {
  subnet_id                 = "${azurerm_subnet.subnet3.id}"
  route_table_id = "${azurerm_route_table.routetable3.id}"
}

Debug Output

https://gist.github.com/ewierschke/075040ee240e8c51d0ddb63e1e0779ea

Panic Output


No Panic

Expected Behavior


All subnets should be created against existing VNET with appropriate association of a pre-created NSG and appropriate association of a newly created route table.

Actual Behavior


Only 3 subnets are created and terraform gets stuck in ~infinite 'Still creating...' loop. Terrform sees one of four subnets as still creating along with 3 route table associations and 2 NSG associations.

azurerm_subnet.subnet2: Still creating... (6m50s elapsed)
azurerm_subnet_route_table_association.routetableassociation0: Still creating... (6m40s elapsed)
azurerm_subnet_network_security_group_association.nsgassociation3: Still creating... (6m30s elapsed)
azurerm_subnet_route_table_association.routetableassociation3: Still creating... (6m30s elapsed)
azurerm_subnet_network_security_group_association.nsgassociation1: Still creating... (6m20s elapsed)
azurerm_subnet_route_table_association.routetableassociation1: Still creating... (6m20s elapsed)
azurerm_subnet.subnet2: Still creating... (7m0s elapsed)
azurerm_subnet_route_table_association.routetableassociation0: Still creating... (6m50s elapsed)
azurerm_subnet_route_table_association.routetableassociation3: Still creating... (6m40s elapsed)
azurerm_subnet_network_security_group_association.nsgassociation3: Still creating... (6m40s elapsed)
azurerm_subnet_route_table_association.routetableassociation1: Still creating... (6m30s elapsed)
azurerm_subnet_network_security_group_association.nsgassociation1: Still creating... (6m30s elapsed)
azurerm_subnet.subnet2: Still creating... (7m10s elapsed)

Steps to Reproduce

  1. terraform apply

Important Factoids


Not sure how important, but in a larger deployment am trying to create 12+ subnets at once (what is provided is what I have been able to narrow it down to). Was able to move the subnet and associations segment into a module and more than 3 subnets get created (~8 +/-) but still gets stuck in similar loop.

If this code is executed with the subnet3 resource and association resources commented out, the run succeeds (limiting to 3 new subnets to create).

subnet_names and subnet_prefixes are lists in my variables file.

variable "subnet_names" {
  default = [
    "subnet0", 
    "subnet1", 
    "subnet2", 
    "subnet3", 
    "subnet4", 
    "subnet5", 
    "subnet6", 
    "subnet7", 
    "subnet8", 
    "subnet9", 
    "subnet10", 
    "subnet11", 
    "subnet12", 
    "subnet13"
    ]
}

variable "subnet_prefixes" {
  default = [
    "10.7.0.0/24", 
    "10.7.1.0/24", 
    "192.168.0.0/24", 
    "192.168.1.0/24", 
    "192.168.2.0/24", 
    "192.168.3.0/24", 
    "192.168.4.0/24", 
    "192.168.5.0/24", 
    "192.168.6.0/24", 
    "192.168.7.0/24", 
    "192.168.8.0/24", 
    "192.168.9.0/24", 
    "192.168.10.0/24", 
    "192.168.11.0/24"
    ]
}

The VNET already exists with 0 subnets and two address spaces.
Two subnets in one address space and one in the other address space get successfully created.
The NSG to associate is already pre-created that is to be associated with the new subnets.

I don't appear to be hitting API rate limits per the debug output.

If the above code is executed with -parallelism=1 the apply succeeds.

Not sure what I might be missing here or if maybe there is a limitation on the Microsoft.Network virtualNetworks API?

References

  • #0000
bug servicnetwork-security servicroute-tables

Most helpful comment

@tombuildsstuff we should re-open this since the locking change was rolled back.

All 36 comments

I was experiencing the same issue, I was able to workaround this by adding depends_on for both nsg and route association

For the example, above, I would make it:
resource "azurerm_subnet_network_security_group_association" "nsgassociation3" {
subnet_id = "${azurerm_subnet.subnet3.id}"
network_security_group_id = "${data.azurerm_network_security_group.required_nsg.id}"
depends_on = ["azurerm_subnet.subnet3"]
}

resource "azurerm_subnet_route_table_association" "routetableassociation3" {
subnet_id = "${azurerm_subnet.subnet3.id}"
route_table_id = "${azurerm_route_table.routetable3.id}"
depends_on = ["azurerm_subnet.subnet3"]
}

I did it for all my associations, and it was no longer stuck

I'm currently running into this behavior with tf v0.11.13 and provider.azurerm v1.23.0. Adding depends_on to the associations did not help in my case. The trace log never shows tf actually sending a PUT request to create the subnet. Adding --parallelism=1 does fix it for me though.

Looking closer at the logs, it looks like a deadlock is happening with azureRMLockByName() (there's a Locking message for the vnet, but no following Locked message, and something else had successfully Locked it earlier, with no following Unlocked message. This would be a lot easier to decipher if we had locking resource names in the trace logs...)

I was able to get this to work by removing route_table_id and network_security_group_id from the azurerm_subnet definitions. However, that triggers another issue referenced in #2358, so I had to add a lifecycle/ignore_changes block to the subnets to prevent terraform from continually trying to reset the route_table_id and network_security_group_id.

@tombuildsstuff are those vnet locks necessary in azurerm_subnet_network_security_group_association and azurerm_subnet_route_table_association? Maybe subnet locks would be sufficient and avoid deadlocks here?

Since the change that made this work for me was removing route_table_id and network_security_group_id from the azurerm_subnet resource, I took a look at the azurerm_subnet code. It looks like removing those causes terraform to skip the nsg and route table locks for the subnet: https://github.com/terraform-providers/terraform-provider-azurerm/blob/54ce52395ce71bcde2e0e983e06061159faea106/azurerm/resource_arm_subnet.go#L165

Maybe it's not even the vnet lock that causes the problem, but the nsg/route table locks (or a combination of both the vnet lock and the other locks?) I would expect the dag to figure this all out and prevent the deadlocks entirely, but that doesn't seem to be happening here. @tombuildsstuff I noticed you're the original committer for those azurerm_subnet_*_association resources, so I thought maybe you'd have some ideas on how to better track down and/or avoid the deadlock here.

I'm facing a similar issue where creating 3 subnets and their network security group associations work, but as soon as I add a 4th set, it gets stuck in an infinite waiting loop.

Both the subnet creation and NSG association creation lock on the vnet + NSG, but they each request the lock in reverse order. This appears to result in a deadlock where a subnet creation step has locked on the vnet, but is waiting for the NSG lock to become unlocked. In the meantime, an NSG association creation step has locked on the NSG, but is waiting for the vnet lock to be unlocked.

Corresponding code in the provider:
https://github.com/terraform-providers/terraform-provider-azurerm/blob/v1.30.1/azurerm/resource_arm_subnet_network_security_group_association.go#L60-L68
https://github.com/terraform-providers/terraform-provider-azurerm/blob/v1.30.1/azurerm/resource_arm_subnet.go#L148-L167

I wonder if this deadlock could be similar to what you're seeing @Moeser - hypothetically, setting parallelism=1 and/or removing the locks removes the deadlock condition.

I'm seeing this on terraform 0.11.11, with terraform-provider-azurerm_v1.30.1_x4.

Interestingly, this seems to happen frequently (approx. 80% of the time) when running on linux. I've tried to reproduce with same versions and scripts on darwin (Mac), but haven't been able to.

Here's the corresponding lock logs (with indexes added at the end of each line to indicate count)

Lock requests for the vnet:

2019-06-10T16:42:30.653Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 0
2019-06-10T16:42:30.653Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Locked "azurerm_virtual_network.the-example-virtual-network" 0
2019-06-10T16:42:30.655Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 1
2019-06-10T16:42:30.655Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 2
2019-06-10T16:42:30.838Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 3
2019-06-10T16:42:41.137Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Unlocking "azurerm_virtual_network.the-example-virtual-network" 0
2019-06-10T16:42:41.137Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Unlocked "azurerm_virtual_network.the-example-virtual-network" 0
2019-06-10T16:42:41.137Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locked "azurerm_virtual_network.the-example-virtual-network" 1
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 4
2019-06-10T16:42:51.465Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:51 [DEBUG] Unlocking "azurerm_virtual_network.the-example-virtual-network" 1
2019-06-10T16:42:51.465Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:51 [DEBUG] Unlocked "azurerm_virtual_network.the-example-virtual-network" 1
2019-06-10T16:42:51.465Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:51 [DEBUG] Locked "azurerm_virtual_network.the-example-virtual-network" 2
2019-06-10T16:42:51.500Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:51 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 5
2019-06-10T16:42:51.503Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:51 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network" 6
2019-06-10T16:43:01.775Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Unlocking "azurerm_virtual_network.the-example-virtual-network" 2
2019-06-10T16:43:01.775Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Unlocked "azurerm_virtual_network.the-example-virtual-network" 2
2019-06-10T16:43:01.775Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Locked "azurerm_virtual_network.the-example-virtual-network" 3

Logs for requester #3:
2019-06-10T16:42:30.838Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [INFO] preparing arguments for Azure ARM Subnet creation.
2019-06-10T16:42:30.838Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network"

...

2019-06-10T16:43:01.775Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Locked "azurerm_virtual_network.luckylake9847-virtual-network"
2019-06-10T16:43:01.775Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Locking "azurerm_network_security_group.the-example-security-group"

Lock requests for the NSG:

2019-06-10T16:42:20.120Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:20 [DEBUG] Locking "azurerm_network_security_group.the-example-security-group" 0
2019-06-10T16:42:20.120Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:20 [DEBUG] Locked "azurerm_network_security_group.the-example-security-group" 0
2019-06-10T16:42:30.818Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Unlocking "azurerm_network_security_group.the-example-security-group" 0
2019-06-10T16:42:30.818Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:30 [DEBUG] Unlocked "azurerm_network_security_group.the-example-security-group" 0
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locking "azurerm_network_security_group.the-example-security-group" 1
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locked "azurerm_network_security_group.the-example-security-group" 1
2019-06-10T16:43:01.775Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Locking "azurerm_network_security_group.the-example-security-group" 2
2019-06-10T16:43:01.826Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:43:01 [DEBUG] Locking "azurerm_network_security_group.the-example-security-group" 3

Logs for requester #1:
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [INFO] preparing arguments for Subnet <-> Network Security Group Association creation.
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locking "azurerm_network_security_group.the-example-security-group"
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locked "azurerm_network_security_group.the-example-security-group"
2019-06-10T16:42:41.171Z [DEBUG] plugin.terraform-provider-azurerm_v1.30.1_x4: 2019/06/10 16:42:41 [DEBUG] Locking "azurerm_virtual_network.the-example-virtual-network"

I wonder if this deadlock could be similar to what you're seeing @Moeser - hypothetically, setting parallelism=1 and/or removing the locks removes the deadlock condition.

Yes, you appear to be running into the same deadlock as me. If you'd like, you could try building the azurerm module with the changes from my pull request #3673 to verify that it fixes the bug for you.

Interestingly, this seems to happen frequently (approx. 80% of the time) when running on linux. I've tried to reproduce with same versions and scripts on darwin (Mac), but haven't been able to.

This definitely happens to me on darwin too. You might want to try forcing parallelism back to default (10) on your mac. There are combinations of parallelism that will avoid this bug, such as 1, or larger than subnets * 3 (which is why the default of 10 triggers on 4 subnets but not 3). Parallelism values that are exact multiples of 3 (such as 9, 12, etc.) may be less likely to trigger the bug as well, but I haven't verified that.

Thank you very much for the fix in provider version 1.33.1... I'm not sure if there is still a lingering issue.
The network_security_group_id and route_table_id parameters on the azurerm_subnet resource are marked as deprecated however, when commenting out those two parameters in my above configuration I get into a state where each terraform apply results in the nsg and route table removing and then re-adding. (ie run terraform apply, result=nsg/route table attached to subnet... re-run terraform apply, result=nsg/route table detached) If I leave the deprecated paramters uncommented, the apply cycle completes as expected, keeping those resources attached to the subnet. Are those network_security_group_id and route_table_id parameters still intended to be deprecated?

Hi @ewierschke , the deprecated notices and cycling behavior you are seeing are part of a separate issue and more clearly documented in issue #3054 . The summary is that the next major (2.0) release of the azurerm provider will have a behavioral change to how subnets are associated with nsgs and route tables. The warnings could probably use a link or some better info on the expected way to define those resources until 2.0 comes out, but again, that's a separate issue from the deadlocks discussed here.

@tombuildsstuff we should re-open this since the locking change was rolled back.

I tried removing the network_security_group_id from my azurerm_subnet resource and the wheels are still spinning anyways (still stuck waiting).

Terraform v0.11.14
+ provider.azurerm v1.34.0

I am facing something similar, as I described in the issue #4471, I have 4 azurerm_subnet resources already existing and want to attach each subnet to the only network_security_group by creating azurerm_subnet_network_security_group_association and adding network_security_group_id as a property to each subnet. Several subnets are stuck in status "Still modifying..." forever as well as some associations are stuck forever in status "Still creating...". My current workaround is to add the network_security_group_id as a property to subnet ONLY, without creating the associations.

I have found that pinning to v1.33.0 resolves the issue, as a work around.

@pedrohdz have just tested on 1.33.0 with 4 subnets and it is still the same - the resources are still stuck being modified/created.

I have exactly the same issue when I create new subnets with a security group association. Will it help to use the (deprecated) network_security_group_id property on the azurerm_subnet resource?

UPDATE: I have now tested with the deprecated property and it works much better 馃帀 When will we be unable to use the property network_security_group_id on azurerm_subnet?

Issue still open. As we experience this issue in all our environments, can you please work on a fix?
Thanks!

@ClaudiaBaur I have temporary "fixed" it with the deprecated property: network_security_group_id - you can also do that, until a fix is released 馃槃

@sorenhansendk, yes we already use the workaround in all environments. However, this fct is marked as deprecated and our landscapes are live, please make sure that it remains in for some more time... Might we get a problem when updating the Azure SDKs in near future or do you make sure that nothing breaks? Do you have kind of a roadmap for deprecation and/or fix?
Thanks,

I am not sure if we're facing this issue as well.
We're trying to create 4 subnets and the process gets stuck at creating network interfaces using version > 1.33.1
Interesting fact, it fails 100% of the time at first execution, but it succeeds most of the time with a second try.
It is 100% replicable in our case so I'm able to test any patches.

We seem to also be hit by this when we try to use any version >1.33.0.

Currently using terraform 0.11.14

@dubuc have you tried v1.36.0 ? There was a change that should fix some of the deadlocks in pull 4501

Anyone still running into this issue, try 1.36.0 or newer. The change in https://github.com/terraform-providers/terraform-provider-azurerm/pull/4501 should have reduced the deadlocks

I believe we hit the same issue with 1.36.0. I will use the latest release and try again tomorrow when I get to the office. @Moeser sorry about the delay, and thank you for the reply.

I've found a work-around inspired by the comment by @Moeser than it works with parallelism turned off. If you use depends_on to avoid parallelism, it doesn't hang. If you each subnet dependent on the previous subnet and each network security group association dependent on the previous one. For example, the following snippet shows dependencies that avoid parallelism. There are 4 subnets: web, management, netscaler and default. Based on the dependencies, they will be in that order.

resource "azurerm_subnet" "web" {
  name                      = "web"
  ...
  network_security_group_id = azurerm_network_security_group.web_subnet_nsg.id
}

resource "azurerm_subnet" "management" {
  name                      = "management"
  ...
  network_security_group_id = azurerm_network_security_group.management_subnet_nsg.id
  depends_on                = [azurerm_subnet.web]
}

resource "azurerm_subnet" "netscaler" {
  name                      = "netscaler"
  ...
  network_security_group_id = azurerm_network_security_group.netscaler_subnet_nsg.id
  depends_on                = [azurerm_subnet.management]
}

resource "azurerm_subnet" "default" {
  name                      = "default"
  ...
  network_security_group_id = azurerm_network_security_group.default_subnet_nsg.id
  depends_on                = [azurerm_subnet.netscaler]
}

resource "azurerm_subnet_network_security_group_association" "default" {
  subnet_id                 = azurerm_subnet.default.id
  network_security_group_id = azurerm_network_security_group.default_subnet_nsg.id
  depends_on                = [azurerm_subnet.default]
}

resource "azurerm_subnet_network_security_group_association" "web" {
  subnet_id                 = azurerm_subnet.web.id
  network_security_group_id = azurerm_network_security_group.web_subnet_nsg.id
  depends_on                = [azurerm_subnet_network_security_group_association.default]
}

resource "azurerm_subnet_network_security_group_association" "management" {
  subnet_id                 = azurerm_subnet.management.id
  network_security_group_id = azurerm_network_security_group.management_subnet_nsg.id
  depends_on                = [azurerm_subnet_network_security_group_association.web]
}

resource "azurerm_subnet_network_security_group_association" "netscaler" {
  subnet_id                 = azurerm_subnet.netscaler.id
  network_security_group_id = azurerm_network_security_group.netscaler_subnet_nsg.id
  depends_on                = [azurerm_subnet_network_security_group_association.management]
}

@embik we should try this approach

Forcing the resources to serialize via depends_on helps, but people who are using count to make multiple copies will find that harder to do. Serializing via depends_on is a good temporary fix if that works for you.

The locks were introduced to work around a bug in the azure API, where it will return a 409 error if multiple subnet/nsg/route changes are made at the same time. The locks work around the issue by forcing the vnet related resources to serialize. I still think the long term solution is to reduce the locks to one or remove them entirely, but that can't happen until the 409 errors are handled in a retryable way.

Workaround is also difficult to be applied when you use multiple modules in your code to create subnet with NSG & UDR.

If these modules run in parallel, you are in problem as depends_on does not work with module.

I was experiencing this hang as well, upgrading to 1.40.0 seems to have resolved my deadlocking issue. https://github.com/terraform-providers/terraform-provider-azurerm/releases

Hi there,

I'm with 1.40.0 as well and I have similar issue. I figured it always happens when I terraform apply subnet creation and nsg association at once. More specifically, if it's two or more of each.
I was creating 10 subnets and then 10 associations, and it managed to create two subnets before it started creating NSG associations for them. That was exactly the point when it started to hang.

I had to add this crazy

depends_on = ["azurerm_subnet.subnet1","azurerm_subnet.subnet2","azurerm_subnet.subnet3","azurerm_subnet.subnet4","azurerm_subnet.subnet5","azurerm_subnet.subnet6","azurerm_subnet.subnet7","azurerm_subnet.subnet8","azurerm_subnet.subnet9","azurerm_subnet.subnet10"]

into

resource "azurerm_subnet_network_security_group_association" "nsgAssociation1" {
  subnet_id                 = azurerm_subnet.subnet1.id
  network_security_group_id = azurerm_network_security_group.nsg1.id
 }

Is it known yet what is causing such deadlocks? Is it too much parallelism?

The version 1.40.0 doesn't solve our deadlock issue. :(

Same issue exists in 1.43.0, only works with the proposed workaround of forcing the sequential creation of the respective resources.

Sadly I have experienced this in AzureRM 2.0.0 . It loops forever during a destroy of 1 nic and 1 nsg association. Interestingly, it only seems to happen when I have a azurerm firewall that is also attached to the same vnet. The firewall gets deleted successfully , but this lock issue occurs in any case relating to the nic and nsg assoc. If I remove the firewall from the config, all is well every time.

Same can be said for nat gateway - if you try to create one whilst also creating several NICs for instance, and you get a deadlock condition. Definitely an issue. If I add depends_on to the creation of the azurerm_nat_gateway specifying the NICs it goes through just fine.

Same deadlock issue in azurerm 2.7.0. I could only resolve it by serializing the creation of all azurerm_subnet_network_security_group_association and azurerm_subnet_nat_gateway_association items.

same issue as @samartzidis , I have both azurerm_subnet_network_security_group_association and azurerm_subnet_nat_gateway_association, 3 of each for 3 subnets. Have to comment them out, create all stack, then uncomment and add the associations.

Will try the "depends_on" now, thanks for suggestions :)

I am _still_ running into this issue with 2.39.0, even when using depends_on for all subnet/xxx associations. I'm out of ideas and this is preventing my CI job from terminating successfully, so I would really appreciate some input.

Edit: sorry, my bad, it turns out I have overlooked one resource. depends_on does seem to do the trick for now but it would be nice to have a long term solution eventually.

Nope, cheered too soon. Deadlock is still occurring even with depends_on on all subnet related resources.

Was this page helpful?
0 / 5 - 0 ratings