Terraform-provider-azurerm: Subnets on same vnet fail due to parrallel setup

Created on 3 Jul 2019  路  24Comments  路  Source: terraform-providers/terraform-provider-azurerm

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.12.3
+ provider.azurerm v1.31.0

Affected Resource(s)

  • azurerm_subnet
  • azurerm_subnet_network_security_group_association
  • Possibly azurerm_virtual_network_peering

Terraform Configuration Files

I'm not able to use the full configuration due to sensitive information in them. Please see this example:

resource azurerm_virtual_network vnet {
  name                = "my_vnet"
  location            = "northeurope"
  resource_group_name = "vnet_example"
  address_space       = ["10.0.0.0/24", ]
}

resource azurerm_subnet subnet1 {
  name                      = "subnet1"
  virtual_network_name      = "${azurerm_virtual_network.vnet.name}"
  resource_group_name       = "vnet_example"
  address_prefix            = "10.0.0.0/28"
}

resource azurerm_subnet subnet2 {
  name                      = "subnet2"
  virtual_network_name      = "${azurerm_virtual_network.vnet.name}"
  resource_group_name       = "vnet_example"
  address_prefix            = "10.0.0.64/28"
}

resource azurerm_subnet subnet3 {
  name                      = "subnet3"
  virtual_network_name      = "${azurerm_virtual_network.vnet.name}"
  resource_group_name       = "vnet_example"
  address_prefix            = "10.0.0.128/28"
}

Expected Behavior

The subnets and other configuration are created on the VNet.

Actual Behavior

Terraform seems to be attempting to apply the separate configurations to the VNet at the same time. I've tried adding dependencies between the subnets which solved the problem there, but it appeared again when attempting to use a azurerm_subnet_network_security_group_association resource:

Error: Error updating Route Table Association for Subnet "(redacted)" (Virtual Network "(redacted)" / Resource Group "(redacted)"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/(redacted)." Details=[]

Important Factoids

In our configuration, each subnet, security group combination is in its own module. We have explored using dependencies to ensure they run sequentially but this is not possible with the azurerm_subnet_network_security_group_association resource.

References

Possibly related to these?

  • #2605
  • #3472
bug servicvirtual-networks

Most helpful comment

This is still occurring with Provider 2.24. Is there a workaround?

All 24 comments

I'm getting this same error on Terraform 0.11.13 and AzureRM provider 1.33.1

Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri:

This looks related to #3673

To overcome, we're setting explicit dependencies using "depends_on" but, that's an awful hack.

UPDATE
I'm able to replicate the problem even into Terraform 0.12.7 and AzureRM 1.33.1 with a minimal configuration.

Configuration

locals {

}

provider "azurerm" {

}

resource "azurerm_resource_group" "rg" {
  name     = "bug-3780-rg"
  location = "eastus"
}

resource "azurerm_virtual_network" "vnet" {
  name                = "bug-3780-vnet"
  location            = "eastus"
  resource_group_name = "bug-3780-rg"
  address_space       = ["192.168.14.0/24"]
}

resource "azurerm_subnet" "subnet1" {
  name                 = "bug-3780-subnet1"
  address_prefix       = "192.168.14.0/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}

resource "azurerm_subnet" "subnet2" {
  name                 = "bug-3780-subnet2"
  address_prefix       = "192.168.14.16/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}
resource "azurerm_subnet" "subnet3" {
  name                 = "bug-3780-subnet3"
  address_prefix       = "192.168.14.32/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}
resource "azurerm_subnet" "subnet4" {
  name                 = "bug-3780-subnet4"
  address_prefix       = "192.168.14.48/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}

Output

azurerm_subnet.subnet1: Creation complete after 2s [<Redacted>/resourceGroups/bug-3780-rg/providers/Microsoft.Network/virtualNetworks/bug-3780-vnet/subnets/bug-3780-subnet1]
azurerm_subnet.subnet2: Creation complete after 2s [<Redacted>/resourceGroups/bug-3780-rg/providers/Microsoft.Network/virtualNetworks/bug-3780-vnet/subnets/bug-3780-subnet2]
azurerm_subnet.subnet4: Still creating... [10s elapsed]
azurerm_subnet.subnet4: Creation complete after 12s [<Redacted>/resourceGroups/bug-3780-rg/providers/Microsoft.Network/virtualNetworks/bug-3780-vnet/subnets/bug-3780-subnet4]

Error: Error Creating/Updating Subnet "bug-3780-subnet3" (Virtual Network "bug-3780-vnet" / Resource Group "bug-3780-rg"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: *<Redacted>* Details=[]

Same on terraform destroy:

Error: Error deleting Subnet "test-subnet" (Virtual Network "test-vn" / Resource Group "playground"): network.SubnetsClient#Delete: Failure sending request: StatusCode=409 -- Original Error: Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/xxx/providers/Microsoft.Network/locations/westeurope/operations/xxx?api-version=2018-12-01." Details=[]

3673 has updated the code so that the subnet is locked instead of the vnet, but I think the vnet still needs to be locked as well

    locks.ByName(virtualNetworkName, virtualNetworkResourceName)
    defer locks.UnlockByName(virtualNetworkName, virtualNetworkResourceName)

@bcline760 and @florianrusch I'm curious whether the "dependent resource" referred to in the error is actually the vnet, or the resource group. Do you see the same error if you use a different resource group for each subnet?

I'm not suggesting that as a fix, just as a test to narrow down which resource azure is still modifying behind the scenes.

Looks like this error is happening in more places than just terraform. azure-powershell for example: https://github.com/Azure/azure-powershell/issues/1817 . Adding the vnet lock back forces the subnets to create in serial (and running in serial is the workaround the powershell folks seem to be using for now), but might not carry over to modules running in parallel like the original issue reporter mentioned (note that the original report was for v1.31.0 which still had vnet locks in the subnet resource).

Any way to treat these as retryable errors instead of aborting the run? In all cases I've seen so far, they have been temporary.

I'm also seeing this issue regularly. I have multiple terraform state files for different module tests but they all share the same resource group and vnet. Interestingly, I'm using 1.33.1. Todays test log:

Initializing the backend...

Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "azurerm" (hashicorp/azurerm) 1.33.1...
- Downloading plugin for provider "dns" (hashicorp/dns) 2.2.0...
- Downloading plugin for provider "null" (hashicorp/null) 2.1.2...
...

Error: Error updating Route Table Association for Subnet "moduletest-gitlabrunner-subnet" (Virtual Network "moduletest-vnet" / Resource Group "moduletest-rg"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=409 -- Original Error: Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/...

@suonto out of interest do you see this issue when using version 1.33.0 of the Azure Provider instead? The only change between 1.33.0 and 1.33.1 was a change to the locking, so this should isolate if it's that change.

Thanks!

@tombuildsstuff it's 1.33.1. You can see terraform init output in the log I posted.

@suonto sorry, I'm wondering if this disappears/works for you if you switch to 1.33.0? :)

@tombuildsstuff The first report was before 1.33.1, but based on the comments here I think the locking change aggravated the issue for other users. For now the best bet might be to roll back my locking changes. I think long term, the real solution to this 'AnotherOperationInProgress' error is to get azure/go-autorest to treat these 409 errors as retryable (which might not be straightforward since some 409 errors are not retryable like name conflicts, but some like this clearly are where azure even provides a uri to continue checking the status of the request).

The change might even fit better in the api itself, since the api probably shouldn't be returning retryable errors as 409 in the first place. At the very least maybe they could add a retry hint. azure-rest-api-specs has a much lower rate of resolving issues than go-autorest does, so we might not see a fix materialize in the actual api.

If there's a way to catch those retryable 409s errors in the azurerm provider, that would work too, but I suspect that might not be as easy as it sounds.

@tombuildsstuff no it does not. I've been experiencing this for a long time and always used the latest provider on ephemeral ci executors. Just wanted to confirm this issue in hope of getting a proper solution at some point. I'm running 10 module tests in parallel each night. They all have their own tfstate but they share the VNET as a data object. Then I see every morning that at least one of the tests has failed with this specific issue. The state of arts has been like this for at least 3 months.

@Moeser I'm not sure if your change caused any new issues since 409 errors were seen both pre and post the change.

For my use case, which is just creating multiple azurerm_subnet resources, rolling back to 1.33.0 fixed the issue

Exactly same problems here with Terraform v0.12.8 and provider.azurerm v1.33.1 while deploying a vnet with multiple subnets and associated NSG. Downgrading to provider.azurerm v1.33.0 does the trick and deployments start working smoothly again.

@Moeser

@tombuildsstuff The first report was before 1.33.1, but based on the comments here I think the locking change aggravated the issue for other users. For now the best bet might be to roll back my locking changes.

I think there's two distinct bugs here; the first being the original issue (which I believe should be fixed via additional locking) and the other (more recent one) which I believe is related to the recent locking changes, unfortunately. As you've mentioned here I think it's best to roll those changes back for the moment, since it appears that some additional locks are still required.

I think long term, the real solution to this 'AnotherOperationInProgress' error is to get azure/go-autorest to treat these 409 errors as retryable (which might not be straightforward since some 409 errors are not retryable like name conflicts, but some like this clearly are where azure even provides a uri to continue checking the status of the request).
If there's a way to catch those retryable 409s errors in the azurerm provider, that would work too, but I suspect that might not be as easy as it sounds.

The Azure API's make this a challenge - whilst in general the same status codes are used for most API's, 409 is used in a bunch of different ways such that go-autorest couldn't automatically retry for that without breaking some of those API's (in addition to the scenario you've mentioned).

I vaguely recall seeing the error code AnotherOperationInProgress used somewhere for something entirely different - such that where the SDK's generated it may be simpler for this logic to live within the specific resource in Terraform than in the one specific resource in the SDK/go-autorest, unfortunately. FWIW the SDK exposes the HTTP Status Codes so it should be possible to pull that information out/retry in the resources which need it.

For those running 1.33.1 and running into the AnotherOperationInProgress error while creating subnets, try pinning your terraform azurerm provider to 1.33.0 until the next version is released.

@tombuildsstuff

FWIW the SDK exposes the HTTP Status Codes so it should be possible to pull that information out/retry in the resources which need it.

That would be great. I still feel like the Azure API should be adding retry headers when these 409s are retryable, but adding code to the resources would be a much quicker short term fix.

馃憢

The recent locking changes have been rolled back in #4320 which has been merged into master -so that bug will ship in version 1.34.0 of the Azure Provider. As such I'm going to hide the comments about this recent bug to be able to leave this issue focusing on the original issue here

Thanks!

My setup is quite similar (and also got the same results with v1.33.0):

Terraform v0.12.8
+ provider.azurerm v1.33.1

I have a single virtual network with 4 subnets, and 4 security group also. I commented out all four group/subnet association - everything runs fine. If add back one security group association, and I run terraform apply, it tries to do the security group association to the subnet. If I run the terraform apply again (without changing the code itself), during the next run it tries to remove the associated security group from the subnet.

@Clausewitz45 I don't think that's related. And fyi you can fix that by adding lifecycle ignore changes security group id.

@suonto thanks for mentioning it - it's a nice workaround for the issue (at least I can continue the provisioning of the infrastructure), but this should not be the default behavior. I thought it is related because I got the same error message, but it is working with just a single subnet - not with 3.

This is still occurring with Provider 2.24. Is there a workaround?

still occurs with provider 2.25 also. Is there an update or workaround

Anyone still having this?
I'm using terraform 0.12.29 and azurerm 2.30.0.

Error: Error updating Subnet "subnet-name" (Virtual Network "vnet-name" / Resource Group "BI-Test"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/<subscription-id>providers/Microsoft.Network/locations/brazilsouth/operations/<subscription-id>?api-version=2020-05-01." Details=[]

I'm having following error with the 2.33.0 provider:

Error: Error updating Subnet "test-subnet" (Virtual Network "test-vnet" / Resource Group "rg-test): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shivamsriva31093 picture shivamsriva31093  路  47Comments

ewierschke picture ewierschke  路  36Comments

hashibot[bot] picture hashibot[bot]  路  29Comments

mooperd picture mooperd  路  48Comments

hashibot picture hashibot  路  43Comments