Terraform-provider-azurerm: [1.34] Destroying load balancer errors due to CannotRemoveRuleUsedByProbeUsedByVMSS

Created on 20 Sep 2019  ยท  13Comments  ยท  Source: terraform-providers/terraform-provider-azurerm

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

% terraform -v
Terraform v0.12.9
% terragrunt -v
terragrunt version v0.19.25
% terragrunt providers
.
โ””โ”€โ”€ provider.azurerm =1.34

Affected Resource(s)

  • azurerm_lb
  • azurerm_lb_probe
  • azurerm_lb_rule
  • azurerm_virtual_machine_scale_set

Terraform Configuration Files

I managed to condense the testcase to this one file below.

terraform {
  backend "local" {}
}

provider "azurerm" {
  version = "=1.34"
}

resource "azurerm_resource_group" "testcase" {
  name     = "testcase-rg"
  location = "West Europe"
}

resource "azurerm_virtual_network" "testcase" {
  name                = "testcase"
  resource_group_name = azurerm_resource_group.testcase.name
  location            = azurerm_resource_group.testcase.location
  address_space       = ["10.0.0.0/16"]
}

resource "azurerm_subnet" "testcase" {
  name                      = "testcase"
  resource_group_name       = azurerm_resource_group.testcase.name
  virtual_network_name      = azurerm_virtual_network.testcase.name
  address_prefix            = "10.0.0.0/24"
}

resource "azurerm_lb" "testcase-lb" {
  name                = "testcase-lb"
  location            = azurerm_resource_group.testcase.location
  resource_group_name = azurerm_resource_group.testcase.name

  frontend_ip_configuration {
    name                          = "testcase-lb-ip"
    subnet_id                     = azurerm_subnet.testcase.id
    private_ip_address_allocation = "Dynamic"
  }
  sku = "Standard"
}

resource "azurerm_lb_backend_address_pool" "testcase-lb" {
  name                = "testcase-lb-address-pool"
  resource_group_name = azurerm_resource_group.testcase.name
  loadbalancer_id     = azurerm_lb.testcase-lb.id
}

resource "azurerm_lb_probe" "testcase" {
  name                = "https-probe"
  resource_group_name = azurerm_resource_group.testcase.name
  loadbalancer_id     = azurerm_lb.testcase-lb.id
  protocol            = "Https"
  port                = 443
  request_path        = "/"
}

resource "azurerm_lb_rule" "testcase-lb" {
  name                           = "Https"
  resource_group_name            = azurerm_resource_group.testcase.name
  loadbalancer_id                = azurerm_lb.testcase-lb.id
  protocol                       = "Tcp"
  frontend_port                  = 443
  backend_port                   = 443
  frontend_ip_configuration_name = "testcase-lb-ip"
  backend_address_pool_id        = azurerm_lb_backend_address_pool.testcase-lb.id
  probe_id                       = azurerm_lb_probe.testcase.id
}

resource "azurerm_virtual_machine_scale_set" "testcase-backend-vmss" {
  name                = "testcase-backend-vmss"
  location            = azurerm_resource_group.testcase.location
  resource_group_name = azurerm_resource_group.testcase.name

  automatic_os_upgrade = false
  overprovision        = true
  upgrade_policy_mode  = "Automatic"

  health_probe_id = azurerm_lb_probe.testcase.id

  sku {
    name     = "Standard_B1s"
    tier     = "Standard"
    capacity = 1
  }

  storage_profile_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "18.04-LTS"
    version   = "latest"
  }

  storage_profile_os_disk {
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"
    os_type           = "Linux"
  }

  os_profile {
    computer_name_prefix = "testcase"
    admin_username       = "ubuntu"
  }

  os_profile_linux_config {
    disable_password_authentication = true

    ssh_keys {
      path     = "/home/ubuntu/.ssh/authorized_keys"
      key_data = file("id_rsa.pub")
    }
  }

  network_profile {
    name    = "testcase-network-profile"
    primary = true

    ip_configuration {
      name                                   = "testcase-ip-configuration"
      primary                                = true
      subnet_id                              = azurerm_subnet.testcase.id
      load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.testcase-lb.id]
    }
  }
}

resource "azurerm_monitor_autoscale_setting" "testcase" {
  name                = "testcase-autoscale"
  resource_group_name = azurerm_resource_group.testcase.name
  location            = azurerm_resource_group.testcase.location
  target_resource_id  = azurerm_virtual_machine_scale_set.testcase-backend-vmss.id

  profile {
    name = "defaultProfile"

    capacity {
      default = 1
      minimum = 1
      maximum = 1
    }
  }
}

Debug Output

https://gist.github.com/ashemedai/63230ee8d5cb000786b145f338c317b3

I removed the long list of regions with their API versions and such, should make the debug log a bit easier to read.

Expected Behavior

Correct destroying of all the resources in one run.

Actual Behavior

Terraform errored due to receiving an Azure CannotRemoveRuleUsedByProbeUsedByVMSS error.

azurerm_lb_rule.testcase-lb: Destroying... [id=/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/testcase-rg/providers/Microsoft.Network/loadBalancers/testcase-lb/loadBalancingRules/Https]
azurerm_monitor_autoscale_setting.testcase: Destroying... [id=/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/testcase-rg/providers/microsoft.insights/autoscalesettings/testcase-autoscale]
azurerm_monitor_autoscale_setting.testcase: Destruction complete after 0s
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Destroying... [id=/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/testcase-rg/providers/Microsoft.Compute/virtualMachineScaleSets/testcase-backend-vmss]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 10s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 20s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 30s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 40s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 50s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 1m0s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Still destroying... [id=/subscriptions/00000000-0000-0000-0000-...MachineScaleSets/testcase-backend-vmss, 1m10s elapsed]
azurerm_virtual_machine_scale_set.testcase-backend-vmss: Destruction complete after 1m10s

Error: Error Creating/Updating Load Balancer "testcase-lb" (Resource Group "testcase-rg"): network.LoadBalancersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="CannotRemoveRuleUsedByProbeUsedByVMSS" Message="Load balancer rule /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/testcase-rg/providers/Microsoft.Network/loadBalancers/testcase-lb/loadBalancingRules/Https cannot be removed because the rule references the load balancer probe /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/testcase-rg/providers/Microsoft.Network/loadBalancers/testcase-lb/probes/https-probe that is used as health probe by VM scale set /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/testcase-rg/providers/Microsoft.Compute/virtualMachineScaleSets/testcase-backend-vmss. To remove this rule, please update VM scale set to remove the reference to the probe." Details=[]

Steps to Reproduce

  1. Save file to main.tf
  2. terraform apply
  3. terraform destroy
  4. Error out

Important Factoids

West Europe Azure.
Also tried 1.33, same results.

question servicvmss

Most helpful comment

hi @ashemedai

Thanks for opening this issue - apologies I thought I'd replied to this.

In this case since there's no dependency between the Load Balancer Rule and the VM Scale Set Terraform believes that it can destroy both at the same time, rather than where one is dependent on another (in Azure) - as you've mentioned it's possible to add a depends_on which will add this dependency, to ensure the LB Rule is removed prior to the VMSS being deleted.

Unfortunately this is by design since there's no link between the VM Scale Set and the Load Balancer Rule directly within Azure - as such I don't believe there's much we can do to work around this (besides potentially exposing a field for load_balancer_rule_id which would be the same as a depends_on).

As such our recommendation here is to add a depends_on between this VM Scale Set and the Load Balancer Rule to ensure this dependency is captured - and since there's no dependency between these resources in Azure that we can expose to make this dependency implicit rather than explicit - I'm going to close this issue for the moment. We're currently working on the new versions of the Virtual Machine Scale Set resources - as a part of this we'll document this requirement both in the new resources and the existing VMSS resource.

Thanks!

All 13 comments

I can work around the problem by adding a depends_on = [azurerm_lb_rule.testcase-lb] to the azurerm_virtual_machine_scale_set resource. That pushes the destroying of the rule after the destroying of the VMSS.

@tombuildsstuff Just curious, why did the label change from bug to question?

hi @ashemedai

Thanks for opening this issue - apologies I thought I'd replied to this.

In this case since there's no dependency between the Load Balancer Rule and the VM Scale Set Terraform believes that it can destroy both at the same time, rather than where one is dependent on another (in Azure) - as you've mentioned it's possible to add a depends_on which will add this dependency, to ensure the LB Rule is removed prior to the VMSS being deleted.

Unfortunately this is by design since there's no link between the VM Scale Set and the Load Balancer Rule directly within Azure - as such I don't believe there's much we can do to work around this (besides potentially exposing a field for load_balancer_rule_id which would be the same as a depends_on).

As such our recommendation here is to add a depends_on between this VM Scale Set and the Load Balancer Rule to ensure this dependency is captured - and since there's no dependency between these resources in Azure that we can expose to make this dependency implicit rather than explicit - I'm going to close this issue for the moment. We're currently working on the new versions of the Virtual Machine Scale Set resources - as a part of this we'll document this requirement both in the new resources and the existing VMSS resource.

Thanks!

@tombuildsstuff That's clear. And thanks for including the note on the documentation, was going to ask that while reading through your answer.

@ashemedai ๐Ÿ‘ this is included in #4585 :)

@tombuildsstuff Awesome stuff! Hopefully I can help out on the Go front at some point. But first need to get this infrastructure implemented.

@tombuildsstuff I noticed a related issue with azurerm_lb_backend_address_pool and azurerm_virtual_machine_scale_set. Even with depends_on set, there are issues changing and deleting resources.

The graph dependency here is that changes to dependencies mean the dependent resource must make changes first. The process probably should be that the dependent VMSS needs to have the settings removed first, the dependencies replaced, then the dependent VMSS new settings added at the end.

Is there support for splitting the change like this?

Good point actually, @mbrancato. When I was doing various changes I was constantly following my written down order to do that VMSS, load balancer probe, load balancer dance.

In addition to that, if you, for whatever reason, have manual update set, you might be wondering why the heck some of those steps will fail. For people in the future running into this problem: it's because your VM instance models are out of sync with the VMSS definition. You will need to manually update the VM instances' model definition via something like az vmss update-instances. And judging by the Google hits, I was not the first one to fall into the trap of thinking the update referred to the image instead of the model.

On the one hand I wonder if Terraform should detect this desync in state between VMSS and instances. On the other hand it seems adding this could be out of scope.

@ashemedai @mbrancato it's worth noting when designing the replacement VMSS resources (outlined in #2807 - available in Beta soon) have a flag which intentionally updates the model (and the images) during an update if required.

It's possible that we could introduce some kind of association resource between VMSS and Load Balancers - but I don't think that's a viable solution since the association would need to update the profile/roll the instances within the set, for each kind of association (e.g. App Gateway, Load Balancer etc) - which means you'd be rolling outside of the main VMSS resources, or asking users to do so out of band - neither of which is ideal.

We intentionally shut down the VM Scale Set during deletion as a part of the new resources, which alleviates some other kinds of bugs, but it's by no means a silver bullet) - IMO the Azure API should allow running these commands in parallel, since whilst they're related Azure should detect that & lock/poll internally as needed.

@tombuildsstuff Good point actually. The burden should lie on the Azure backend, indeed. I'll have an App Consult session with Microsoft soon, I'll highlight some of the issues I ran into to them via that channel as well.

This has been released in version 1.36.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 1.36.0"
}
# ... other configuration ...

I updated to 1.36.1 and still saw the dependency issue with the CannotRemoveRuleUsedByProbeUsedByVMSS error described here. I've opened a new issue (#4769) to track this as it seems to be related but different from the issue here.

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error ๐Ÿค– ๐Ÿ™‰ , please reach out to my human friends ๐Ÿ‘‰ [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings