Terraform-provider-azurerm: NXDOMAIN on mykeyvaultname.vault.azure.net should be (able to be) treated as azurerm_key_vault_{secret,certificate} deleted on refresh

Created on 27 Nov 2018  ·  17Comments  ·  Source: terraform-providers/terraform-provider-azurerm

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

It is generally the expected/desired behavior of Terraform that when a resource is deleted out of band and a subsequent plan or apply is run (after refresh) that terraform proposes to recreate any such resources present in the module but removed from state

In the case of an Azure keyvault's deletion, a terraform module which specifies that keyvault and (some of) that keyvault's contents ought to propose the recreation of the keyvault and those contents.

It does not.

What you get instead is a crashed refresh, with (potentially many) messages like

azurerm_key_vault_certificate.ssl_cert: azurerm_key_vault_certificate.ssl_cert: keyvault.BaseClient#GetCertificate: Failure sending request: StatusCode=0 -- Original Error: Get https://mykeyvaultname.vault.azure.net/certificates/example-certificate/?api-version=2016-10-01: dial tcp: lookup mykeyvaultname.vault.azure.net on [2001:4898::1050:5050]:53: no such host

This is a result of the fact of implementation that a keyvault represents an independent hostname endpoint of the keyvault API, thus when a keyvault is removed, the entire API surface for that kevault is likewise removed, so the refresh fails on trying to get the {secret, certificate} state.

The code for this refresh https://github.com/terraform-providers/terraform-provider-azurerm/blob/f7248221c808b1ce605ff9d2857cfbc68da2e4db/azurerm/resource_arm_key_vault_secret.go#L160-L168

handles 404 on the secret itself as a warn and remove from state, but the NXDOMAIN on the keyvault itself simply crashes the refresh. But it is always the case that a deleted key vault's contents are also deleted, so I contend this sort of failure ought to be considered a deletion as well.

I believe that the resource graph will reasonably ensure a recreate is not attempted until after a recreate of the vault itself has been done

I think this would be a reasonable default (always) behavior, but if the current behavior were considered for any reason desirable, it would be nice if there were at least some way to opt-in to this behavior

New or Affected Resource(s)

  • azurerm_key_vault_certificate
  • azurerm_key_vault_secret
enhancement servickeyvault thinking

Most helpful comment

This needs to be reopened.

Firstly the documentation has not been updated to reflect the change.

Secondly, after moving to key_vault_id with 1.22.1 I can no longer reference a Key Vault that exists in another subscription than the one I'm executing Terraform against. It seems to ignore the subscription in the ID/URI.

data.azurerm_key_vault_secret.foo: data.azurerm_key_vault_secret.foo: Error looking up Secret "my_secret" vault url form id "/subscriptions/subscription_with_key_vault/resourceGroups/my_resource_group/providers/Microsoft.KeyVault/vaults/my_key_vault": Error unable to find KeyVault "my_key_vault" (Resource Group "my_resource_group"): keyvault.VaultsClient#Get: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="ResourceGroupNotFound" Message="Resource group 'my_resource_group' could not be found."

All 17 comments

The result of this in practice for my team is that people frequently throw away their terraform state as a whole and start from scratch; and people generally consider terraform as a whole unreliable

@donaldguy thanks for reporting this - apologies for the delayed response here.

Thinking about this: I'm wondering if it'd make sense to deprecate the vault_uri field and instead replace this with a key_vault_id field, which would allow us to:

a) look up the Vault URI from the Key Vault resource
b) handle the Key Vault being deleted
c) work around a bug in TF Core where references other than ID don't force a re-creation of child objects (e.g. Access Policy / Certs / Keys / Secrets) when the parent needs recreation until the second run

WDYT @katbyte?

sounds like a good solution to me @tombuildsstuff

After downloading the new module with this fix, there is coming a lot off error when running Terraform Plan
module.key_vault.azurerm_key_vault_secret.kv_secret[5]: azurerm_key_vault_secret.kv_secret.5: Error checking if key vault "" for Secret "sqlserver--admin--password"
in Vault at url "https://akeyvaultulr.vault.azure.net/" exists: keyVaultId is empty

Some way to fix this or must I delete state file and run apply again?

@snerting unfortunately that's a known issue in v1.22 of the Provider which has a fix in #2874 - we'll be releasing an update shortly which includes that fix.

For the moment you can pin to the previous version of the Provider like so:

provider "azurerm" {
  version = "=1.21.0"
}

@tombuildsstuff #2874 states for resources created before 22, but i am trying to create a fresh environment and experiencing @snerting 's error. Is that also address in there?

@MattHartz can you confirm if that’s still an issue on v1.22.1, which includes a fix for this?

@tombuildsstuff Yeah I got it working by refreshing the state if i recall correctly.

Should I be able to retrieve a secret from a key vault that is living in a different subscription with "data.azurerm_key_vault_secret"? (access policies are set correctly)

I'm using the now deprecated vault_uri which sounds like it doesn't matter to which subscription the vault belongs. I'm getting the following error:

* data.azurerm_key_vault_secret.my_secret: data.azurerm_key_vault_secret.my_secret: Unable to locate the Resource ID for the Key Vault at URL "https://myvault.vault.azure.net/": %!s(<nil>)

This needs to be reopened.

Firstly the documentation has not been updated to reflect the change.

Secondly, after moving to key_vault_id with 1.22.1 I can no longer reference a Key Vault that exists in another subscription than the one I'm executing Terraform against. It seems to ignore the subscription in the ID/URI.

data.azurerm_key_vault_secret.foo: data.azurerm_key_vault_secret.foo: Error looking up Secret "my_secret" vault url form id "/subscriptions/subscription_with_key_vault/resourceGroups/my_resource_group/providers/Microsoft.KeyVault/vaults/my_key_vault": Error unable to find KeyVault "my_key_vault" (Resource Group "my_resource_group"): keyvault.VaultsClient#Get: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="ResourceGroupNotFound" Message="Resource group 'my_resource_group' could not be found."

@mjiderhamn I would expect that you have to define another aliased provider for the other subscription. That is the normal pattern, right?

@diroussel The issue is that it was working beforehand, using the same provider.

I mean the core issue is that keyvault data is exposed via an API divorced from ARM whilst metadata is solidly managed under it

I think in some sense, a separate provider, akin to azuread, would be the correct place for this sort of data source, and indeed secret, key, certificate (but not access policy) resources - but that would definitely recreate issues like my OP

Personally, (speaking as an individual dev not a Microsoft employee,) I think it's bad API design and there is not really a correct pragmatic solution compatible with both terraform's design principles and common workflows

I would contend that it's not absurd for the data source to work cross sub, but I certainly wouldn't assert that it should. I might well consider the old behavior a sometimes convenient bug like the silent importing being removed in 2.0

Til then, people who care should just pin back to 1.21 ; but that's just my outside opinion; I am not a maintainer on this project

I agree with @donaldguy that it might be more proper to make Azure Key Vault a separate provider (compare to Hashicorp Vault).

Regardless, the documentation needs to be updated to reflect the current implementation.

Agree with @mjiderhamn that the docs need to reflect this implementation.

Not updating the documentation when deprecating is sloppy.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

Was this page helpful?
0 / 5 - 0 ratings