Azure-sdk-for-net: [BUG] Class KeyVaultCredential does not reflect change in www-authenticate header

Created on 5 Apr 2019  路  4Comments  路  Source: Azure/azure-sdk-for-net

If the customer moves a key vault from one tenant to another, the www-authenticate header changes, but the KeyVaultCredential class does not detect the new value and use the previous forever. The process keeps requesting tokens for the wrong tenant and becomes unable to access that specific key vault again.

This caused an incident that required rebooting of servers in the cloud.

The bug is related to this code (link):

        var accessToken = await PreAuthenticate(request.RequestUri).ConfigureAwait(false);
        if (!string.IsNullOrEmpty(accessToken))
            // [BUG] If this token is invalid because of wrong audience or resource, the request will get a 401 and the PostAuthenticate will never called to update the cache.
            request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", accessToken);
        else
        {
            ...
        }

The proposed fix is when the request fails with 401 down the pipeline, the PostAuthenticate method is called with the response, allowing the challenge cache to be updated.

Client KeyVault

Most helpful comment

This issue has been addressed and this fix has been published in v3.0.4 (https://www.nuget.org/packages/Microsoft.Azure.KeyVault/3.0.4). This fix updates the challenge class whenever it receives a 401 response with the WWW-Authenticate header. The resulting behavior when a vault is migrated from one tenant to another is that the first call to the KeyVaultClient after the migration will still call the authentication callback with the original authority, and so that call will still fail with a 401 and the user would have to handle the exception. However subsequent calls to the KeyVaultClient will pass the correct authority to the authentication callback.

All 4 comments

Thank you for opening this issue! We are routing it to the appropriate team for follow up.

CC: @schaabs

@msfcolombo can you tell me more about the scenario you are seeing this in? We are trying to figure out how common this scenario is.

It happens when the subscription administrator moves the key vault (or the parent subscription) from one tenant to another.

The detailed scenario is as follows:

  1. Customer has vault contosovault on a tenant A.
  2. Some service kvaccessor is continuously doing requests against contosovault. The service is supposed to run 24x7.
  3. Administrator moves contosovault or the parent subscription from tenant A to tenant B. This should not cause any outage.
  4. Eventually the token used by kvaccessor expires and Key Vault returns 401 with a value of www-authenticate (challenge) that reflects the tenant move.
  5. Key Vault client ignores that and asks for a new token using the previous (wrong) challenge, the one for tenant A.
  6. The new token does not work. Key Vault returns 401 again. The service is locked on HTTP 401 results.

It's uncommon because moving a vault from a tenant to another is uncommon. But when it happens, it appears that the service is facing an outage similar to expired credentials. Rebooting the machine or restarting the process fixes, because it throws away the cache.

This issue has been addressed and this fix has been published in v3.0.4 (https://www.nuget.org/packages/Microsoft.Azure.KeyVault/3.0.4). This fix updates the challenge class whenever it receives a 401 response with the WWW-Authenticate header. The resulting behavior when a vault is migrated from one tenant to another is that the first call to the KeyVaultClient after the migration will still call the authentication callback with the original authority, and so that call will still fail with a 401 and the user would have to handle the exception. However subsequent calls to the KeyVaultClient will pass the correct authority to the authentication callback.

Was this page helpful?
0 / 5 - 0 ratings