Azure-sdk-for-js: [storage-file-datalake] 'Unable to extract accountName' exception when instantiating DataLakeServiceClient with a Gov Cloud ADLS File System endpoint

Created on 6 May 2020  路  13Comments  路  Source: Azure/azure-sdk-for-js

  • Package Name: @azure/storage-file-datalake
  • Package Version: 12.0.0
  • Operating system:
  • [x] nodejs

    • version: 12.16.2

  • Is the bug related to documentation in

Describe the bug

Instantiating a DataLakeServiceClient using an ADLS File System endpoint (e.g. https://{account}.dfs.core.usgovcloudapi.net) will fail and throw an exception with the error message: "Error: Unable to extract accountName with provided information."

To Reproduce

  1. Follow the instructions documented at https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-javascript

  2. Attempt to instantiate a DataLakeServiceClient using a ADLS file system endpoint

let dataLakeServiceClient = new DataLakeServiceClient(
  'https://{account}.dfs.core.usgovcloudapi.net',
  new DefaultAzureCredential()
);
let fileSystemClient = dataLakeServiceClient.getFileSystemClient("myfilesystemname");
  1. You should receive an error:
Exception: Error: Unable to extract accountName with provided information.
Stack: Error: Unable to extract accountName with provided information.
at getAccountNameFromUrl (/node_modules/@azure/storage-file-datalake/dist/index.js:3538:15)
at DataLakeServiceClient.StorageClient (/node_modules/@azure/storage-file-datalake/dist/index.js:4631:28)
at new DataLakeServiceClient (/node_modules/@azure/storage-file-datalake/dist/index.js:7108:28)

Expected behavior

I should be able to provide the ADLS file system endpoint from my Storage Account and instantiate a new DataLakeServiceClient.

Additional context

Providing a blob service endpoint will return a MissingRequiredHeader RestError looking for a 'x-ms-blob-type' header.

getAccountNameFromUrl() does an explicit check for blob in the URL and will throw this exception: https://github.com/Azure/azure-sdk-for-js/blob/ef519c3c3f14f167e923d6f55fec8fb88cd315e1/sdk/storage/storage-file-datalake/src/utils/utils.common.ts#L549

Client Service Attention Storage bug customer-reported

Most helpful comment

Hi, ToBlobEndpointHostMappings and ToDfsEndpointHostMappings are exported by @azure/storage-file-datalake package.

Please add following mapping into ToBlobEndpointHostMappings and try again.

import {ToBlobEndpointHostMappings} from @azure/storage-file-datalake;
ToBlobEndpointHostMappings.push(["dfs.core.usgovcloudapi.net", "blob.core.usgovcloudapi.net"]);

//...

@ljian3377 We need to add national clouds suffixes into the default values of ToBlobEndpointHostMappings and ToDfsEndpointHostMappings.

All 13 comments

I don't think there is anything wrong with getAccountNameFromUrl. As it's called with a blobEndpoint url that is transformed from the original URL.

And I can not recreate it. Note that

DefaultAzureCredential will first look for Azure Active Directory (AAD)
client secret credentials in the following environment variables:

  • AZURE_TENANT_ID: The ID of your AAD tenant
  • AZURE_CLIENT_ID: The ID of your AAD app registration (client)
  • AZURE_CLIENT_SECRET: The client secret for your AAD app registration

If those environment variables aren't found and your application is deployed
to an Azure VM or App Service instance, the managed service identity endpoint
will be used as a fallback authentication source.

How do you getDefaultCredential()?
This is weird as even if you haven't configured the credential correctly, the error shouldn't be thrown from getAccountNameFromUrl.

Can you try this sample in your environment?

const { DataLakeServiceClient } = require("@azure/storage-file-datalake");
import { DefaultAzureCredential } from "@azure/identity";

// Load the .env file if it exists
import * as dotenv from "dotenv";
dotenv.config();

export async function main() {
  // Enter your storage account name
  const account = process.env.ACCOUNT_NAME || "";
  if (!account) {
    console.warn(
      "Account name not provided, but it is required to run this sample. Exiting."
    );
    return;
  }

  // Azure AD Credential information is required to run this sample:
  if (
    !process.env.AZURE_TENANT_ID ||
    !process.env.AZURE_CLIENT_ID ||
    !process.env.AZURE_CLIENT_SECRET
  ) {
    console.warn(
      "Azure AD authentication information not provided, but it is required to run this sample. Exiting."
    );
    return;
  }

  const defaultAzureCredential = new DefaultAzureCredential();
  const dataLakeServiceClient = new DataLakeServiceClient(
    `https://${account}.dfs.core.windows.net`,
    defaultAzureCredential
  );

  const fileSystemClient = dataLakeServiceClient.getFileSystemClient("myfilesystemname");
  await fileSystemClient.exists();
}

main().catch((err) => {
  console.error("Error running sample:", err.message);
});

You're correct, @ljian3377 that that is how we're obtaining a DefaultAzureCredential from within an Azure Function.

I omitted the getDefaultCredential(), but it's just an abstraction of the checks you have in your main and returns a DefaultAzureCredential(). For reference, here's that omitted method (it's the same as the sample you provided):

getDefaultCredential = () => {
  if (
    !process.env.AZURE_TENANT_ID ||
    !process.env.AZURE_CLIENT_ID ||
    !process.env.AZURE_CLIENT_SECRET
  ) {
    console.warn(
      "Azure AD authentication information not provided, but it is required to run this sample. Exiting."
    );
    return;
  }
  return new DefaultAzureCredential({ authorityHost: "https://login.microsoftonline.us/" });
}

Using your sample I am returned the same error:

UnhandledPromiseRejectionWarning: Error: Unable to extract accountName with provided information.
at getAccountNameFromUrl (/node_modules/@azure/storage-blob/dist/index.js:10146:15)
at BlobServiceClient.StorageClient (/node_modules/@azure/storage-blob/dist/index.js:12199:28)
at new BlobServiceClient (/node_modules/@azure/storage-blob/dist/index.js:17869:24)
at new DataLakeServiceClient (/node_modules/@azure/storage-file-datalake/dist/index.js:7111:35)

I did make an update to the original Issue for added context, I am accessing a storage account hosted at https://{account}.dfs.core.usgovcloudapi.net, could that effect the transform to the blob endpoint URL?

https://github.com/Azure/azure-sdk-for-js/blob/f7a74a3bcc53ae6f2e7bb9151e1e9828b4bfedab/sdk/storage/storage-file-datalake/src/utils/constants.ts#L199

I'm able to make successful calls if I preempt the try-catch in getAccountNameFromUrl() from executing and hardcode a return with the account name parsedUrl.getHost().split(".")[0];

@glennmusa did you try the same using an account in the public cloud vs gov?

@isamllr accessing the public cloud storage account works as documented.

Hi, ToBlobEndpointHostMappings and ToDfsEndpointHostMappings are exported by @azure/storage-file-datalake package.

Please add following mapping into ToBlobEndpointHostMappings and try again.

import {ToBlobEndpointHostMappings} from @azure/storage-file-datalake;
ToBlobEndpointHostMappings.push(["dfs.core.usgovcloudapi.net", "blob.core.usgovcloudapi.net"]);

//...

@ljian3377 We need to add national clouds suffixes into the default values of ToBlobEndpointHostMappings and ToDfsEndpointHostMappings.

@ljian3377 We need to add national clouds suffixes into the default values of ToBlobEndpointHostMappings and ToDfsEndpointHostMappings.

@XiaoningLiu Could you help provide a more complete list for this mapping.

Hi, ToBlobEndpointHostMappings and ToDfsEndpointHostMappings are exported by @azure/storage-file-datalake package.

import {ToBlobEndpointHostMappings} from @azure/storage-file-datalake;
ToBlobEndpointHostMappings.push(["dfs.core.usgovcloudapi.net", "blob.core.usgovcloudapi.net"]);

//...

Thanks for pointing that out @XiaoningLiu. This works.

We'll use this method for now and will keep an eye out for the implementation that includes the national cloud suffixes.

Appreciate the help @XiaoningLiu, @ljian3377, and @isamllr!

@XiaoningLiu Could you help provide a more complete list for this mapping.

Here's what I pulled from az cloud list

> az cloud list --query '[].{ name:name, storageEndpoint:suffixes.storageEndpoint }' --output table

Name               StorageEndpoint
-----------------  ----------------------
AzureCloud         core.windows.net
AzureChinaCloud    core.chinacloudapi.cn
AzureUSGovernment  core.usgovcloudapi.net
AzureGermanCloud   core.cloudapi.de

According to @XiaoningLiu, also could add these three.

preprod.core.windows.net
core.microsoft.scloud
core.eaglex.ic.gov

The fix is now available in our July preview release. The GA release will be in August.

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

Already released the fix in 12.1.0-preview.1. It will be GA next release.

This should have been fixed in @azure/[email protected].

Was this page helpful?
0 / 5 - 0 ratings