Dvc: Remote WASB via SAS token

Created on 5 Sep 2020  Â·  32Comments  Â·  Source: iterative/dvc

Good beautiful day team,

This feature request refers to the following forum post: https://discuss.dvc.org/t/how-use-dvc-with-azure-blob-storage/158/3. The below post includes almost identical text that is posted in the DVC forum.

In particular, I am interested in using dvc remote with a SAS token. When I add the azureblob remote via https:// link to DVC (config) and initiate dvc push, it returns 405 The resource doesn't support specified Http Verb. error.

To sum app, providing remote as:

https://wasburl/container/?sig=<sastoken>

results in

dvc.exceptions.HTTPError: '405 The resource doesn't support specified Http Verb.'

What I read in different forums, it appears Azure requires Cross-Origin Resource Sharing (CORS) headers and DVC in current form appears no to be supporting it. This results in a feature request to DVC to include CORS headers.

===========================
For the record, the current resultant DVC output log is:

SHORT/REGULAR OUTPUT:

base) ➜  DVC git:(master) ✗ dvc push                                                                                                                                                                                                                                                               [9:36:28]
ERROR: failed to upload '.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' to 'https://<wasb>/<folder>/38/6f68f067e4c588bf4dcc96afdbece7' - '405 The resource doesn't support specified Http Verb.'
ERROR: failed to upload '.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' to 'https://<wasb>/<folder>/d4/1d8cd98f00b204e9800998ecf8427e' - '405 The resource doesn't support specified Http Verb.'
ERROR: failed to upload '.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' to 'https://<wasb>/<folder>/66/52640f2cde8646134c9b5cdfb4583f.dir'
ERROR: failed to push data to the cloud - 3 files failed to upload

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

VERBOSE OUTPUT:

(base) ➜  DVC git:(master) ✗ dvc push --verbose                                                                                                                                                                                                                                                     [9:36:40]
2020-09-04 21:40:27,553 DEBUG: Check for update is enabled.
2020-09-04 21:40:27,559 DEBUG: fetched: [(3,)]
2020-09-04 21:40:27,576 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-04 21:40:27,578 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-04 21:40:27,581 DEBUG: Preparing to upload data to 'https://<wasb>/<folder>/?sig=<SAStoken>
2020-09-04 21:40:27,581 DEBUG: Preparing to collect status from https://<wasb>/<folder>/?sig=<SAStoken>
2020-09-04 21:40:27,581 DEBUG: Collecting information from local cache...
2020-09-04 21:40:27,582 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' is unchanged since it is read-only
2020-09-04 21:40:27,601 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-04 21:40:27,602 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' is unchanged since it is read-only
2020-09-04 21:40:27,602 DEBUG: Collecting information from remote cache...
2020-09-04 21:40:27,603 DEBUG: Querying 1 hashes via object_exists
2020-09-04 21:40:28,030 DEBUG: Matched '0' indexed hashes
2020-09-04 21:40:28,030 DEBUG: Querying 3 hashes via object_exists
2020-09-04 21:40:28,328 DEBUG: Uploading '.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' to 'https://<wasb>/<folder>/d4/1d8cd98f00b204e9800998ecf8427e'
2020-09-04 21:40:28,328 DEBUG: Uploading '.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' to https://'<wasb>/<folder>/38/6f68f067e4c588bf4dcc96afdbece7'
2020-09-04 21:40:28,556 ERROR: failed to upload '.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' to 'https://<wasb>/<folder>/38/6f68f067e4c588bf4dcc96afdbece7' - '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 30, in wrapper
    func(from_info, to_info, *args, **kwargs)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/base.py", line 391, in upload
    self._upload(  # noqa, pylint: disable=no-member
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/http.py", line 195, in _upload
    raise HTTPError(response.status_code, response.reason)
dvc.exceptions.HTTPError: '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
2020-09-04 21:40:28,558 ERROR: failed to upload '.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' to 'https://<wasb>/<folder>/d4/1d8cd98f00b204e9800998ecf8427e' - '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 30, in wrapper
    func(from_info, to_info, *args, **kwargs)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/base.py", line 391, in upload
    self._upload(  # noqa, pylint: disable=no-member
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/http.py", line 195, in _upload
    raise HTTPError(response.status_code, response.reason)
dvc.exceptions.HTTPError: '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
2020-09-04 21:40:28,561 DEBUG: failed to upload full contents of 'data/prepared', aborting .dir file upload
2020-09-04 21:40:28,562 ERROR: failed to upload '.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' to 'https://<wasb>/<folder>/66/52640f2cde8646134c9b5cdfb4583f.dir'
2020-09-04 21:40:28,563 DEBUG: fetched: [(31,)]
2020-09-04 21:40:28,564 ERROR: failed to push data to the cloud - 3 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/command/data_sync.py", line 50, in run
    processed_files_count = self.repo.push(
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/repo/__init__.py", line 34, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/repo/push.py", line 35, in push
    return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/data_cloud.py", line 65, in push
    return self.repo.cache.local.push(
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/remote/base.py", line 15, in wrapper
    return f(obj, named_cache, remote, *args, **kwargs)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 417, in push
    return self._process(
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 386, in _process
    raise UploadError(fails)
dvc.exceptions.UploadError: 3 files failed to upload
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-09-04 21:40:28,573 DEBUG: Analytics is enabled.
2020-09-04 21:40:28,708 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpch5v7r19']'
2020-09-04 21:40:28,709 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpch5v7r19']'
awaiting response research

Most helpful comment

@efiop No problem! I'll make a quick PR just to show one way to do it. What I did works fine for pushing/pulling and everything. One second and I'll write up a bit about it and send it along.

All 32 comments

Hi @krstp . Please note that https://wasburl/container/?sig=<sastoken> will be considered as an http remote, not as azure one. For azure, url should start with azure. Have you tried configuring sas_token in our config?

dvc remote add -d myazure azure://container/path
dvc remote modify myazure sas_token <sastoken>

Yes, it is all understood, I am expecting it to be as http/s remote. In theory it should work; and by higher directive I am constrained to using SAS token, however, per reported feature/error it will not work.

Indeed, before I tried azure://, but it did not work.

dvc remote modify myazure sas_token <sastoken>

yields the following repsonse:

ERROR: configuration error - config file error: extra keys not allowed @ data['remote']['azuretest']['sas_token']

Let me know what I might be missing. Thanks.

--
There is one exception, if I do not put the token in _"quotes"_ it will report zsh: parse error near '&', this just means token always needs to be quoted to be parsed correctly.

More verbose outputs:

  • to the first syntax with token in _"quotes"_:
2020-09-05 18:21:45,011 ERROR: configuration error - config file error: extra keys not allowed @ data['remote']['azuretest']['sas_token']
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/main.py", line 75, in main
    ret = cmd.run()
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/command/remote.py", line 74, in run
    section[self.args.option] = self.args.value
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", line 120, in __exit__
    next(self.gen)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/config.py", line 426, in edit
    self.validate(merged_conf)
  File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/config.py", line 436, in validate
    raise ConfigError(str(exc)) from None
dvc.config.ConfigError: config file error: extra keys not allowed @ data['remote']['azuretest']['sas_token']
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
  • and to the token without _"quotes"_ yields the same response as in post above.

@krstp Oops, indeed, we didn't add it to our config yet, sorry for confusion. Btw, do you use ~/.azure/config? If so, you could set the sas token there as well. Or in general, have you tried using az CLI utility with that sas token?

Thanks again for your quick turnaround.

az itself, I do not think will do what I need, however, yes I did try to use the token with regular azcopy and/or via _Storage Explorer_ etc., all works as desired.

The gist is, I want to work with multiple environments/containers (say 100) where each environment has different SAS token. At the same time, I want multiple users to be able to pull/push dvc data in active/work container from the other idle data hosting containers at will. As my tool of choice is DVC, the most logical way would be to set remote at will per user need, but I guess I'll have to come up with a workaround.

Thanks again, have a great one, and I guess community will have to wait for an update ;)

@krstp If az works fine, then adding sas_token support to our confing should do the trick. We already have logic for it in https://github.com/iterative/dvc/blob/93a298343f6f6a744465b41fd34f2b0c40478efa/dvc/tree/azure.py#L48 , but just didn't add it to https://github.com/iterative/dvc/blob/93a298343f6f6a744465b41fd34f2b0c40478efa/dvc/config.py#L193 . Looks like we just simply need to add sas token to that line and that should do the trick. Would you be able to modify that line and give it a shot?

For the record: here is the is the code change you suggested: https://github.com/krstp/dvc/commit/496f49d8dd57b4887c52e7cc063e2c51c2922e21 . The build succeeded and I am able to provide SAS token via dvc remote modify as you suggest above, however, this will not work; see below.

For the above steps, you advised usage of azure:// protocol, which - as we agreed in private - will not work as the only input I can deliver is via API endpoint https://<account>.blob.core.windows.net/<container>?sig=<token> and above described solution refers to DVC's Azure connection_string, which will not work by feeding HTTPS string.

I also tried the latest DVC release via pip install git+https://github.com/iterative/dvc but latest fixes do not address the current issue and as stated feeding HTTPS:// via AZURE:// will not work. Nonetheless, I went with my own fork and sas_token code update (pip install git+https://github.com/krstp/dvc.git). The results is still erroneous.

To sum up, this leaves the issue at the start point, status remains OPEN . Nonetheless, I want to say, I much appreciate all your help and time!

For the record: seems like azure://container/path plus connection string with something like AccountName=myaccount;SharedAccessSignature=mysastoken;(or more) should do the trick, but I didn't test it.

Tested and fails :( To document steps:

dvc remote add azurecache azure://<container>/<path>
dvc remote modify azurecache connection_string "AccountName=<accountname>;SharedAccessSignature=<token>" 

where dvc push yields:

ERROR: unexpected error - Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
RequestId:<request_ID>
Time:2020-09-07T02:54:54.9084796Z
ErrorCode:NoAuthenticationInformation
Error:None

OK, I have the problem somehow pin-pointed. I had a chance to test it in-between other users with different connection_strings that worked for them, ultimately I could not get them to work with recent DVC version. The end story is most recent DVC 1.6.4 (brew) and 1.6.6 (pip) seems to be buggy and will not accept connection to azure.

What did help was downgrade to 1.1.11 via pip; I did not test other in-between version. With 1.1.11 I also had to downgrade azure-storage-blob==0.37.1; otherwise there is another issue that relates to the latter package.

With those changes and config in one of the forms of:

['remote "sample 1"']
    url = azure://<container>/<folder>/
    connection_string = AccountName=<account_name>;SharedAccessSignature=sig=<token>

OR

['remote "sample 2"']
    url = azure://<container>/<folder>/
    connection_string = DefaultEndpointsProtocol=https;AccountName=<account_name>;AccountKey=<token>;EndpointSuffix=core.windows.net

OR

['remote "sample 3"']
    url = azure://<contianer>/<folder>/
    connection_string = azure://ContainerName=<container_name>;BlobEndpoint=https://<account_name>.blob.core.windows.net;SharedAccessSignature=sig=<token>

it works.

Returning my tests for now to version 1.1.11.

@krstp Whoa, great news! Ok, so probably something got messed up during our migration to azure-storage-blob 12. Looking if I can spot something in particular...

@krstp Ok, so here is the suspect: https://github.com/iterative/dvc/pull/4379 . So 1.4.0 is the last dvc version that should be working for you.

@krstp Could you please show a verbose log for the error you are getting with the newer dvc version when using

['remote "sample 1"']
    url = azure://<container>/<folder>/
    connection_string = AccountName=<account_name>;SharedAccessSignature=sig=<token>

? That would help to pinpoint the issue.

Ok, seems like when we are connecting:

https://github.com/iterative/dvc/blob/67ddb7b41f8ba382048327fa5b5520168ab9c43f/dvc/tree/azure.py#L78

from_connection_string ignores the sas_token:

https://github.com/Azure/azure-sdk-for-python/blob/49026681df85f52c6428660697c1fb0ba3b2bfad/sdk/storage/azure-storage-blob/azure/storage/blob/_blob_service_client.py#L164

https://github.com/Azure/azure-sdk-for-python/blob/49026681df85f52c6428660697c1fb0ba3b2bfad/sdk/storage/azure-storage-blob/azure/storage/blob/_shared/base_client.py#L348

so

['remote "sample 1"']
    url = azure://<container>/<folder>/
    connection_string = AccountName=<account_name>;SharedAccessSignature=sig=<token>

indeed doesn't seem to work. Seems like what we need is

['remote "sample 1"']
    url = azure://<container>/<folder>/
    account = <account_name>
    sas_token = <token>

so then in https://github.com/iterative/dvc/blob/67ddb7b41f8ba382048327fa5b5520168ab9c43f/dvc/tree/azure.py#L45 we could

    name = config.get("account") self._az_config.get("storage", "account", None)

Haven't tried it yet though :slightly_frowning_face:

Confirming 1.4.0 works as desired.

And I agree, simplification of config to essential keys and entries would make life easier ;)

Thanks again for your support.

Another option is for us to parse the connection string ourselves and then just not use from_connection_string.

@krstp Unfortunately I'm not able to get to reproducing that issue right now, but if you have time, you could give the first approach a try. (the first one: adding account config option and using it in addition to sas_token) Feeling good about that one :slightly_smiling_face:

Returning to your earlier request of verbose log; DVC version: 1.6.6 (pip):

(dvc) (base) ➜  DVC-mine git:(master) ✗ dvc push --verbose                                                                                                                                                                                                        [9:48:15]
2020-09-08 21:48:19,479 DEBUG: Check for update is enabled.
2020-09-08 21:48:19,482 DEBUG: fetched: [(3,)]
2020-09-08 21:48:19,495 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-08 21:48:19,495 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-08 21:48:19,575 DEBUG: Preparing to upload data to 'azure://<container>/test/.dvc/'
2020-09-08 21:48:19,575 DEBUG: Preparing to collect status from azure://<container>/test/.dvc/
2020-09-08 21:48:19,576 DEBUG: Collecting information from local cache...
2020-09-08 21:48:19,576 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' is unchanged since it is read-only
2020-09-08 21:48:19,577 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' is unchanged since it is read-only
2020-09-08 21:48:19,577 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-08 21:48:19,577 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/19/a94bfb19fc864eb3a06f85d29e1221' is unchanged since it is read-only
2020-09-08 21:48:19,578 DEBUG: Collecting information from remote cache...
2020-09-08 21:48:19,578 DEBUG: Querying 1 hashes via object_exists
2020-09-08 21:48:19,605 DEBUG: fetched: [(33,)]
2020-09-08 21:48:19,606 ERROR: unexpected error - cannot import name 'BlobServiceClient' from 'azure.storage.blob' (/Users/<user>/venv/dvc/lib/python3.7/site-packages/azure/storage/blob/__init__.py)
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/main.py", line 75, in main
    ret = cmd.run()
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/command/data_sync.py", line 59, in run
    run_cache=self.args.run_cache,
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/repo/__init__.py", line 34, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/repo/push.py", line 35, in push
    return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/data_cloud.py", line 66, in push
    cache, jobs=jobs, remote=remote, show_checksums=show_checksums,
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/remote/base.py", line 15, in wrapper
    return f(obj, named_cache, remote, *args, **kwargs)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 435, in push
    download=False,
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 330, in _process
    download=download,
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 168, in _status
    self._indexed_dir_hashes(named_cache, remote, dir_md5s)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 239, in _indexed_dir_hashes
    remote.tree.list_hashes_exists(dir_md5s - dir_exists)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/base.py", line 672, in list_hashes_exists
    ret = list(itertools.compress(hashes, in_remote))
  File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
    yield fs.pop().result()
  File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/opt/miniconda3/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/base.py", line 665, in exists_with_progress
    ret = self.exists(path_info)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 129, in exists
    return any(path_info.path == path for path in paths)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 129, in <genexpr>
    return any(path_info.path == path for path in paths)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 132, in _list_paths
    container_client = self.blob_service.get_container_client(bucket)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/funcy/objects.py", line 50, in __get__
    return prop.__get__(instance, type)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 71, in blob_service
    from azure.storage.blob import BlobServiceClient
ImportError: cannot import name 'BlobServiceClient' from 'azure.storage.blob' (/Users/<user>/venv/dvc/lib/python3.7/site-packages/azure/storage/blob/__init__.py)
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-09-08 21:48:19,616 DEBUG: Analytics is enabled.
2020-09-08 21:48:19,688 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpew3yjbjg']'
2020-09-08 21:48:19,689 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpew3yjbjg']'

@krstp That particular error is just because you have an old version of azure-storage-blob installed. I suppose it is a leftover from your experiments with older versions of dvc. You need to pip install 'dvc[azure]' to install appropriate azure deps.

Indeed, I just checked it, and changing:
https://github.com/iterative/dvc/blob/67ddb7b41f8ba382048327fa5b5520168ab9c43f/dvc/tree/azure.py#L45

to name = config.get("account") does the trick.

I just reinstalled 1.6.6 locally based on latest pull from today's code and works as desired. Thanks!

Nope; update doesn't work.

Tried DVC with latest azure dependencies, but still no go.

@krstp Was able to reproduce the issue. Could you try:

pip install 'git+https://github.com/efiop/dvc.git@fix-4534#egg=dvc[azure]'

make sure that $ dvc version shows 1.6.6+aa569c.mod as version.

That patch makes status/push/pull/etc work fine for me with a config that looks like this:

['remote "myazure"']                                                                                           
url = azure://mycontainer/path                                                                                 
account = mydvctest                                                                                            
sas_token = "?sv=2019-12-12&ss=bfqt&srt=sco...."

Still doesn't work. Version verified. My suspicion is, the sas_token format is wrong, nonetheless, I tried to feed it multiple ways, but doesn't work.

@krstp Note how I've wrapped it with "", did you do the same? Also, it would be great if you could post the error you are getting now.

Indeed tested with "". Same error as before. Sorry for later reply, life took over; when I find some more time I'll ping you with an update. Thanks.

I was able to get this working with a connection string like so:

account_name = "storageaccountname"
sas_token = "<output of az storage container generate-sas>"
cstring = f"AccountName={account_name};SharedAccessSignature={sas_token};"

I then used cstring in the remote modify command, e.g.

dvc remote modify --local remotename connection_string "<cstring above>"

The quick test I was using to iterate on the connection string formatting

from azure.storage.blob import BlobServiceClient
cstring = "<cstring above>"
bsc = BlobServiceClient.from_connection_string(cstring)
cc = bsc.get_container_client("mycontainer")
list(cc.list_blobs())

Everything almost worked except you don't quite support the SAS I was using because you try to check for blob existence, which requires account-level privileges, and I was using container-level ones. I'll make a PR to fix the latter, but that's a separate thing, I think.

Everything almost worked except you don't quite support the SAS I was using because you try to check for blob existence, which requires account-level privileges, and I was using container-level ones. I'll make a PR to fix the latter, but that's a separate thing, I think.

@mhworth Great catch! We've received a simiar comment from another user not long ago, we should definitely reconsider creating a bucket in azure, as it is not how we do things in all other remotes. We could totally consider removing that logic and requiring user to create the container themselves and if not - show a pretty error.

@efiop No problem! I'll make a quick PR just to show one way to do it. What I did works fine for pushing/pulling and everything. One second and I'll write up a bit about it and send it along.

Here 'tis, won't hurt my feelings if you prefer a different route, just figured I'd share since it worked around my problem!

https://github.com/iterative/dvc/pull/4582

I'm facing this issue without SAS as well. @krstp Are you able to make it work without SAS?

Sorry, I did not have time to play with it. The only option for me was to use an older version of DVC that relied on older version of azure-blob (this one seems to be the main source of error) via SAS key. FYI: Azure yields even more generally unrelated issues, but this is a different subject and I guess it is back on M$ back-burner.

@krstp Got it, thanks.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shcheklein picture shcheklein  Â·  3Comments

anotherbugmaster picture anotherbugmaster  Â·  3Comments

prihoda picture prihoda  Â·  3Comments

tc-ying picture tc-ying  Â·  3Comments

gregfriedland picture gregfriedland  Â·  3Comments