Good beautiful day team,
This feature request refers to the following forum post: https://discuss.dvc.org/t/how-use-dvc-with-azure-blob-storage/158/3. The below post includes almost identical text that is posted in the DVC forum.
In particular, I am interested in using dvc remote
with a SAS token. When I add the azureblob remote via https:// link to DVC (config) and initiate dvc push
, it returns 405 The resource doesn't support specified Http Verb.
error.
To sum app, providing remote as:
https://wasburl/container/?sig=<sastoken>
results in
dvc.exceptions.HTTPError: '405 The resource doesn't support specified Http Verb.'
What I read in different forums, it appears Azure requires Cross-Origin Resource Sharing (CORS) headers and DVC in current form appears no to be supporting it. This results in a feature request to DVC to include CORS headers.
===========================
For the record, the current resultant DVC output log is:
SHORT/REGULAR OUTPUT:
base) ➜ DVC git:(master) ✗ dvc push [9:36:28]
ERROR: failed to upload '.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' to 'https://<wasb>/<folder>/38/6f68f067e4c588bf4dcc96afdbece7' - '405 The resource doesn't support specified Http Verb.'
ERROR: failed to upload '.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' to 'https://<wasb>/<folder>/d4/1d8cd98f00b204e9800998ecf8427e' - '405 The resource doesn't support specified Http Verb.'
ERROR: failed to upload '.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' to 'https://<wasb>/<folder>/66/52640f2cde8646134c9b5cdfb4583f.dir'
ERROR: failed to push data to the cloud - 3 files failed to upload
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
VERBOSE OUTPUT:
(base) ➜ DVC git:(master) ✗ dvc push --verbose [9:36:40]
2020-09-04 21:40:27,553 DEBUG: Check for update is enabled.
2020-09-04 21:40:27,559 DEBUG: fetched: [(3,)]
2020-09-04 21:40:27,576 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-04 21:40:27,578 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-04 21:40:27,581 DEBUG: Preparing to upload data to 'https://<wasb>/<folder>/?sig=<SAStoken>
2020-09-04 21:40:27,581 DEBUG: Preparing to collect status from https://<wasb>/<folder>/?sig=<SAStoken>
2020-09-04 21:40:27,581 DEBUG: Collecting information from local cache...
2020-09-04 21:40:27,582 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' is unchanged since it is read-only
2020-09-04 21:40:27,601 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-04 21:40:27,602 DEBUG: Assuming '/Users/<user>/GitHubs/DVC/.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' is unchanged since it is read-only
2020-09-04 21:40:27,602 DEBUG: Collecting information from remote cache...
2020-09-04 21:40:27,603 DEBUG: Querying 1 hashes via object_exists
2020-09-04 21:40:28,030 DEBUG: Matched '0' indexed hashes
2020-09-04 21:40:28,030 DEBUG: Querying 3 hashes via object_exists
2020-09-04 21:40:28,328 DEBUG: Uploading '.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' to 'https://<wasb>/<folder>/d4/1d8cd98f00b204e9800998ecf8427e'
2020-09-04 21:40:28,328 DEBUG: Uploading '.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' to https://'<wasb>/<folder>/38/6f68f067e4c588bf4dcc96afdbece7'
2020-09-04 21:40:28,556 ERROR: failed to upload '.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' to 'https://<wasb>/<folder>/38/6f68f067e4c588bf4dcc96afdbece7' - '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 30, in wrapper
func(from_info, to_info, *args, **kwargs)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/base.py", line 391, in upload
self._upload( # noqa, pylint: disable=no-member
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/http.py", line 195, in _upload
raise HTTPError(response.status_code, response.reason)
dvc.exceptions.HTTPError: '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
2020-09-04 21:40:28,558 ERROR: failed to upload '.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' to 'https://<wasb>/<folder>/d4/1d8cd98f00b204e9800998ecf8427e' - '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 30, in wrapper
func(from_info, to_info, *args, **kwargs)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/base.py", line 391, in upload
self._upload( # noqa, pylint: disable=no-member
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/tree/http.py", line 195, in _upload
raise HTTPError(response.status_code, response.reason)
dvc.exceptions.HTTPError: '405 The resource doesn't support specified Http Verb.'
------------------------------------------------------------
2020-09-04 21:40:28,561 DEBUG: failed to upload full contents of 'data/prepared', aborting .dir file upload
2020-09-04 21:40:28,562 ERROR: failed to upload '.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' to 'https://<wasb>/<folder>/66/52640f2cde8646134c9b5cdfb4583f.dir'
2020-09-04 21:40:28,563 DEBUG: fetched: [(31,)]
2020-09-04 21:40:28,564 ERROR: failed to push data to the cloud - 3 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/command/data_sync.py", line 50, in run
processed_files_count = self.repo.push(
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/repo/__init__.py", line 34, in wrapper
ret = f(repo, *args, **kwargs)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/repo/push.py", line 35, in push
return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/data_cloud.py", line 65, in push
return self.repo.cache.local.push(
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/remote/base.py", line 15, in wrapper
return f(obj, named_cache, remote, *args, **kwargs)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 417, in push
return self._process(
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/cache/local.py", line 386, in _process
raise UploadError(fails)
dvc.exceptions.UploadError: 3 files failed to upload
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-09-04 21:40:28,573 DEBUG: Analytics is enabled.
2020-09-04 21:40:28,708 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpch5v7r19']'
2020-09-04 21:40:28,709 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpch5v7r19']'
Hi @krstp . Please note that https://wasburl/container/?sig=<sastoken>
will be considered as an http remote, not as azure one. For azure, url should start with azure
. Have you tried configuring sas_token
in our config?
dvc remote add -d myazure azure://container/path
dvc remote modify myazure sas_token <sastoken>
Yes, it is all understood, I am expecting it to be as http/s remote. In theory it should work; and by higher directive I am constrained to using SAS token, however, per reported feature/error it will not work.
Indeed, before I tried azure://
, but it did not work.
dvc remote modify myazure sas_token <sastoken>
yields the following repsonse:
ERROR: configuration error - config file error: extra keys not allowed @ data['remote']['azuretest']['sas_token']
Let me know what I might be missing. Thanks.
--
There is one exception, if I do not put the token in _"quotes"_ it will report zsh: parse error near '&'
, this just means token always needs to be quoted to be parsed correctly.
More verbose outputs:
2020-09-05 18:21:45,011 ERROR: configuration error - config file error: extra keys not allowed @ data['remote']['azuretest']['sas_token']
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/main.py", line 75, in main
ret = cmd.run()
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/command/remote.py", line 74, in run
section[self.args.option] = self.args.value
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/config.py", line 426, in edit
self.validate(merged_conf)
File "/usr/local/Cellar/dvc/1.6.1/libexec/lib/python3.8/site-packages/dvc/config.py", line 436, in validate
raise ConfigError(str(exc)) from None
dvc.config.ConfigError: config file error: extra keys not allowed @ data['remote']['azuretest']['sas_token']
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
@krstp Oops, indeed, we didn't add it to our config yet, sorry for confusion. Btw, do you use ~/.azure/config
? If so, you could set the sas token there as well. Or in general, have you tried using az CLI utility with that sas token?
Thanks again for your quick turnaround.
az
itself, I do not think will do what I need, however, yes I did try to use the token with regular azcopy
and/or via _Storage Explorer_ etc., all works as desired.
The gist is, I want to work with multiple environments/containers (say 100) where each environment has different SAS token. At the same time, I want multiple users to be able to pull/push dvc data in active/work container from the other idle data hosting containers at will. As my tool of choice is DVC, the most logical way would be to set remote at will per user need, but I guess I'll have to come up with a workaround.
Thanks again, have a great one, and I guess community will have to wait for an update ;)
@krstp If az works fine, then adding sas_token
support to our confing should do the trick. We already have logic for it in https://github.com/iterative/dvc/blob/93a298343f6f6a744465b41fd34f2b0c40478efa/dvc/tree/azure.py#L48 , but just didn't add it to https://github.com/iterative/dvc/blob/93a298343f6f6a744465b41fd34f2b0c40478efa/dvc/config.py#L193 . Looks like we just simply need to add sas
token to that line and that should do the trick. Would you be able to modify that line and give it a shot?
For the record: here is the is the code change you suggested: https://github.com/krstp/dvc/commit/496f49d8dd57b4887c52e7cc063e2c51c2922e21 . The build succeeded and I am able to provide SAS token via dvc remote modify
as you suggest above, however, this will not work; see below.
For the above steps, you advised usage of azure://
protocol, which - as we agreed in private - will not work as the only input I can deliver is via API endpoint https://<account>.blob.core.windows.net/<container>?sig=<token>
and above described solution refers to DVC's Azure connection_string
, which will not work by feeding HTTPS string.
I also tried the latest DVC release via pip install git+https://github.com/iterative/dvc
but latest fixes do not address the current issue and as stated feeding HTTPS:// via AZURE:// will not work. Nonetheless, I went with my own fork and sas_token
code update (pip install git+https://github.com/krstp/dvc.git
). The results is still erroneous.
To sum up, this leaves the issue at the start point, status remains OPEN . Nonetheless, I want to say, I much appreciate all your help and time!
For the record: seems like azure://container/path
plus connection string with something like AccountName=myaccount;SharedAccessSignature=mysastoken;
(or more) should do the trick, but I didn't test it.
Tested and fails :( To document steps:
dvc remote add azurecache azure://<container>/<path>
dvc remote modify azurecache connection_string "AccountName=<accountname>;SharedAccessSignature=<token>"
where dvc push
yields:
ERROR: unexpected error - Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
RequestId:<request_ID>
Time:2020-09-07T02:54:54.9084796Z
ErrorCode:NoAuthenticationInformation
Error:None
OK, I have the problem somehow pin-pointed. I had a chance to test it in-between other users with different connection_string
s that worked for them, ultimately I could not get them to work with recent DVC version. The end story is most recent DVC 1.6.4 (brew)
and 1.6.6 (pip)
seems to be buggy and will not accept connection to azure.
What did help was downgrade to 1.1.11
via pip; I did not test other in-between version. With 1.1.11
I also had to downgrade azure-storage-blob==0.37.1
; otherwise there is another issue that relates to the latter package.
With those changes and config in one of the forms of:
['remote "sample 1"']
url = azure://<container>/<folder>/
connection_string = AccountName=<account_name>;SharedAccessSignature=sig=<token>
OR
['remote "sample 2"']
url = azure://<container>/<folder>/
connection_string = DefaultEndpointsProtocol=https;AccountName=<account_name>;AccountKey=<token>;EndpointSuffix=core.windows.net
OR
['remote "sample 3"']
url = azure://<contianer>/<folder>/
connection_string = azure://ContainerName=<container_name>;BlobEndpoint=https://<account_name>.blob.core.windows.net;SharedAccessSignature=sig=<token>
it works.
Returning my tests for now to version 1.1.11
.
@krstp Whoa, great news! Ok, so probably something got messed up during our migration to azure-storage-blob 12. Looking if I can spot something in particular...
@krstp Ok, so here is the suspect: https://github.com/iterative/dvc/pull/4379 . So 1.4.0 is the last dvc version that should be working for you.
@krstp Could you please show a verbose log for the error you are getting with the newer dvc version when using
['remote "sample 1"']
url = azure://<container>/<folder>/
connection_string = AccountName=<account_name>;SharedAccessSignature=sig=<token>
? That would help to pinpoint the issue.
Ok, seems like when we are connecting:
https://github.com/iterative/dvc/blob/67ddb7b41f8ba382048327fa5b5520168ab9c43f/dvc/tree/azure.py#L78
from_connection_string ignores the sas_token:
so
['remote "sample 1"']
url = azure://<container>/<folder>/
connection_string = AccountName=<account_name>;SharedAccessSignature=sig=<token>
indeed doesn't seem to work. Seems like what we need is
['remote "sample 1"']
url = azure://<container>/<folder>/
account = <account_name>
sas_token = <token>
so then in https://github.com/iterative/dvc/blob/67ddb7b41f8ba382048327fa5b5520168ab9c43f/dvc/tree/azure.py#L45 we could
name = config.get("account") self._az_config.get("storage", "account", None)
Haven't tried it yet though :slightly_frowning_face:
Confirming 1.4.0
works as desired.
And I agree, simplification of config to essential keys and entries would make life easier ;)
Thanks again for your support.
Another option is for us to parse the connection string ourselves and then just not use from_connection_string.
@krstp Unfortunately I'm not able to get to reproducing that issue right now, but if you have time, you could give the first approach a try. (the first one: adding account
config option and using it in addition to sas_token
) Feeling good about that one :slightly_smiling_face:
Returning to your earlier request of verbose log; DVC version: 1.6.6 (pip)
:
(dvc) (base) ➜ DVC-mine git:(master) ✗ dvc push --verbose [9:48:15]
2020-09-08 21:48:19,479 DEBUG: Check for update is enabled.
2020-09-08 21:48:19,482 DEBUG: fetched: [(3,)]
2020-09-08 21:48:19,495 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-08 21:48:19,495 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-08 21:48:19,575 DEBUG: Preparing to upload data to 'azure://<container>/test/.dvc/'
2020-09-08 21:48:19,575 DEBUG: Preparing to collect status from azure://<container>/test/.dvc/
2020-09-08 21:48:19,576 DEBUG: Collecting information from local cache...
2020-09-08 21:48:19,576 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/38/6f68f067e4c588bf4dcc96afdbece7' is unchanged since it is read-only
2020-09-08 21:48:19,577 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/d4/1d8cd98f00b204e9800998ecf8427e' is unchanged since it is read-only
2020-09-08 21:48:19,577 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/66/52640f2cde8646134c9b5cdfb4583f.dir' is unchanged since it is read-only
2020-09-08 21:48:19,577 DEBUG: Assuming '/Users/<user>/GitHubs/DVC-mine/.dvc/cache/19/a94bfb19fc864eb3a06f85d29e1221' is unchanged since it is read-only
2020-09-08 21:48:19,578 DEBUG: Collecting information from remote cache...
2020-09-08 21:48:19,578 DEBUG: Querying 1 hashes via object_exists
2020-09-08 21:48:19,605 DEBUG: fetched: [(33,)]
2020-09-08 21:48:19,606 ERROR: unexpected error - cannot import name 'BlobServiceClient' from 'azure.storage.blob' (/Users/<user>/venv/dvc/lib/python3.7/site-packages/azure/storage/blob/__init__.py)
------------------------------------------------------------
Traceback (most recent call last):
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/main.py", line 75, in main
ret = cmd.run()
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/command/data_sync.py", line 59, in run
run_cache=self.args.run_cache,
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/repo/__init__.py", line 34, in wrapper
ret = f(repo, *args, **kwargs)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/repo/push.py", line 35, in push
return len(used_run_cache) + self.cloud.push(used, jobs, remote=remote)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/data_cloud.py", line 66, in push
cache, jobs=jobs, remote=remote, show_checksums=show_checksums,
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/remote/base.py", line 15, in wrapper
return f(obj, named_cache, remote, *args, **kwargs)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 435, in push
download=False,
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 330, in _process
download=download,
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 168, in _status
self._indexed_dir_hashes(named_cache, remote, dir_md5s)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/cache/local.py", line 239, in _indexed_dir_hashes
remote.tree.list_hashes_exists(dir_md5s - dir_exists)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/base.py", line 672, in list_hashes_exists
ret = list(itertools.compress(hashes, in_remote))
File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
yield fs.pop().result()
File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 435, in result
return self.__get_result()
File "/opt/miniconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/opt/miniconda3/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/base.py", line 665, in exists_with_progress
ret = self.exists(path_info)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 129, in exists
return any(path_info.path == path for path in paths)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 129, in <genexpr>
return any(path_info.path == path for path in paths)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 132, in _list_paths
container_client = self.blob_service.get_container_client(bucket)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/funcy/objects.py", line 50, in __get__
return prop.__get__(instance, type)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/funcy/objects.py", line 28, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
File "/Users/<user>/venv/dvc/lib/python3.7/site-packages/dvc/tree/azure.py", line 71, in blob_service
from azure.storage.blob import BlobServiceClient
ImportError: cannot import name 'BlobServiceClient' from 'azure.storage.blob' (/Users/<user>/venv/dvc/lib/python3.7/site-packages/azure/storage/blob/__init__.py)
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-09-08 21:48:19,616 DEBUG: Analytics is enabled.
2020-09-08 21:48:19,688 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpew3yjbjg']'
2020-09-08 21:48:19,689 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/rp/40tr0pm57qlggq2jfnwz81940000gn/T/tmpew3yjbjg']'
@krstp That particular error is just because you have an old version of azure-storage-blob installed. I suppose it is a leftover from your experiments with older versions of dvc. You need to pip install 'dvc[azure]'
to install appropriate azure deps.
Indeed, I just checked it, and changing:
https://github.com/iterative/dvc/blob/67ddb7b41f8ba382048327fa5b5520168ab9c43f/dvc/tree/azure.py#L45
to name = config.get("account")
does the trick.
I just reinstalled 1.6.6
locally based on latest pull from today's code and works as desired. Thanks!
Nope; update doesn't work.
Tried DVC with latest azure dependencies, but still no go.
@krstp Was able to reproduce the issue. Could you try:
pip install 'git+https://github.com/efiop/dvc.git@fix-4534#egg=dvc[azure]'
make sure that $ dvc version
shows 1.6.6+aa569c.mod
as version.
That patch makes status/push/pull/etc work fine for me with a config that looks like this:
['remote "myazure"']
url = azure://mycontainer/path
account = mydvctest
sas_token = "?sv=2019-12-12&ss=bfqt&srt=sco...."
Still doesn't work. Version verified. My suspicion is, the sas_token
format is wrong, nonetheless, I tried to feed it multiple ways, but doesn't work.
@krstp Note how I've wrapped it with ""
, did you do the same? Also, it would be great if you could post the error you are getting now.
Indeed tested with "". Same error as before. Sorry for later reply, life took over; when I find some more time I'll ping you with an update. Thanks.
I was able to get this working with a connection string like so:
account_name = "storageaccountname"
sas_token = "<output of az storage container generate-sas>"
cstring = f"AccountName={account_name};SharedAccessSignature={sas_token};"
I then used cstring in the remote modify command, e.g.
dvc remote modify --local remotename connection_string "<cstring above>"
The quick test I was using to iterate on the connection string formatting
from azure.storage.blob import BlobServiceClient
cstring = "<cstring above>"
bsc = BlobServiceClient.from_connection_string(cstring)
cc = bsc.get_container_client("mycontainer")
list(cc.list_blobs())
Everything almost worked except you don't quite support the SAS I was using because you try to check for blob existence, which requires account-level privileges, and I was using container-level ones. I'll make a PR to fix the latter, but that's a separate thing, I think.
Everything almost worked except you don't quite support the SAS I was using because you try to check for blob existence, which requires account-level privileges, and I was using container-level ones. I'll make a PR to fix the latter, but that's a separate thing, I think.
@mhworth Great catch! We've received a simiar comment from another user not long ago, we should definitely reconsider creating a bucket in azure, as it is not how we do things in all other remotes. We could totally consider removing that logic and requiring user to create the container themselves and if not - show a pretty error.
@efiop No problem! I'll make a quick PR just to show one way to do it. What I did works fine for pushing/pulling and everything. One second and I'll write up a bit about it and send it along.
Here 'tis, won't hurt my feelings if you prefer a different route, just figured I'd share since it worked around my problem!
I'm facing this issue without SAS as well. @krstp Are you able to make it work without SAS?
Sorry, I did not have time to play with it. The only option for me was to use an older version of DVC that relied on older version of azure-blob (this one seems to be the main source of error) via SAS key. FYI: Azure yields even more generally unrelated issues, but this is a different subject and I guess it is back on M$ back-burner.
@krstp Got it, thanks.
Most helpful comment
@efiop No problem! I'll make a quick PR just to show one way to do it. What I did works fine for pushing/pulling and everything. One second and I'll write up a bit about it and send it along.