dvc init --no-scmdvc import-url s3://some_bucket/some_target -v2020-07-01 17:14:02,947 DEBUG: fetched: [(3,)]
2020-07-01 17:14:03,123 DEBUG: Removing output 'some_target' of stage: 'some_target.dvc'.
Importing 's3://some_bucket/some_target' -> 'some_target'
2020-07-01 17:14:03,123 DEBUG: Computed stage: 'some_target.dvc' md5: '2f8b87d3b22efd1638f414c3b3f65614'
2020-07-01 17:14:03,123 DEBUG: 'md5' of stage: 'some_target.dvc' changed.
2020-07-01 17:14:04,088 DEBUG: fetched: [(0,)]
2020-07-01 17:14:04,146 ERROR: failed to import s3://some_bucket/some_target. You could also try downloading it manually, and adding it with `dvc add`. - Current operation was unsuccessful because 's3://some_bucket/some_target' requires existing cache on 's3' remote. See <https://man.dvc.org/config#cache> for information on how to set up remote cache.
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/dvc/command/imp_url.py", line 14, in run
self.repo.imp_url(
File "/usr/lib/python3.8/site-packages/dvc/repo/__init__.py", line 36, in wrapper
ret = f(repo, *args, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/repo/scm_context.py", line 4, in run
result = method(repo, *args, **kw)
File "/usr/lib/python3.8/site-packages/dvc/repo/imp_url.py", line 54, in imp_url
stage.run()
File "/home/anotherbugmaster/.local/lib/python3.8/site-packages/funcy/decorators.py", line 39, in wrapper
return deco(call, *dargs, **dkwargs)
File "/usr/lib/python3.8/site-packages/dvc/stage/decorators.py", line 35, in rwlocked
return call()
File "/home/anotherbugmaster/.local/lib/python3.8/site-packages/funcy/decorators.py", line 60, in __call__
return self._func(*self._args, **self._kwargs)
File "/usr/lib/python3.8/site-packages/dvc/stage/__init__.py", line 424, in run
sync_import(self, dry, force)
File "/usr/lib/python3.8/site-packages/dvc/stage/imports.py", line 29, in sync_import
stage.save_deps()
File "/usr/lib/python3.8/site-packages/dvc/stage/__init__.py", line 387, in save_deps
dep.save()
File "/usr/lib/python3.8/site-packages/dvc/output/base.py", line 268, in save
self.info = self.save_info()
File "/usr/lib/python3.8/site-packages/dvc/output/base.py", line 192, in save_info
return self.remote.save_info(self.path_info)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 762, in save_info
return self.tree.save_info(path_info, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 329, in save_info
self.PARAM_CHECKSUM: self.get_hash(path_info, tree=tree, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 297, in get_hash
hash_ = self.get_dir_hash(path_info, tree, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 311, in get_dir_hash
raise RemoteCacheRequiredError(path_info)
dvc.exceptions.RemoteCacheRequiredError: Current operation was unsuccessful because 's3://some_bucket/some_target' requires existing cache on 's3' remote. See <https://man.dvc.org/config#cache> for information on how to set up remote cache.
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
dvc get-url s3://some_bucket/some_target -v2020-07-01 17:15:39,910 ERROR: unexpected error - 'NoneType' object has no attribute 'cache'
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/dvc/main.py", line 53, in main
ret = cmd.run()
File "/usr/lib/python3.8/site-packages/dvc/command/get_url.py", line 17, in run
Repo.get_url(self.args.url, out=self.args.out)
File "/usr/lib/python3.8/site-packages/dvc/repo/get_url.py", line 19, in get_url
dep.save()
File "/usr/lib/python3.8/site-packages/dvc/output/base.py", line 268, in save
self.info = self.save_info()
File "/usr/lib/python3.8/site-packages/dvc/output/base.py", line 192, in save_info
return self.remote.save_info(self.path_info)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 762, in save_info
return self.tree.save_info(path_info, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 329, in save_info
self.PARAM_CHECKSUM: self.get_hash(path_info, tree=tree, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 297, in get_hash
hash_ = self.get_dir_hash(path_info, tree, **kwargs)
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 310, in get_dir_hash
if not self.cache:
File "/usr/lib/python3.8/site-packages/dvc/remote/base.py", line 184, in cache
return getattr(self.repo.cache, self.scheme)
AttributeError: 'NoneType' object has no attribute 'cache'
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
The same commands work with https://some_domain/some_target urls and I don't think that external cache were ever necessary to download files from S3.
Output of dvc version:
$ dvc version
1.1.2
Additional Information (if any):
If applicable, please also provide a --verbose output of the command, eg: dvc add --verbose.
I found out a couple of things:
get-url works in 0.93.0import-url work one need to set up s3 cache in _any_ bucket, not necessarily in the same bucket that contains the file that needs to be importedThat kind of solves the issue, but I don't get the logic behind this. Why would I need a cache in a separate bucket just to download the file from a completely different bucket? Seems weird because I need to download the file to my local machine anyway in order to compute hashes
@anotherbugmaster This is a well known bug that became more intrusive once we've adjusted the way we process inputs in get-url and import-url. It will be improved in the near future.