Dvc: Running import-url on S3 URL gives error message "requires existing cache on 's3' remote"

Created on 22 Jul 2020  路  2Comments  路  Source: iterative/dvc

Bug Report

We have a corpus (i.e., directory of text documents) that are stored in a company-wide S3 data store at s3://my-company-research-data/data/corpus.

When I run dvc import-url s3://my-company-research-data/data/corpus ./local/path, I get an error:

ERROR: failed to import s3://duolingo-research-data/det/COCA. You could also try downloading it manually, and adding it withdvc add. - Current operation was unsuccessful because 's3://my-company-research-data/data/corpus' requires existing cache on 's3' remote. See <https://man.dvc.org/config#cache> for information on how to set up remote cache.. Per, this thread, this appears to be a bug.

Please provide information about your setup

Output of dvc version:

$ dvc version

DVC version: 1.1.11
Python version: 3.7.3
Platform: Darwin-18.7.0-x86_64-i386-64bit
Binary: False
Package: pip
Supported remotes: http, https, s3
Repo: dvc, git

Additional Information (if any):

When I run the same command with --verbose, this is what i get:

2020-07-22 17:20:16,566 DEBUG: fetched: [(3,)]                                  
2020-07-22 17:20:16,935 DEBUG: Removing output 'challenge/common/data/COCA-corpus' of stage: 'COCA-corpus.dvc'.
Importing 's3://duolingo-research-data/det/COCA' -> 'challenge/common/data/COCA-corpus'
2020-07-22 17:20:16,936 DEBUG: Computed stage: 'COCA-corpus.dvc' md5: 'fda33f7e862514b4c924b2692aff808d'
2020-07-22 17:20:16,936 DEBUG: 'md5' of stage: 'COCA-corpus.dvc' changed.
2020-07-22 17:20:17,523 DEBUG: fetched: [(0,)]
2020-07-22 17:20:17,524 ERROR: failed to import s3://duolingo-research-data/det/COCA. You could also try downloading it manually, and adding it with `dvc add`. - Current operation was unsuccessful because 's3://duolingo-research-data/det/COCA' requires existing cache on 's3' remote. See <https://man.dvc.org/config#cache> for information on how to set up remote cache.
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/command/imp_url.py", line 18, in run
    no_exec=self.args.no_exec,
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/repo/__init__.py", line 34, in wrapper
    ret = f(repo, *args, **kwargs)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/repo/scm_context.py", line 4, in run
    result = method(repo, *args, **kw)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/repo/imp_url.py", line 54, in imp_url
    stage.run()
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/funcy/decorators.py", line 39, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/stage/decorators.py", line 35, in rwlocked
    return call()
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/funcy/decorators.py", line 60, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/stage/__init__.py", line 424, in run
    sync_import(self, dry, force)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/stage/imports.py", line 29, in sync_import
    stage.save_deps()
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/stage/__init__.py", line 387, in save_deps
    dep.save()
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/output/base.py", line 267, in save
    self.info = self.save_info()
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/output/base.py", line 191, in save_info
    return self.tree.save_info(self.path_info)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/tree/base.py", line 314, in save_info
    self.PARAM_CHECKSUM: self.get_hash(path_info, tree=tree, **kwargs)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/tree/base.py", line 282, in get_hash
    hash_ = self.get_dir_hash(path_info, tree, **kwargs)
  File "/Users/duolingo/Documents/GitHub/det-challenge-development/.pyenv/lib/python3.7/site-packages/dvc/tree/base.py", line 296, in get_dir_hash
    raise RemoteCacheRequiredError(path_info)
dvc.exceptions.RemoteCacheRequiredError: Current operation was unsuccessful because 's3://duolingo-research-data/det/COCA' requires existing cache on 's3' remote. See <https://man.dvc.org/config#cache> for information on how to set up remote cache.
------------------------------------------------------------
bug

Most helpful comment

Hi @nimrand !

Unfortunately, this is a known bug :slightly_frowning_face: https://github.com/iterative/dvc/issues/4144 We'll try to get to it in the next sprint (starting next week). Thank you for the feedback! Closing this ticket in favor of https://github.com/iterative/dvc/issues/4144 .

All 2 comments

Thanks! I don't think importing should have this condition: requires existing cache on 's3' remote.

Hi @nimrand !

Unfortunately, this is a known bug :slightly_frowning_face: https://github.com/iterative/dvc/issues/4144 We'll try to get to it in the next sprint (starting next week). Thank you for the feedback! Closing this ticket in favor of https://github.com/iterative/dvc/issues/4144 .

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ghost picture ghost  路  3Comments

prihoda picture prihoda  路  3Comments

shcheklein picture shcheklein  路  3Comments

dmpetrov picture dmpetrov  路  3Comments

gregfriedland picture gregfriedland  路  3Comments