UPDATE: Jump to https://github.com/iterative/dvc/issues/4423#issuecomment-676527748
位 dvc import https://github.com/iterative/example-get-started data/data.xml
Importing 'data/data.xml (https://github.com/iterative/example-get-started)' -> 'data.xml'
ERROR: unexpected error - [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\tests\\.dvc\\cache\\a3\\04afb96060aad90176268345e10355'
Full --verbose output:
位 dvc import https://github.com/iterative/example-get-started data/data.xml -v
2020-08-18 23:22:41,569 DEBUG: Check for update is enabled.
2020-08-18 23:22:41,752 ERROR: interrupted by the user
------------------------------------------------------------
Traceback (most recent call last):
File "c:\users\poj12\dvc\dvc\main.py", line 53, in main
cmd = args.func(args)
File "c:\users\poj12\dvc\dvc\command\base.py", line 40, in __init__
updater.check()
File "c:\users\poj12\dvc\dvc\updater.py", line 58, in check
self._with_lock(self._check, "checking")
File "c:\users\poj12\dvc\dvc\updater.py", line 44, in _with_lock
func()
File "c:\users\poj12\dvc\dvc\updater.py", line 62, in _check
self.fetch()
File "c:\users\poj12\dvc\dvc\updater.py", line 84, in fetch
daemon(["updater"])
File "c:\users\poj12\dvc\dvc\daemon.py", line 105, in daemon
file_path = os.path.abspath(inspect.stack()[0][1])
File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 1514, in stack
return getouterframes(sys._getframe(1), context)
File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 1491, in getouterframes
frameinfo = (frame,) + getframeinfo(frame, context)
File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 1465, in getframeinfo
lines, lnum = findsource(frame)
File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 792, in findsource
module = getmodule(object, file)
File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\inspect.py", line 754, in getmodule
os.path.realpath(f)] = module.__name__
File "C:\Users\poj12\AppData\Local\Programs\Python\Python38\lib\ntpath.py", line 647, in realpath
path = _getfinalpathname(path)
KeyboardInterrupt
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-08-18 23:22:41,927 DEBUG: Analytics is disabled.
(.venv) poj12@AP-QDVJ7BLR ~/DVC-repos/tests (master)
位
(.venv) poj12@AP-QDVJ7BLR ~/DVC-repos/tests (master)
位
(.venv) poj12@AP-QDVJ7BLR ~/DVC-repos/tests (master)
位 dvc import https://github.com/iterative/example-get-started data/data.xml -v
2020-08-18 23:22:51,005 DEBUG: Check for update is enabled.
2020-08-18 23:22:51,516 DEBUG: Trying to spawn '['daemon', '-q', 'updater']'
2020-08-18 23:22:52,092 DEBUG: Spawned '['daemon', '-q', 'updater']'
2020-08-18 23:22:52,117 DEBUG: fetched: [(3,)]
2020-08-18 23:22:53,407 DEBUG: Removing output 'data.xml' of stage: 'data.xml.dvc'.
Importing 'data/data.xml (https://github.com/iterative/example-get-started)' -> 'data.xml'
2020-08-18 23:22:53,416 DEBUG: Computed stage: 'data.xml.dvc' md5: 'e7514d625f896d082cc0ca259453b732'
2020-08-18 23:22:53,421 DEBUG: 'md5' of stage: 'data.xml.dvc' changed.
2020-08-18 23:22:53,425 DEBUG: Creating external repo https://github.com/iterative/example-get-started@None
2020-08-18 23:22:53,430 DEBUG: erepo: git clone 'https://github.com/iterative/example-get-started' to a temporary dir
2020-08-18 23:22:56,420 DEBUG: Saving '..\..\AppData\Local\Temp\tmppx2petbedvc-clone\data\data.xml' to '.dvc\cache\a3\04afb96060aad90176268345e10355'.
2020-08-18 23:22:56,428 DEBUG: cache 'C:\Users\poj12\DVC-repos\tests\.dvc\cache\a3\04afb96060aad90176268345e10355' expected 'a304afb96060aad90176268345e10355' actual 'None'
2020-08-18 23:22:56,439 DEBUG: cache 'C:\Users\poj12\DVC-repos\tests\.dvc\cache\a3\04afb96060aad90176268345e10355' expected 'a304afb96060aad90176268345e10355' actual 'None'
2020-08-18 23:22:56,508 DEBUG: Preparing to download data from 'https://remote.dvc.org/get-started'
2020-08-18 23:22:56,512 DEBUG: Preparing to collect status from https://remote.dvc.org/get-started
2020-08-18 23:22:56,517 DEBUG: Collecting information from local cache...
2020-08-18 23:22:56,638 DEBUG: fetched: [(45,)]
2020-08-18 23:22:56,697 ERROR: unexpected error - [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\tests\\.dvc\\cache\\a3\\04afb96060aad90176268345e10355'
------------------------------------------------------------
Traceback (most recent call last):
File "c:\users\poj12\dvc\dvc\main.py", line 54, in main
ret = cmd.run()
File "c:\users\poj12\dvc\dvc\command\imp.py", line 14, in run
self.repo.imp(
File "c:\users\poj12\dvc\dvc\repo\imp.py", line 6, in imp
return self.imp_url(path, out=out, erepo=erepo, frozen=True)
File "c:\users\poj12\dvc\dvc\repo\__init__.py", line 34, in wrapper
ret = f(repo, *args, **kwargs)
File "c:\users\poj12\dvc\dvc\repo\scm_context.py", line 4, in run
result = method(repo, *args, **kw)
File "c:\users\poj12\dvc\dvc\repo\imp_url.py", line 54, in imp_url
stage.run()
File "c:\users\poj12\dvc\.venv\lib\site-packages\funcy\decorators.py", line 39, in wrapper
return deco(call, *dargs, **dkwargs)
File "c:\users\poj12\dvc\dvc\stage\decorators.py", line 36, in rwlocked
return call()
File "c:\users\poj12\dvc\.venv\lib\site-packages\funcy\decorators.py", line 60, in __call__
return self._func(*self._args, **self._kwargs)
File "c:\users\poj12\dvc\dvc\stage\__init__.py", line 429, in run
sync_import(self, dry, force)
File "c:\users\poj12\dvc\dvc\stage\imports.py", line 30, in sync_import
stage.deps[0].download(stage.outs[0])
File "c:\users\poj12\dvc\dvc\dependency\repo.py", line 97, in download
_, _, cache_infos = repo.fetch_external([self.def_path])
File "c:\users\poj12\dvc\dvc\external_repo.py", line 147, in fetch_external
self.local_cache.save(
File "c:\users\poj12\dvc\dvc\cache\base.py", line 282, in save
return self._save(path_info, tree, hash_, save_link, **kwargs)
File "c:\users\poj12\dvc\dvc\cache\base.py", line 290, in _save
return self._save_file(path_info, tree, hash_, save_link, **kwargs)
File "c:\users\poj12\dvc\dvc\cache\base.py", line 218, in _save_file
with tree.open(path_info, mode="rb") as fobj:
File "c:\users\poj12\dvc\dvc\repo\tree.py", line 372, in open
return dvc_tree.open(path, mode=mode, encoding=encoding, **kwargs)
File "c:\users\poj12\dvc\dvc\repo\tree.py", line 113, in open
return open(cache_path, mode=mode, encoding=encoding)
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\tests\\.dvc\\cache\\a3\\04afb96060aad90176268345e10355'
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2020-08-18 23:22:56,877 DEBUG: Analytics is disabled.
Similar problem for get:
位 dvc get https://github.com/iterative/example-get-started data/data.xml
ERROR: unexpected error - [Errno 2] No such file or directory: 'C:\\Users\\poj12\\DVC-repos\\.kWasCRdYJMf8x9qBkgneqt\\a3\\04afb96060aad90176268345e10355'
I tried from different locations, inside DVC repos o not.
Output of dvc version:
DVC version: 1.5.1
---------------------------------
Platform: Python 3.8.2 on Windows-10-10.0.18362-SP0
Supports: All remotes
Workspace directory: NTFS on C:\
Repo: dvc, git
@jorgeorpinel, Does this also happen in 1.5.0? Can you checkout to 64f038fda83c2400263b335fca6bb4f63f6ecf0f and try?
Yeah, still happens on the latest master commit.
No, I just wanted to know if it's a recent issue or not. It'd be helpful if you could check on 1.5.0 as well.
Ah OK I got you. Yes, this also happens for me in 64f038fda83c2400263b335fca6bb4f63f6ecf0f (checked it out and ran pip install -e ".[all,tests]").
Can reproduce on linux. This is bad...
Yeah, I can too. Looks like granular import is broken. model.pkl is working, so does .data
Directory is also not working.
Hm, old dvc versions also don't work. ~It is actually because example-get-started is not dvc pull-able it all, probably some issues with the remote there.~ Need to check (and, well, need a better error on dvc side :smile: )
~example-get-started is currently using a private s3 url, instead of public http, probably someone did that by accident.~
~@shcheklein @jorgeorpinel Hm, looks like https://github.com/iterative/example-get-started/commits/master/.dvc/config didn't even have public remotes. Did someone force-push there? I do remember using that repo for get/import previously.~
For the record: indeed there were some force pushes, but current master has publically accesible remote and I can dvc pull from it, but can't dvc get for some reason, not from any dvc version actually, which is strange. Investigating...
Ok, so the issue is that https://github.com/iterative/example-get-started/blob/master/data/data.xml.dvc#L4 is a dvc-import-ed file by itself (i don't think it used to be that way). So dvc is not able to pull it (we didn't officially support chaining imports at all yet) and gives that strange error. Definitely need to at least error-out nicely.
More precisely: https://github.com/iterative/dvc/blob/14745039a4bfd35c34dc342bd5e6d324ecd52640/dvc/repo/tree.py#L105 collects only cache_info.external cache, which repo.cloud.pull is not able to process and just finishes as if there was nothing it needed to do. Def need to handle it in DvcTree as well.
Directly related to https://github.com/iterative/dvc/issues/3305
Closing in favor of https://github.com/iterative/dvc/issues/3305 , since detecting the import chaining is very close to just adding the support for it (and error-ing out on cycles).
Ah, glad it's an issue with the repo only, and not DVC 馃槄
data.xml.dvc#L4 is a dvc-import-ed file by itself (i don't think it used to be that way). So dvc is not able to pull it
True! This is a change from our new Get Started (https://dvc.org/doc/start/data-access#import-file-or-directory)
I can confirm that both import and get work with https://github.com/iterative/dataset-registry get-started/data.xml.
gives that strange error. Definitely need to at least error-out nicely
Agree.