dvc.api.open (from the Python API) unexpectedly fails with an AssertionError when trying to open a file that should be contained in a dvc tracked directory but that misses in cache.
$ dvc version
DVC version: 1.8.0 (pip)
---------------------------------
Platform: Python 3.6.10 on Darwin-19.6.0-x86_64-i386-64bit
Supports: http, https, s3, ssh
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s5
Workspace directory: apfs on /dev/disk1s5
Repo: dvc, git
git and dvc repository; create a test directory containing two file test/foo and test/bar;dvc add test && dvc push && git commit -m "test" && git pushfrom dvc.api import open
with open('test/foo') as file:
pass
which results in the following uncaught exception:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-x-xxxxxxxxxxxx> in <module>
----> 1 with open('test/foo') as file:
2 pass
3
.../python3.6/contextlib.py in __enter__(self)
79 def __enter__(self):
80 try:
---> 81 return next(self.gen)
82 except StopIteration:
83 raise RuntimeError("generator didn't yield") from None
.../python3.6/site-packages/dvc/api.py in _open(path, repo, rev, remote, mode, encoding)
82 with _make_repo(repo, rev=rev) as _repo:
83 with _repo.open_by_relpath(
---> 84 path, remote=remote, mode=mode, encoding=encoding
85 ) as fd:
86 yield fd
.../python3.6/contextlib.py in __enter__(self)
79 def __enter__(self):
80 try:
---> 81 return next(self.gen)
82 except StopIteration:
83 raise RuntimeError("generator didn't yield") from None
.../python3.6/site-packages/dvc/repo/__init__.py in open_by_relpath(self, path, remote, mode, encoding)
616 with self.state:
617 with tree.open(
--> 618 path, mode=mode, encoding=encoding, remote=remote,
619 ) as fobj:
620 yield fobj
.../python3.6/site-packages/dvc/tree/repo.py in open(self, path, mode, encoding, **kwargs)
157 raise
158
--> 159 return dvc_tree.open(path, mode=mode, encoding=encoding, **kwargs)
160
161 def exists(
.../python3.6/site-packages/dvc/tree/dvc.py in open(self, path, mode, encoding, remote)
80 if self.stream:
81 if out.is_dir_checksum:
---> 82 checksum = self._get_granular_checksum(path, out)
83 else:
84 checksum = out.hash_info.value
.../python3.6/site-packages/dvc/tree/dvc.py in _get_granular_checksum(self, path, out, remote)
47
48 def _get_granular_checksum(self, path, out, remote=None):
---> 49 assert isinstance(path, PathInfo)
50 if not self.fetch and not self.stream:
51 raise FileNotFoundError
AssertionError:
Maybe this is an expected failure, but this bare AssertionError suggests it's not.
@hugo-ricateau-tiime, thanks for the bug report. This is not expected, looks like we failed to wrap arguments somewhere (for PathInfo). I'll take a look.
Thanks for the quick fix 馃帀. Any idea of when this will be released?
@hugo-ricateau-tiime, if you need it urgently, I'll suggest you use:
$ pip install git+https://github.com/iterative/dvc#egg=dvc[s3,ssh]
Pinging @efiop for the release, should be soon.
1.8.1 will be out on pip in ~5 minutes. Thank you @skshetry ! :pray:
I wasn't expecting such a fast release. Thanks for your reactivity @skshetry and @efiop :)
Most helpful comment
1.8.1 will be out on pip in ~5 minutes. Thank you @skshetry ! :pray: