dvc add
failed while using wild match mode. It adds a wrong suffix .dvc
to my pattern.
Output of dvc version
:
Platform: Python 3.8.3 on Windows-10-10.0.18362-SP0
Supports: http, https, ssh
Cache types: hardlink
Cache directory: NTFS on D:\
Workspace directory: NTFS on D:\
Repo: dvc, git
$ dvc version
Additional Information (if any):
If applicable, please also provide a --verbose
output of the command, eg: dvc add --verbose
.
@karajan1001 What does that error say, btw? And also could you provide verbose output, please?
So far looks like there might be something about the way your shell is evaluating the regex. It actually looks like it is not expanding the wildcards and ? at all and passing them as is to the dvc. Dvc itself doesn't support regexes at all, we simply rely on your shell to do that and then pass the list to dvc add
. E.g. on bash dvc add *.mov
would actually result in dvc add 1.mov 2.mov ... 10.mov
, so dvc will get the list of files and not the original regex.
@karajan1001 What does that error say, btw? And also could you provide verbose output, please?
So far looks like there might be something about the way your shell is evaluating the regex. It actually looks like it is not expanding the wildcards and ? at all and passing them as is to the dvc. Dvc itself doesn't support regexes at all, we simply rely on your shell to do that and then pass the list to
dvc add
. E.g. on bashdvc add *.mov
would actually result indvc add 1.mov 2.mov ... 10.mov
, so dvc will get the list of files and not the original regex.
Yes obviously, DVC didn't get correctly file list. This might be an issue of 1. environment 2. package DVC relied on, not one in DVC itself. But
@karajan1001 Great point about the git. Git indeed supports some globing natively, so to match it we also need to pass the targets through os.glob
. The use case is limited to shells that don't support globbing natively (I'm surprised PS didn't do that, maybe I'm missing something), so it is pretty limited :slightly_frowning_face:
@efiop
I tested on my computer, PowerShell didn't expand patterns.
According to stackoverflow:
We have to implement wildcard expansion ourselves.
@karajan1001 Thanks for the research! :pray: So we indeed need to pass targets through os.glob to implement that. We could start with doing just that only in dvc/repo/add.py
, but there might be a better way to do it everywhere. Obviously we could add a custom argparse action
that would pass the targets through os.glob, but it seems to be more fitting to implement it on API level (dvc/repo/) instaed of just CLI (dvc/command/).
+1 on this feature!
Closing in favor of https://github.com/iterative/dvc/issues/4816
Most helpful comment
+1 on this feature!