UPDATE: Skip to https://github.com/iterative/dvc/issues/5089#issuecomment-744675730
repro fails for any file if there's an error in ./dvc.yaml
位 cat dvc.yaml
stages:
hi:
cmd: echo hello
meta:
string: "some text"
位 dvc repro tmp/dvc.yaml
ERROR: 'dvc.yaml' format error: extra keys not allowed @ data['meta']
Even if tmp/dvc.yaml is valid, repro fails because ./dvc.yaml is invalid. AND the message is misleading because "dvc.yaml" here refers to ./dvc.yaml, whereas I'm specifying tmp/dvc.yaml as target (so I would assume that's the one DVC may be referring to).
Repro of tmp/dvc.yaml (as long as it's valid).
Output of dvc version:
位 dvc version
DVC version: 1.10.2 (exe)
---------------------------------
Platform: Python 3.7.9 on Windows-10-10.0.18362-SP0
Supports: All remotes
Cache types: hardlink
Cache directory: NTFS on C:\
Caches: local
Remotes: None
Workspace directory: NTFS on C:\
Repo: dvc, git
Additional Information (if any):
We build full dag before doing anything to check for cycles/overlaps and so on, so this error is by design. The dvc.yaml is a correct relative path to your current location. We could consider using ./dvc.yaml there instead though, but some might consider that a bug and as something ugly.
tmp/dvc.yaml also has a DAG so why not build that one only? I think that ignoring any file not given as a target is what the user would expect.
The dvc.yaml is a correct relative path... We could consider using ./dvc.yaml
I didn't quite understand those comments. In the example above I specified which path to use (not that one).
tmp/dvc.yaml also has a DAG
@jorgeorpinel, the DAG in the dvc is not explicit. It's built by DVC based on output or dependencies of the stage.
And, as it's implicit, you could very well have a dependency be output in another stage in a different dvc.yaml file. So, a dvc.yaml is not a DAG. That's why we need to build a DAG, from all of the stages in a repo to ensure there are not any duplications or cycles.
In the example above I specified which path to use (not that one).
It's the relative path dvc.yaml that you are seeing which is correct, as the relative path from your working directory has the error, not the tmp/dvc.yaml.
we need to build a DAG, from all of the stages in a repo to ensure there are not any duplications or cycles
I see. Well, that's nice! Having the option to split DAGs in several dvc.yaml files. Although a bit unintuitive but I did double check that an invalid tmp/dvc.yaml causes dvc repro to fail (even if that file is not given as a target).
But this does mean that our docs on repro are incomplete, maybe incorrect... None of this is expressed there.
the relative path dvc.yaml that you are seeing which is correct
Ah I see. I think it would be better to use ./ in that particular case because it's not clear that 'dvc/yaml' is a path. It feels like just a file name, and since I specified tml/dvc.yaml then I assume DVC means that one.
@efiop @skshetry @pared I've transferred this to docs but one last concern: I tried a .dvc file as target and it still checked all of the dvc.yaml files in my project for validity first. How is that helpful? There's no way they can affect the .dvc file I specified as target, as it's not a stage (not dependent on any actual stage in dvc.yaml files) but rather just some data tracked by DVC. Thanks
@jorgeorpinel There might be another dvc file that is pointing to the same output path, we check for that and that's why we need to collect everything anyway. This behaviour might change in the future, but for now this is by design.
Overlapping output paths... OK got it 馃憤
Most helpful comment
@jorgeorpinel, the DAG in the dvc is not explicit. It's built by DVC based on output or dependencies of the stage.
And, as it's implicit, you could very well have a dependency be output in another stage in a different
dvc.yamlfile. So, advc.yamlis not a DAG. That's why we need to build a DAG, from all of the stages in a repo to ensure there are not any duplications or cycles.It's the relative path
dvc.yamlthat you are seeing which is correct, as the relative path from your working directory has the error, not thetmp/dvc.yaml.