Dvc: dag: not accepting output files/folders

Created on 17 Nov 2020  路  6Comments  路  Source: iterative/dvc

Bug Report

Please provide information about your setup

Running dvc dag with optional target argument that is an output file or a folder produces an error:

$ dvc dag postprocessed-data/data.pickle
ERROR: failed to show a pipeline for 'None' - "Stage 'postprocessed-data/data.pickle' not found inside 'dvc.yaml' file"

This is despite the fact that there is a stage with output dependency postprocessed-data/data.pickle in file dvc.yaml.

Output of dvc version:

$ dvc version
DVC version: 1.9.1 (pip)
---------------------------------
Platform: Python 3.7.6 on Darwin-19.6.0-x86_64-i386-64bit
Supports: azure, gs, http, https, s3
Cache types: reflink, hardlink, symlink
Repo: dvc, git

Additional Information (if any):

If applicable, please also provide a --verbose output of the command, eg: dvc add --verbose.

bug p1-important

Most helpful comment

I remember we used to support this. might be deprecated after some refactoring.

Say you have a module dependent on an intermediate data file xyz.csv that you want to replace. How else would you know how the file was generated so you know how to replace it? Do i have to keep track of the pipeline myself by examining pipeline yaml files? This is currently what i have to do for most changes in the pipeline.

As a heavy user of DVC in production workflows, i think the tool currently suffers from too little support for real-world scenarios and i can really advise against depreciating this.

All 6 comments

I remember we used to support this. might be deprecated after some refactoring.

I remember we used to support this. might be deprecated after some refactoring.

Say you have a module dependent on an intermediate data file xyz.csv that you want to replace. How else would you know how the file was generated so you know how to replace it? Do i have to keep track of the pipeline myself by examining pipeline yaml files? This is currently what i have to do for most changes in the pipeline.

As a heavy user of DVC in production workflows, i think the tool currently suffers from too little support for real-world scenarios and i can really advise against depreciating this.

@karajan1001 Actually, I don't think we did support it for dvc dag or the deprecated dvc pipeline show. The docs have it because we have a generalized stage collection logic, so we've assumed that dvc dag also supports it already, but turns out we were wrong :slightly_frowning_face: IIRC the reason why we didn't do that is that --outs then would need an additional logic to not include sibling outputs from the same stage in the DAG.

@amka66 No plans to deprecate it :slightly_smiling_face: Created a fix in https://github.com/iterative/dvc/pull/4908 , we'll try to release 1.10.1 with it tomorrow. Thank you for the feedback!

As a heavy user of DVC in production workflows, i think the tool currently suffers from too little support for real-world scenarios and i can really advise against depreciating this.

That's why we're really grateful when users like you report issues/pain-points they experience. :pray:

@amka66 Sorry, I misunderstood. I thought it is similar to #4532.

For the record: fix was merged and is going to be released in 1.10.2 later today. Thanks again for the feedback! :pray:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mdscruggs picture mdscruggs  路  3Comments

jorgeorpinel picture jorgeorpinel  路  3Comments

GildedHonour picture GildedHonour  路  3Comments

siddygups picture siddygups  路  3Comments

robguinness picture robguinness  路  3Comments