dvc run does not include folder in .gitignore

Created on 3 Mar 2020  路  4Comments  路  Source: iterative/dvc

I have a project that has been working properly. It's fundamentally based on pipeline.
Before when I created the pipeline the models folder was added to .gitignore, however now I can't make it ignore the models folder. Here is the pipeline:

dvc add data

dvc run --no-exec \
    -f train.dvc \
    -d code/train.py \
    -d data/train \
    -o models \
    -M metrics/train.json \
    python code/train.py

dvc run --no-exec \
    -f eval.dvc \
    -d code/eval.py \
    -d data/test \
    -d models \
    -M metrics/eval.json \
    python code/eval.py

now .gitignore contains:

/data

however I expect /models to be there as it was (as specified by -o)

Please provide information about your setup
dvc 0.87.0
Python 3.6.9
ubuntu 10.04

bug p1-important product research

All 4 comments

@DavidGOrtega The issue here is --no-exec. It only creates the dvc-file but doesn't act on it, hence why models is not added to .gitignore until you actually run it. Could you explain why you need it/why it matters?

@efiop This issue is actually a bit misleading since the issue is not only the inclusion in gitignore but also is not tracked by DVC until you run repro. It matters specially in CI/CD space.
The workflow is this:

  • setup a dvc pipeline without running repro locally
  • push to repo
  • CI pull and run repro

When CI does dvc pull errors appears (having to use -force) missing caches or models not tracked.

ERROR: failed to pull data from the cloud - Checkout failed for following targets:
  models
 Did you forget to fetch?

I'm actually reviewing this. I will come back with more info to update this.

@DavidGOrtega thank you for reporting this! Was you able to find any workaround?

It seems like a valid use case which is not fully supported by DVC.

@efiop is there any specific reason not to create outputs with --no-exec?
If we need to support this scenario, what would be your suggestion: create outputs or introduce a new option?

@dmpetrov We could totally do that automatically, I'm just trying to undersatnd the reasons behind it and if we really need it. So far the use case is valid, so I think we could proceed with implementing it.

Was this page helpful?
0 / 5 - 0 ratings