Dvc: move: Support changing dependencies of related stages

Created on 10 Jan 2019  路  9Comments  路  Source: iterative/dvc

echo "hello" > hello

dvc add hello

dvc run -d hello -o copy 'cp hello copy'

dvc move hello greetings

Move should change copy.dvc content as well:

- cmd: cp hello copy
+ cmd: cp greetings copy
deps:
 md5: b1946ac92492d2347c6235b4d2611184
- path: hello
+ path: greetings
md5: 4aadcd3be9fc768f4a57d7a41236b610
outs:
 cache: true
  md5: b1946ac92492d2347c6235b4d2611184
  metric: false
  path: copy
enhancement p3-nice-to-have

Most helpful comment

In my world view, the outputs should be right next to the stage files. anything else I find very confusing....so in that world, if you move the stage files, my expectation is that you also move the outputs. The problem is the inputs..some of the inputs are relative, and others are absolute...and this needs to be modified both in the command and in the dependencies...

All 9 comments

this is very important. one thing that needs to be thought of is that the command itself might reference files that are elsewhere in the repository...and the relative path to these files might be in the command. In that case, the command will fail in the new location (assuming a directory depth change) unless the paths are absolute, or the paths are modified during the move.

@yfarjoun That is true. That is one of the reasons why we didn't implement this feature right away. Parsing the command is indeed hacky and even that wouldn't help you have paths hardcoded in your script or config file.

True...but a working solution (like a bash variable that can be referenced, or the use of dvc root) and a set of "best practices" that will enable moving files, would alleviate the problem.

@yfarjoun Agreed. Referencing variables/params is a part of https://github.com/iterative/dvc/issues/1462 . I guess we better tackle parametrization/evaluation in general first and then proceed with this move feature. Thanks for the feedback! :slightly_smiling_face:

@yfarjoun @mroutis @efiop with the introduction of wdir could we actually implement this move by just changing wdir? It is the simplest solution that sounds reasonable, your thoughts?

It is unclear where the resultant files end up with a wdir that isn't . do they end up in wdir? are they moved to be next to the .dvc file? the mechanics (from the documentation) isn't clear to me.

@yfarjoun sorry about the confusion, I think I got the issue idea wrong. For a moment I though that this was about being able to move a stage file created by dvc run, while actually it's about fixing the existing move (that can move only data source stages that were created by dvc add) logic by taking care of the related stages. Actually, I'm not even sure right now how should semantics of dvc move for a stage file look like. Should it for example change outputs location and how?

In my world view, the outputs should be right next to the stage files. anything else I find very confusing....so in that world, if you move the stage files, my expectation is that you also move the outputs. The problem is the inputs..some of the inputs are relative, and others are absolute...and this needs to be modified both in the command and in the dependencies...

We can at least say to a user: "The following stages depend on 'foo', you need to update them manually: ...".

Was this page helpful?
0 / 5 - 0 ratings

Related issues

gregfriedland picture gregfriedland  路  3Comments

dnabanita7 picture dnabanita7  路  3Comments

mdscruggs picture mdscruggs  路  3Comments

TezRomacH picture TezRomacH  路  3Comments

dmpetrov picture dmpetrov  路  3Comments