Dvc: Support revisions in dvc update

Created on 26 Nov 2019  Â·  9Comments  Â·  Source: iterative/dvc

dvc update --rev hello_world file.dvc

it is still not supported but it seems like a very handy alternative to re-importing:

dvc import --rev hello_world https://github.com/dmpetrov/dataset file
enhancement feature request p0-critical

Most helpful comment

Hey, I'll give it a try to create a solution for this feature!

All 9 comments

Copying main part of the discussion about this from Slack:

@Suor: When you you do dvc import -r ... on an existing import stage it rewrites revisions thus switching a tag or a branch.

Dmitry: ...we need a shortcut for re-importing/import, and update looks like a reasonable alternative...

Alexander: You mean you want avoid retyping the url?

Dmitry: Yes. Also, update seems a proper command name for updating version.
I understand that switching branches is kind of exception, but this is not what I usually do with imported datasets.

Updating vs. re-importing is definitely a source of confusion, and it's reflected in our docs (which also requires the term "fixed-revision import". See the following excerpts:

From https://dvc.org/doc/command-reference/import#example-fixed-revisions-re-importing:

If the Git revision moves (e.g. a branch), you may use dvc update to bring the data up to date. However, for typically static references (e.g. tags), or for SHA commits, in order to actually "update" an import, it's necessary to re-import the data instead, by using dvc import again without or with a different --rev. This will overwrite the import stage...

From https://dvc.org/doc/command-reference/update#examples:

For typically static references (e.g. tags), or for SHA commits, dvc update will not have any effect on the import. Refer to the re-importing example to learn how to "update" fixed-revision imports.

From https://dvc.org/doc/use-cases/data-registry#example (in an expandable section):

In order to actually "update" it, do not use dvc update. Instead, re-import the data

So I like the idea about dvc update --rev because it will probably simplify all these explanations, although it will also require changing its command reference to explain that there are 2 types of updates supported:

  • Simply look for changes in the import as defined in its DVC-file, respecting url AND rev (if present) – only updating the actual data and rev_lock field.
  • Change the rev AND look for changes, updating data and rev_lock as well.

Alternatively we could introduce a new command or subcommand such as dvc import move to change an import stage's rev value, in order to then dvc update it.

Hey, I'll give it a try to create a solution for this feature!

@dmpetrov hey, what is the expected behavior after dvc update --rev another-rev, should it lock original import to another-rev or do not create side effects like that?

current behaviour is, it locks to another-rev but also pulls the rev each time from the remote which I think it should just cache if the last commits are the same?

@ilgooz dvc update and dvc update --rev latest_rev should do the same, so yes, it should lock to another-rev, because that is how dvc update currently works too.

Do you mean that it is pulling the rev on each update? If so, it is how our current caching is implemented. There is no need to tackle that in this PR, feel free to create an issue for it though :slightly_smiling_face:

@ilgooz dvc update and dvc update --rev latest_rev should do the same, so yes, it should lock to another-rev, because that is how dvc update currently works too.

Do you mean that it is pulling the rev on each update? If so, it is how our current caching is implemented. There is no need to tackle that in this PR, feel free to create an issue for it though 🙂

Thanks, yes, I asked two different Qs and got my answers!

@efiop, why is this p0 btw?

@skshetry Just trying to unblock @andronovhopf . Thank you so much for the fix! :pray:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

anotherbugmaster picture anotherbugmaster  Â·  3Comments

GildedHonour picture GildedHonour  Â·  3Comments

ghost picture ghost  Â·  3Comments

robguinness picture robguinness  Â·  3Comments

shcheklein picture shcheklein  Â·  3Comments