Revisions are slightly more accurate - https://git-scm.com/docs/revisions
A revision parameter '
' typically, but not necessarily, names a commit object.
References are the ones living in .git/refs
: https://git-scm.com/book/en/v2/Git-Internals-Git-References
We use the term _references_ in metrics
and diff
commands, but both commands accept revisions, since repo.brancher
accepts revs
(not refs
).
This change would also require updating the docs.
From same discussion in https://github.com/iterative/dvc/pull/3244#pullrequestreview-348990111, I think the conclusion is that the easiest thing to do is to use "commit" everywhere (same as the official reference for git diff
does, for example, https://git-scm.com/docs/git-diff) instead of both "revision" and "reference", to avoid getting into Git's jargon too much.
dvc get/import
commands is also OK. Notice that "revision" means "Synonym for commit (the noun)" per https://git-scm.com/docs/gitglossary#def_revisionp.s. using the more general term "version" instead of revision sometimes is also best. E.g. in the new dvc diff
description by Ramón: "Compare two different versions of your DVC project (tracked by Git)" 👍
Note: other Git terms we may want to review/avoid are:
I can agree on using _commit_ instead of _references_ or _revisions_, it is more well known and user-friendly.
"working tree" in favor of "current workspace" or something like "local changes", as appropriate.
:+1: just to be clear, this specific case ("working tree" -> "current workspace") would apply only for the messages that users see, since it makes sense to talk about _trees_ on the code.
Maybe "SHA" in favor of "commit hash"? (This is the route we're currently going in docs.)
the only instance of "commit hash" that I found on the code was on completion scripts, and it this case it makes more sense to use only _commit_:
it makes sense to talk about trees on the code
Any time a more precise Git term is available in code var/fn names, coments, docstrings, debug output, etc. it should definitely be used, I think. But yes, for user-facing stings we want to simplify terminology and avoid Git jargon as much as possible – hopefully completely.
the only instance of "commit hash"...
OK cool that's easy then. We should prob also look for just "hash" which depending on the context may refer to a commit SHA. Thanks
p.s. I'm down to take this issue as well as other terminology reviews in the code base (e.g. avoid "checksum") but won't have time for it very soon. Maybe next week if I don't keep track of it...
OK cool that's easy then. We should prob also look for just "hash" which depending on the context may refer to a commit SHA. Thanks
Looks like we don't use _hash_ at all :sweat_smile:
Term "revision" still exists in
https://github.com/iterative/dvc/blob/7821fa6866a492c0675248028039257d292a09ec/dvc/command/imp.py#L54
https://github.com/iterative/dvc/blob/7821fa6866a492c0675248028039257d292a09ec/dvc/command/get.py#L72
https://github.com/iterative/dvc/blob/de1f67afbd93929d51c5dd052d5f8a951e77c182/dvc/scm/git/tree.py#L105
https://github.com/iterative/dvc/blob/e7b3297c2d6ae1ad633cd0435cca81093cac86ff/dvc/scm/git/__init__.py#L90
and maybe https://github.com/iterative/dvc/blob/fd1e6f457e393a599468f5afc33c2c8b495a1c97/tests/func/test_diff.py#L110
as well as corresponding changes in dvc/scripts/completion/dvc.zsh
besides docstrings, comments, and code, of course – but not expecting those to change right now necessarily (per https://github.com/iterative/dvc/pull/3299#discussion_r377260820)
@jorgeorpinel revision
sounds better than commit
when applied to API/CLI, as --commit
sounds like some sort of action. revision
sounds the best to me over all other alternatives, tbh.
Git "revision" means "Synonym for commit (the noun)" per https://git-scm.com/docs/gitglossary#def_revision.
Using the more general term "version" instead of revision when possible is also best. E.g. in the new dvc diff description by Ramón: "Compare two different versions of your DVC project (tracked by Git)" – this is the current approach in docs.
It would look strange and not right to me as an option --commit
. It'll be confusing for DVC.
@jorgeorpinel I have no problem with "Compare two different versions" in the docs or any explanations, but version
when applied to CLI/API sounds like a tag to me (just based on what first comes to my mind when I hear it), while revision
is a more general term (again, IMPO), so I don't really see any point in changing rev
that we currently have to it.
I have nothing agains the term "revision". Originally I argued it feels more correct than commit in fact (it encompasses branches, tags, etc, not just a commit SHA) but I thought that per some other discussion with Ivan (not in this issue) we had concluded that its best to always use "commit" for simplicity and not to get into Git terminology. But I guess that's not the case? I'm more than happy to keep using "revision"!
Just curious about the instances in dvc/dvc/scm/git/tree.py
and dvc/dvc/scm/git/__init__.py
I mentioned in https://github.com/iterative/dvc/issues/3245#issuecomment-584414556 above. Otherwise, this is closable for me.