Dvc: post-checkout hook breaks checkout and rebase when adopting DVC in a branch

Created on 28 Jun 2019  Â·  10Comments  Â·  Source: iterative/dvc

When adopting DVC in an existing repository with PR code-review workflow, it will be landing in a branch that isn't master. This branch then has problematic interactions where work continues on master before the dvc adoption lands there.

Specifically, once you've done dvc init, committed that, and then dvc install, the post-checkout hook breaks git checkout and git rebase between the master branch and the adopting-dvc branch.

To reproduce:

mkdir dvc-test
cd dvc-test
git init
echo "Testing" > README.md
git add README.md
git commit -m "Initial commit"
git checkout -b start-using-dvc
dvc init
git commit -m "dvc init"
dvc install

I go on experimenting with dvc, but then I have to pause the dvc adoption to do some work on the master branch.:

git stash save "experimenting with dvc"
git checkout master

This last command succeeds but outputs:

Switched to branch 'master'
Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/repos' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.

... uh-oh. I'm on master where dvc hasn't been taken into use, so I (or my tooling) shouldn't be putting anything in .dvc.

I continue with the master work and then return to DVC:

echo "Progress" > README.md
git commit README.md -m "Progress"
git checkout start-using-dvc

That last command fails with:

error: The following untracked working tree files would be overwritten by checkout:
    .dvc/.gitignore
Please move or remove them before you switch branches.
Aborting

... which is not normal in git; with a clean working tree, git checkout to switch branches should always work.

I workaround this and want to keep going, rebasing my dvc experiment branch on current tip of master:

rm .dvc/.gitignore
git checkout start-using-dvc
git rebase master

That rebase fails with:

First, rewinding head to replay your work on top of it...
Adding '.dvc/state' to '.dvc/.gitignore'.
Adding '.dvc/lock' to '.dvc/.gitignore'.
Adding '.dvc/config.local' to '.dvc/.gitignore'.
Adding '.dvc/updater' to '.dvc/.gitignore'.
Adding '.dvc/updater.lock' to '.dvc/.gitignore'.
Adding '.dvc/repos' to '.dvc/.gitignore'.
Adding '.dvc/state-journal' to '.dvc/.gitignore'.
Adding '.dvc/state-wal' to '.dvc/.gitignore'.
Adding '.dvc/cache' to '.dvc/.gitignore'.
Applying: dvc init
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
error: The following untracked working tree files would be overwritten by merge:
    .dvc/.gitignore
Please move or remove them before you merge.
Aborting
error: Failed to merge in the changes.
Patch failed at 0001 dvc init
hint: Use 'git am --show-current-patch' to see the failed patch
Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

Deleting the post-checkout hook makes the rebase work just fine:

rm .git/hooks/post-checkout
git rebase master

So probably the post-checkout hook should abort with an informative message if you're on a branch where dvc _hasn't_ been adopted. Not sure whether that check is as simple as checking existence of .dvc or whether that breaks other legitimate uses of the hook.

My dvc --version is 0.50.1, installed with homebrew (brew cask).

enhancement p1-important

Most helpful comment

Ah, whoops, sorry, you should take it as “away on vacation and not paying
much attention” :) ... I can whip up a simple PR when I'm back home next
weekend.

On Mon, Jul 22, 2019, 23:31 Ruslan Kuprieiev notifications@github.com
wrote:

@gthb https://github.com/gthb Not sure if the message got lost or if I
should take this as "no" :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/iterative/dvc/issues/2208?email_source=notifications&email_token=AABFP3HVQOQAZ665CT3GTWTQAY7M5A5CNFSM4H4GTSKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2ROSZQ#issuecomment-513993062,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABFP3HERHVVK3TS5E7DQG3QAY7M5ANCNFSM4H4GTSKA
.

All 10 comments

Not sure whether that check is as simple as checking existence of .dvc

Hm, probably not that, because git-ignored stuff under .dvc _should_ stick around even when you checkout a branch outside of dvc control. But maybe git ls-files -- .dvc?

With very minimal testing this post-checkout hook seems to fix at least my immediate problem:

#!/bin/sh
LS_FILES=`git ls-files .dvc`
[ "$LS_FILES" = "" ] || exec dvc checkout

Same thing happens on git commit in different branches. So pre-commit hook should also be conditional on this check. Maybe the pre-push hook too?

@gthb dvc detects that this is a dvc repository by finding .dvc directory. When switching branches, you have some gitignored(in dvc branch) files there, so git leaves them under .dvc, and so dvc still detects your non-dvc branch as dvc branch. I don't see any good solution to handle that, except maybe modifying your git-hooks to only run on dvc branch. E.g. .git/hooks/post-checkout

#!/bin/sh                                                        

if [ "$(git rev-parse --abrev-ref HEAD)" == "my-dvc-branch" ]; then     
    exec dvc checkout                                            
fi

@gthb Oops, sorry, your later messages didn't show up for me for some reason. I like the ls-files trick a lot! Very neat! :slightly_smiling_face: Let's think if we could always include it as a header to our git hooks. I can only see a possible problem when you didn't git commit your .dvc files yet, but even that should not be a problem, since you usually git ocmmit that stuff right after dvc init. What do you think?

Actually git ls-files by default shows files that are in the index (added but not yet committed), and dvc init adds the new files to the index, so these hooks will work directly after dvc init.

$ dvc init
[...]
$ git ls-files .dvc
.dvc/.gitignore
.dvc/config
$ git reset
$ git ls-files .dvc
$

But yeah, instructions should probably point out that (with this ls-files change) the hooks installed by dvc install only act in a “dvc-enabled” branch, meaning one that has tracked files under .dvc, and initializing dvc on a branch doesn't affect other branches until the change gets merged to those branches. (So you can safely try it out in a branch before merging to master, for instance.)

@gthb Would you like to contribute a patch for this? :slightly_smiling_face: You'd simply need to add that to https://github.com/iterative/dvc/blob/0.50.1/dvc/scm/git/__init__.py#L230 .

@gthb Not sure if the message got lost or if I should take this as "no" :slightly_smiling_face:

Ah, whoops, sorry, you should take it as “away on vacation and not paying
much attention” :) ... I can whip up a simple PR when I'm back home next
weekend.

On Mon, Jul 22, 2019, 23:31 Ruslan Kuprieiev notifications@github.com
wrote:

@gthb https://github.com/gthb Not sure if the message got lost or if I
should take this as "no" :)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/iterative/dvc/issues/2208?email_source=notifications&email_token=AABFP3HVQOQAZ665CT3GTWTQAY7M5A5CNFSM4H4GTSKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2ROSZQ#issuecomment-513993062,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABFP3HERHVVK3TS5E7DQG3QAY7M5ANCNFSM4H4GTSKA
.

@gthb We would really appreciate that :slightly_smiling_face: Have a great vacation!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dnabanita7 picture dnabanita7  Â·  3Comments

shcheklein picture shcheklein  Â·  3Comments

anotherbugmaster picture anotherbugmaster  Â·  3Comments

ghost picture ghost  Â·  3Comments

GildedHonour picture GildedHonour  Â·  3Comments