When we run a command that is running checkout internally, we could get:
(node)[ivan@ivan ~/Projects/myrepo]$ dvc metrics show -a
Unexpected error: Cmd('git') failed due to: exit code(1)
cmdline: git checkout bigrams
stderr: 'error: Your local changes to the following files would be overwritten by checkout:
auc.metric
auc.metric.dvc
featurization.py
matrix.pkl.dvc
model.pkl.dvc
Please commit your changes or stash them before you switch branches.
Aborting'
We should improve the message.
Ideally, we should find a way to traverse git's object graph w/o actual checkouts.
We should use ls-tree and show branch:file instead of checking out other branches. It will eliminate the problem once and for all. Also need to access dvc cache files directly, instead of using dvc checkout. Already working on it. Thank you!
fixing this would also solve https://github.com/iterative/dvc/issues/1552
I'm using this script locally right now which uses git show (and jq):
#!/usr/bin/env python3
import sys
import argparse
from jq import jq # pylint: disable=no-name-in-module
from git import Repo
from git.exc import GitCommandError
import json
from pygments import highlight
from pygments.lexers.data import JsonLexer
from pygments.formatters.terminal import TerminalFormatter
parser = argparse.ArgumentParser(description="""
python scripts/metrics.py experiments/**/metrics/ml_model_parameter_search.json
python scripts/metrics.py experiments/**/metrics/rule_model_train.json --jq '.total_untagged'
python scripts/metrics.py experiments/**/metrics/rule_model_parameter_search.json --jq '.level_5.score' --all-branches
python scripts/metrics.py experiments/**/metrics/rule_model_parameter_search.json --jq 'to_entries|map(.key, .value.score)' --all-branches
""",
formatter_class=lambda prog: argparse.RawDescriptionHelpFormatter(prog, max_help_position=32))
parser.add_argument('filepaths', nargs='+')
parser.add_argument('--jq')
parser.add_argument('--all-branches', action='store_true')
args = parser.parse_args(args=None if sys.argv[1:] else ['--help'])
repo = Repo('.')
branches = ['master']
if args.all_branches:
branches = [x.name for x in repo.branches]
for branch in branches:
for filepath in args.filepaths:
try:
content = repo.git.show(f'{branch}:{filepath}')
except GitCommandError as e:
if 'exists on disk, but not in' in e.stderr or 'does not exist in' in e.stderr:
continue
else:
raise e
if args.jq:
content = json.dumps(jq(args.jq).transform(text=content))
content = highlight(content, JsonLexer(), TerminalFormatter()).strip()
print(branch, filepath, end='')
prefix = '\n' if len(content) > 50 else ' '
print(f'{prefix}{content}')
here are some example outputs:


Most helpful comment
I'm using this script locally right now which uses
git show(andjq):here are some example outputs: