Need to investigate flaky test, probably hitting some gitpython related race condition again
context:
https://travis-ci.com/github/iterative/dvc/jobs/372602915
https://travis-ci.com/github/iterative/dvc/jobs/372071149
Another failure on a different test:
=================================== FAILURES ===================================
_____________________________ test_show_experiment _____________________________
[gw2] linux -- Python 3.7.1 /home/travis/virtualenv/python3.7.1/bin/python
tmp_dir = PosixTmpDir('/tmp/pytest-of-travis/pytest-0/popen-gw2/test_show_experiment0')
scm = Git: '/tmp/pytest-of-travis/pytest-0/popen-gw2/test_show_experiment0/.git'
dvc = Repo: '/tmp/pytest-of-travis/pytest-0/popen-gw2/test_show_experiment0'
def test_show_experiment(tmp_dir, scm, dvc):
tmp_dir.gen("copy.py", COPY_SCRIPT)
tmp_dir.gen("params.yaml", "foo: 1")
dvc.run(
cmd="python copy.py params.yaml metrics.yaml",
metrics_no_cache=["metrics.yaml"],
params=["foo"],
name="foo",
)
scm.add(["copy.py", "params.yaml", "metrics.yaml", "dvc.yaml", "dvc.lock"])
scm.commit("baseline")
baseline_rev = scm.get_rev()
timestamp = datetime.fromtimestamp(
scm.repo.rev_parse(baseline_rev).committed_date
)
dvc.reproduce(PIPELINE_FILE, experiment=True, params=["foo=2"])
results = dvc.experiments.show()
expected_baseline = {
"metrics": {"metrics.yaml": {"foo": 1}},
"params": {"params.yaml": {"foo": 1}},
"queued": False,
"timestamp": timestamp,
}
expected_params = {"foo": 2}
assert set(results.keys()) == {"workspace", baseline_rev}
experiments = results[baseline_rev]
assert len(experiments) == 2
for rev, exp in experiments.items():
if rev == "baseline":
assert exp == expected_baseline
else:
assert exp["metrics"]["metrics.yaml"] == expected_params
assert exp["params"]["params.yaml"] == expected_params
> assert exp["timestamp"] > timestamp
E assert datetime.datetime(2020, 8, 18, 10, 1, 1) > datetime.datetime(2020, 8, 18, 10, 1, 1)
Could be a one-time failure though, but maybe this assumption about time should not even be here?
Happened again, but with a different message: https://travis-ci.com/github/iterative/dvc/jobs/375383416
self = BaselineMismatchError(<git.Commit "d31b10fcd1a8e1191e215399a2a978db245349d5">, '232a7cbd9b448ce97013eb3db486fc3e2a104b00')
rev = <git.Commit "d31b10fcd1a8e1191e215399a2a978db245349d5">
expected = '232a7cbd9b448ce97013eb3db486fc3e2a104b00'
def __init__(self, rev, expected):
> rev_str = f"{rev[:7]}" if rev is not None else "dangling commit"
E TypeError: 'Commit' object is not subscriptable
I've narrowed down the issue to gitpython Repo.is_dirty()
misbehaving, but still not sure of the actual cause. In the flaky runs, for some reason is_dirty()
is incorrectly returning False.
Inserting this code into experiments._commit
if not self.scm.is_dirty():
tree = self.exp_dvc.tree
logger.debug("--- workspace params.yaml:")
with tree.open("params.yaml") as fobj:
data = fobj.read()
logger.debug(f"{data}")
rev = self.scm.get_rev()
tree = self.scm.get_tree(rev)
logger.debug("--- baseline params.yaml:")
with tree.open(os.path.join(self.exp_dvc.root_dir, "params.yaml")) as fobj:
data = fobj.read()
logger.debug(f"{data}")
logger.debug(f"non-dirty exp workspace @ '{rev}'")
raise UnchangedExperimentError(self.scm.get_rev())
gives
2020-08-20 07:38:35,961 DEBUG: Commiting experiment '484f6348df10e5f532c32e264e5278730ff648f379791b54ecf134caf8582cac'
2020-08-20 07:38:35,996 DEBUG: --- workspace params.yaml:
2020-08-20 07:38:35,996 DEBUG: foo: 2
2020-08-20 07:38:35,998 DEBUG: --- baseline params.yaml:
2020-08-20 07:38:36,000 DEBUG: foo: 1
2020-08-20 07:38:36,001 DEBUG: non-dirty exp workspace @ '627722c61614e0af2aa6f0a4fdf7abd5b81961f5'
2020-08-20 07:38:36,003 DEBUG: Experiment 'f75f01e0ed462d7c07ffd8c16ae87632f19c6a49' identical to baseline '627722c61614e0af2aa6f0a4fdf7abd5b81961f5'
Reproduced experiment '627722c'.
so params.yaml
should be dirty here
edit: the issue here was that exp_dvc.tree.open("params.yaml")
opens cwd / params.yaml
and not the experiments workspace params.yaml
, is_dirty()
was actually working properly
So the underlying issue is related to input collection when we populate the executor tmp dir. The gitpython index/tree doesn't always behave consistently for merge commits, so sometimes the uploaded tree did not contain the correct changes.
Abstracting the tmp dir input collection to use git stash apply
and upload
ing directly from the dirty experiments workspace (the same way we do output collection now) rather than relying on merge commit git trees.
Most helpful comment
I've narrowed down the issue to gitpython
Repo.is_dirty()
misbehaving, but still not sure of the actual cause. In the flaky runs, for some reasonis_dirty()
is incorrectly returning False.Inserting this code into
experiments._commit
gives
so
params.yaml
should be dirty hereedit: the issue here was that
exp_dvc.tree.open("params.yaml")
openscwd / params.yaml
and not the experiments workspaceparams.yaml
,is_dirty()
was actually working properly