I'm getting
failed to read 'my_project/data/metrics/metrics.json' on 'master'
when I rundvc metrics diff master --show-md -vvv
in a clean repo/on Github Actions, and it ends up failing withERROR: unexpected error - 'data'
. However, when I run it locally it works fine. I've trieddvc push -a -d --run-cache
, as well as using an older version of DVC to see if that changes anything. I'm not sure what the next step is in debugging this anymore, nor what the problem is :confused:
@steffansluis
https://discordapp.com/channels/485586884165107732/485596304961962003/753655147149787137
@pared#9484 I suddenly don't seem to be able to reproduce the error with
dvc metrics
either locally or on GH actions, it now breaks both locally and remote ondvc plots diff ...
with the same error:ERROR: unexpected error - 'data'
@steffansluis
https://discordapp.com/channels/485586884165107732/485596304961962003/753722495147704370
Output of dvc version
:
$ dvc version
DVC version: 1.6.6 (pip)
---------------------------------
Platform: Python 3.8.2 on Linux-5.4.0-7634-generic-x86_64-with-glibc2.29
Supports: azure, http, https, ssh
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/data-root
Workspace directory: ext4 on /dev/mapper/data-root
Repo: dvc, git
Additional Information (if any):
#!/bin/bash
rm -rf repo storage origin copy
main=$(pwd)
mkdir repo remote origin
pushd origin
git init --quiet --bare
popd
pushd repo
git init --quiet
git remote add origin $main/origin
dvc init --quiet
dvc remote add -d str $main/storage
echo -e "import json\nwith open('data', 'r') as fd:\n val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n json.dump({'m':val}, fd)" > code.py
git add -A
git commit -m "initial"
echo 1 >> data
dvc run -d data --plots metric.json -n run python code.py
git add -A
git commit -m "first run"
git push origin master
dvc push
dvc push --run-cache
popd
git clone origin copy
pushd copy
git checkout -b branch
echo -e "import json\nwith open('data', 'r') as fd:\n val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n json.dump({'m':val+1}, fd)" > code.py
dvc repro run
dvc plots diff master
I am having the same problem with dvc plots show
:
dvc plots show metrics/plot.csv
ERROR: unexpected error - 'metrics/plot.csv'
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
DVC version: 1.7.2 (osxpkg)
---------------------------------
Platform: Python 3.7.5 on Darwin-19.5.0-x86_64-i386-64bit
Supports: gdrive, gs, hdfs, http, https, s3, oss, webdav, webdavs
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1
Workspace directory: apfs on /dev/disk1s1
Repo: dvc, git
The contents of plots.csv
is copied straight from the plots show
documentation here:
epoch,accuracy,loss,val_accuracy,val_loss
0,0.9418667,0.19958884770199656,0.9679,0.10217399864746257
1,0.9763333,0.07896138601688048,0.9768,0.07310650711813942
2,0.98375,0.05241111190887168,0.9788,0.06665669009438716
3,0.98801666,0.03681169906261687,0.9781,0.06697812260198989
4,0.99111664,0.027362171787042946,0.978,0.07385754839298315
5,0.9932333,0.02069501801203781,0.9771,0.08009233058886166
6,0.9945,0.017702101902437668,0.9803,0.07830339228538505
7,0.9954,0.01396906608727198,0.9802,0.07247738889862157
(Edit: updated logs with new filename.)
@alexdmiller could you please clarify, does it happen in an empty repo, or it has some history? do you do dvc run
before that or not?
@shcheklein here is the contents of my dvc.yaml
:
stages:
training:
cmd: python train/train.py
deps:
- train/train.py
outs:
- models/classification/classification-jit.pth
metrics:
- metrics/summary.json
plots:
- metrics/plot.csv:
x: epoch
x_label: Epoch
Interestingly if I do dvc plots show
without passing an argument, it does produce output. But I was under the impression I could specify a chart to output.
Verbose log for the repro script:
+ dvc plots diff master -v
2020-09-21 23:56:31,930 DEBUG: Check for update is enabled.
2020-09-21 23:56:31,945 DEBUG: cache '/home/efiop/git/dvc/copy/.dvc/cache/ae/78ff52c6993d7833da173bf6cda983' expected 'HashInfo(name='md
, value='ae78ff52c6993d7833da173bf6cda983', dir_info=None)' actual 'None'
2020-09-21 23:56:31,946 ERROR: Plot data extraction failed. Please see https://man.dvc.org/plot for supported data formats.
------------------------------------------------------------
Traceback (most recent call last):
File "/home/efiop/git/dvc/dvc/command/plots.py", line 54, in run
plots = self._func(targets=self.args.targets, props=self._props())
File "/home/efiop/git/dvc/dvc/command/plots.py", line 89, in _func
return self.repo.plots.diff(
File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 100, in diff
return diff(self.repo, *args, **kwargs)
File "/home/efiop/git/dvc/dvc/repo/plots/diff.py", line 21, in diff
return plots_repo.plots.show(
File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 95, in show
return self.render(data, revs, props, templates)
File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 73, in render
return {
File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 74, in <dictcomp>
datafile: _render(datafile, desc["data"], desc["props"], templates)
File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 240, in _render
rev_data = plot_data(datafile, rev, datablob).to_datapoints(
File "/home/efiop/git/dvc/dvc/repo/plots/data.py", line 165, in to_datapoints
data = data_proc(
File "/home/efiop/git/dvc/dvc/repo/plots/data.py", line 122, in _find_data
raise PlotDataStructureError()
dvc.repo.plots.data.PlotDataStructureError: Plot data extraction failed. Please see https://man.dvc.org/plot for supported data formats.
------------------------------------------------------------
2020-09-21 23:56:31,948 DEBUG: Analytics is enabled.
2020-09-21 23:56:31,986 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpxlbhsaby']'
2020-09-21 23:56:31,987 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpxlbhsaby']'
@pared Could you elaborate on what is going on and how to solve it?
Seems like the repro script provided by me does not work anymore, here is fixed version:
#!/bin/bash
rm -rf repo storage origin copy
main=$(pwd)
mkdir repo remote origin
pushd origin
git init --quiet --bare
popd
pushd repo
git init --quiet
git remote add origin $main/origin
dvc init --quiet
dvc remote add -d str $main/storage
echo -e "import json\nwith open('data', 'r') as fd:\n val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n json.dump([{'m':val}], fd)" > code.py
git add -A
git commit -m "initial"
echo 1 >> data
dvc run -d data --plots metric.json -n run python code.py
git add -A
git commit -m "first run"
git push origin master
dvc push
dvc push --run-cache
popd
git clone origin copy
pushd copy
#dvc pull
git checkout -b branch
echo -e "import json\nwith open('data', 'r') as fd:\n val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n json.dump([{'m':val+1}], fd)" > code.py
dvc repro run
dvc plots diff master
So what is happening here is that if we uncomment the last dvc pull, there is no problem. So I think that we have some problem with opening the dvc-controlled plots file that is not in cache. Hope to handle it in #4446.
As to @alexdmiller issue, I will investigate.
@alexdmiller would it be possible to post result of dvc plots show metrics/plots.csv -v
here? I am unable to reproduce with following script:
#!/bin/bash
main=$(pwd)
rm -rf workspace
mkdir workspace
pushd workspace
mkdir repo
pushd repo
git init --quiet
dvc init --quiet
echo "src" >> src
dvc add src
mkdir metrics
dvc run -d src -n write --plots metrics/plots.csv "cp $main/plots.csv metrics/plots.csv"
dvc plots modify metrics/plots.csv -x epoch --x-label Epoch
dvc plots show metrics/plots.csv
NOTE that plots.csv
containing the data mentioned in https://github.com/iterative/dvc/issues/4559#issuecomment-695100928 needs to be alongside the script for this example to work.
So what is happening here is that if we uncomment the last dvc pull, there is no problem.
@pared But without it you just don't have any cache in your copy
repo, so dvc has nowhere to get the data from. Or am I missing something? It would be great if you could post tracebacks for the repro scripts you create. Also note that you don't use set -e
in your long scripts, which might be error-prone.
But without it you just don't have any cache in your copy repo, so dvc has nowhere to get the data from.
@efiop you are right, but shouldn't we in that case automatically stream the data from default remote?
@pared Sure, that's what I was asking about a few posts above. Thanks for narrowing it down! :pray:
I don't think we've done streaming in metrics or plots yet, so it is not really a bug but a missing feature.
@efiop Just to clarify, I ran into this issue while following the instructions to use CML with DVC. They seem to imply that when on a branch and comparing metrics to master, running dvc pull
on the master branch is not needed, whereas the reproduction from @pared shows it is needed atm.
@steffansluis Could you point to a specific section, please? Sorry, might be missing something, but I do see dvc pull --run-cache
in there.
@efiop That seems to only pull the cache for the current branch. When running this on CI, the example provided in the README failed and I had to modify it to the following to get it to work:
BRANCH=$(git branch --show-current)
git fetch --prune
git checkout master
dvc pull --run-cache
git checkout $BRANCH
dvc pull --run-cache
Note after edit: I did not test the example 1:1, I'm not actually running dvc repro
on CI, however I would expect that does not affect the situation presented here.
Ok, this is extremely confusing. Looks like a lot of stuff got mixed up here and the repro script doesn't seem to be reproducing the original issue. I also don't see the traceback for the plots error in the discord channel. So let's start fresh, @steffansluis are you able to reproduce an error now with the newest dvc version? If so, could you show the verbose log for it, please?
Sorry for the inconvenience, we've failed to ask for the verbose logs initially :slightly_frowning_face:
Seems like the original issue (unable to produce plot if cache is not fetched, repro from my comment is currently solved on master, while it did fail on OP's version from the time of creating issue (1.6.6). @steffansluis The problem with CI (Github Actions) is that when using checkout action, it only checks out current branch, so git fetch --prune
might be always necessary. would it be possible to check whether your CI works only with git fetch --prune
: that is:
...
git fetch --prune
...
instead of:
...
BRANCH=$(git branch --show-current)
git fetch --prune
git checkout master
dvc pull --run-cache
git checkout $BRANCH
dvc pull --run-cache
...
?
@pared sorry for the delay in my response. Here is the verbose output of the command. Note that since I last posted, I changed the name of the plot to classification-plot.csv
.
$ dvc plots show metrics/classification-plot.csv -v
2020-10-05 11:12:03,909 DEBUG: Check for update is enabled.
2020-10-05 11:12:04,059 ERROR: unexpected error - 'metrics/classification-plot.csv'
------------------------------------------------------------
Traceback (most recent call last):
File "dvc/main.py", line 76, in main
File "dvc/command/plots.py", line 54, in run
File "dvc/command/plots.py", line 84, in _func
File "dvc/repo/plots/__init__.py", line 88, in show
File "dvc/repo/plots/__init__.py", line 88, in <genexpr>
KeyError: 'metrics/classification-plot.csv'
------------------------------------------------------------
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
DVC version: 1.7.9 (osxpkg)
---------------------------------
Platform: Python 3.7.9 on Darwin-19.5.0-x86_64-i386-64bit
Supports: All remotes
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1
Workspace directory: apfs on /dev/disk1s1
Repo: dvc, git
And just to remind, if I run without specifying a target, it works:
$ dvc plots show
file:///Users/miller/Projects/dvc-project/plots.html
It seems like I'm getting the target name wrong. I've tried many different target names, but I get the same error. Here is the relevant portion of my dvc.yaml file:
train-classifier:
cmd: python3 train/train_classifier.py
deps:
- ../common
- train/train_classifier.py
outs:
- models/classification/classification-jit.pth
metrics:
- metrics/classification-summary.json
plots:
- metrics/classification-plot.csv:
x: epoch
x_label: Epoch
@alexdmiller thanks for the response. I'll try to reproduce that.
EDIT:
I am unable to do that so far
@alexdmiller
can I ask you to clone this repo: https://github.com/pared/dvc_playground
and run the ./script.sh
?
@pared It works with dvc --version: 1.8.1+15b1e7
and
git fetch --prune
dvc pull --run-cache
So, original problem has been solved on master, there still is ambiguous error that @alexdmiller encounters.
@pared I spoke to soon :disappointed:
git fetch --prune
dvc pull --run-cache
fixes the error, but it generates a report that doesn't take the master branch into account, so I end up with all my metrics being labeled as new, no diffs, and a confusion matrix that only includes the branch that my PR is about. I have reverted back to my old workaround for now.
@steffansluis
--plots
or --plots-no-cache
? @pared
--plots
.dvc metrics diff master --show-md
work fine without me having to explicitly git fetch or dvc pull a run-cache. runs-on: [ubuntu-latest]
container: dvcorg/cml-py3:latest
...
- uses: actions/checkout@v2
with:
submodules: true
...
- run: |
pip3 install 'git+https://github.com/iterative/dvc#egg=master'
pip3 install azure-storage-blob knack
...
- run: cml-eval.sh
cml-eval.sh
#!/bin/bash
# Stop on first error
set -e -o pipefail
# Fetch things to compare against master branch, and master branch
# git fetch --prune
# dvc pull generate_metrics_json generate_confusion_matrix --run-cache
BRANCH=$(git branch --show-current)
git fetch --prune
git checkout master
dvc pull generate_metrics_json generate_confusion_matrix --run-cache
git checkout $BRANCH
dvc pull generate_metrics_json generate_confusion_matrix --run-cache
# Generate report
echo "# Difference from master branch" >> report.md
echo "## Metrics" >> report.md
dvc metrics diff master --show-md >> report.md
echo "" >> report.md
npm install -g [email protected] [email protected]
echo "## Plots" >> report.md
echo "### Class confusions" >> report.md
dvc plots diff --target my_project/data/metrics/confusion_matrix.csv --template confusion -x actual -y predicted --show-vega master > vega.json
vl2png vega.json -s 1.5 | cml-publish --md >> report.md
echo "" >> report.md
# Print report for debugging purposes
cat report.md
# Post report as comment on PR and as (always successful) Github check
cml-send-comment report.md
cml-send-github-check report.md
@steffansluis could you try replacing
BRANCH=$(git branch --show-current)
git fetch --prune
git checkout master
dvc pull generate_metrics_json generate_confusion_matrix --run-cache
git checkout $BRANCH
with
git fetch --prune
dvc fetch generate_metrics_json generate_confusion_matrix --all-commits
?
@alexdmiller @steffansluis seems like we found root cause for the unexpected error
issue, it should be fixed in #4762. Can I ask you to check whether it still exists after 1.9.1
?