Dvc: `dvc plots diff` gives 'ERROR: unexpected error'

Created on 11 Sep 2020  路  24Comments  路  Source: iterative/dvc

Bug Report

Please provide information about your setup

I'm getting failed to read 'my_project/data/metrics/metrics.json' on 'master' when I run dvc metrics diff master --show-md -vvv in a clean repo/on Github Actions, and it ends up failing with ERROR: unexpected error - 'data'. However, when I run it locally it works fine. I've tried dvc push -a -d --run-cache, as well as using an older version of DVC to see if that changes anything. I'm not sure what the next step is in debugging this anymore, nor what the problem is :confused:
@steffansluis

https://discordapp.com/channels/485586884165107732/485596304961962003/753655147149787137

@pared#9484 I suddenly don't seem to be able to reproduce the error with dvc metrics either locally or on GH actions, it now breaks both locally and remote on dvc plots diff ... with the same error: ERROR: unexpected error - 'data'
@steffansluis

https://discordapp.com/channels/485586884165107732/485596304961962003/753722495147704370

Output of dvc version:

$ dvc version
DVC version: 1.6.6 (pip)
---------------------------------
Platform: Python 3.8.2 on Linux-5.4.0-7634-generic-x86_64-with-glibc2.29
Supports: azure, http, https, ssh
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/data-root
Workspace directory: ext4 on /dev/mapper/data-root
Repo: dvc, git

Additional Information (if any):

#!/bin/bash

rm -rf repo storage origin copy
main=$(pwd)

mkdir repo remote origin

pushd origin
git init --quiet --bare
popd

pushd repo
git init --quiet
git remote add origin $main/origin

dvc init --quiet
dvc remote add -d str $main/storage
echo -e "import json\nwith open('data', 'r') as fd:\n  val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n  json.dump({'m':val}, fd)" > code.py

git add -A 
git commit -m "initial"

echo 1 >> data

dvc run -d data --plots metric.json -n run python code.py

git add -A
git commit -m "first run"
git push origin master
dvc push
dvc push --run-cache

popd

git clone origin copy

pushd copy

git checkout -b branch

echo -e "import json\nwith open('data', 'r') as fd:\n  val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n  json.dump({'m':val+1}, fd)" > code.py

dvc repro run
dvc plots diff master

See also: https://github.com/iterative/dvc/issues/4446

bug research

All 24 comments

I am having the same problem with dvc plots show:

dvc plots show metrics/plot.csv       
ERROR: unexpected error - 'metrics/plot.csv'

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
DVC version: 1.7.2 (osxpkg)
---------------------------------
Platform: Python 3.7.5 on Darwin-19.5.0-x86_64-i386-64bit
Supports: gdrive, gs, hdfs, http, https, s3, oss, webdav, webdavs
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1
Workspace directory: apfs on /dev/disk1s1
Repo: dvc, git

The contents of plots.csv is copied straight from the plots showdocumentation here:

epoch,accuracy,loss,val_accuracy,val_loss
0,0.9418667,0.19958884770199656,0.9679,0.10217399864746257
1,0.9763333,0.07896138601688048,0.9768,0.07310650711813942
2,0.98375,0.05241111190887168,0.9788,0.06665669009438716
3,0.98801666,0.03681169906261687,0.9781,0.06697812260198989
4,0.99111664,0.027362171787042946,0.978,0.07385754839298315
5,0.9932333,0.02069501801203781,0.9771,0.08009233058886166
6,0.9945,0.017702101902437668,0.9803,0.07830339228538505
7,0.9954,0.01396906608727198,0.9802,0.07247738889862157

(Edit: updated logs with new filename.)

@alexdmiller could you please clarify, does it happen in an empty repo, or it has some history? do you do dvc run before that or not?

@shcheklein here is the contents of my dvc.yaml:

stages:
  training:
    cmd: python train/train.py
    deps:
      - train/train.py
    outs:
      - models/classification/classification-jit.pth
    metrics:
      - metrics/summary.json
    plots:
      - metrics/plot.csv:
          x: epoch
          x_label: Epoch

Interestingly if I do dvc plots show without passing an argument, it does produce output. But I was under the impression I could specify a chart to output.

Verbose log for the repro script:

+ dvc plots diff master -v                                                                                                              
2020-09-21 23:56:31,930 DEBUG: Check for update is enabled.                                                                             
2020-09-21 23:56:31,945 DEBUG: cache '/home/efiop/git/dvc/copy/.dvc/cache/ae/78ff52c6993d7833da173bf6cda983' expected 'HashInfo(name='md
, value='ae78ff52c6993d7833da173bf6cda983', dir_info=None)' actual 'None'                                                               
2020-09-21 23:56:31,946 ERROR: Plot data extraction failed. Please see https://man.dvc.org/plot for supported data formats.             
------------------------------------------------------------                                                                            
Traceback (most recent call last):                                                                                                      
  File "/home/efiop/git/dvc/dvc/command/plots.py", line 54, in run                                                                      
    plots = self._func(targets=self.args.targets, props=self._props())                                                                  
  File "/home/efiop/git/dvc/dvc/command/plots.py", line 89, in _func                                                                    
    return self.repo.plots.diff(                                                                                                        
  File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 100, in diff                                                              
    return diff(self.repo, *args, **kwargs)                                                                                             
  File "/home/efiop/git/dvc/dvc/repo/plots/diff.py", line 21, in diff                                                                   
    return plots_repo.plots.show(                                                                                                       
  File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 95, in show                                                               
    return self.render(data, revs, props, templates)                                                                                    
  File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 73, in render                                                             
    return {                                                                                                                            
  File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 74, in <dictcomp>                                                         
    datafile: _render(datafile, desc["data"], desc["props"], templates)                                                                 
  File "/home/efiop/git/dvc/dvc/repo/plots/__init__.py", line 240, in _render                                                           
    rev_data = plot_data(datafile, rev, datablob).to_datapoints(                                                                        
  File "/home/efiop/git/dvc/dvc/repo/plots/data.py", line 165, in to_datapoints                                                         
    data = data_proc(                                                                                                                   
  File "/home/efiop/git/dvc/dvc/repo/plots/data.py", line 122, in _find_data                                                            
    raise PlotDataStructureError()                                                                                                      
dvc.repo.plots.data.PlotDataStructureError: Plot data extraction failed. Please see https://man.dvc.org/plot for supported data formats.
------------------------------------------------------------                                                                            
2020-09-21 23:56:31,948 DEBUG: Analytics is enabled.                                                                                    
2020-09-21 23:56:31,986 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpxlbhsaby']'                                      
2020-09-21 23:56:31,987 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpxlbhsaby']'                                              

@pared Could you elaborate on what is going on and how to solve it?

Seems like the repro script provided by me does not work anymore, here is fixed version:

#!/bin/bash

rm -rf repo storage origin copy
main=$(pwd)

mkdir repo remote origin

pushd origin
git init --quiet --bare
popd

pushd repo
git init --quiet
git remote add origin $main/origin

dvc init --quiet
dvc remote add -d str $main/storage
echo -e "import json\nwith open('data', 'r') as fd:\n  val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n  json.dump([{'m':val}], fd)" > code.py

git add -A 
git commit -m "initial"

echo 1 >> data

dvc run -d data --plots metric.json -n run python code.py

git add -A
git commit -m "first run"
git push origin master
dvc push
dvc push --run-cache

popd

git clone origin copy

pushd copy

#dvc pull
git checkout -b branch

echo -e "import json\nwith open('data', 'r') as fd:\n  val=int(fd.read())\nwith open('metric.json', 'w') as fd:\n  json.dump([{'m':val+1}], fd)" > code.py

dvc repro run
dvc plots diff master

So what is happening here is that if we uncomment the last dvc pull, there is no problem. So I think that we have some problem with opening the dvc-controlled plots file that is not in cache. Hope to handle it in #4446.

As to @alexdmiller issue, I will investigate.

@alexdmiller would it be possible to post result of dvc plots show metrics/plots.csv -v here? I am unable to reproduce with following script:

#!/bin/bash

main=$(pwd)
rm -rf workspace
mkdir workspace
pushd workspace

mkdir repo
pushd repo

git init --quiet
dvc init --quiet

echo "src" >> src
dvc add src

mkdir metrics
dvc run -d src -n write --plots metrics/plots.csv "cp $main/plots.csv metrics/plots.csv"
dvc plots modify metrics/plots.csv -x epoch --x-label Epoch
dvc plots show metrics/plots.csv

NOTE that plots.csv containing the data mentioned in https://github.com/iterative/dvc/issues/4559#issuecomment-695100928 needs to be alongside the script for this example to work.

So what is happening here is that if we uncomment the last dvc pull, there is no problem.

@pared But without it you just don't have any cache in your copy repo, so dvc has nowhere to get the data from. Or am I missing something? It would be great if you could post tracebacks for the repro scripts you create. Also note that you don't use set -e in your long scripts, which might be error-prone.

But without it you just don't have any cache in your copy repo, so dvc has nowhere to get the data from.

@efiop you are right, but shouldn't we in that case automatically stream the data from default remote?

@pared Sure, that's what I was asking about a few posts above. Thanks for narrowing it down! :pray:

I don't think we've done streaming in metrics or plots yet, so it is not really a bug but a missing feature.

@efiop Just to clarify, I ran into this issue while following the instructions to use CML with DVC. They seem to imply that when on a branch and comparing metrics to master, running dvc pull on the master branch is not needed, whereas the reproduction from @pared shows it is needed atm.

@steffansluis Could you point to a specific section, please? Sorry, might be missing something, but I do see dvc pull --run-cache in there.

@efiop That seems to only pull the cache for the current branch. When running this on CI, the example provided in the README failed and I had to modify it to the following to get it to work:

BRANCH=$(git branch --show-current)

git fetch --prune
git checkout master
dvc pull --run-cache

git checkout $BRANCH
dvc pull --run-cache

Note after edit: I did not test the example 1:1, I'm not actually running dvc repro on CI, however I would expect that does not affect the situation presented here.

Ok, this is extremely confusing. Looks like a lot of stuff got mixed up here and the repro script doesn't seem to be reproducing the original issue. I also don't see the traceback for the plots error in the discord channel. So let's start fresh, @steffansluis are you able to reproduce an error now with the newest dvc version? If so, could you show the verbose log for it, please?

Sorry for the inconvenience, we've failed to ask for the verbose logs initially :slightly_frowning_face:

Seems like the original issue (unable to produce plot if cache is not fetched, repro from my comment is currently solved on master, while it did fail on OP's version from the time of creating issue (1.6.6). @steffansluis The problem with CI (Github Actions) is that when using checkout action, it only checks out current branch, so git fetch --prune might be always necessary. would it be possible to check whether your CI works only with git fetch --prune: that is:

...
git fetch --prune
...

instead of:

...
BRANCH=$(git branch --show-current)

git fetch --prune
git checkout master
dvc pull --run-cache

git checkout $BRANCH
dvc pull --run-cache
...

?

@pared sorry for the delay in my response. Here is the verbose output of the command. Note that since I last posted, I changed the name of the plot to classification-plot.csv.

$ dvc plots show metrics/classification-plot.csv -v
2020-10-05 11:12:03,909 DEBUG: Check for update is enabled.
2020-10-05 11:12:04,059 ERROR: unexpected error - 'metrics/classification-plot.csv'
------------------------------------------------------------
Traceback (most recent call last):
  File "dvc/main.py", line 76, in main
  File "dvc/command/plots.py", line 54, in run
  File "dvc/command/plots.py", line 84, in _func
  File "dvc/repo/plots/__init__.py", line 88, in show
  File "dvc/repo/plots/__init__.py", line 88, in <genexpr>
KeyError: 'metrics/classification-plot.csv'
------------------------------------------------------------

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
DVC version: 1.7.9 (osxpkg)
---------------------------------
Platform: Python 3.7.9 on Darwin-19.5.0-x86_64-i386-64bit
Supports: All remotes
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1
Workspace directory: apfs on /dev/disk1s1
Repo: dvc, git

And just to remind, if I run without specifying a target, it works:

$ dvc plots show
file:///Users/miller/Projects/dvc-project/plots.html

It seems like I'm getting the target name wrong. I've tried many different target names, but I get the same error. Here is the relevant portion of my dvc.yaml file:

  train-classifier:
    cmd: python3 train/train_classifier.py
    deps:
    - ../common
    - train/train_classifier.py
    outs:
    - models/classification/classification-jit.pth
    metrics:
    - metrics/classification-summary.json
    plots:
    - metrics/classification-plot.csv:
        x: epoch
        x_label: Epoch

@alexdmiller thanks for the response. I'll try to reproduce that.
EDIT:
I am unable to do that so far
@alexdmiller
can I ask you to clone this repo: https://github.com/pared/dvc_playground
and run the ./script.sh?

@pared It works with dvc --version: 1.8.1+15b1e7 and

git fetch --prune
dvc pull --run-cache

So, original problem has been solved on master, there still is ambiguous error that @alexdmiller encounters.

@pared I spoke to soon :disappointed:

git fetch --prune
dvc pull --run-cache

fixes the error, but it generates a report that doesn't take the master branch into account, so I end up with all my metrics being labeled as new, no diffs, and a confusion matrix that only includes the branch that my PR is about. I have reverted back to my old workaround for now.

@steffansluis

  1. How do you mark plots? --plots or --plots-no-cache?
  2. The problem exists on github runner, right? You are fine locally?
  3. What system is the runner? Ubuntu?

@pared

  1. --plots.
  2. Yes, locally dvc metrics diff master --show-md work fine without me having to explicitly git fetch or dvc pull a run-cache.
  3. 3.
    runs-on: [ubuntu-latest]
    container: dvcorg/cml-py3:latest
...

      - uses: actions/checkout@v2
        with:
          submodules: true
...

      - run: |
          pip3 install 'git+https://github.com/iterative/dvc#egg=master'
          pip3 install azure-storage-blob knack

...
      - run: cml-eval.sh

cml-eval.sh

#!/bin/bash

# Stop on first error
set -e -o pipefail

# Fetch things to compare against master branch, and master branch
# git fetch --prune
# dvc pull generate_metrics_json generate_confusion_matrix --run-cache

BRANCH=$(git branch --show-current)

git fetch --prune
git checkout master
dvc pull generate_metrics_json generate_confusion_matrix --run-cache

git checkout $BRANCH
dvc pull generate_metrics_json generate_confusion_matrix --run-cache

# Generate report
echo "# Difference from master branch" >> report.md
echo "## Metrics" >> report.md
dvc metrics diff master --show-md >> report.md
echo "" >> report.md

npm install -g [email protected] [email protected]

echo "## Plots" >> report.md
echo "### Class confusions" >> report.md
dvc plots diff --target my_project/data/metrics/confusion_matrix.csv --template confusion -x actual -y predicted --show-vega master > vega.json
vl2png vega.json -s 1.5 | cml-publish --md >> report.md
echo "" >> report.md

# Print report for debugging purposes
cat report.md

# Post report as comment on PR and as (always successful) Github check
cml-send-comment report.md
cml-send-github-check report.md

@steffansluis could you try replacing

BRANCH=$(git branch --show-current)

git fetch --prune
git checkout master
dvc pull generate_metrics_json generate_confusion_matrix --run-cache

git checkout $BRANCH

with

git fetch --prune
dvc fetch generate_metrics_json generate_confusion_matrix --all-commits

?

@alexdmiller @steffansluis seems like we found root cause for the unexpected error issue, it should be fixed in #4762. Can I ask you to check whether it still exists after 1.9.1?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

anotherbugmaster picture anotherbugmaster  路  3Comments

ghost picture ghost  路  3Comments

siddygups picture siddygups  路  3Comments

mdscruggs picture mdscruggs  路  3Comments

shcheklein picture shcheklein  路  3Comments