Dvc: plots diff: WARN about bad templates sent to -t [qa]

Created on 8 Jun 2020  Â·  13Comments  Â·  Source: iterative/dvc

Bug Report

UPDATE: Jump to https://github.com/iterative/dvc/issues/3969#issuecomment-640879725


OLD:

Using logs.csv from the example in https://dvc-landing-2020-05-31-evbiu1f.herokuapp.com/doc/command-reference/plots/modify#examples (see iterative/dvc.org/pull/1382):

dvc plots diff -t train.json
file://C:\Users\poj12\DVC-repos\tests\plots.html

image

Please provide information about your setup

Output of dvc version:

λ dvc version
DVC version: 1.0.0a9+3683c2
Python version: 3.8.2
Platform: Windows-10-10.0.18362-SP0
Binary: False
Package: None
Supported remotes: azure, gdrive, gs, hdfs, http, https, s3, ssh, oss
Cache: reflink - not supported, hardlink - supported, symlink - supported
Filesystem type (cache directory): ('NTFS', 'C:\\')
Repo: dvc, git
Filesystem type (workspace): ('NTFS', 'C:\\')

Additional Information (if any):

If applicable, please also provide a --verbose output of the command, eg: dvc add --verbose.

enhancement ui

Most helpful comment

@pared So looks like our templates have the schema already, which validates the generated template, right? So we are covered there. The only thing that we need to validate is that the template is indeed using our anchors.

All 13 comments

Page source:

<!DOCTYPE html>
<html>
<head>
    <title>DVC Plot</title>
    <script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
    <script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
    <script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
</head>
<body>
    <div id = "plot0"></div>
<script type = "text/javascript">
    var spec = epoch,accuracy,loss
0,0.9403833150863647,0.2019129991531372
1,0.9733833074569702,0.08973673731088638
2,0.9815833568572998,0.06529958546161652
3,0.9861999750137329,0.04984375461935997
4,0.9882333278656006,0.041892342269420624
;
    vegaEmbed('#plot0', spec);
</script>
</body>
</html>

That's pretty strange, works for me just fine:

<!DOCTYPE html>
--
  | <html>
  | <head>
  | <title>DVC Plot</title>
  | <script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
  | <script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
  | <script src="https://cdn.jsdelivr.net/npm/[email protected]"></script>
  | </head>
  | <body>
  | <div id = "plot0"></div>
  | <script type = "text/javascript">
  | var spec = {
  | "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  | "data": {
  | "values": [
  | {
  | "accuracy": "0.9403833150863647",
  | "epoch": "0",
  | "index": 0,
  | "loss": "0.2019129991531372",
  | "rev": "workspace"
  | },
  | {
  | "accuracy": "0.9733833074569702",
  | "epoch": "1",
  | "index": 1,
  | "loss": "0.08973673731088638",
  | "rev": "workspace"
  | },
  | {
  | "accuracy": "0.9815833568572998",
  | "epoch": "2",
  | "index": 2,
  | "loss": "0.06529958546161652",
  | "rev": "workspace"
  | },
  | {
  | "accuracy": "0.9861999750137329",
  | "epoch": "3",
  | "index": 3,
  | "loss": "0.04984375461935997",
  | "rev": "workspace"
  | },
  | {
  | "accuracy": "0.9882333278656006",
  | "epoch": "4",
  | "index": 4,
  | "loss": "0.041892342269420624",
  | "rev": "workspace"
  | }
  | ]
  | },
  | "title": "",
  | "mark": {
  | "type": "line"
  | },
  | "encoding": {
  | "x": {
  | "field": "index",
  | "type": "quantitative",
  | "title": "index"
  | },
  | "y": {
  | "field": "loss",
  | "type": "quantitative",
  | "title": "loss",
  | "scale": {
  | "zero": false
  | }
  | },
  | "color": {
  | "field": "rev",
  | "type": "nominal"
  | }
  | }
  | }
  | ;
  | vegaEmbed('#plot0', spec);
  | </script>
  | </body>
  | </html>

@jorgeorpinel did you use standard templates?

<!DOCTYPE html>
--
 | <html>
 | <head>
...

Tehehe copy paste from Chrome "view source" introduces a bunch of characters (works properly with ctrl+shift+v for me).

did you use standard templates?

🤦 ! I mistakenly used the -t train.json flag thinking it meant --targets but it means --template ! dvc plots diff --targets train.json works.

Maybe we should reconsider the option letters? Or at least print a warning when a provided template file does not exist or is invalid! Cc @pared Thanks!

@jorgeorpinel It already throws if the template is not found:

(dvc-3.8.3) ➜  dvc git:(fix-3897) ✗ dvc plots diff -t sdf
ERROR: Template 'sdf' not found.

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

It is not able to detect a bad template, that is true, but I'm not completely sure if we can do that in a general case. Seems like the best we could do is to check that it is indeed a json/yaml (once we get rid of html support), but I'm not sure we could require it to hit any quotas on anchors that it uses. CC @pared

not able to detect a bad template, that is true, but I'm not completely sure if we can do that in a general case

Maybe let's just keep an eye on this to see if users make the same error I did and at least detect when a given template is actually a metrics file in that case (and WARN/ERROR out).

We can check that template is valid by checking for DVC_METRIC_DATA anchor. I don't see a use case where one would like to use dvc plots and not provide it.

Also if we are talking about JSON templates, what is a common property of a JSON specification is $schema field.
eg

  • for vega:
    "$schema": "https://vega.github.io/schema/vega/v5.json"
  • vega lite:
    "$schema": "https://vega.github.io/schema/vega-lite/v4.json"

It can even be done by JSON schema validation but seems overkill (unless there's a quick python lib that does easily).

Taking a quick peek at available resources, https://github.com/Julian/jsonschema seems like the most popular python-based and maintained package.

The question is whether we want to include a new package for one functionality validation.

If we do, it might help us solving plot update (if we decide to do it) from https://github.com/iterative/dvc/issues/3906

I think it would make sense.

@pared So looks like our templates have the schema already, which validates the generated template, right? So we are covered there. The only thing that we need to validate is that the template is indeed using our anchors.

@efiop yep, I think this would solve this issue.

Was this page helpful?
0 / 5 - 0 ratings