dvc plots diff: use a descriptive filename instead of plots.html by default

Created on 3 Jul 2020  Â·  11Comments  Â·  Source: iterative/dvc

Current implementation

By default, dvc plots diff will put the plot in a boilerplate html code into $(pwd)/plots.html. If I run dvc plots diff -x another-axis -y another-metric, it'll overwrite the plots.html.

You can pass dvc plots diff ... -out path/to/file.html and it would write to the provided path, however this requires the user to think about the path they want to use for plots, if they want to keep them.

_Note: it's fair to say that I myself probably don't want to keep those plots files, I keep deleting those, so this feature would be a very theoretical improvement for an abstract user_.

Proposal

Provide a default output file name that is unique to the commits that are being compared, and the metrics that are being plotted, like this:

`dvc-plots-{x}-{y}-{sha1}-{sha2}.html.

The command should still accept the -out option that should override that default with whatever user provided.

If you think this is an okay idea, I'd be willing to implement and test this and do a pull request ;-) WDYT?

awaiting response

All 11 comments

dvc-plots-{x}-{y}-{sha1}-{sha2}.html

This doesn't really work when you have multiple plots in the project with different templates. plots.html contains one or more plots. I'm also not sure {x} and {y} are always available in all templates. Plus, even if we introduce this change, it will break the backward compatibility for everyone using dvc plots show/diff already. So I don't think this is worth it, especially since we have --out already. Unless, I'm missing some good scenario here, of course :slightly_smiling_face:

Thank you for explaining the issues with the templates — indeed, I never _really_ used them, and before making any arguments, I feel like I need to get a better understanding of the flexibility that's already there, and how people actually use plots today.

The issues is safe to close — I'll reopen if I get up to speed and have any arguments to improve how plots are generated, or create an issue in dvc.org repo if I think a better documentation on templates is needed.

I think it'd make sense to generate plots.html file based on the timestamp.

So, something like: plots.1594187557.html.

Also, it makes me wonder if it'd be good to have some functionality to directly open the plots.html in the browser.

I think it'd make sense to generate plots.html file based on the timestamp.

@skshetry Not sure I understand why it is useful. Visually it doesn't help to differentiate those files.

Also, it makes me wonder if it'd be good to have some functionality to directly open the plots.html in the browser.

This one is a good one. Indeed, we might consider launching something like xdg-open plots.html to open the browser with it. But at the same time you really need to click the link and after that you could just refresh the page (at least that's how I do it). Please feel free to create an issue for it.

I don't think that creating new htmls by default is a good idea. There is -o option. Generating many files with not clear names will lead to polluting the repository with unused plots that we forgot to remove.

@pared, I think it depends on what the user's intention is. For creating one-off plots to view and compare, I think user don't care much about names, they are fine with some plots from temp dir (or, from the workspace), and don't want to get it overwritten (i feel -o is quite an effort).

If user want' something meaningful, there's an --out. Again, I have limited experience with plots, and can speak only for myself. :smile:

Pawel, you assumed that file names won’t be “clear”. I agree that generating and adding a bunch of files to the repo, if it’s unclear what their names or content are about, is a bad practice.

However, no one is suggesting that. On that matter, timestamp format won’t work since it’s not readable, date and time would be in a different, human readable format.

The names should be unique, no plots file, if we adopt this, should be overwritten.

As to whether to add those htmls to the repo or ignore them, or which dir to put them in, that’s really up to the user. It’s probably a good practice to move them to a special for by default tho to not have them in repo root.

+1 for auto opens.

Nate
(650) 933-7237

On Wed, Jul 8 2020 at 8:44 AM, Saugat Pachhai < [email protected] > wrote:

@pared ( https://github.com/pared ) , I think it depends on what the user's
intention is. For creating one-off plots to view and compare, I think user
don't care much about names, they are fine with some plots from temp dir
(or, from the workspace), and don't want to get it overwritten (i feel -o is
quite an effort).

If user want's something meaningful, there's an --out.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub (
https://github.com/iterative/dvc/issues/4158#issuecomment-655600118 ) , or
unsubscribe (
https://github.com/notifications/unsubscribe-auth/AAAKU4N6UY5IFCVOHNAMUVLR2SH65ANCNFSM4OPOB62A
).

@xnutsive I think that readability of date makes users' life easier only till he has just a few plots.
after ls-ing
plot-10.04.2019 11:30, plot-10.04.2019 11:33, plot-23.04.2019 08:30 and few more, it will be easier just to remember what revision we wanted to check, and rerun dvc plots diff rather than look for the plot we wanted to see.

The date is detached from content, and that is why such approach does not seem to be intuitive (to me) in the long term. If we are to do it at all, I would rather focus on HTML's content, eg:
plot_diff_roc_HEAD_v1_v3.html, which gives us idea of content just by looking at the name.

Makes sense. Also, we can't really use HEAD in names or content since it's dynamic?

Nate
(650) 933-7237

On Thu, Jul 09, 2020 at 3:07 AM, PaweĹ‚ RedzyĹ„ski < [email protected] > wrote:

@ xnutsive ( https://github.com/xnutsive ) I think that readability of date
makes users' life easier only till he has just a few plots.
after ls-ing
plot-10.04.2019 11:30, plot-10.04.2019 11:33, plot-23.04.2019 08:30 and few
more, it will be easier just to remember what revision we wanted to check,
and rerun dvc plots diff rather than look for the plot we wanted to see.

The date is detached from content, and that is why such approach does not
seem to be intuitive (to me) in the long term. If we are to do it at all,
I would rather focus on HTML's content, eg:
plot_diff_roc_HEAD_v1_v3.html , which gives us idea of content just by
looking at the name.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (
https://github.com/iterative/dvc/issues/4158#issuecomment-656037114 ) , or
unsubscribe (
https://github.com/notifications/unsubscribe-auth/AAAKU4PVAXO7UDZLYGRKTMTR2WJFPANCNFSM4OPOB62A
).

Yup, probably short sha would be a better idea.

Was this page helpful?
0 / 5 - 0 ratings