Dvc: Use progress bar for dvc pull/checkout

Created on 20 Feb 2019  路  8Comments  路  Source: iterative/dvc

As per discussion here:

https://discordapp.com/channels/485586884165107732/485586884165107734/547833091264086067

One of the suggestions is to use npm-style progress bar instead of printing the files one by one. Something similar to this:

https://terminalizer.com/view/1f4a8cae675

enhancement

Most helpful comment

@pared , I like tqdm too; the only reason I find is that every library that we add to dvc would need to be included in the packaged version.

By the way, python-tqdm is already on the community package repository of arch linux, it makes me think that is frequently used as a dependency.

All 8 comments

@efiop, @shcheklein Would we rather have our own progress bar, or can we consider 3rd party, like https://github.com/tqdm/tqdm ? It has MIT license, so I guess it would be ok to use it.

Don't see any reasons to do our own library. Looks like a great library, @pared !

@pared , I like tqdm too; the only reason I find is that every library that we add to dvc would need to be included in the packaged version.

By the way, python-tqdm is already on the community package repository of arch linux, it makes me think that is frequently used as a dependency.

We did try tqdm in the early stages of dvc, but couldn't make it work the way we wanted with multithreading and stuff. But maybe it will fit into the current dvc, there have been so many changes since then after all 馃檪

I forgot that we already have progress bar. I assume due to resons mentioned by @efiop. In that case I am not sure that introducing new dependency is good idea. We should probably stay with our current progress bar, or try to apply tqdm everywhere. In 2 case I think we should treat it as separate task. What do you guys think?

Also there is one more thing to discuss before I do something:
I do believe that we could implement this feature by using dvc.progress.progress in dvc.repo.checkout. However there exists problem what to do with logged information, that is logged "below". To mention those that I found so far:

  • logger.info("Linking directory '{}'.".format(dir_relpath)) in remote.local.do_checkout
  • "Stage '{}' didn't change.".format(self.relpath) in dvc.stage

Those infos will break display of progress bar, and result partial progress bar updates mixed with logged information. How would we like to process from here?
I see here two options:

  1. Change info to debug, and don't care about ugly look in debug.
  2. Make logger log to file somewhere under .dvc dir and keep minimalistic progress in stdout.

What do you guys think?

@pared I agree, migrating to tqdm is a separate task, unless our current progress bar is not suitable for the current task and tqdm is.

Have you tested that though? If recent progress bar changes didn't break anything, it should print info() cleanly if we are during progress bar print and start progress bar over. It is hard to explain in words, need to try and see :) Basically, it looks like progress bar is a last dynamic line in your terminal and all info()'s and debug()'s and etc are printed above it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

siddygups picture siddygups  路  3Comments

ghost picture ghost  路  3Comments

prihoda picture prihoda  路  3Comments

tc-ying picture tc-ying  路  3Comments

ghost picture ghost  路  3Comments