Pip: Parallel download: How should we display the progress?

Created on 4 Aug 2020 · 12Comments · Source: pypa/pip

This issue is opened to discuss the UI to be implemented for parallel download, with accessibility as well as platform compatibility in mind (relevant ticket: GH-8518). Concerning cursor moving support, the safest approach is to have a meta progress bar at the bottom of the output, i.e. something like apt's but not stuck to the bottom of the viewport (since that would require curses-or-similar support and defeat the purpose):

apt

@pradyunsg suggested having multiple progress bar, similar to what showed in python-poetry/poetry#2595 and what rich provides (it unfortunately requires Python >=3.6, but similar TUI can be implemented/found elsewhere):

rich

I believe that there are other designs as well and would love to hear more suggestions and opinions!

cli download UX needs discussion question

Source

McSinyx

Most helpful comment

Hopefully, our code for detecting when we aren't running in a terminal can cope with that (or can be made to, without needing special "detect Jupyter" code)

Good point. sys.stdout.isatty() reports False in Jupyter. That should be enough.

willmcgugan on 6 Aug 2020

👍3

All 12 comments

Hello. Author of Rich here. Text based progress bars is one of my favourite subjects. Mention me if I can contribute in any way.

willmcgugan on 4 Aug 2020

Thanks for your attention @willmcgugan! I have yet to look into rich's codebase: it there any barrier to implement cross-platform multiple progress bars on Python 2 (since pip is going to support at least until the end of the year)? In addition, is there any magic involved to make it work on Jupiter Notebook? I notice that ncurses apps don't seem to work well there (e.g. top) but I don't know any further :smile:

McSinyx on 5 Aug 2020

It wouldn't me much of a problem to get progress bars running on both Python 2 and 3. Principle is the same.

There is a bit of magic for Jupyter support. I don't think anything that moves the cursor around work work in Jupyter.

Jupyter supports HTML widgets so you could either implement it in HTML, which gives you a lot of freedom, or convert you text based bars in to HTML (which is what Rich does).

I'm surprised to hear people run pip in Jupyter tbh. But I'm not a big Jupyter user.

willmcgugan on 6 Aug 2020

I'm surprised to hear people run pip in Jupyter tbh. But I'm not a big Jupyter user.

I'm pretty sure it's a bad idea. I dabble in Jupyter, but wouldn't say I'm a heavy user. But I'd have thought that you'd need to restart the kernel after installing to pick up the changes correctly. I'm -1 on having any special handling in pip for it being run in Jupyter, it's not a scenario I think we should encourage.

pfmoore on 6 Aug 2020

Every data science workshop/talk I’ve ever attended (granted I seldom do) that involves live coding use %pip install in a Jupyter Notebook. It sounds like a bad idea, but is extremely prevalent. But I agree pip doesn’t need to do too much special things for it.

uranusjr on 6 Aug 2020

It might be worth at least detecting Jupyter and disabling progress bars. Otherwise I suspect you will get a stream of progress bars in the output.

As much as I like multiple progress bars, does pip need them? I wouldn't care too much about the progress of individual downloads, but I would be interested in the a single bar for the progress as a whole.

willmcgugan on 6 Aug 2020

👍2

Every data science workshop/talk I’ve ever attended (granted I seldom do) that involves live coding use %pip install

I note that uses %pip. I don't know if there's special support in the pip magic command to make it work better. Equally, installing stuff in-process doesn't often fail, it's just risky. But the whole thing is a long-standing (core Python) issue, and affects more than just Jupyter - the discussion has come up for Idle, and I assume that other IDEs like VS Code and PyCharm need to deal with this as well. I don't think it's something pip should concern itself with directly (other than to point out that it's not our issue 🙂).

It might be worth at least detecting Jupyter and disabling progress bars.

Hopefully, our code for detecting when we aren't running in a terminal can cope with that (or can be made to, without needing special "detect Jupyter" code). Or the Jupyter people can make the %pip magic command add --no-progress automatically.

pfmoore on 6 Aug 2020

Hopefully, our code for detecting when we aren't running in a terminal can cope with that (or can be made to, without needing special "detect Jupyter" code)

Good point. sys.stdout.isatty() reports False in Jupyter. That should be enough.

willmcgugan on 6 Aug 2020

👍3

Now that the Jupyter-related side-discussion has concluded...

The situation that this issue is for: after dependency resolution, pip has a list of links that it needs to download, before proceeding to installation. It needs to perform these downloads, while showing progress to the end user.

I basically see 2 options for how to show progress:

Show progress for each download separately, kinda how https://github.com/python-poetry/poetry/pull/2595 does things. Rich would not be what we use, but there's almost certainly another library that implements this.
Have a single progress bar, that shows the combined progress, with some indication when a file is fully downloaded.

pradyunsg on 6 Aug 2020

How many downloads are we talking about here? I can imagine having two or three independent progress bars being OK, but if we have 40 I doubt anyone would be impressed... Conversely, why do we need to indicate when a file is fully downloaded if we have a single progress bar? Why not just show progress of the overall task? Is it because we can't accurately determine "progress" at that level?

This is probably something that the UX studies are already looking at, for pip's existing progress reporting. Do they have any insights?

As a user, what I mostly want to know is:

Up front, how long will I have to wait for this task to finish (in terms of time, not some arbitrary measure like bytes)?
As the job progresses, does that initial estimate change?
Some visual indication that pip's still doing something - or if pip isn't, then what it's waiting for.

I'm not really interested in details like "see how clever pip is, it's worked out that you need FORTY files when you thought you only needed one!!!" 🙂

Given that the only options I have are to sit it out or kill the process, I'd probably also like some postmortem information, but only if I kill the job, that gives me information that would help me to work out what options I have to speed up the process. Honestly, I don't know what options there are, though. Maybe:

Download some particularly costly files and host them locally
Adjust my install command to omit optional stuff that costs time I'm not willing to spend
Split the install into parts, so I can do the most essential bit now and the less important bits later
Fix my network speed
Even just "tough, do the install some time when you can afford that long of a wait"

pfmoore on 6 Aug 2020

Good point. sys.stdout.isatty() reports False in Jupyter. That should be enough.

It indeed is. I was worrying that we'd have to dump thousands of lines to the output :sweat_smile:

As a user, what I mostly want to know is:

Up front, how long will I have to wait for this task to finish (in terms of time, not some arbitrary measure like bytes)?

As the job progresses, does that initial estimate change?

Some visual indication that pip's still doing something - or if pip isn't, then what it's waiting for.

[...] Given that the only options I have are to sit it out or kill the process

This sums up my UX as well and IMHO e.g. apt is doing a good job with it. Unless someone is strongly against it, I'll go with the single progress bar first (since it's also easier to prototype). If the UX research team find out a better alternative (e.g. multiple bars), I'll be happy to iterate the UI.

Regarding the postmortem information, it seems to be something super nice to have, given in parallel download the choice of number of connections may significantly affect the speed. The content of such message should be taken care once we've got the implementation working, and it might deserve a separate tracking ticket.

McSinyx on 6 Aug 2020

The implementation using a single progress bar is now available at GH-8771 for review.

McSinyx on 20 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings