Pip: [Improvement] Pip could resume download package at halfway the connection is poor

Created on 20 Oct 2017  路  9Comments  路  Source: pypa/pip

  • Pip version: 9.0.1
  • Python version: 3.6.2
  • Operating system: macOS 10.13

Description

When I have poor internet connection (the network is cut unexpectedly), updating pip package is painful. When I retry the pip install, it would stop at the midpoint and give me the same md5 error.

All I have to do is

  1. Download the package from pypi (using browser or wget, both have retry/resume capability)
  2. Pip install
  3. remove the package

If pip download have resume feature then the problem could be solved.

What I've run

pip install -U jupyterlab in poor network condition

Collecting jupyterlab
  Downloading jupyterlab-0.28.4-py2.py3-none-any.whl (8.7MB)
    4% |鈻堚枊                              | 430kB 1.1MB/s eta 0:00:08
THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    jupyterlab from https://pypi.python.org/packages/b1/6d/d1d033186a07e08af9dc09db41401af7d6e18f98b73bd3bef75a1139dd1b/jupyterlab-0.28.4-py2.py3-none-any.whl#md5=9a93b1dc85f5924151f0ae9670024bd0:
        Expected md5 9a93b1dc85f5924151f0ae9670024bd0
             Got        4b6835257af9609a227a72b18ea011e3
download awaiting PR enhancement

Most helpful comment

I have a poor connection and I often resume pip manually using wget.

This is easy for wheel using wget -c, then you can install the wheel with pip, but when it's a tarball I have to use the setup script and don't get the same result even though in the end it works.

All 9 comments

I don't know how pip's hashing works, but here's some working, easy, simple, modular resume code in a single file/function: https://gist.github.com/CTimmerman/ccf884f8c8dcc284588f1811ed99be6c

I have a poor connection and I often resume pip manually using wget.

This is easy for wheel using wget -c, then you can install the wheel with pip, but when it's a tarball I have to use the setup script and don't get the same result even though in the end it works.

This should be easier to implement now since all the logic regarding downloads is isolated in pip._internal.network.download.

Any updates on this? I was installing a huge package (specifically Tensorflow, 500+MB), and for some reason pip was killed in 99% of download... Re run the command and it started downloading from 0...

@johny65 No updates.

Folks are welcome to contribute this functionality to pip. As noted by @chrahunt, there's a clear part of the codebase for these changes to be made in. :)

I have a few questions about the design for this enhancement. First, why (or how) does this happen?

When I have poor internet connection (the network is cut unexpectedly) [...] When I retry the pip install, it would stop at the midpoint and give me the same md5 error.

My guess would be that back then the wheels are stored directly to the cache dir instead of being downloaded to a temporary location like it is handled now. Thus the hashing error should be solved.

However, because the wheel being downloaded is in a directory that will be cleaned up afterward, do we want to expose that mechanism to be configurable (e.g. pip install --wheel-dir=<user-assgned path> <packages>), or do we want to offer the last result for people with poor connections to pip download -d <user-assgned path> <packages> then pip install? Personally I prefer the latter approach, where we'd need to make pip download download directly to the specified dir, and I'm not sure if doing that would break any existing use case.

Any updates on this? I was installing a huge package (specifically Tensorflow, 500+MB), and for some reason pip was killed in 99% of download... Re run the command and it started downloading from 0...

same with pytorch which was 1 GB in size. days quota just got exhausted and no fruitful result.

fwiw, you can always curl manually (applying the resuming logic you need and checking the integrity manually) and pip install the downloded file instead.

Folks are welcome to contribute this functionality to pip.

Was this page helpful?
0 / 5 - 0 ratings