Pipenv: Make pipenv sync skip installed dependencies

Created on 17 Oct 2018  Â·  8Comments  Â·  Source: pypa/pipenv

We are currently using pipenv sync to deploy some of our apps with the versions of packages that we have tested. The problem we're having is just how long it takes for a sync to run. We might install just one new package (or update one version), then have to wait several minutes for a sync to complete.

Using pipenv sync --verbose it looks like a pip install is being called for every single dependency. Wouldn't it be faster to check the existing hashes in the lock file against the dependencies which are already installed to confirm that they match? I don't actually know what those hashes are calculated over so this is a bit of a naive request. Ultimately my goal is just to make pipenv sync not spin for several minutes on existing dependencies when they don't need to be updated.

Also of note, we came from using our own virtual environments with a normal requirements file for pip. Adding a new dependency or updating a version was _much_ quicker then, so it has to be something specific to the way pipenv is running independent installs that takes so long.

Currently using Python 3.6.6 with pip 18.1 and pipenv 2018.10.13

CLI Requires PEEP Type help wanted

Most helpful comment

Edit: if running pipenv 2020.5.28 or later, pipenv sync should do the right thing by default, and the below workaround should no longer be needed.


When an environment is already up to date, I'd expect pipenv sync --dev to be at least as fast as combining it with pip-tools and running:

pipenv lock --keep-outdated --requirements > requirements.txt
pipenv run pip install -c requirements.txt pip-tools
pipenv run pip-sync

At the moment, this isn't the case: the running time of pipenv sync --dev depends on the number of packages in the environment, rather than the number of packages that have changed version since the last sync operation.

Does it really make sense for every pipenv sync call to be verifying the environment against the wheel archives? If I wanted that behaviour, I'd expect to have to pass a --verify-installed flag or similar, rather than having it be the default.

Note: to check hashes properly, at least the first install should still be done with a regular pipenv sync call (the above faster sync status checking operation doesn't check the hashes if it actually does need to install something)

Edits:

  • updated the example to match what we're actually doing to speed up our sync processes (without the second line to ensure pip-tools has been installed, the pip-sync line would fail in a fresh venv)
  • removed --dev, as that doesn't work as expected due to #3316
  • added note about the lack of hashes in the generated requirements.txt file (as per #4189)

All 8 comments

I think there is already an open issue about this. We don’t object, but don’t see it as a priority either. You can implement it if you want it now.

Fyi it’s actually the hash checking which is going to be the slow part here. Pip won’t actually install something that is already up to date, unless we are passing a bad flag somewhere in which case I’d be interested in that. Is it actually installing anything or simply just taking a long time to no-op?

I think it is the overhead of launching no-op pip subprocesses…

@techalchemy Is hash checking (locking in general) a CPU bound operation or network based?

I was wondering how much of a speedup we can get by using Cython..

@devxpy sorry for the delay, hash checking will depend, it's partly that we just need to chunk and hash every file in order to actually perform a hash check. We do have a cache for hashes which can speed things up for files that are already downloaded, but files that we download also have to be hashed for hash checking, so it's kind of both.

And launching subprocesses has overhead as well, which we've measured. However, when I run pipenv with no additional subprocesses it has basically no impact. I've been curious about cython myself

Edit: if running pipenv 2020.5.28 or later, pipenv sync should do the right thing by default, and the below workaround should no longer be needed.


When an environment is already up to date, I'd expect pipenv sync --dev to be at least as fast as combining it with pip-tools and running:

pipenv lock --keep-outdated --requirements > requirements.txt
pipenv run pip install -c requirements.txt pip-tools
pipenv run pip-sync

At the moment, this isn't the case: the running time of pipenv sync --dev depends on the number of packages in the environment, rather than the number of packages that have changed version since the last sync operation.

Does it really make sense for every pipenv sync call to be verifying the environment against the wheel archives? If I wanted that behaviour, I'd expect to have to pass a --verify-installed flag or similar, rather than having it be the default.

Note: to check hashes properly, at least the first install should still be done with a regular pipenv sync call (the above faster sync status checking operation doesn't check the hashes if it actually does need to install something)

Edits:

  • updated the example to match what we're actually doing to speed up our sync processes (without the second line to ensure pip-tools has been installed, the pip-sync line would fail in a fresh venv)
  • removed --dev, as that doesn't work as expected due to #3316
  • added note about the lack of hashes in the generated requirements.txt file (as per #4189)

Note: anyone using my suggested workaround above will need to be aware of #3316, as passing --dev when generating a requirements file means you may end up with a requirements.txt file that's missing production dependencies that you expected to be included.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

erinxocon picture erinxocon  Â·  3Comments

ipmb picture ipmb  Â·  3Comments

FooBarQuaxx picture FooBarQuaxx  Â·  3Comments

hynek picture hynek  Â·  3Comments

jacebrowning picture jacebrowning  Â·  3Comments