We are currently using pipenv sync to deploy some of our apps with the versions of packages that we have tested. The problem we're having is just how long it takes for a sync to run. We might install just one new package (or update one version), then have to wait several minutes for a sync to complete.
Using pipenv sync --verbose
it looks like a pip install is being called for every single dependency. Wouldn't it be faster to check the existing hashes in the lock file against the dependencies which are already installed to confirm that they match? I don't actually know what those hashes are calculated over so this is a bit of a naive request. Ultimately my goal is just to make pipenv sync not spin for several minutes on existing dependencies when they don't need to be updated.
Also of note, we came from using our own virtual environments with a normal requirements file for pip. Adding a new dependency or updating a version was _much_ quicker then, so it has to be something specific to the way pipenv is running independent installs that takes so long.
Currently using Python 3.6.6 with pip 18.1 and pipenv 2018.10.13
I think there is already an open issue about this. We don’t object, but don’t see it as a priority either. You can implement it if you want it now.
Fyi it’s actually the hash checking which is going to be the slow part here. Pip won’t actually install something that is already up to date, unless we are passing a bad flag somewhere in which case I’d be interested in that. Is it actually installing anything or simply just taking a long time to no-op?
I think it is the overhead of launching no-op pip subprocesses…
@techalchemy Is hash checking (locking in general) a CPU bound operation or network based?
I was wondering how much of a speedup we can get by using Cython..
Similar with https://github.com/pypa/pipenv/issues/2895
@devxpy sorry for the delay, hash checking will depend, it's partly that we just need to chunk and hash every file in order to actually perform a hash check. We do have a cache for hashes which can speed things up for files that are already downloaded, but files that we download also have to be hashed for hash checking, so it's kind of both.
And launching subprocesses has overhead as well, which we've measured. However, when I run pipenv with no additional subprocesses it has basically no impact. I've been curious about cython myself
Edit: if running pipenv 2020.5.28 or later, pipenv sync
should do the right thing by default, and the below workaround should no longer be needed.
When an environment is already up to date, I'd expect pipenv sync --dev
to be at least as fast as combining it with pip-tools and running:
pipenv lock --keep-outdated --requirements > requirements.txt
pipenv run pip install -c requirements.txt pip-tools
pipenv run pip-sync
At the moment, this isn't the case: the running time of pipenv sync --dev
depends on the number of packages in the environment, rather than the number of packages that have changed version since the last sync operation.
Does it really make sense for every pipenv sync
call to be verifying the environment against the wheel archives? If I wanted that behaviour, I'd expect to have to pass a --verify-installed
flag or similar, rather than having it be the default.
Note: to check hashes properly, at least the first install should still be done with a regular pipenv sync
call (the above faster sync status checking operation doesn't check the hashes if it actually does need to install something)
Edits:
pip-tools
has been installed, the pip-sync
line would fail in a fresh venv)--dev
, as that doesn't work as expected due to #3316Note: anyone using my suggested workaround above will need to be aware of #3316, as passing --dev
when generating a requirements file means you may end up with a requirements.txt
file that's missing production dependencies that you expected to be included.
Most helpful comment
Edit: if running pipenv 2020.5.28 or later,
pipenv sync
should do the right thing by default, and the below workaround should no longer be needed.When an environment is already up to date, I'd expect
pipenv sync --dev
to be at least as fast as combining it with pip-tools and running:At the moment, this isn't the case: the running time of
pipenv sync --dev
depends on the number of packages in the environment, rather than the number of packages that have changed version since the last sync operation.Does it really make sense for every
pipenv sync
call to be verifying the environment against the wheel archives? If I wanted that behaviour, I'd expect to have to pass a--verify-installed
flag or similar, rather than having it be the default.Note: to check hashes properly, at least the first install should still be done with a regular
pipenv sync
call (the above faster sync status checking operation doesn't check the hashes if it actually does need to install something)Edits:
pip-tools
has been installed, thepip-sync
line would fail in a fresh venv)--dev
, as that doesn't work as expected due to #3316