-vvv
option).~ (Does not apply)python:3-7-stretch
) running on CircleCII am reasonably sure that installing packages using poetry install
instead of pip install
takes significantly longer.
To compare. Here is a pip install
taking 35 seconds: https://circleci.com/gh/MichaelAquilina/S4/464#action-103
Here is another build with poetry install
and the same requirements taking 1:27 seconds: https://circleci.com/gh/MichaelAquilina/S4/521#action-104
In both cases, both dev and non dev requirements were installed.
Thanks for your interest in Poetry!
This is expected because Poetry orders the packages to install so that the deepest packages in the dependency graph are installed first to avoid errors at installation time. This requires sequential installation of the packages which takes longer but is more "secure".
Also, Poetry checks hashes of installed packages for security reasons and due to the way pip
works, Poetry has to generate a temporary requirements.txt
file to make pip check hashes. This adds an overhead which explains the difference between the two tools.
Thanks for taking the time to explain :)
Would it be possible for poetry to just download the target packages in parallel and install in sequence to speed up the process?
I intend to improve the installation part of Poetry sometime in the future to speed things up, yes. I can't give you an ETA though, the only thing I can tell you is that it will likely be after the 1.0
milestone since I want to stabilize Poetry before changing critical parts, like this one.
I'll be sure to keep you posted if anything changes on that front.
That sounds great @sdispater. Of course I fully understand your reasoning and agree that ensuring stability is a lot more important than performance.
Thanks for the great work on Poetry!
Let me start by saying I recently discovered poetry
and I think it's awesome :)
Now, is there any progress on this? I just migrated a couple of projects to poetry and my container builds went from 2s to 5 minutes each. The reason why it's so slow is twofold:
ADD requirements.txt /tmp
RUN pip install -r requirements.txt
ADD . /project
...
so I could cache that layer. With poetry
I have to do:
ADD ./project
RUN cd /project && poetry install
Which means I need to install the dependencies over and over even though I haven't changed a single dependency. I am trying to do something equivalent with:
ADD poetry.lock /tmp
ADD pyproject.toml /tmp
RUN cd /tmp && /root/.poetry/bin/poetry update --no-dev
But I am not sure if this is the right thing to do but does the job. Feels extremely hacky so I'd appreciate some advice :P
pip
and ~5minutes with poetry :( Monitoring the network I can see how pip downloads everything very fast at a few MB/s while poetry does it one by one never reaching more than a dozen KB/s.Re this previous comment, it would be ideal if there was an interface to install when only the lock file is available, so that we can continue to make good use of docker layer caching. Because I haven't been able to get an install to succeed when the source isn't also available.
Just did a comparison for out 190 package requirement project (according to poetry) and the time difference between pipenv en poetry was the following:
Poetry (prerelease): 201.54 real 138.98 user 43.05 sys
Pipenv: 75.50 real 200.14 user 51.65 sys
Pip with hashes: 83.60 real 53.16 user 21.66 sys
Now yes, pipenv probably doesn't take installation order into account and just installs them concurrently and yes I have had the probably where a package needed it's dependencies to install correctly (which shouldn't be the case, but sometimes you don't have control) but still, I think if you look at the dependency tree you can run a lot of leaves and branches in parallel without this being an issue because they don't meet up until the end.
Still, pip with a requirement file with hashes is a lot faster....
Still, pip with a requirement file with hashes is a lot faster
It feels like you are comparing apples to oranges here. Pip doesn't resolve dependencies. Poetry install does.
However those dependencies are only resolved during the lock phase, the install phase (that is performed more often than locks) just installs the packages.
We could consider to do some time expensive ordering/parallelization calculations during the lock phase?
Yeah, the dependency resolving taking long is fine, then doing a poetry export -f requirements.txt
to generate the requirements file with all dependencies and installing that with pip is faster then just using pure poetry.
Like I said: that misses the ordering in installing which seems to be only needed if not doing it causes issues.
Maybe the solution would be to give the install command a flag (and environment variable) to install concurrently to speed it up if ordering is not important. Then the default will be the slower and safe, but if you know it's not needed it will speed up CI build times significantly.
Also I just saw this in the pip documentation, stating that they also install packages in order.. now I'm really confused about what poetry is doing to make it go so much slower...
@hvdklauw Pip documentation just says that they are installing in topological order. That would still allow parallelism for packages that are not forced into a specific order by a dependency relationship.
(But looking at the discussions around pip, they don't do parallel installs either.)
I've dug into this a bit. And it seems that the slowness originates from the fact that poetry
runs a new subprocess for pip
for each package. This incurs the Python startup time foe each package. And this adds up if you have many packages to install. I noticed this by using pycallgraph
which has shown me this:
As I understand @sdispater, this is done to ensure ordering of packages. However, pip
as of version 6.0 orders packages by topology which IMO is good enough. So there is no need anymore to do the package ordering inside poetry
.
Another reason is hash-checking, which is also supported in pip
Hence, I don't see any benefit from running separate pip
processs other than nicer console output.
Another idea worth investigating is calling pip
as API. Something like this should be possible instead of using a subprocess:
from pip._internal.commands.install import InstallCommand
# equivalent to `pip install foobar==10.0`
cmd = InstallCommand()
cmd.main(['foobar==10.0'])
This would avoid the repeated Python startup-time.
There are some issues with this though:
poetry
is executed, which is unlikely the one we want to install the packages into.This came out from more of an curious investigation into pip
and poetry
and the second bullet-point alone above makes me think that this is really not so easy to implement. Although, the pip
options -t
and --prefix
may be worth investigating for this.
If those don't pan out, I think it's still absolutely acceptable to call pip
only once with all the dependencies (given that it supports topological ordering and hash-verification).
We are currently using following script in CI to make installation faster (2x in one of our projects):
poetry export -f requirements.txt --dev | poetry run -- pip install -r /dev/stdin
I realize this is probably naive, but having had some experience with building my own tooling around pip install
I'm curious about something here.
Since poetry resolves dependencies upfront and then, at least for the pip_installer
, uses pip install --no-deps
to install each package independently of all the others (this is what my own build tools have done), it seems to me to be safe to execute multiple pip install
operations in parallel.
From my own testing, changing this for loop in installer.py
into a ThreadPool.map
effectively gives speedup relative to the number of packages you're trying to install, for already-locked dependencies.
self._io.write_line("")
# for op in ops:
# self._execute(op)
multiprocessing.pool.ThreadPool(len(ops)).map(self._execute, ops)
This takes me from ~40s to freshly install (including new venv creation) about 30 dependencies to about 10s. For comparison, a serial, pip install -r requirements.txt -t test --no-deps
using the exported requirements.txt
file takes about 15s.
I don't know if the lack of parallelization here is because removing or updating things might be more sensitive to some kind of cross-dependency race, or if there are other reasons why a parallel per-dependency install isn't a good standard practice. But in my limited testing, that very simple change makes poetry
~4x faster for a decently-sized application.
Ok, sorry, long post.
I had the opportunity of discussing the issue with @sdispater recently, and we came to the conclusion that one of the ways to dramatically speed up the download time would be to download less bytes, and especially just download the ones we need.
Let's add a few things:
I've done a POC for just this (not trying to integrate this into poetry, just seeing if I could achieve anything noteworthy by playing around).
The result is https://github.com/ewjoachim/quickread
My first conclusions:
RangeSet{Range[6964008, 6965518), Range[7105384, 7428297)}
Given the file format of zip, at first, I thought it wouldn't be possible to improve that, but then I've looked again, and now I think if it's feasible to do much much better than that.
More details on the analysis of the problems with the ZIP format. Technical details, not needed to understand the whole problem but if you're interested, here you are (click to expand)
Ok, so first thing first: I'm not an expert of the zip format. Most of these are things I discovered today by reading google and wikipedia. If you think there's a mistake, you're most probably right, and please say so.
The zip file is composed of
If you want to read a single file, here are the minimal steps one needs to take:
0x06054b50
, and then see if the length of the comment section (offset 20) matches with the offset we found through our search.Sources:
That being said, we know that the wheels are most likely created by the wheel tool, and we can make assumptions, like, plenty of them. This helps because:
name-version.dist-info/WHEEL
(the file we want to read)name-version.dist-info/entry_points
name-version.dist-info/top_level.txt
name-version.dist-info/RECORD
name-version
, we know exactly the offset we need to read to get the location of the wheels file. This is given by the formula: offset_from_end_of_file = - 248 - 4 * len(name_dash_version)
where name_dash_version
is the variable part of the .dist-info folder name, which is also present in the wheel filename.20 + len(name_dash_version)
so that we'd get the file offset in the first 4 bytes and then name-version.dist-info/WHEEL
which would allow us to check that the wheel follows our assumptions. Thing is we cannot request both a length and a negative offset in the HTTP spec (as far as I understand) but that means we'll have to read 300 bytes instead of 30 and it really doesn't make a difference.zlib.decompress
from stdlib, and then we have the WHEEL fileThis method would allow getting only 2 requests and 5kb (or we can make it 3 requests and probably 1kb), so for Django, we're talking about dividing the download time by about 10 000. That should speed up poetry.
And the best part is that if there's no wheel or the wheel isn't in the expected format, then we can just detect it and fallback to today's strategy.
I'm going to try and upgrade the POC to showcase reading the WHEEL file given a wheel URL.
Another possible simpler solution would be to read the last, say, 3kb of the zip, look for 0x04034b50
(which means "beginning of a new file"), read file name, if WHEEL
we've found it, if entry_points
, top_level
or RECORD
, we've gone too far, and we can restart with the last 10kb, if anything else, continue reading.
Reading efficiently through the bytes can be done with the struct
module, we can even re-use the struct definitions from the zipfile module
EDIT: yeah but... We have no idea how long the central directory may be between 1kb and 1MB, so by default it's hard. We're probably better off reading the central directory to know where to search
Ok, POC updated at https://github.com/ewjoachim/quickread/blob/master/quickread/wheel.py (usage at https://github.com/ewjoachim/quickread/blob/master/script2.py).
We can get the full requirements for a wheel with 2 requests, each downloading 0.5 to 2 kb.
I'll try to see if there's a way to integrate this into poetry.
Hm :/
I implemented that on a branch and the results are not as impressive as expected.
Running on poetry itself, in the dev env:
$ # On master
$ poetry run poetry update -vvv
...
1: Version solving took 3.156 seconds.
1: Tried 1 solutions.
0: Complete version solving took 52.984 seconds for 4 branches
0: Resolved for branches: (>=2.7,<2.8 || >=3.4,<3.5), (>=2.7,<2.8), (>=3.4,<3.5), (>=3.5,<4.0)
$ # On my branch
$ poetry run poetry update -vvv
...
1: Version solving took 2.874 seconds.
1: Tried 1 solutions.
0: Complete version solving took 45.825 seconds for 4 branches
0: Resolved for branches: (>=2.7,<2.8 || >=3.4,<3.5), (>=2.7,<2.8), (>=3.4,<3.5), (>=3.5,<4.0)
Full output: https://gist.github.com/ewjoachim/6fed55fe84da9c90d6452b73ed64cdbd
Additional ideas for speeding up:
requests.get
everywhere, and not creating a requests.Session
. That should speed up things, I guess.I'm posting my branch (#1803), but probably not spending a lot more time on that unless someone has a very smart idea :)
@ewjoachim
Apologies if I'm missing something here, but it looks like the solution you are investigating would potentially have an impact on an add/update (i.e. the resolution process), but _not_ on the install process (which is the focus of this issue?).
Definitely not being dismissive of the work - as I think anything that speeds of solving time is great (during development I definitely spend more time solving than installing) - just don't think it's relevant to this particular thread.
--
I suspect the other posts calling out the subprocess pip call per module are the primary cause of the diff between pip and poetry _on install_ . I think in some cases the pure parallelism approach would break - due mostly to modules doing custom things in the setup.py (which implicitly require dependencies to be installed). Something that parallelized each layer of depth n up the dependency graph, starting with the deepest would still probably always work; I haven't looked through the lockfile structure to see if the information is present there to do this.
Hm, you're right... This stems from a discussion I had with @sdispater and I mistook this ticket with the one regarding the problem we discussed :(
Sorry for the noise. I'll try to find the proper ticket or create one.
From reading this thread, would someone be able to clarify why poetry needs to install each package individually? If you've already computed the dependency closure, what is the additional bookkeeping vs. generating multiple requirements.txt files and installing in parallel similar to pipenv?
We've really enjoyed the usability boost of poetry commands, but in our CI system, builds can take up to 20 minutes to install even if everything is already downloaded (I assume b/c of subprocessing out to pip). I'm thinking of setting up our CI to do poetry export -f requirements.txt
, but I'd love to _not_ do that :)
Yeah, again tested it yesterday in our CI environment, poetry install takes minutes longer then doing the export and then using pip. Both with and without the packages cached in a folder.
Installing packages in parallel would be really nice.
Pipenv does this in a really bad way: ignoring package dependencies completely. Sometimes installation fails and those packages are simply retried in the end. Not a good approach.
But since we have the complete dependency graph we could find the packages that are safe to install in parallel and do that.
Pipenv does this in a really bad way: ignoring package dependencies completely. Sometimes installation fails and those packages are simply retried in the end. Not a good approach.
But since we have the complete dependency graph we could find the packages that are safe to install in parallel and do that.
This safe way is definitely _ideal_ BUT the dumb way prob works in 90% of cases (and is significantly simpler). I wonder if you could get a tradeoff by topologically sorting, then doing parallel installs from deps up to top level requirements? (possibly weighting shared dependencies higher)
This safe way is definitely _ideal_ BUT the dumb way prob works in 90% of cases (and is significantly simpler). I wonder if you could get a tradeoff by topologically sorting, then doing parallel installs from deps up to top level requirements? (possibly weighting shared dependencies higher)
This is a quick hack: using the depths for parallel installation: #2374. The idea is that it should be safe.
EDIT: It is about four times faster on my computer, while still being safe in that it respects the dependencies of the packages.
For anyone else stumbling on this thread, and since it hasn't been mentioned so far, it looks like https://github.com/python-poetry/poetry/pull/2595 has been merged and should help with this issue: https://github.com/python-poetry/poetry/pull/2595#issuecomment-653663078
Just curious. does the new installer pre-build for the local env, cache the built dists, and soft-link to them rather than copying them (like pip-accel did)?
Most helpful comment
I intend to improve the installation part of Poetry sometime in the future to speed things up, yes. I can't give you an ETA though, the only thing I can tell you is that it will likely be after the
1.0
milestone since I want to stabilize Poetry before changing critical parts, like this one.I'll be sure to keep you posted if anything changes on that front.