Sometimes the order of installation of packages is important. But if you organise the order in your requirements.in, the output from pip-compile will be sorted by package name. This makes can cause problems later. My example is with numpy and gdal; in order to get certain bindings, gdal first checks to see if you have numpy installed, and if you do it will give you certain additional bindings.
---requirements.in---
gdal==2.4.2
numpy
Results in the file
---requirements.txt---
#
# This file is autogenerated by pip-compile
# To update, run:
#
# pip-compile
#
gdal==2.4.2
numpy==1.17.4
Now after pip-sync you'll get gdal installed first. Then
python -c 'from osgeo import gdal_array'
will give you an error.
Optionally keep the order of the packages in requirements.in the same as requirements.txt.
My solution at the moment is to edit requirements.txt to change the order. But this breaks some of the nice automatic feature.
To offer an alternative perspective, maybe gdal and similar packages ought to be using extras specifications, so that you'd add gdal[numpy] to your .in file.
To be clear, pip makes no promises to install the packages in the order they are declared in the requirements file.
The only promise is "dependencies before dependant".
Ref: https://pip.pypa.io/en/stable/reference/pip_install/#installation-order
Considering that, I'm not in favour of adding features that encourage relying on the declaration order in the requirements file.
@AndydeCleyre suggestions makes sense, and is the right way to go right now IMO.
Although, after using pip-compile, the resulting requirements.txt will declare both gdal[numpy] and numpy as top-level dependencies, so we'd need to validate that pip would still consider numpy as a dependency of gdal[numpy], despite also being top-level, and install numpy first.
@pradyunsg May I shamelessly summon your knowledge to consider what would be the order in this case above?
--
Rambling thought, this would be a case for a mechanic similar to extras, but for build dependencies (PEP518). Not sure how that would work or be used though.
@vphilippon
Although, after using pip-compile, the resulting requirements.txt will declare both gdal[numpy] and numpy as top-level dependencies
Are you sure? I have a tiny package with some optional dep groups and it doesn't seem to mark the optional deps as top-level.
Without any optional dep groups:
requirements.in:
ptrender
requirements.txt:
plumbum==1.6.8 # via ptrender
ptrender==0.0.3 # via -r requirements.in (line 1)
pyratemp==0.3.2 # via ptrender
With one optional dep group (that just directly adds the dep strictyaml):
requirements.in:
ptrender[yaml]
```console
$ grep '^strictyaml' requirements.txt
strictyaml==1.0.6 # via ptrender
With more (`dev` adds `ipython` and `flit`):
`requirements.in`:
```python
ptrender[yaml,dev]
```console
$ grep '^(strictyaml|ipython|flit)=' requirements.txt
flit==2.2.0 # via ptrender
ipython==7.12.0 # via ptrender
strictyaml==1.0.6 # via ptrender
These are defined for the package via a flit-flavored `pyproject.toml` as:
```toml
[tool.flit.metadata.requires-extra]
dev = ["flit", "ipython"]
yaml = ["strictyaml"]
And this becomes the following in the flit-generated legacy-compatible setup.py file, though I don't think it's used at all in my installation procedure:
extras_require = \
{'dev': ['flit', 'ipython'], 'yaml': ['strictyaml']}
@AndydeCleyre So yes, I'm sure, but I might have expressed what I meant in a confusing way.
So, pip-compile will effectively mark the dependencies with the correct # via X... comment in any case, giving the information to anyone reading it that it's a transitive dependency, coming from X.
What I meant is that for pip, those packages are now all equally top-level dependencies, because the requirements now all comes from the requirements.txt directly.
I'll take this example:
requirements.in:
ptrender[yaml]
Let's says you do pip install -r requirements.in directly.
The result is that pip will read the requirements.in, and take take ptrender[yaml] as the only top-level dependency. pip will then perform its usual dependency discovery mechanism, adding plumbum, pyratemp and strictyaml, all as second-level dependency (pip discovered them throughptrender[yaml], not directly fromrequirements.in`, and so on...
Now with the pip-compile requirements.in output:
requirements.txt:
plumbum==1.6.8 # via ptrender
ptrender[yaml]==0.0.1 # via -r requirements.in (line 1)
pyratemp==0.3.2 # via ptrender
python-dateutil==2.8.1 # via strictyaml
ruamel.ordereddict==0.4.14 # via ruamel.yaml
ruamel.yaml.clib==0.2.0 # via ruamel.yaml
ruamel.yaml==0.16.10 # via strictyaml
six==1.14.0 # via python-dateutil
strictyaml==1.0.6 # via ptrender
If I run pip install -r requirements.txt, pip will take all of these dependencies listed in the requirements.txt as top-level dependencies, since they're all explicitly put in there.
To be extra clear: pip doesn't parse those via X... comments to dertermine the dependency order or anything.
So the question I ask myself about pip behavior is:
Will pip perform the "dependency discovery" mechanic and build the "install order" and apply it as usual? Or does the fact that all the dependencies are listed together in the requirements.txt make pip simply consider them as top-level dependencies and install them without the dependency order gurantee?
Now that I think of it, we can test and check pip logs:
vphilippon@MTL-BJ270 MINGW64 ~/tests/playground
$ pip install -r requirements.txt
Collecting plumbum==1.6.8
Using cached plumbum-1.6.8-py2.py3-none-any.whl (115 kB)
Collecting ptrender[yaml]==0.0.1
Using cached ptrender-0.0.1-py2.py3-none-any.whl (3.1 kB)
Collecting pyratemp==0.3.2
Using cached pyratemp-0.3.2.tgz (52 kB)
Collecting python-dateutil==2.8.1
Using cached python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Collecting ruamel.ordereddict==0.4.14
Using cached ruamel.ordereddict-0.4.14-cp27-cp27m-win32.whl (28 kB)
Collecting ruamel.yaml.clib==0.2.0
Using cached ruamel.yaml.clib-0.2.0-cp27-cp27m-win32.whl (98 kB)
Collecting ruamel.yaml==0.16.10
Using cached ruamel.yaml-0.16.10-py2.py3-none-any.whl (111 kB)
Requirement already satisfied: six==1.14.0 in c:\users\vphilippon\tests\playground\venv\lib\site-packages (from -r requirements.txt (line 8)) (1.14.0)
Collecting strictyaml==1.0.6
Using cached strictyaml-1.0.6.tar.gz (47 kB)
Building wheels for collected packages: pyratemp, strictyaml
Building wheel for pyratemp (setup.py) ... done
Created wheel for pyratemp: filename=pyratemp-0.3.2-py2-none-any.whl size=20404 sha256=6dba6829d89e36cd644c8f90f7fbb33633f57263be5a89d2f738b26b7afe4992
Stored in directory: c:\users\vphilippon\appdata\local\pip\cache\wheels\96\f2\f5\7ac098d380df05194b670310e746e4b4480e52afa8a13929fe
Building wheel for strictyaml (setup.py) ... done
Created wheel for strictyaml: filename=strictyaml-1.0.6-py2-none-any.whl size=25319 sha256=8eafa08589ecaf01548ecc1124dbf3db0642b07fb62a5426aa606dce8c04da64
Stored in directory: c:\users\vphilippon\appdata\local\pip\cache\wheels\80\50\17\337fe2b6709ea577f61881540f8ad4b4e4b2b01696fe4bad04
Successfully built pyratemp strictyaml
Installing collected packages: plumbum, pyratemp, ruamel.ordereddict, ruamel.yaml.clib, ruamel.yaml, python-dateutil, strictyaml, ptrender
Successfully installed plumbum-1.6.8 ptrender-0.0.1 pyratemp-0.3.2 python-dateutil-2.8.1 ruamel.ordereddict-0.4.14 ruamel.yaml-0.16.10 ruamel.yaml.clib-0.2.0 strictyaml-1.0.6
If I'm right, the interesting part is this one:
Installing collected packages: plumbum, pyratemp, ruamel.ordereddict, ruamel.yaml.clib, ruamel.yaml, python-dateutil, strictyaml, ptrender
That's the install order, and it respects the dependency order!
As a proof, using pipdeptree, a nifty tool to display the dependency tree (super usefull to when debugging those kind of things):
$ pipdeptree
pip-tools==4.5.0
- click [required: >=7, installed: 7.0]
- six [required: Any, installed: 1.14.0]
pipdeptree==0.13.2
- pip [required: >=6.0.0, installed: 20.0.2]
ptrender==0.0.1
- plumbum [required: Any, installed: 1.6.8]
- pyratemp [required: Any, installed: 0.3.2]
setuptools==44.0.0
strictyaml==1.0.6
- python-dateutil [required: >=2.6.0, installed: 2.8.1]
- six [required: >=1.5, installed: 1.14.0]
- ruamel.yaml [required: >=0.14.2, installed: 0.16.10]
- ruamel.ordereddict [required: Any, installed: 0.4.14]
- ruamel.yaml.clib [required: >=0.1.2, installed: 0.2.0]
wheel==0.34.2
We can see that plumbum, pyratemp, ruamel.ordereddict and ruamel.yaml.clib are leaves of the dependency tree, so they should indeed be installed first by pip.
Then ruamel.yaml and python-dateutil, before strictyaml.
Then strictyaml, before ptrender
(sadly, pipdeptree is unable to link strictyaml to ptrender, because the specified [yaml] extra isn't registred in the environment at installation, so the link isn't obvious for the tool).
And finally ptrender, the "True" top-level in this situation.
six doesn't show-up in the Installing ... message because it was already installed before my test, as a dependency of pip-tools. You can redo the test on your own using seperate venvs, and validate my analysis :) .
So bottom line, it seems that even if all of the package are theorically "top-level dependencies" because they are directly listed in the requirements.txt, pip will do the intelligent thing and determine the install order by processing the dependencies of each of them. (馃憦 pip !)
That means that @AndydeCleyre's suggestion of using "extras" would indeed work in this case 馃憤 , and is almost certainly the thing gdal should implement to address this.
(Reminder that I hold the right to be wrong, that this was a mix of analysis, learning new things, and trying to put a demonstration together, and that I welcome anyone who knows better to challenge this 馃槃)
(Also thank you @AndydeCleyre for pointing to ptrender, I now have a nice test case to use in my future extras handling debugging sessions!)
something I do for this case is:
# requirements-base.in
numpy
# requirements-gdal.in
-r requirements-base.txt
gdal
pip install --no-deps --require-hashes -r requirements-base.txt
pip install --no-deps --require-hashes -r requirements-gdal.txt
but really this is a bug in gdal - it shouldn't be inspecting the state of the virtualenv at install time
The only promise is "dependencies before dependant".
Yup -- pip installs "bottom up" in the dependency tree -- installing the dependencies before dependent package, regardless of what is specified on the top level.
Will pip perform the "dependency discovery" mechanic and build the "install order" and apply it as usual? Or does the fact that all the dependencies are listed together in the requirements.txt make pip simply consider them as top-level dependencies and install them without the dependency order gurantee?
pip doesn't care about the "original source" of how it "discovered" a requirement to satisfy.
@pradyunsg May I shamelessly summon your knowledge to consider what would be the order in this case above?
Yup. :)
Relevant bit of code: https://github.com/pypa/pip/blob/9e884a46f632815826be51a55eb7a285b351f513/src/pip/_internal/resolution/legacy/resolver.py#L402
@rdenham
Given all this, I think we can close this, yes? To sum up:
txt anywaytxts and install in sequence, also codifying relationships with constraints or includesI know this issue is closed but I want to make a use case if this is ever revisited.
Most helpful comment
but really this is a bug in gdal - it shouldn't be inspecting the state of the virtualenv at install time