What's the problem this feature will solve?
Projects that have build requirements and optional build requirements can not use pyproject.toml because of the build isolation.
Example (https://bitbucket.org/fluiddyn/fluidfft): depending if mpi4py is installed, MPI extensions are built or not. It's very important to be able to install/use the package without MPI so mpi4py can not be added in build-system.requires
.
If a pyproject.toml is used, we get ImportError when importing mpi4py in the setup.py even when mpi4py is installed because of the isolation! So the MPI extensions are never build.
Describe the solution you'd like
Something like this could work:
[build-system]
# Minimum requirements for the build system to execute.
requires = [
"setuptools", "wheel", "numpy", "cython", "jinja2", "fluidpythran>=0.1.7"
]
# packages that are used during the build only if they are installed
optional = ["mpi4py", "pythran"]
Then mpi4py would be available during the build only if it is installed. We keep the advantages of the isolation (discussed in https://www.python.org/dev/peps/pep-0517/) but optional build dependencies are allowed.
Alternative Solutions
[build-system]
# Minimum requirements for the build system to execute.
requires = [
"setuptools", "wheel", "numpy", "cython", "jinja2", "fluidpythran>=0.1.7"
]
isolation = False
You can use the --no-build-isolation
flag.
From the help message "--no-build-isolation Disable isolation when building a modern source distribution. Build dependencies specified by PEP 518 must be already installed if this option is used." i don't think it is a solution !
The easy solution is to get rid of the pyproject.toml file. But it would actually be good to be able to use it (because of real build dependencies).
More generally, the solution can not be a pip option. There are many situations (for example install with tox) where you cannot use a pip option.
Here's a workaround I suggest to you: install all build-system.requires items that you want available manually (for eg, via pip install setuptools wheel numpy cython jinja2 fluidpythran>=0.1.7
) prior to running pip install --no-isolation ...
.
Please include the pip version you are using and other details. ~They're in the template for good reasons.~ They're not in this template. Aha!
What version of pip are you using? You are mentioning PEP 517, but PEP 517 isn't out in a public release yet.
i don't think it is a solution !
If you disable isolation, the user has to manually install the build-time dependencies that they want to be available. The docstring is written assuming that all PEP 518 build-system.requires
are indeed required, as defined in the PEP. Perhaps, the doc-string can use improvement.
There are many situations (for example install with tox) where you cannot use a pip option.
tox can: https://tox.readthedocs.io/en/latest/example/basic.html#further-customizing-installation
Honestly, if your CLI tool doesn't allow passing custom options to pip, I don't think that would be pip's problem.
I get this (correct) behavior with the last release of pip (18.1). it is why I didn't mention it.
Adding more general features to pyproject.toml
requires a PEP to extend the standard.
Really, this option --no-isolation
is not a solution for us.
I'd like to use the build-system.requires
option of pyproject.toml. It would allow users to just do pip install fluidfft
. Of course I don't want to use --no-isolation
because with this option, build-system.requires
is not taken into account and we loose all the benefice of pyproject.toml.
We need to be able to detect in setup.py if some packages are installed and if they are installed to use them during the build.
tox can: https://tox.readthedocs.io/en/latest/example/basic.html#further-customizing-installation
Good to know! But once again, it is not a solution to this problem.
Adding more general features to pyproject.toml requires a PEP to extend the standard.
PEP 518 is still "Provisional". Could it be modified to add a way to declare optional dependencies?
Not likely, Provisional in this case pretty much just means it's accepted, but it hasn't been implemented/released yet so we might need to make small tweaks still in order to make it functional in the real world.
Adding new features is generally out of scope.
Then packages with optional build dependencies CAN NOT (and won't be able to) use pyproject.toml ?
it is a problem! it is not something completely crazy to have a try: import ... except ImportError:
in a setup.py.
Note that the solution with build-system.optional
seems very simple to implement.
I mean, then the question becomes, what if there are two things listed in optional
? Do we need to install both or neither? Or can we install just one of them? Should we be trying to opportunistically install them and fail silently if we cant, or should they be opt in or opt out?
What if we want to have groups of optional dependencies? Should that be supported? All of the same questions asked above apply to that too.
None of these questions are, on the surface, super hard. But they all require sitting down and working through the options and making a decision about how they should function. That's something that deserves wider discussion.
By all means, you're free to try and raise a discussion on the discuss.python.org Packaging category or on distutils-sig to see if other people are open to amending PEP 518 for this feature. I just think it's more likely going to be the case that we want an additional PEP for that feature.
I agree with @dstufft. There's nothing fundamentally difficult to decide here, but it needs to be driven by use cases and a reasonable sense of what's important and what is generalisation for its own sake.
I see this as being a follow-up PEP that extends PEP 518, and not suitable as an amendment to PEP 518. If you want to propose and champion such a PEP, then as @dstufft says, you need to start the discussion on distutils-sig or the Discource category.
That being said, PEP 518 is released in pip now I think yea? We should probably move it out of provisional anyways.
It seems very complicated whereas the need is really simple.
It could be that the name "optional" is not appropriate. Maybe something like "used_if_importable" would be better. pip does not have to try to install packages listed in used_if_importable
. For each package, if it is importable, it has to be importable during the build. Otherwise, nothing to do. It's both very simple and very useful.
I hope the build-system.requires
is implemented in a way that package already installed are reused (with symbolic links in the isolation virtual env or something like this)? is it?
if yes, a simple implementation of build-system.used_if_importable
could be as simple as adding at line 55 of pyproject.py:
if "used_if_importable" in build_system:
requires = build_system.setdefault("requires", [])
for package_name in build_system["used_if_importable"]:
try:
import_module(package_name)
except ImportError:
continue
requires.append(package_name)
It is sad that this very useful feature pyproject.py, which has been presented as the new clean universal way to declare build dependencies can not be used for all packages having optional build dependencies.
For example, I think for mpi4py it's bad news. There is the function mpi4py.get_include()
, which is mainly used only during the build. Without a proper fix of this pyproject.toml feature, all projects using mpi4py.get_include()
won't be able to use pyproject.toml.
@dstufft Yeah.
@paugier Fair enough. I understand why the current functionality isn't enough for you. I agree with @dstufft here - then this'll need more discussion in distutils-sig or on discuss.python.org, since it affects more tools than pip.
@paugier But... if a project is "used if importable", then why isn't --no-build-isolation
sufficient? You install all the mandatory requirements, and then you decide which of the "used if importable" ones you want to install. Once that's done, build your wheel.
I guess I don't follow how you expect pip to choose whether to install "used if importable" build dependencies? What criteria should we use? Your sample code sets requires
to include things that are already present, which seems pointless because we then simply won't install them (because they are already there...). It's not like the value of requires
is available outside of pip for further use.
Apologies if I'm being dumb here somehow. (Note that "to explain the use case clearly to people who don't understand the requirement" is another benefit of writing this stuff up in a PEP, by the way :wink:)
But... if a project is "used if importable", then why isn't --no-build-isolation sufficient?
This was my thought as well, but I see where @paugier is coming from. They want the "hard dependencies" installed automatically, and the optional dependencies used opportunistically. Installing the "hard dependencies" separately requires either parsing pyproject.toml or specifying build dependencies separately.
Basically what they are asking for (which I think is reasonable) is a way for the "isolated" build environment still be created, but as an environment similar to a virtualenv created with --system-site-packages.
I don't think the PEP needs to be modified in order for pip to support this use case.
Let's take an example. Fluidsim is a package to run numerical simulations. it is used for research with very big simulations on clusters. it is also used for education by students with small simulations on laptops.
It would be good if users could install with just pip install fluidsim
from a clean environment. build-system.requires
of pyproject.toml would be very useful for that because we have build requirements.
Without pyproject.toml, the minimal installation with pip is something like
pip install numpy cython fluidpythran
pip install fluidsim
Then, we have 2 optional build requirements that are used only if they have been installed before by the user.
mpi4py
: just used for MPI simulations (on clusters).Pythran
: a Python-Numpy compiler.So that from a clean environment, i would like to have:
pip install fluidsim
-> no MPI support and no Pythran compilationpip install mpi4py; pip install fluidsim
-> MPI support and no Pythran compilationpip install pythran; pip install fluidsim
-> no MPI support and Pythran compilationpip install mpi4py pythran; pip install fluidsim
-> MPI support and Pythran compilationif we want to use pyproject (to get 1.), we would need for case 3. to tell users to install with:
pip install numpy cython fluidpythran pythran
pip install fluidsim --no-build-isolation
which is worth that what we have now without pyproject.toml
. Users will run the first line and forget the --no-build-isolation
which will lead to no Pythran compilation even though Pythran has been installed by the user just before.
To summarize,
cython
or fluidpythran
even if it is required by setup.py.We need to be able to tell pip not to isolate from some packages, in my case mpi4py and pythran, maybe with something like:
[build-system]
...
# packages that are used during the build only if they are installed
no_isolation_for = ["mpi4py", "pythran"]
Basically what they are asking for (which I think is reasonable) is a way for the "isolated" build environment still be created, but as an environment similar to a virtualenv created with --system-site-packages.
OK, thanks, I see what is being requested now.
That sounds like a new pip flag, that enables behaviour somewhere between the default build isolation and --no-build-isolation
, that creates a build environment and installs the dependencies from pyproject.toml
, but doesn't isolate that environment.
That's not what the original request said, though (it was asking for a new key in pyproject.toml
)
I don't think the PEP needs to be modified in order for pip to support this use case.
The relevant PEP for build isolation is PEP 517, specifically here. For a pyproject.toml
change, it would be PEP 518.
Agreed, the isolation change does not require a change to the PEP. The originally requested new key in pyproject.toml
would, though. What pip currently provides for build isolation is PEP-compliant, but it's certainly reasonable that pip provide additional features above the minimum required by the PEP.
It's not necessarily simple to provide the new feature (pip doesn't use virtualenv, so there's no existing means to provide --system-site-packages
but it could be done. It would need coding (along with tests and documentation), so it's basically a case of "PRs welcome".
"hard dependencies" installed automatically, and the optional dependencies used opportunistically
Exactly!
It seems to me that it would be much better to write this information (that some packages need to be accessible to setup.py opportunistically) in the repository (in practice in the pyproject.toml file) rather than to ask all users to add an option to the pip command line.
It's a property of the package not of one installation. And we have a file to contain such types of information, called pyproject.toml... Moreover, it would be nearly useless to install packages that need this feature without this feature.
The maintainers would get several installation issues just because users forget to use this unknown pip option, especially if we have --no-build-isolation
(no build isolation and no automatic installation of hard dependencies) and another option ???
for just no build isolation (with automatic installation of hard dependencies)...
How would it work for dependencies? All dependencies would be installed with this mode even if they don't need that? If not, how would one add this option for one dependency?
It seems to me that it would be much better to write this information (that some packages need to be accessible to setup.py opportunistically) in the repository (in practice in the pyproject.toml file) rather than to ask all users to add an option to the pip command line.
This is not something I would expect you to want end users to do at all. It seems like it would be a terribly bad idea to have the capabilities of your deployed application depend on the build-time environment of the users, except in very rare cases.
I don't like the idea of opportunistic dependencies at build time - that means what gets installed depends on what happened to be installed when you were doing the build, which is something I think should be opt-in for end users. What may be useful would be a build-time equivalent of extras
, so that you could maybe have something like pip install mypkg
and pip install mypkg[mpi]
, which would also install the "mpi" dependencies into the build environment when doing a build.
I think the traditional way to do this is to break out the "optional" parts of your package into separate packages and make them extras dependencies, though I don't think that would work in the case of something like pyyaml
, which for some time (maybe still) had a Pure Python version that was installed if you didn't have libyaml
available at build time and a C extension that builds otherwise (that wouldn't matter here because it's not pip installable anyway, but you get the idea - an enhancement that can't easily be broken into a separate package).
It seems to me that it would be much better to write this information (that some packages need to be accessible to setup.py opportunistically) in the repository (in practice in the pyproject.toml file) rather than to ask all users to add an option to the pip command line.
OK, then if you want that you definitely need a PEP revision/update. Some things that will need to be considered:
build_wheel
with build isolation, so a standard update will mean changes to that as well as to pip.As a pip option, this is simply a quality of implementation option for pip. As a standard pyproject.toml
key, it has significant implications for the whole ecosystem (in theory).
As another possibility, maybe there's an option to make it a pip option, but add a feature to pip to read project-specific options from a tool-specific [tool.pip]
section in pyproject.toml
? That would still be a lot of work, but it would be limited to affecting pip, and it could be handled within the existing standards.
How would it work for dependencies?
Good question.
Conversely, for your option, how would something like pip install mpi4py fluidsim
work - given that pip makes no guarantees that it will install the 2 listed items in any particular order? I don't think we want the installed result to be different depending on an arbitrary decision that pip makes.
Optional build dependencies seem to me like a bad idea.
If A is an optional build dependency of B, then pip install A; pip install B
and pip install B; pip install A
would both end up with A & B installed but somehow the installed code of B would be different ?
This seems like a perfect recipe for painful debugging.
This would also means that the wheel cache should be deactivated for such packages. Or we would need to have different caches depending on the currently installed packages :no_good_man:
Optional build dependencies seem to me like a bad idea.
Interesting that you think it's bad. Can you please propose a better approach?
I didn't experience so many "painful debugging" related to this aspect of some packages I work with.
If someone see that a feature is not present (for example no MPI support), it's pretty simple to rebuild the package (pip install fluidsim --force-reinstall --no-cache-dir
) and it's solved. Of course, it is less simple than for simpler packages without optional build dependencies but it is not so bad. I think it is not the solution which is bad or too complicated for the use case, it is the use case which is complicated.
Note that optional build dependencies are quite common for non-Python dependencies. See for example how mpi4py chooses which MPI library should be used. Or how projects choose which Blas implementation should be used.
I use optional build dependencies for (1) MPI support and (2) Python-Numpy compilers (here, only Pythran).
Also pip is of course not able to install the MPI library so pip can not install mpi4py on machines for which the user has not previously install OpenMPI or another implementation.
Then, we can think about alternatives to my "bad idea" (build depending on installed packages) :
pip install fluidsim[mpi]
. Could be ok but not so simple (because no wheel for mpi4py and pip can't install OpenMPI).@xavfernandez How would you solve this problem then?
I think it's going to be more and more important to be able to compile the same Python-Numpy code with different tools, for example to target different hardware.
So the package should be able to use opportunistically such tools. Note that such tools are quite new (not 100% sure every works everywhere) and can lead to long and memory consuming build. We really don't want to add pythran as a hard build dependency (except if we provide a wheel, but here I can't).
I agree that in that case, we could use pip install fluidsim[pythran]
, but we would have the same problems than with my solution:
pip install pythran
pip install fluidsim
(which is reasonable from the user point of view) would lead to no Pythran compilation. "This seems like a perfect recipe for painful debugging" as you say :slightly_smiling_face:
How do you plan to build and publish wheels for this project, in such a way that the "optional dependencies" you're requiring would be correctly supported/handled? Or, to put it another way, how would you expect your proposal to work for the pip wheel
command?
How do you plan to build and publish wheels for this project, in such a way that the "optional dependencies" you're requiring would be correctly supported/handled? Or, to put it another way, how would you expect your proposal to work for the pip wheel command?
Thank you for asking good questions...
I don't think it is a good idea to publish wheels with MPI support. MPI is too dependent of the hardware.
But for Python-Numpy ahead-of-time compilers, it could just work fine with (from a clean environment):
pip install pythran
pip wheel fluidsim
which should build a wheel with Pythran compilation and without MPI support. (Note that Pythran is only a build dependency and not a runtime dependency, so this wheel should more fine without Pythran.)
The build of wheels built to be published is much less a problem because the maintainers control the environment. For example, in this case, I could choose to prepare the build environment with pip install pythran
and without pip install mpi4py
.
In practice, I'm not going to publish such wheels on PyPI because they would be preferred to sdist and would disable MPI build.
To provide compiled versions of the packages to the users, we would have to use conda, which can install OpenMPI.
Conversely, for your option, how would something like pip install mpi4py fluidsim work - given that pip makes no guarantees that it will install the 2 listed items in any particular order? I don't think we want the installed result to be different depending on an arbitrary decision that pip makes.
True.
pip would have to be aware of the optional dependencies, so that even with the command
pip install fluidsim-ocean mpi4py fluidsim
(fluidsim is a dependence of fluidsim-ocean)
pip installs in the following “topological order” : mpi4py, fluidsim, fluidsim-ocean
and NOT fluidsim, fluidsim-ocean and then mpi4py
Then, we could use something like (which would be compatible with the PEPs):
[build-system]
# Minimum requirements for the build system to execute.
requires = [
"setuptools", "wheel", "numpy", "cython", "jinja2", "fluidpythran>=0.1.7"
]
[tool.pip]
# packages that are used during the build only if they are installed
optional_build_tools = ["mpi4py", "pythran"]
I admit I'm ignorant on this topic. But without knowing anything, would it be possible as a work-around for there to be a wrapper package created (called something e.g. like mpi4py-pythran
) whose only responsibility would be to detect / bring in the optional dependencies if they exist? This wrapper package could be included as a mandatory requirement. This way pip would only have to know about the wrapper package. Its purpose would be to hide the other projects so that pip wouldn't have to be told or know about them. Or maybe this doesn't add anything and just pushes the problem down further.
To provide compiled versions of the packages to the users, we would have to use conda
Hmm, OK. So this is why the "optional dependencies" approach is a "bad idea" - if it doesn't support building and publishing wheels, it doesn't really fit well with pip's build model.
I guess I'm -1 on this idea in that case. I appreciate that you have a difficult issue to solve, and as things stand, pip doesn't have a good solution for you, but I'm not particularly comfortable with a partial solution that requires such extensive changes to how pip works for such a rare use case.
Ultimately, --no-build-isolation
is the "escape hatch" we provide for this situation. I know it means you have to manually manage the whole build environment (whether you do that by parsing pyproject.toml
yourself or using another means to record the build dependencies), so it's not ideal for you, but unless you can capture a solution to your use case in a PEP that covers all of the issues discussed here (building and publishing wheels, supporting build frontends other than pip, etc) I don't think there's much else that can be done.
(The possibility of a "use system site packages" type of isolation mode, triggered by a pip option, is still a reasonable pip enhancement, but given that you've discounted it as a solution for your use case, there's no identified situation where it's going to be of benefit to anyone, so it's going to be a bit of a hard sell to get it added to pip).
The fact that I won't provide wheel for fluidsim and fluidfft is really related to MPI and not to this feature request!
For example for the package fluidimage, we would also need this feature and we don't use MPI during the build, so it would be totally compatible with publishing a wheel built with manylinux.
pip is the official Python installer. It seems to me that it should support use cases of most Python packages. There are cases for which wheels are not adapted. Use of MPI is a good example.
The possibility of a "use system site packages" type of isolation mode, triggered by a pip option, is still a reasonable pip enhancement
If this mode could also be triggered with a [tool.pip]
option in pyproject.toml, it would be really fine.
The situation that we have now (and that we will have for a long time if nothing is done), existence of a pyproject.toml implies totally isolated build, many packages can't use pyproject.toml at all. [OK I understood, if there is a pyproject.toml file, you can still use pip but all users would have to use --no-build-isolation
and you're going to loose the main advantage of having a pyproject.toml file]
Really, something has to be done.
Another solution is to stop with this pyproject.toml idea and to use setup.cgf and setup_require. But then, it would have to be well supported.
To summarize what would be useful for packages with build tools used opportunistically:
[tool.pip]
option in pyproject.toml to trigger this mode for the package[tool.pip]
option to declare optional_build_tools
, which would allow pip to install things with a better order.Note that it is fully compatible with wheels, except in special cases where anyway, you don't want to use wheels, in particular MPI.
@cjerdonek I'm sorry but I don't understand your proposed solution...
To rephrase, my idea was the following. Using your original post as a starting point, instead of adding optional = ["mpi4py", "pythran"]
to your pyproject.toml
, you would add an additional (mandatory) dependency to your requires
with a new package whose name could be something like mpi4py-pythran
. The package's only purpose would be to contain the logic of detecting and installing the optional (sub-) dependencies mpi4py
and pythran
if they exist. I'm wondering if it would be possible to create such a package. Perhaps this would allow more flexibility since the mpi4py-pythran
package could have installation logic that wouldn't be limited by the parent project's pyproject.toml
file.
Alternatively, maybe a custom backend that extended the setuptools backend could implement the necessary logic. Possibly the prepare_metadata_for_build_wheel
hook could be used to detect what's available and make the necessary decisions?
These ideas seem difficult to implement. The setup.py file is executed in the isolated environment. It seems difficult to de-isolate an isolated environment from this isolated environment.
The new package mpi4py-pythran
would also be installed in the isolated environment so it seems difficult to (i) detect if some packages are installed in the environment from which pip has been used (we know nothing about it from there) and (ii) tweak the isolated environment to make the packages usable from the isolated environment.
For the second idea, how could we change the setuptools backend before the execution of setup.py? Then, prepare_metadata_for_build_wheel
is executed in the main environment or in the isolated environment? Can we control from there how is created the isolated environment?
I'm really open to other solutions but I don't understand.
For the second idea, how could we change the setuptools backend before the execution of setup.py?
You don't, you write your own backend that is specified in pyproject.toml
. That backend can do whatever you want, but I'd assume it simply runs setuptools for the majority of the processing. The setuptools backend itself is only a very small file in the setuptools distribution.
Anyhow, there are a fair number of possibilities for you to explore here, and I don't think the people here really understand your problem in enough detail to give you anything than pointers, so you'll need to flesh out the details yourself.
pip install fluidsim[mpi]
. Could be ok but not so simple (because no wheel for mpi4py and pip can't install OpenMPI).
I think this is the cleanest option, short of rolling out your own PEP 517 backend (which shouldn't be too hard by design btw).
I don't think the people here really understand your problem in enough detail to give you anything than pointers, so you'll need to flesh out the details yourself.
+1
Hi,
The hints based on PEP 517 backend (prepare_metadata_for_build_wheel
) or on a new package mpi4py-pythran are really not interesting for this problem.
I quote PEP 517 (https://www.python.org/dev/peps/pep-0517/#build-environment):
A build backend MUST be prepared to function in any environment which meets the above criteria. In particular, it MUST NOT assume that it has access to any packages except those that are present in the stdlib, or that are explicitly declared as build-requirements.
The whole build is fully isolated so that it is impossible from the build backend, or from a intermediate package in build-system.requires
to know anything about the environment where the package is installed, or or course to use any packages installed in this environment.
The only way to use during the build packages that are not in build-system.requires
is mentioned in PEP 517:
Therefore, build frontends SHOULD provide some mechanism for users to override the above defaults.
For example, a build frontend could have a --build-with-system-site-packages option that causes the
--system-site-packages option to be passed to virtualenv-or-equivalent when creating build environments [...]
This is point 1. in https://github.com/pypa/pip/issues/6144#issuecomment-455500440
@paugier
FTR here's how I'm trying to hack it: https://github.com/aio-libs/aiohttp/pull/3589
AFAICT, there's only one actionable thing here: Adding something like the --build-with-system-site-packages
option mentioned in the PEP.
Honestly, I think we should make it possible on a per-package basis to expose this information (i.e. a PEP 517? update).
@pradyunsg
This would be great and would nicely solve the problems of PEP 517 / conda incompatibility and PEP 517 / mpi4py incompatibility.
Some high efficiency computing packages incompatible with isolated build would be able to opt out from isolated build (and all its good characteristics) while still using pyproject.toml
to gain more stability and correct behavior (pip install ... works fine without import errors and build what can be built) on different environments (conda and environments with and without MPI libraries).
Otherwise, we'll need to avoid pyproject.toml
and keep using the old, somehow broken and depreciated setup_requires
solution (what we now do in https://bitbucket.org/fluiddyn/fluidsim/).
I think it'd be a good idea to have some mechanism of declaring "this package can optionally use that dependency in case if it's possible to install it", maybe some extension to pyproject.toml
@paugier FTR I've hacked this via another PEP517 backend which under the hood tries doing pip install
on the optional deps and ignores any failures: https://github.com/aio-libs/aiohttp/pull/3589/files#diff-522adf759addbd3b193c74ca85243f7dR21