It appears that the locations
submodule which was public API until 9.x
has moved to pip._internal.locations
and is becoming private API in 10.0
.
Using pip.locations
is the only API available to know the distutils_scheme
for the current installation prefix.
It is used by a number of extension modules to determine where distutils' install_headers
directive places the headers that are passed to it. For example pybind11's get_include()
function returns locations.distutils_scheme('pybind11')['headers'])
.
Hence, I would like to argue that it should remain public API.
We cannot just rely on the sys prefix as certain installations such as homebrew's python have exotic distutil schemes and place headers in non-standard directories.
Hi - use of pip.locations
was never actually a supported API. We announced this change some months ago, to give projects time to migrate to supported alternatives. I guess you didn't see the announcement (we're looking into what options we have to announce changes like this better in future to ensure we reach more of the community).
I don't really have a good solution for you at this point - I guess you could copy the code used by pip into the pybind11 library as a local module, as a short-term measure, while you investigate whether there's any other project offering a supported means of getting this information. If it's a common need (it seems like it would be if mainstream distributions like homebrew have exotic schemes that need to be handled) then hopefully something will exist.
Regarding your announcement, I did not know that locations
was supposed to be "internal API". The thing is that I don't think that there is any alternative at the moment. We have done some research for and could not find one.
There has been a thread on distutils-sig about actually including this API into distutils since no alternative exists at the moment, and the response was that distutils should not be iterated upon anymore.
Why not keeping locations
public? Note that pybind11 is a utility that is meant for extension authors and is quite popular. So this impacts many people. (search for distutils_scheme
on github and you will find a lot of those) (168000 instances).
Besides pybind11-based extensions, I know of (many) other packages that make use of the install_headers
directive and the distutils_scheme
module to retrieve the installation directory.
One way to take into consideration the users of that API would be to have a deprecation period first, rather than completely removing it.
From the linked announcement:
Just in case it's not clear, simply finding where the internal APIs
have moved to and calling them under the new names is not what
people should do. ... the idea of this change is to give people the incentive
to find a supported approach, not just to annoy people who are doing
things we don't want them to ;-)
and from the second post in this issue:
We announced this change some months ago, to give projects time to migrate to supported alternatives.
In principle, this seems fine. And yet this logic seems postulated on the assumption that a supported alternative exists. I'm not aware of one, but if you can point us to one we'd be happy to adapt. If there isn't one, that means the only alternatives are, by definition, unsupported, and in such a case "simply finding where the internal APIs have moved to and calling them under new names" seems like the least bad alternative.
In this case, I can't offer an alternative (honestly, I don't know all of the packaging-related projects on PyPI). What I can suggest is that if you take pip's code and copy it into your project (or better still, package it up and publish it on PyPI) you will be supporting it yourselves, and then by definition it will be a supported option. Heck, if you publish and support it, pip may even switch to using it as a dependency instead of maintaining the internal code!
What the announcement was intended to convey was that the (very small, all-volunteer) pip development team doesn't have the resources to provide the sort of backward compatibility guarantees, or bug fixing in the face of unusual environments, that is implied by an API being "supported". We have enough to do simply maintaining a command line program (pip). We never did offer that level of support, and the radical internal changes to pip needed as part of implementing the new features in pip 10 meant that we would be breaking code that imported pip on the assumption that was supported anyway - so we decided to take the "opportunity" (if that's how you can describe the fact that we knew we'd be breaking people's code[1] 馃槥) to make a clean break, and explicitly declare that the code was desupported.
[1] even if those people were doing something undocumented, and which a relatively superficial search of the pip issue tracker or the pre-release documentation would have confirmed is unsupported.
have a deprecation period first
Can you suggest how we could have done that? I appreciate that it's too late now, but we did think long and hard about how we could do something like that, and frankly there just wasn't an obvious way. As this issue shows, posting announcements is not enough. Documentation obviously isn't going to be effective. Adding deprecation warnings isn't possible, because pip itself uses those APIs (legitimately) and we would either have to have a way of disabling the warnings (which people importing pip would likely just use) or impact the majority of pip users with warnings that wouldn't apply to them as they are only using pip the command line.
Ultimately we did our best - that may not have been sufficient for everyone, but the choice was deliberate, and done with the intent of providing the best outcome we could manage for as many of our users as we could.
Well the pip.locations
would be a deprecated shim on pip._internals.locations
.
Also I don't understand why it is too late since 10.0 is not out.
Well the pip.locations would be a deprecated shim on pip._internals.locations.
Yes, something like that could have been possible - but only if we knew which APIs needed it (which we didn't). There's no way we'd have been able to replicate every possible use of pip's internals with a deprecation warning. (Remember - this is people importing code that there's no documentation about - they could be doing literally anything).
Honestly, I'm pretty confused as to why "it's not documented, so you use it at your own risk" isn't sufficient here. But clearly some people feel that they "had" to use undocumented APIs because we'd (in some sense) "failed" to provide a supported means - although why doing so results in any sort of entitlement to support is beyond me, to be frank.
Also I don't understand why it is too late since 10.0 is not out.
10.0 release is 1 week from now, and we're not accepting any further changes other than fixes for release blockers. A change like this would be too big to get sorted in that timeframe, even if it were a release blocker (which it isn't).
Anyway, I think this discussion has probably reached the end of its usefulness. Hopefully it clarified things a little.
Here's the thing: we're coming to you and suggesting that there is a significant need for a particular feature that is already in pip, and has been there for a while. pip 10 more explicitly marks it as internal use only. We're not saying we need the entire locations
module exposed publicly, but are just asking you to provide some public API into it for one particular part of pip
's functionality that doesn't exist elsewhere (precisely because it's the sort of thing that fits with pip
s purpose and really doesn't fit with the purpose of some other random project).
OK, that's a feature request we can consider. It'd be for post-10.0, and I don't want to give you any false hope (we've pretty strongly indicated in the past that we don't have the manpower to provide supported programming APIs), but I'm OK with marking this issue as a post-10.0 feature request.
OK, that's a feature request we can consider. It'd be for post-10.0, and I don't want to give you any false hope (we've pretty strongly indicated in the past that we don't have the manpower to provide supported programming APIs), but I'm OK with marking this issue as a post-10.0 feature request.
distutils_scheme
is a very used API, as you can see with the following github search results. It would not make much sense to release 10.0 with this removal, considering the huge amount of code that this would break.
One way to do this would be to use the ShimModule
approach from IPython, which would automatically results in a ShimWarning
when importing directly from locations
.
We relied a lot on ShimModule
when we split IPython into all the Jupyter subprojects, so I think that it is a fairly robust solution.
You can see the code for ShimModule
here: https://github.com/ipython/ipython/blob/master/IPython/utils/shimmodule.py
I think that it would be a good thing for modules that were displaced to have a shim using this technique in 0.10.0
.
well, how about moving the locations code to something like distlib or a own package instead, as far as i can tell no matter what you do the code that does the trespassing on pip internals will break starting with the next major pip release, why not simply publish a new lib, then adopt pip to use it later on
well, how about moving the locations code to something like distlib or a own package instead, as far as i can tell no matter what you do the code that does the trespassing on pip internals will break starting with the next major pip release, why not simply publish a new lib, then adopt pip to use it later on
In our client code, we are working on another solution, but there is a huge amount of work that will break with this change. I am arguing that starting with a deprecation is a better way to take this into consideration.
given that it wa documented since years that pip has no official api, a deprecation is entirely unwarranted, from my pov its just that the people that cut effort short in the beginning by simply ignoring all details and doing what works at best by accident are now bitten
from my pov that sting is well deserved - if you make something you need to rely on - its due diligence to check if the building blocks are something you can rely on, and using the pip api is not something you can rely on in general as a 3rd party
given that it wa documented since years that pip has no official api, a deprecation is entirely unwarranted, from my pov its just that the people that cut effort short in the beginning by simply ignoring all details and doing what works at best by accident are now bitten
from my pov that sting is well deserved - if you make something you need to rely on - its due diligence to check if the building blocks are something you can rely on, and using the pip api is not something you can rely on in general as a 3rd party
All the people involved in this thread are volunteer maintainers and developers of large open source projects with a lot of users and a limited number of contributors so we are all on the same side here.
I am genuinely trying to provide a constructive feedback and offer technical solutions but I don't think that the "well deserved" comment on the users of an API really helps. As I stated above, we have a workaround but there is still this massive code base of extension modules making use of distutils_scheme
which in my opinion should be taken into consideration.
Hence my proposal is to ackowledge that use by deprecating APIs and issuing a warning rather than removing it all at once.
Like @pfmoore said, this was NOT a public API so I really don't see why we should burden us with any sort of migration path for pip internals.
A fix like https://github.com/pybind/pybind11/pull/1190/files seems simple enough...
[...] I really don't see why we should burden us with any sort of migration path for pip internals.
because a lot of people use it as per earlier comments.
@SylvainCorlay that doesnt in any way explain why their lack of diligence should enforce more work on pip
I think the actionable thing here is for the code from pip's pip._internal.location
to move into an external library that pip then uses...
@vsajip Your thoughts on adding something like this to distlib? (or does it already exist?)
As for making this an API in pip itself -- a strong -1 from me for that.
We could just make a small, independent library.
@dstufft I have no problem with that. I presume we're talking solely about the distutils_scheme()
function (and running_under_virtualenv
, which it uses)? The rest of the stuff is pretty pip-specific.
I don't see how a truly independent library could be used for this, unless pip
is guaranteed to always use it now and in the future - but it seems this wouldn't be the case, otherwise the pip
team wouldn't have made it an internal API.
IIUC pip
decides where to put headers when installing using the distutils_scheme
API, and then puts them there. The location is different to where e.g. distutils
would put headers, which I think would be based on a sysconfig
scheme.
Therefore, as long as everything is always installed using pip
, the locations will tie up when queried using the distutils_scheme
API - but as far as I know, if you used a different tool to install something (even python setup.py install
), then all bets are off regarding getting the location where some header file was installed.
Perhaps I haven't understood correctly? If so, please put me right.
@vsajip from pips pov it ALWAYS was a internal api - the proposed solution just puts it into a reusable place by extracting it into a library that then will get vendored
from pips pov it ALWAYS was a internal api
Yes, I meant originally, and not the recent move. The main point I was making is orthogonal to that, though. From what I can see, the implementation is completely unconnected to e.g. sysconfig
norms / schemes, and is not standards-based (other than pip
being a de facto standard, much like setuptools
).
@vsajip the main reason the pip team doesn't turn every little piece of code that could be in a library is - that it adds sub-project and vendor-ing overhead - as explained earlier its entirely fine to do so for a constrained piece after consideration
the pip will use it requirement will be met by pip switching from self-implement to vendor after the extraction
@vsajip Fundamentally, the question of "for this Python installation, where should the various components of a package be installed" is not a pip specific question. But the stdlib doesn't provide an easy way to answer it. From a brief scan of pip's code in distutils_scheme()
I see 3 main issues:
sysconfig
doesn't allow for the possibility of overrides like --home
or --prefix
.The worst case here is apparently the install location for headers in a virtualenv, but there's no reason that I can see why the stdlib couldn't report an "official" answer in that case which matches what pip does, nor do I know why what the stdlib does report isn't suitable.
But if we take as granted that the stdlib does a bad job here, then "what should installers do" is very much not a pip specific question. It's probably a very good candidate for a packaging standard, actually. I imagine the only reason pip has an internal function for getting this data is expediency - there was nothing official, and at the time there was no real concept of a "packaging standard".
So from where we are now, this function is ideal for standardisation and migrating to an external library. If it were standardised, packaging
would probably be a good place. But one core principle of packaging
is that it implements behaviour defined by agreed standards, so in the absence of a standard, some other library is the obvious answer. A standalone library for just this one function is one option, although it's a lot of overhead for one function. Or distlib
(as a place where packaging-related things that aren't yet backed by a standard are welcome) is another alternative. There may be others, I don't know. My preference as a pip maintainer would be "something we already vendor" or "something small that we can vendor without much problem", in that order. We would want to vendor this in, and not duplicate the functionality.
But honestly, the people wanting to use this data from outside of pip are probably going to need to do some of the work here. I'd be happy to write PRs to move distutils_scheme
to packaging
, but I'm not willing to deal with the process of standardisation. I'm not willing to maintain a new library just for this function that I'd never use myself, and I'm not going to try to persuade you to accept it into distlib
.
Judging by https://github.com/pybind/pybind11/pull/1190/files, it looks like people may be choosing to duplicate pip's code in their own projects. I don't like that, but I can't fault them for choosing expediency over long term "good practice".
@pfmoore I pretty much agree with everything you said. There's an extra wrinkle as regards virtualenvs - the virtualenv.py
layout is slightly different from the results of python -m venv
. For example, on POSIX, for a some_venv
created with virtualenv.py
, some_venv/include/python3.6m
is a symlink to the system Python location /usr/include/python3.6m
, which means you can't (and shouldn't) install .h files into it. Hence the need for the separate /include/site/python3.6m
location computed by distutils_scheme
. For a venv created with python -m venv
, it's not a symlink, so header files can go into it.
I agree that standardisation is the next step. I would certainly consider implementing the functionality in distlib
if it were standardised. While I have implemented things in distlib
where they weren't backed by standards, they were either areas where there were no interoperability concerns (e.g. locators
, resources
and scripts
sub-modules) or the standards were being drafted/discussed and the distlib
implementation was a sort of proof-of-concept (e.g. stuff in support of PEP 426, which muddled along for quite a while before being deferred and then ultimately withdrawn). I'm not really keen to implement the functionality if it's not standardised.
My off-the-top-of-my-head thoughts re. standardisation are that it could just be another sysconfig
scheme that all installers agree to use, or a modification of existing schemes if that doesn't lead to backward compatibility issues (it _shouldn't_, if the schemes are used as they should be, but then again you never know ... I just hit a problem today where setuptools.setup()
seems to ignore a custom installer command class passed to setup()
, and I can't see why - distutils.core.setup()
invokes it as expected).
I too don't have the time or inclination right now to take something like this through standardisation, though I would try to join in the discussion if someone else were to take the lead on it.
it looks like people may be choosing to duplicate pip's code in their own projects
Indeed, and this may come to bite them.
I also noticed that distutils_scheme
doesn't actually return a valid scheme in the sense that the key for the include path in a sysconfig
-returned scheme is include
, but in the return value of distutils_scheme
, the key is headers
. So the name distutils_scheme
could be considered a misnomer.
Pushing this down the road to the next release since I don't think this'd happen in time for 18.0.
I'm taking this out of the 18.1 milestone since I don't see any activity here and I doubt this would be ready in time for 18.1.
I think the concensus here is that we need a separate external library to be developed based off of pip's location module and the same would be vendored in pip. I'll update the title to reflect that.
Most helpful comment
All the people involved in this thread are volunteer maintainers and developers of large open source projects with a lot of users and a limited number of contributors so we are all on the same side here.
I am genuinely trying to provide a constructive feedback and offer technical solutions but I don't think that the "well deserved" comment on the users of an API really helps. As I stated above, we have a workaround but there is still this massive code base of extension modules making use of
distutils_scheme
which in my opinion should be taken into consideration.Hence my proposal is to ackowledge that use by deprecating APIs and issuing a warning rather than removing it all at once.