Pip: With Enum34 installed on python3.6+, pip picks up enum34 instead of the stdlib enum package

Created on 9 May 2020  路  29Comments  路  Source: pypa/pip

Environment

  • pip version: pip 20.1
  • Python version: Python 3.6.10 [This was reported to me against python3.6 so I tested and fixed it there but it should affect later versions of python as well]
  • OS: Fedora 31

Description

When the enum34 package is installed on a system, pip will no longer install a project that uses pyproject.toml instead of setup.py. Instead, it tracebacks inside of the stdlib because pip is finding the enum34 package's version of enum.py instead of the stdlib version. This happens in the re-invocation of pip from ~pip/_internal/commands/install.py~https://github.com/abadger/pip/blob/master/src/pip/_internal/build_env.py#L171 although I think the bug should be fixed in __main__.py instead.

(edited: Put the correct file into the description)

The reason is that __main__.py has code which changes sys.path in case we are running pip from a wheel. That code appears to be changing sys.path in certain circumstances when we aren't running from a wheel by mistake. So that code is where the bug lies, not in install. It may be correct to change install.py as well but I am leaning towards not doing that... It looks like the code in install is making sure that it reinvokes itself by specifying the actual path to pip which it was invoked with. If we tried to use something different like python -m there, then we wouldn't be assured of executing the code from the same install of pip.

I'll add an easy reproducer using pip's repository in the reproducer section.

Fixed by #8213

Expected behavior

Even if enum34 is installed, pip should still be able to install other packages with a pyproject.toml file.

How to Reproduce

  1. python3.6 -m pip install --user enum34
  2. python3.6 -m pip install --user git+git://github.com/pypa/pip
  3. A traceback occurs:

Output


[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -m pip install --user enum34
/usr/lib64/python3.6/enum.py
Requirement already satisfied: enum34 in /home/badger/.local/lib/python3.6/site-packages (1.1.10)
Could not build wheels for enum34, since package 'wheel' is not installed.
[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -m pip --version
/usr/lib64/python3.6/enum.py
pip 20.1 from /home/badger/.local/lib/python3.6/site-packages/pip (python 3.6)
[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 --version *[fix-pip-to-work-in-the-face-of-enum34]  (08:47:21)
Python 3.6.10
[pts/16@peru /srv/git/python-pip/pip/news]$ python3.6 -m pip install --user git+git://github.com/pypa/pip
Collecting git+git://github.com/pypa/pip
  Cloning git://github.com/pypa/pip to /var/tmp/pip-req-build-edpkq5kv
  Running command git clone -q git://github.com/pypa/pip /var/tmp/pip-req-build-edpkq5kv
  Installing build dependencies ... error
  ERROR: Command errored out with exit status 1:
   command: /usr/bin/python3.6 /home/badger/.local/lib/python3.6/site-packages/pip install --ignore-installed --no-user --prefix /var/tmp/pip-build-env-aap_1xn0/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel
       cwd: None
  Complete output (14 lines):
  Traceback (most recent call last):
    File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
      "__main__", mod_spec)
    File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
      exec(code, run_globals)
    File "/home/badger/.local/lib/python3.6/site-packages/pip/__main__.py", line 23, in <module>
      from pip._internal.cli.main import main as _main  # isort:skip # noqa
    File "/home/badger/.local/lib/python3.6/site-packages/pip/_internal/cli/main.py", line 5, in <module>
      import locale
    File "/usr/lib64/python3.6/locale.py", line 16, in <module>
      import re
    File "/usr/lib64/python3.6/re.py", line 142, in <module>
      class RegexFlag(enum.IntFlag):
  AttributeError: module 'enum' has no attribute 'IntFlag'
  ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3.6 /home/badger/.local/lib/python3.6/site-packages/pip install --ignore-installed --no-user --prefix /var/tmp/pip-build-env-aap_1xn0/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel Check the logs for full command output.

needs discussion bugfix enhancement

Most helpful comment

Reading through the thread leaves me feeling we might be over-complexing the problem. pip does not depend on anything third-party, and no standard library components depend on pip. So maybe the solution is simply that the sys.path manipulation needs to be smarter and inject the directory after stdlib, but before site-packages. I believe this would solve the enum34, while not affecting any of the usages (at least none already mentioned).

Whether the solution would induce too much maintenance or other overhead to be worthwhile is another problem. But I feel it should be taken into account while we look for other solutions (e.g. to de-support python dir/to/pip entirely), since any of them may induce maintenance overhead as well.

All 29 comments

This happens in the re-invocation of pip from pip/_internal/commands/install.py

Can you provide a link to the relevant line of code? I can't find what you're referring to here 馃檨

Also, I just tried this and it worked fine for me:

[root@696e6f345c66 /]# python3.6 -m pip install --user enum34
WARNING: Running pip install with root privileges is generally not a good idea. Try `python3.6 -m pip install --user` instead.
Collecting enum34
  Downloading https://files.pythonhosted.org/packages/63/f6/ccb1c83687756aeabbf3ca0f213508fcfb03883ff200d201b3a4c60cedcc/enum34-1.1.10-py3-none-any.whl
Installing collected packages: enum34
Successfully installed enum34-1.1.10
WARNING: You are using pip version 19.1.1, however version 20.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[root@696e6f345c66 /]# python3.6 -m pip install --user git+git://github.com/pypa/pip
WARNING: Running pip install with root privileges is generally not a good idea. Try `python3.6 -m pip install --user` instead.
Collecting git+git://github.com/pypa/pip
  Cloning git://github.com/pypa/pip to /tmp/pip-req-build-st28fks3
  Running command git clone -q git://github.com/pypa/pip /tmp/pip-req-build-st28fks3
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: pip
  Building wheel for pip (PEP 517) ... done
  Stored in directory: /tmp/pip-ephem-wheel-cache-fbj8jirm/wheels/1f/11/b0/5bd125f64cf9eedf300124c783996f30957e0d1b4fac0c38d3
Successfully built pip
Installing collected packages: pip
Successfully installed pip-20.2.dev0

@pfmoore It's this line here: https://github.com/abadger/pip/blob/master/src/pip/_internal/build_env.py#L171

/me looks through your output to see if he can spot what is different between our two environments.

pip_location is defined at the top of the file like this: from pip import __file__ as pip_location

Oh! I think I might know why you couldn't reproduce. Where is your initial pip located? My initial pip is also in ~/.local. That would come into play because the line in build_env.py is going to ~put~result in a directory relative to the pip installation you are running into sys.path. So if you were running the initial pip from /usr/lib/[...]/site-packages but installed enum34 into ~/.local/[...]/site-packages, then you wouldn't trigger the issue where enum from enum34 is used in preference to the one from the stdlib.

/me goes forth to verify this is the difference now that he's posted the theory here.

EDITED: My language was unclear iniitally. The code in build_env re-invokes pip as sys.executable /path/to/the/pip/install/directory/. This results in a directory relative to the pip installation to end up in the re-invoked pip's sys.path. However, the code in pip/__main__.py is what actually puts that directory into sys.path.

Yep, that's what it was. So if you want to reproduce from where you said it worked for you, you can simply run the last pip command again. Since you just installed pip from git into the user home dir, the second time the command runs, it will use that pip and then fail.

If you want to start over, here are all the steps I took with a container to reproduce (Leaving off output but adding comments)

# Host computer
podman run -it fedora /bin/bash
# Now inside the container
# Running ensurepip with --user puts pip into the same site-packages as enum34
# will be installed into
python3.8 -m ensurepip --user
python3.8 -m pip install --user enum34
python3.8 -m pip install --user git+git://github.com/pypa/pip
# Traceback from the initial bug report here

Oops, and I see why you couldn't find the line I was talking about earlier too... I posted the wrong file at the top of the issue. Sorry. I was greping the code for the command line strings to see if I could find any other occurrences that would have the same problem and I cut and pasted the wrong filename into the issue report.

In python-poetry/poetry#1122 this happened with Python 3.7.3.

So this is related to running pip for the isolated build environment. That makes sense, and was what I initially assumed you might be thinking about, but your reference to the wrong file put me off track. Thanks for clarifying.

So I think there probably is something to address here, but I'm not sure your proposed fix is the right approach. I still think that running pip as python /path/to/pip isn't something we should support - and if that means we need to consider another way of running pip in build_env.py then let's look at that.

. Yeah, that's fair. So there's several things I am thinking....

  • From the comment in the source code of __main__.py I interpret that the current code in __main__ is overzealous in what it adds. If you don't think the scope of that code should be decreased, what should the comment actually say? What use case in addition to a pip wheel is it meant to address?
  • It seems like there should be a different way to address this particular issue inside of build_env. The strand that has to be pulled apart for any other solution to work, though, is that it needs to load the correct library for pip without affecting the other libraries that might be pulled in. Python has all sorts of ways to invoke pip but half of them will end up affecting sys.path in a global manner. I think you could separate out the pip library that you want to use (turn it into a wheel or copy the directory structure to its own directory or etc.) Then place that onto PYTHONPATH before re-invoking pip using sys.executable -m That way only the library that you separated out would override everything else on the PYTHONPATH.

I've figured out another way to change build_env.py that looks like it works.... but I'm afraid it might be an undocumented implementation detail of the way python works so I'm not sure if it's safe to rely on. Instead of executing the directory where pip lives, execute pip's __main__.py file instead. Executing the directory will end up setting __package__ to the empty string. Executing the __main__.py file will set __package__ to None. __main__.py only adds to sys.path if __package__ is the empty string so it will work for this as well (Probably needs to have a comment if you go this route, though, so that everyone knows that the condition needs to be True for empty string and False for None).

If you want to go that route, it would look something like this:

diff --git a/src/pip/__main__.py b/src/pip/__main__.py
index a529758b..2e9ff7e7 100644
--- a/src/pip/__main__.py
+++ b/src/pip/__main__.py
@@ -12,16 +12,16 @@ if sys.path[0] in ('', os.getcwd()):

 # If we are running from a wheel, add the wheel to sys.path
 # This allows the usage python pip-*.whl/pip install pip-*.whl
+# Do not enter this codepath if __package__ is None as that
+# means that pip/__main__.py was invoked directly, which is what
+# build_env.py does (issue #8214)
 if __package__ == '':
     # __file__ is pip-*.whl/pip/__main__.py
     # first dirname call strips of '/__main__.py', second strips off '/pip'

diff --git a/src/pip/_internal/build_env.py b/src/pip/_internal/build_env.py
index b8f005f5..77e192f2 100644
--- a/src/pip/_internal/build_env.py
+++ b/src/pip/_internal/build_env.py
@@ -167,8 +167,9 @@ class BuildEnvironment(object):
         prefix.setup = True
         if not requirements:
             return
+        command = os.path.join(os.path.dirname(pip_location), '__main__.py')
         args = [
-            sys.executable, os.path.dirname(pip_location), 'install',
+            sys.executable, command, 'install',
             '--ignore-installed', '--no-user', '--prefix', prefix.path,
             '--no-warn-script-location',
         ]  # type: List[str]

Documentation on __package__ that I can find:

If you want to go this route, perhaps we should ping ncoghlan at that time to see if he considers the None/empty string split to be something that we can depend upon here.

I'd prefer that we switch our BuildEnvironment logic to be not dependent on the path/to/pip hack, instead of actively supporting python path/to/pip invocations in general.

Directly invoking __main__.py is perfectly reasonable!

Okay, two of the three ways I could see this working are easy to implement so I implemented them in two separate PRs. I'm not going to implement turning the pip directory into a wheel and then running that in build_env because there's probably going to be design decisions that would require some back and forth with committers in order to come up with the right way to do it and I don't have the time to commit to that. Here's my summary of the three methods:

  • Only deal with wheels in __main__.py: https://github.com/pypa/pip/pull/8213

    • Breaks the use case of running python path/to/pip/ in a checkout unless PYTHONPATH is set (ie: this will work: PYTHONPATH=$(pwd)/src/ python src/pip.
    • Makes the code in __main__.py conform to the comment which says that adding to the front of sys.path is to make wheel files work.
  • Separate the build_env invocation from the wheel and development invocation: https://github.com/pypa/pip/pull/8217

    • Makes the code in build_env.py use a different method of invoking pip so that we can handle it differently (ie: not at all) in pip/__main__.py.
    • Might be depending on Python implementation details. Should ask ncoghlan, the PEP366 author whether we should be depending on empty string vs None if this is how we want to proceed.
    • Makes comments in __main__.py conform to expectations set forth in these PRs and bug reports.
    • python path/to/pip/ continues to work as well as it did before
    • EDIT: I've been thinking about this this week and, although I haven't tested it, I think this means that pip might not execute the same library version that as python path/to/pip/__main__.py is a part of. I think __main__.py might be treated as a script in this case (not a module whose __main__ is being executed) and therefore, python won't treat the directory that __main__.py is in specially for looking up additional paths. This will mean that when pip reinvokes itself, it could end up running a different version of pip.
  • EDIT (Added this solution: When adding the path to pip to sys.path, make sure we add the path after the stdlib paths: https://github.com/pypa/pip/pull/8241

    • @uranusjr 's idea from: https://github.com/pypa/pip/issues/8214#issuecomment-626395247
    • The only potential drawback with this is that I don't know just how tricky it is to determine all of the stdlib paths. The PR I submitted searches for three paths that I know should count as "stdlib paths":
    • sysconfig.get_path('stdlib')
    • sysconfig.get_path('platstdlib')
    • sysconfig.get_config_var('DESTSHARED')
    • The first two of those are documented on https://docs.python.org/3/library/sysconfig.html#installation-paths as being associated with the stdlib. The last one is where compiled extension modules from the stdlib live on the Linux systems I have used. I do not know if this will also contain the stdlib compiled extension modules on Windows systems or if there's a different var that needs to be scanned for.
  • Build a pip wheel or copy of the pip dir for the isolated build: No sample implementation.

    • Might be a good fit for the model of an isolated build.
    • More intrusive than either of the other solutions.
    • Should be doable without breaking python path/to/pip/

Note that even though any one of these will fix this issue, none of them are mutually exclusive. ie: you could implement all of them if you felt that together, they made the code make more sense as a whole.

Catching up on the comments, but I wanted to respond to this (from the other thread):

Paul seemed to be arguing that invoking pip like that from the CLI is not supported. That it's only supported from within BuildEnvironment right now. @pfmoore is that a method of invocation that you do want to support?

No, you misunderstood me badly. I do not think we should support python /path/to/pip as a valid invocation method. I think how build_env.py invokes pip should be changed (I see you've now submitted a PR doing this).

Maybe you misunderstood because I noted that python /path/to/pip.whl (note this is the wheel) is IIUC used by get-pip.py (I was still confused by your report at that point, and thought that was related to what you were saying). But that's still not supported outside of get-pip.py, and if it broke we'd look at changing get-pip.py, not making it supported.

Thanks for responding. Unless I'm misunderstanding you _now_, I think I understood you just fine. Here's what I said in the portion you just quoted with some emphasis added:

Paul seemed to be arguing that invoking pip like that from the CLI is not supported.

which seems to match what you just said, yes?

I do not think we should support python /path/to/pip as a valid invocation method.

I was asking for confirmation of that because @uranusjr was using a user invoking python /path/to/pip (directory) as his example of what did not work with #8213: https://github.com/pypa/pip/pull/8213#issuecomment-626244364 If python /path/to/pip/ is not a supported use case, then the fact that python /path/to/pip stops working doesn't seem like it is a valid argument against #8213.

which seems to match what you just said, yes?

Yes it does. Sorry, this thread got massively confusing and I misread 馃檨

If python /path/to/pip/ is not a supported use case, then the fact that python /path/to/pip stops working doesn't seem like it is a valid argument against #8213.

The various misinterpretations, incorrect statements, and general confusion has rendered this discussion pretty near to impossible to follow, TBH. Some of that is my fault, for which I apologise.

8123 is entitled "fix when pip is invoked as python /path/to/pip", and I think most of us have been responding to the implication from that title, that it should be allowed for pip to be invoked that way.

In actual fact, the issue now appears to be something more like "Pip's machinery for creating an isolated build environment doesn't correctly ignore packages installed in the user's environment". And we seem to have jumped straight from there into debating solutions, rather than understanding what's actually going on here. For example, why is enum34 being picked up at all, given that the invocation you pointed to specifies --ignore-installed?

I'm going to ignore this thread for a few days, to give things a chance to settle down, and a clearer picture to emerge (in my own head, if nowhere else!) I also intend to ignore all of the proposed fixes, and focus solely on understanding what's going on with the issue when I come back to it - and frankly, I'd suggest others do the same. We can evaluate possible fixes once we better understand what we're fixing...

Cool, I'm available on IRC when you are ready to talk about it from a deeper level. I can talk there and post summaries here or someone can ping me to take a look at updates here. (If I'm busy, once something goes out of my mental buffer, it can be hard to get it back in without pinging me there to get me to look at something ;-)

When you get back to this, regarding --ignore-installed, I believe that --ignore-installed is for pip when it is deciding what packages need to be installed. This issue doesn't have to do with the packages pip is depsolving for. It has to do with what packages the python interpreter is locating for satisfying imports as it runs pip itself. So --ignored-installed won't have any effect on that.

I'm going to post this here, too because I can see a misunderstanding but I have no expectation that you'll read it and get back to me for a while:

I changed the title on #8213. Your suggested title was slightly different than what the problem actually is. It's not that pip should ignore the user's environment when it reinvokes itself for an isolated build; it's that pip should not place any directories with libraries other than pip before the stdlib in sys.path.

Here's a step through of how sys.path is being modified:

  • I have pip and enum34 installed into /usr/lib/python3.7/site-packages/
  • I invoke /usr/bin/pip install git+git://github.com/pypa/pip
  • At this point, the relevant sys.path entries look like this: ['/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']

    • The first two entries are for the stdlib. The last entry is for site-packages. The stdlib entries come first so python is going to find the stdlib's enum.py. These are just standard sys.path entries; unmodified by anything the user has done.

  • pip does its work and decides it needs to build in an isolated environment. So it ends up in the build_env.py file and re-invokes itself via: /usr/bin/python3.7 /usr/lib/python3.7/site-packages/pip/
  • Python decides this is a python module and it can invoke it. It finds the __main__.py and starts executing it.
  • sys.path should be the same as the outer invocation of pip as PYTHONPATH hasn't changed and build_env didn't try to make any changes to what libraries are available to what pip sees. So the relevant portions would once again be: ['/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
  • When Python gets here: https://github.com/pypa/pip/blob/master/src/pip/__main__.py#L15 __package__ is the empty string because that's what it is set to when you invoke a module this way in python. So the conditional is True and at this point it modifies sys.path.
  • Since pip is in /usr/lib/python3.7/site-packages/pip, the effect of the path modification is to re-add site-packages to sys.path, but it gets added at the front of the list. So the relevant entries in sys.path now look like this: ['/usr/lib/python3.7/site-packages', '/usr/lib64/python3.7', '/usr/lib64/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']

    • site-packages now has two entries in sys.path. The one pip.__main__ just added is the first one.

  • The stage is now set for the bug to occur. As more code is executed, Python eventually gets to a chain of imports which yield the Python-3.6 (or later) version of re.py. That version of re.py needs to import enum. It does so and because the re-invoked pip has modified sys.path, it finds and imports /usr/lib/python3.7/site-packages/enum.py which is the enum.py from the enum34 package.
  • re.py in Python-3.6+ expects enum.py to have an IntFlag member. The old version of enum in the enum34 package does not have this member. So when that code is executed, it fails with a traceback.

So the breakage is independent of the usage of the user's site-packages directory. The breakage can occur when pip puts any directory which contains more modules than just pip into sys.path before the stdlib entries there. The changes in any of the proposed fixes are aimed at limiting the scope of the modification to sys.path. #8213 and #8217 work by modifying sys.path in fewer cases. The fix I did not implement works by creating a different sys.path entry which would only contain pip.

I'm sorry for any confusion; I think I tend to think things through in a different way than most people. I read books backwards and get the most out of movies when I've already read the plot :-) So when I lay out how something works, perhaps I lay out my thoughts and analysis in a way that is backwards for other people.

(This popped up in my email, so I ended up reading it - thanks for the detailed post, but please be aware that I haven't thought it through yet. This note is mostly just something that came to mind now, and I want to record it so I don't forget. I'll follow up with a better analysis later).

So this is basically related to the ages-old debate as to whether it's acceptable to "shadow" stdlib modules. It seems to me there are two main points here:

  1. Why does pip need to insert its directory at the front of sys.path (as that opens up the possibility of shadowing)?
  2. Why does enum34 use the import name enum but not present a compatible API to the stdlib enum library that it will inevitably shadow if it ends up ahead of the stdlib on sys.path?

IMO, enum34 isn't correct - it should either be 100% compatible, or have a different name (and require users to try to import enum, and if that fails, import enum34 as enum). But there's not much we can do about that bug, except report it to them and see what they say.

So we're left with working around the issue in pip. Putting the new entry at the end of sys.path would solve the issue, but I've no idea at this point what other issues might be introduced by doing that. Trying to only put our directory at the front of sys.path "when it's safe" probably only adds more fragility to the whole mechanism (what would be safe, after all?)

As I say, I'm not going to think through this now, though.

Reading through the thread leaves me feeling we might be over-complexing the problem. pip does not depend on anything third-party, and no standard library components depend on pip. So maybe the solution is simply that the sys.path manipulation needs to be smarter and inject the directory after stdlib, but before site-packages. I believe this would solve the enum34, while not affecting any of the usages (at least none already mentioned).

Whether the solution would induce too much maintenance or other overhead to be worthwhile is another problem. But I feel it should be taken into account while we look for other solutions (e.g. to de-support python dir/to/pip entirely), since any of them may induce maintenance overhead as well.

I've implemented @uranusjr 's solution in https://github.com/pypa/pip/pull/8241 @pfmoore doesn't like it, but it's one more sample solution to compare.

This issue is contributing to a crash I'm experiencing when deploying a Python 3.6 app to AWS.

enum34 is a dependency of grpcio, but is only installed if the Python version is less than 3.4. My app uses grpcio, and for some reason AWS is still installing enum34 even though my app is Python 3.6 (which is a separate issue). Once enum34 is installed, however, pip will attempt to use it and fail with the usual error:

AttributeError: module 'enum' has no attribute 'IntFlag'

From that point forward I am unable to use pip, even to uninstall enum34. Until this bug is fixed, the only way I can think of for getting around this is by manually deleting the enum34 file in site-packages.

Copying @uranusjr's comment from #8272 into here, for reference:

I think it鈥檇 be better to just revert #5841 than accepting #8213. They would both break the usage #5841 tried to fix, but at least reversion is unlikely to introduce more problems.

We seem to have quite a lot of proposed solutions around - #8213, #8217 and #8241 - as well as reverting #5841.

But I did some digging, following some links in the threads referenced from #5841, and it gets quite complex, going back to before PEP 517 support was added. So whatever we do, we need to be careful how we handle it (there's some really weird setups out there, not all of them in obscure places 馃檨).

Longer term, I think that build isolation is way too complicated at the moment, and we should look at delegating it to a 3rd party library that can really do it well (maybe the code in pep517 can be promoted to "reference implementation" status?) There's a load of complexity to discuss and iron out if we take that route, though, not least of which is how we communicate to the environment builder "how should I install dependencies using the exact same version of pip, with the exact same settings, as the main pip invocation?"

Longer term, I think that build isolation is way too complicated at the moment, and we should look at delegating it to a 3rd party library that can really do it well (maybe the code in pep517 can be promoted to "reference implementation" status?)

Yes, but pip, as usual is a special case.

A lot of the code in pip (around build isolation) is "defensive" -- meant to allow us to use recursive subprocess calls while preventing fork bombing OR to make sure we pass any user-provided options down into the process that's populating the build environment.

I think the pep517 implementation was the original PoC that our logic is based off.

Overall, I think we need to move this logic into a reference library, with an additional API plug in point for "populate this environment with these packages" that gets handled differently by different end users (Linux distros, pip, poetry want to do different things here).

And, within pip itself, we should stop having the recursive nature of build environment population, and transition to doing it in a single process now that we have a proper resolver and better understanding of the expectations around this area. It would be faster and less use of filesystem-as-running-program-state, making it generally more maintainable (I think).

A lot of the code in pip (around build isolation) is "defensive" -- meant to allow us to use recursive subprocess calls while preventing fork bombing OR to make sure we pass any user-provided options down into the process that's populating the build environment.

But surely (?) both of those are needed even in a general library? Protecting against fork-bombs is a general need, and a library needs to allow the user how to invoke pip.

And, within pip itself, we should stop having the recursive nature of build environment population, and transition to doing it in a single process now that we have a proper resolver and better understanding of the expectations around this area.

+1 on this

But surely (?) both of those are needed even in a general library? Protecting against fork-bombs is a general need, and a library needs to allow the user how to invoke pip.

Yea, but we'd need to disentangle that logic which is what I was trying to flag. :)

Also, it's only needed somewhere in the recursive chain, and pep517 doesn't come in the recursive loop. IOW, pip has these protections and since pep517 is currently just straight up using pip, pep517 doesn't need to have any protections and can "reuse" that pip does it.

Keeping the expectations the same or making pep517 keep track of the recursion might be a design decision to make during this refactor. :)

I鈥檝e been thinking about this recently. Maybe we shouldn鈥檛 really run the pip installation to populate the isolated environment in the first place? pip can (for example) build itself into a zip somewhere, and run that instead of the installed source. This zip-building process can be done on-demand when an isolated environment is needed, and cached for the entire session to minimise performance impact (although honestly the delay is probably minor compared to the wheel-building process itself).

Was this page helpful?
0 / 5 - 0 ratings