Pip: Allow to skip downloading manylinux wheels

Created on 16 May 2016  ·  62Comments  ·  Source: pypa/pip

  • Pip version: 8.1.{0,1,2}
  • Python version: 2.7
  • Operating System: CentOS 6

    Description:

When PIP 8.1 introduced support for manylinux1 wheels, few issues started to show up when attempting to build wheels. We use custom Python installation, installed in other than system Python location. When we upgrade our package requirements, we re-build wheels ourself using command below:

$ pip wheel -w /path/to/wheel -f /path/to/wheel --use-wheel  -r requirements.txt

Pre-PIP 8.1, this command did what I expected: build wheels for new packages in the requirements.txt file. After PIP 8.1, it just download manylinux wheels despite the fact that /path/to/wheel already has an wheel for the requirement.

Let's take an example: One of the packages we use is numpy. Requirements string looks like this: numpy==1.10.4. This ensures we use only this package version. What PIP pre-8.1 did: It detected that wheel for numpy 1.10.4 was already built and did nothing else. What PIP 8.1.x does: it downloads manylinux1 wheel, despite the fact that I already have wheel for 1.10.4 and wasn't updating numpy wheel.

You might suggest adding --no-binary numpy, but that won't solve my problem either -- I don't want to rebuild numpy package _every time_ I build wheels, and I don't want to select only updated packages to build wheels. I like what I had before: -r requirements.txt and it did the job.

So what I'm asking here is either of two:

  1. If wheel already built -- just skip it, don't attempt to download manylinux wheel.
  2. Option to disable download of manylinux wheels.
wheel no action enhancement

Most helpful comment

Thank you for quick response. It might -- need to test this.

But I'd still prefer some kind of command line option (ie --no-manylinux1) for pip.

All 62 comments

You can drop a _manylinux.py file in site-packages or in the standard library or wherever that will make it importable with contents like:

manylinux1_compatible = False

Does that satisfy your use case?

Thank you for quick response. It might -- need to test this.

But I'd still prefer some kind of command line option (ie --no-manylinux1) for pip.

Got bitten by this too. Creating a _manylinux.py file works, but imo, a command-line option or even an environment variable to exclude manylinux wheels would be a lot cleaner.

We got bitten by this too. Builds that used to work (on CentOS 5, CentOS 6) still seemed to work but the final PyInstaller build does not work.

@dstufft -- I'm wondering it you decided on implementing an option to skip manylinux1 wheels download? FOr some reason, I'm getting bitten by this over and over. Keeping _manylinux.py file in site-packages is nice, but I can't remember _always_ add it... :(

There is an issue suggesting that the non-manylinux1 tag should have precedence over the manylinux1 tag. Would that solve your issue?

No it will not. I'd like a flag to ignore manylinux entirely (and download source distribution or other wheels).

Have you tried publishing a package that provides _manylinux.py and installing that into your virtual environments as a matter of course?

Yes, but I consider it a dirty hack (why should installing a package have a side-effect of changing how pip functions?).

It also doesn't work in all cases for example system pip where I do not control dist-packages.

If anyone is interested in writing a PR for this, that would probably help move it forward (a command line option would make the most sense, as that would automatically support setting it in the ini file or via an environment variable).

Otherwise, supplying _manylinux.py is probably the best option for now. (You could set PYTHONPATH to a directory of your choosing, and add _manylinux.py there, that would make it visible in all environments)

I started working on a patch, does this seem sane before I start figuring out how to test this?

https://github.com/pypa/pip/compare/master...asottile:no_manylinux?expand=1

For example:

Default

$ pip download libsass --dest foo
Collecting libsass
  Using cached libsass-0.11.1-cp27-cp27mu-manylinux1_x86_64.whl
  Saved ./foo/libsass-0.11.1-cp27-cp27mu-manylinux1_x86_64.whl
Collecting six (from libsass)
  Using cached six-1.10.0-py2.py3-none-any.whl
  Saved ./foo/six-1.10.0-py2.py3-none-any.whl
Successfully downloaded libsass six
$ ls foo/
libsass-0.11.1-cp27-cp27mu-manylinux1_x86_64.whl
six-1.10.0-py2.py3-none-any.whl

With --no-manylinux

$ pip download libsass --dest foo --no-manylinux
Collecting libsass
  Using cached libsass-0.11.1.tar.gz
  Saved ./foo/libsass-0.11.1.tar.gz
Collecting six (from libsass)
  Using cached six-1.10.0-py2.py3-none-any.whl
  Saved ./foo/six-1.10.0-py2.py3-none-any.whl
Successfully downloaded libsass six
$ ls foo/
libsass-0.11.1.tar.gz  six-1.10.0-py2.py3-none-any.whl

@asottile -- why don't you submit pull request so somebody could review it and provide comments?

Sure, felt like I should get a first round of feedback on the initial approach since it is untested currently but I can do that

Could this be a problem wIth the new manylinux download being run unnecessarily?
If you have an existing wheel locally, pip doesnt need to fetch a new wheel. i.e. by default it shouldnt download a manylinux wheel if there is already a non-manylinux wheel available locally.

Then if someone really wants pip to not use a locally available wheel, they shouldnt put it in a place that pip is looking for local wheels.

Could this be a problem wIth the new manylinux download being run unnecessarily?

This too. But most importantly, I want an option in pip which will disable manylinux completely. Currently, I need to check if manylinux wheel somehow got downloaded and kill it in the wheelhouse. This is very annoying at times.

Our main usecase is we do a one-time download / build of wheels to put in our internal pypi server and manylinux wheels are _very much_ incompatible with our security requirements.

Any solution which focuses on manylinux will be linux specific.

manylinux wheels are very much incompatible with our security requirements.

If I understand correctly, you want pip to not download binary from a foreign repo, but you are happy with a binary being built and used locally.

But there could be a Windows shop which has the same security requirements, and a manylinux solution wont work for them.

In which case you want to be able to disable binary for foreign repo, and then the Windows shop will also be able to use the solution.

It's not that it's binary from a foreign repo, it's that shared object files of libraries (that often have security fixes such as libxml, libssl, etc.) are straight _vendored into the wheel_.

The windows shop is probably already ok with --no-binary ':all:' which'll avoid the win32 / win_amd64 wheels?

it's that shared object files of libraries (that often have security fixes such as libxml, libssl, etc.) are straight vendored into the wheel.

Can you give an example?

Maybe there is a generic way to improve pip such that it excludes/rejects those prebuilt wheels, only when they include undesirable contents, which could also occur on Windows.

@jayvdb example is numpy -- it comes as manylinux wheel, with builtin libraries which aren't compatible with the system and caused me couple hours of headache when I accidentally downloaded manylinux wheel and used it to install numpy. So I'd rather have --no-manylinux flag for pip {download,install,wheel}, rather then I need to waste time later to figure out why something suddenly doesn't work the way it should.

https://github.com/numpy/numpy/issues/7570 appears to be the only open issue related to manylinux. It does confirm they are shipping .so's and causing many problems in the process. :/

I'd rather not have a "manylinux" specific option.

Maybe a more general option --only-pure-python or something akin to --no-binary or --only-binary...

Manylinux is already a special case in pip. I also still want to be able to download normal binary wheels (such as from an internal pypi server).

Manylinux is already a special case in pip.

Is manylinux not simply a specific compatibility tag? (I'm not that familiar with how manylinux is implemented). I would assume that Linux platforms state that they support a set of compatibility tags that includes manylinux, but that they prefer platform-specific binaries over manylinux. In which case, the more general option would be to have something that allows users to remove tags from the list of supported compatibility tags.

In any case, I agree with @xavfernandez that we should prefer general solutions over special cases. The compatibility tag mechanism handles this (or should, it's what it was designed for) so I'd prefer manylinux to work within that framework (and then this issue becomes "we need a way to override the default platform compatibility list").

OK. I wonder why it _replaces_ linux with manylinux, rather than just _adding_ a lower-priority manylinux.

@pfmoore it adds a manylinux flavor to the arch in addition to the supported vanilla arch.

We could maybe piggyback on #3760 and allow --platform option for pip install.

It doesn't replace it. It just adds another manylinux which is preferred over the generic linux. See:

>>> from pip.pep425tags import get_supported
>>> for supported in get_supported():
>>> for supported in get_supported():
...     print(supported)
...
('cp35', 'cp35m', 'manylinux1_x86_64')
('cp35', 'cp35m', 'linux_x86_64')
('cp35', 'abi3', 'manylinux1_x86_64')
('cp35', 'abi3', 'linux_x86_64')
('cp35', 'none', 'manylinux1_x86_64')
('cp35', 'none', 'linux_x86_64')
('py3', 'none', 'manylinux1_x86_64')
('py3', 'none', 'linux_x86_64')
('cp35', 'none', 'any')
('cp3', 'none', 'any')
('py35', 'none', 'any')
('py3', 'none', 'any')
('py34', 'none', 'any')
('py33', 'none', 'any')
('py32', 'none', 'any')
('py31', 'none', 'any')
('py30', 'none', 'any')

If you notice that line of code does _two_ things:

arches = [  # Split over two lines to make it easier to see
    # Add manylinux, which can be computed by swapping the platform tag from the tag
    # returned by distutils
    arch.replace('linux', 'manylinux1'),
    # Add the default platform tag, which is just what distutils returns.
    arch,
]

So yea, manylinux1 isn't special other than we have to implement the generation ourselves rather than relying more on distutils to do it for us. In that vein, I agree that we don't want a manylinux specific option though and we should try to find a more general purpose option.

@dstufft can order be changed i.e manylinux1 will be lower priority? This will allow not to download manylinux1 wheels when pre-built wheel for the platform already exists?

Either way, manylinux _is_ a special case for Linux and I'd really like an option to turn it off.

@sashkab I'm not sure that changing the order is the correct way to handle this-- at least by default. Our order puts more specific wheels before less specific wheels, and the tag linux_x86_64 for example is the most generic linux tag you can get before losing compiled wheels entirely.

@asottile manylinux1 is not special, it is one of many platform tags generated. That being said, none of us are against providing some mechanism which you can use to turn it off, we just want to ensure that it is general enough to support other use cases. Options have associated costs and the more general we can make those options, the better.

For instance, let's pretend that we just add a --no-manylinux wheel option, then in the future we add manylinux2, now some people will be happy with the default (use manylinux1 or manylinux2) but some will want to either not use both of them (so --no-manylinux will work for them), but some will only want to use one or the other (meaning we'd need to add --no-manylinux1 and --no-manylinux2).

Another case, let's say we add tags specific to the disto you're currently on, linux ubuntu_16_04_x86_64, some people would be fine using those, but don't want manylinux wheels, while some people would want neither, and some people are fine using either. Should we then add a --no-linux-distro?

Further beyond that, what if someone doesn't want precompiled wheels on OS X? Should we then add a --no-macosx? What about if someone doesn't want any wheels except pure Python wheels? Should we also add --only-pure?

You can see how adding specific options can quickly turn into a combinatorial explosion of options, which is hard to support, hard to implement, hard to maintain, and hard to understand. Much better is something that is more generic that can cover all of those cases with as few as options as possible.

After looking more at PR #3760, I tend to agree with @xavfernandez that adding --platform (and other options) to pip install will allow to solve this issue.

Should the opposite --no-platform also be added?

3760 has been merged.

@asottile A --no-platform doesnt seem necessary, do you see an use-case ?

--no-platform manylinux

On Aug 12, 2016 10:32 AM, "Xavier Fernandez" [email protected]
wrote:

3760 https://github.com/pypa/pip/pull/3760 has been merged.

@asottile https://github.com/asottile A --no-platform doesnt seem
necessary, do you see an use-case ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/pypa/pip/issues/3689#issuecomment-239390976, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABugn-D3QtOMIi5mM_mwaETKaexT-ASoks5qfC-5gaJpZM4IfWus
.

--plafform linux_x86_64 should produce the same thing

In our case it does not. We have other platform strings to support multi-platform via https://github.com/asottile/pip-custom-platform

I'd need something like --platform linux_x86_64 OR (computed string based on OS information)' when really what I *actually* want is--no-platform manylinux`. There's precedence for other options in pip to provide both the positive and negative (--index, --no-index, --binary, --no-binary, etc.), why oppose here?

As a kind of workaround, I've made a package you can install to disable manylinux1 (using the _manylinux trick above): https://github.com/asottile/no-manylinux1

+1 I don't want to pip install manylinux wheels. Only "ordinary" ones. I need to build myself packages which depends from OS libraries (i.e. numpy).

I'd like to mention another use case that I think some of the current command-line proposals don't cover. I'm considering producing manylinux1 wheels for a package I maintain, but I know they won't suit everyone (in my case, because the package has optional features that are included if the system supports them at compile time, and which can't be included in the manylinux1 wheels). So some users will want to disable the manylinux1 wheels for this package (while still allowing them for other packages). They could use --no-binary <pkg>, but that would also prevent using locally cached wheels generated by pip. Ideally there would be an option similar to --no-binary that could take either :all: or a list of package names, and which could be used in a requirements.txt file.

The opposite mode might be useful for people with security concerns: if package X takes a long time or is a pain to compile and the manylinux1 wheel has been audited for security concerns, it could be white-listed while still banning other manylinux1 wheels.

After PIP 8.1, it just download manylinux wheels despite the fact that /path/to/wheel already has an wheel for the requirement.

This is due to the fact that manylinux tag is more specific than linux_x86_64 one as stated by @dstufft:

Our order puts more specific wheels before less specific wheels, and the tag linux_x86_64 for example is the most generic linux tag you can get before losing compiled wheels entirely.

Astonishingly the issue Should manylinux1 comes before linux_* in precedence? raised by the creator of the wheel format himself @dholth has not been referred here, yet. It seems like many people affected by this issue are surprised the same way as @dholth to see manylinux tag being more not less specific than linux_x86_64 tag.
That being said it seems like the solution to this issue is for affected people to introduce their own platform tag, make it more specific than the manylinux1 one and build wheels with this tag which would prevent using manylinux1 wheels entirely. Some clear description of how to do this would go a long way to help affected people.

We lose many hours of investigation in our company, when we updated pip from 8.0.3 to 8.1.

Pip 8.0.3 installs linux_x86_64 Pillow wheel version but pip 8.1 installs manylinux1 (Pillow) wheel which is more generic (it includes .libs/*.so, for example .libs/libjpeg-bcb94a84.so.9.2.0).

It is crucial for us that Pillow should not contain local libjpeg-bcb94a84.so.9.2.0 library. We must freeze pip version to 8.0.3 in order to install correct pillow wheel version.

We lose many hours of investigation in our company, when we updated pip from 8.0.3 to 8.1.

And how many hours of work is your company saving by using free software and depending on the work of countless volunteers?

Don't answer - it was a rhetorical question ☮️

You can install a package called no-manylinux1 to prevent pip from installing these packages. It sets a flag as described in the manylinux specification.

On Sep 15, 2017 08:53, Oliver Bestwalter notifications@github.com wrote:
We lose many hours of investigation in our company, when we updated pip from 8.0.3 to 8.1.

And how many hours of work is your company saving by using free software and depending on the work of countless volunteers?
Don't answer - it was a rhetorical question ☮️

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.

Are folks still facing this issue?

There are multiple manylinux versions in the wild, the no-manylinux1 package isn't gonna be enough. Given that a manylinux-specific option is pip is a no-go, does someone has a suggestion on how they'd imagine this working?

I would still like to see an option _in pip_ which disables all manylinux specifications (disabling wheels containing vendored packages ~essentially) -- is there a link to updated manylinux specs? I can add the other attributes to no-manylinux1 if they're missing

There's manylinux2010 and manylinux2014.

I think you can publish no-manylinux FWIW. I'm willing to mark that as an official-ish workaround for this issue.

I've:

  • renamed no-manylinux1 to no-manylinux
  • uploaded no-manylinux==2.0.0 which contains markers for manylinux{1,2010,2014}
  • uploaded no-manylinux1==1.2.0 which is a transitional package that only deps no-manylinux and contains no code

I still think there should be an official way to do this that doesn't depend on installing this package

here's the new repo btw: https://github.com/asottile/no-manylinux

I still think there should be an official way to do this that doesn't depend on installing this package

Hmm... Any specific reasons for that preference? If it's the maintainance responsibility of that package, I'll be happy to share that load.

I ask because the way this package operates is 100% in sync with the PEP and provides all the functionality that we'd have had with a flag, with no need for special casing a platform in pip.

I actually prefer it this way, since "I don't want to use manylinux wheels in this system (for reasons)" -> "pip install no-manylinux" is a straightforward enough UX IMO.

The biggest reason being it's not possible to avoid them in a single install command, the other reason is it's yet another thing to update when new manylinux standards appear (apparently it's been years since 2010 but hadn't been updated until today for instance). The other is it's not really a _system_ per se, it's more akin to the --no-binary case (I could totally imagine a --no-binary :manylinux: or something which exactly solves this issue)

apparently it's been years since 2010

Oh, the number is reflecting the date of the oldest systems that is compatible with that version. I don't have he dates for manylinux2010 off hand but manylinux2014 became an approved PEP last month. 🙃

Alrighty. Let's add an option to pip to handle this on a per-install basis. I honestly don't think this is high priority for me but we'll be happy to accept a PR for this (subject to the regular PR reviews).

I'm still not convinced we need a special case option in pip. We already have a plethora of options for special-case tweaking of what gets installed, and the maintenance overhead is non-trivial. While I don't particularly think that a "no manylinux" option will add significantly to that burden, it is nonetheless another step down that slope.

I'm not going to block a PR for this, but I want to strongly advise caution when considering it. How many users will it benefit? How often will it be the only possible solution for such users? How does that measure against the added technical debt (and user confusion cost) that this incurs?

I know this is a long conversation already, but one thing that has not been pointed out is that, from a security perspective, this seems _not_ to be a manylinux-specific thing at all.

As mentioned above:

It's not that it's binary from a foreign repo, it's that shared object files of libraries (that often have security fixes such as libxml, libssl, etc.) are straight vendored into the wheel.

This vendoring can occur not just via auditwheel (Linux) but also via delocate on macOS. The concept is the same; a third-party, not-whitelisted library is being bundled into the wheel. So a --no-manylinux tag is probably not holistic enough in that sense. If you wanted something besides --no-binary, it would probably need to account for the more intricate logic of disallowing wheels that contain bundled libraries specifically.

With PEP 600 and pip moving to use packaging.tags (a common shared implementation for generating compatibility tags), pip's codebase no longer directly controls or generates the compatibility tags. Further, with PEP 600, there's now no need for any updates to no-manylinux, once support for disabling PEP 600-style manylinux wheels is added to it.

As it is already possible to disable this behavior, via pip install no-manylinux from PyPI, and given that the overhead of keeping that package functioning is very low now, I no longer think that we should do this.

the no-manylinux package still feels like a huge hack :S -- but I did update it a few months ago for PEP 600 https://github.com/asottile/no-manylinux/commit/5e5dea7135857dcadbf4991f94cddf151cc716d8

For what it's worth, there's a usecase here that isn't easily covered by the no-manylinux package approach: If the system on which the wheels are built/collected supports manylinux, but they're being collected to build a container/image for a system that won't.

For example, packaging up a package and its dependencies with pip wheel -w dist mypackage, and then a Dockerfile like this:

FROM python:3.8-alpine

COPY dist /wheels
RUN pip install --no-index --find-links /wheels mypackage; rm -rf /wheels

Alpine does not support manylinux packages (docker build for the above will overlook any manylinux dependencies and say ERROR: Could not find a version that satisfies the requirement at the RUN step), but the system on which pip wheel runs does, and should not generally avoid them.

Essentially, this is similar to cross-compiling, so would it be acceptable to allow constraining which packages pip is allowed to use in a wheel run? This constraint set could be "exported" from another installation's pip.pep425tags.get_supported. If I'm not mistaken, this might actually make it a more generic version of --no-binary, since that just maps to only allowing any?

@tobyp I believe that's a related but entirely separate problem -- and in general building wheels on a disparate platform is not going to work. for example, if you produce a wheel which links against libc (even if it isn't downloading this wheel which is what this issue is about)

@asottile Thanks for the reply, and explaining the difference between this issue and my use case! I've managed to solve my problem using multi-stage docker builds that produce dist already inside a container, which sidesteps the libc-related problems you mention.

For what it's worth, if it's only about preventing the installation of manylinux packages, I can't think of any case that wouldn't be covered by the no-manylinux package either.

Was this page helpful?
0 / 5 - 0 ratings