Pipenv: (feature request) More support for conditional installation

Created on 24 Jan 2018  Â·  36Comments  Â·  Source: pypa/pipenv

First of all, apologies if what I'm requesting is already possible! I did my best to search for anything relevant.

Context

We make use of Tensorflow in our codebase, which has the annoying quirk that there is a GPU-enabled wheel called "tensorflow-gpu" and a CPU-only wheel called "tensorflow". The GPU-enabled package will crash on a system without the CUDA libs installed. The wheels also share most of their package structure and seem to clobber each other if you accidentally install both. Some of that behaviour is probably a bit un-Pythonic, but that's what has been shipped.

Now, ideally Google would build a single wheel that gracefully falls back to CPU-only if CUDA insn't installed, but that probably isn't happening any time soon. Their official solution seems to be not to specify TF as a dependency for anything, and require the user to manually install the correct build [1]. This is less-than-ideal for configuration management.

Feature Request

I'd very much like it if I could set up one Pipfile that defaults to installing the CPU package unless I set some custom flag indicating GPU is available.

Discussion

The closest existing functionality I'm aware of is the support for environment markers - if there was an extension to allow user-defined markers then this could be a fairly clean solution. I'm unsure how pipenv handles the lockfiles when there are multiple possible configurations, but this would continue working the same way.

As a rough example of what I'm suggesting:
Pipfile:

[packages]

tensorflow-gpu = { version = "*", gpu_enabled = "=='1'" }
tensorflow = { version = "*", gpu_enabled = "==''"  }

Installing CPU version: pipenv install
Installing GPU version: pipenv --env_marker "gpu_enabled=1" install

[1] https://github.com/deepmind/sonnet "Sonnet will work with both the CPU and GPU version of tensorflow, but to allow for that it does not list Tensorflow as a requirement, so you need to install Tensorflow separately if you haven't already done so."

EDIT: Another downside to Google's approach that just occurred to me: manually installing tensorflow will really not work well with pipenv - every time the env gets purged, tensorflow will need to be manually installed again.

Most helpful comment

@Froskekongen I don't really want to go down that route. "gpu_enabled" seems like a fairly ill-defined ad-hoc flag to add to a multi-platform standard. "cuda_enabled" would be easier to define but is even more ad-hoc - I suspect the suggestion would be thrown out even if we did try to get it into the spec.

Also it turns out "extras" would not help here - that just allows a package to specify optional dependencies.

Basically, I can already tell pipenv to only install a package if it's running on Windows. The best example of this in our codebase is:

numpy = { version = "==1.13.3", os_name = "!='nt'", index="pypi" }
"c500f7b" = {path = "numpy-1.13.3+mkl-cp36-cp36m-win_amd64.whl", os_name = "=='nt'"}

My original request was simply a way to do the same thing, except instead of looking at os_name, it would look at a user-defined variable.

All 36 comments

Related to #1303.

we currently support all the PEP 508 markers, so I'd start with getting gpu_enabled added to that list.

@kennethreitz do you support the "extra" marker?

@kennethreitz: Can you give some pointers on how to use PEP 508 markers in Pipfiles for people without experience in this domain. I have exactly the same issue as the OP.

@Froskekongen What exactly conditions are you looking for? The linked PEP 508 itself is already very informational on available markers, and Pipfile’s readme already showcases how they can be used in Pipfile.

@uranusjr: It would be great, for instance, to showcase how one could solve this issue using "extra". Would something along these lines work, e.g.:

tensorflow_gpu = {*, markers="extra == 'gpu'"}
tensorflow = {*, markers="extra == 'cpu'"}

And how do I then set the extra marker?

I guess another way of doing this would be to extend the [dev-packages] support to add extra sets of packages that are optional.

@ed-alertedh: Did you contact python developers about adding gpu_enabled in PEP 508? (Or maybe cuda_enabled would be even better)

@Froskekongen I don't really want to go down that route. "gpu_enabled" seems like a fairly ill-defined ad-hoc flag to add to a multi-platform standard. "cuda_enabled" would be easier to define but is even more ad-hoc - I suspect the suggestion would be thrown out even if we did try to get it into the spec.

Also it turns out "extras" would not help here - that just allows a package to specify optional dependencies.

Basically, I can already tell pipenv to only install a package if it's running on Windows. The best example of this in our codebase is:

numpy = { version = "==1.13.3", os_name = "!='nt'", index="pypi" }
"c500f7b" = {path = "numpy-1.13.3+mkl-cp36-cp36m-win_amd64.whl", os_name = "=='nt'"}

My original request was simply a way to do the same thing, except instead of looking at os_name, it would look at a user-defined variable.

@ed-alertedh: Why do you say that "cuda_enabled" or "cuda_installed" is ad_hoc? I think that gpu-computing is fairly common, and dependence on cuda is very prevalent.

@kennethreitz: Can you please open this issue until there is a proper fix or workaround for this use case?

For me the use case is something like:
pipenv run python -m some_module.some_submodule --training_data_location=path --parameter_location=otherpath
and corresponding test examples.

Typically the many development platforms and jenkins will not have cuda installed on the systems, but the code still needs to work.

I would also like this feature to be extended. In my use case, i have some extra packages that I need only installed at some deployment environments.

It's a django app, and I import extra features depending on purpose of the server being installed. I would like to set an environment variable that flags the packages for installation. Maybe this is already available with PEP 508 support, but I cant figure out how can I do this. If It is doable I think the current documentation could be extended with this example.

you have to accept the constraints of the system.

I don't understand what is the constraint, is it the PEP 508 specification?

@kennethreitz as a matter of constructive feedback (which you can feel free to disregard), it may have been easiest to just (a) confirm that this use-case is not currently possible and then (b) state that it's not something you plan to implement. As a bonus (c) indicate whether contributions would be welcome or not.

Thanks for all your work on this tool, it's a huge improvement over wrestling with requirements.txt

edit: Avoiding double post- As a technical detail to add here, I seem to be having trouble generating a single Pipfile.lock that works on multiple platforms anyway. An example is the jupyter package, which appears to pull in "pexpect" as a dep on linux but doesn't on Windows. I think I'm going to have to bite the bullet and have multiple Pipfiles (possibly using a template to reduce duplication)

I'm just saying that you need to read the documentation, here:

https://docs.pipenv.org/advanced/#specifying-basically-anything

And work within those constraints.

Sorry to comment in a closed issue, but it seems the most appropriate place for me.

In addition to @ed-alertedh

Is it possible to specify some extra flags based on some conditions? I would like to specify some pip flags (in this case --no-binary) based on some condition, e.g. current OS:

somoclu = {version = "*", sys_platform = "== 'win32'"}
somoclu = {version = "*", sys_platform = "== 'linux'",no-binary=":all:"} 

Is there any possibility or workaround to this except specifying the wheel file directly?

Pip arguments get dropped in from the environment currently rather than being configured in the Pipfile. Just set PIP_NO_BINARY

Thanks for the quick answer. Actually, I just want to compile the somoclu package but not all the other packages from the pipfile.

(Side question: My no-binary=":all:" entry in the pipfile is just ignored?)

Right the Pipfile doesn’t interpret no-binary and pipenv doesn’t either, I believe you can set the PIP_NO_BINARY variable and then set PIP_ONLY_BINARY to the package you want to get as a wheel

Would I would have to set PIP_NO_BINARY="somuclu" globally on every Linux machine I am running this on? This seems weird to me.

If yes, can I somehow incorporate PIP_NO_BINARY="somuclu" directly into the Pipfile?

(Thanks for the quick replies and sorry if I don't understand...)

I'm not sure of what you need, but I believe this may help:

https://docs.pipenv.org/advanced/#injecting-credentials-into-pipfiles-via-environment-variables

https://docs.pipenv.org/advanced/#automatic-loading-of-env

On Wed, 11 Apr 2018, 15:04 Timo Klerx, notifications@github.com wrote:

Would I would have to set PIP_NO_BINARY="somuclu" globally on every Linux
machine I am running this on? This seems weird to me.

If yes, can I somehow incorporate PIP_NO_BINARY="somuclu" directly into
the Pipfile?

(Thanks for the quick replies and sorry if I don't understand...)

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pypa/pipenv/issues/1353#issuecomment-380463905, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADlMQ6Z6Dn3pbJzyo54lnFVyFhd0Pi9eks5tng1-gaJpZM4Rqqfb
.

Actually, somoclu fails to run smoothly on Linux using the wheel provided by pypi.org (some shared object is not installed; at least on my system), but compiling it myself (using pip install --no-binary :all:) works fine.
On Windows, compiling somoclu (and anything else) is always a mess. So I wanted to have a single Pipfile that installs everything smoothly on any (new) client (with any operating system) when calling pipenv install
Hence, I wanted to add OS-specific flags to the installation of somoclu.

No, we don't do os-specific pip-flag interpolation (I'm sure you understand this is a very specific edge case) -- and yeah I'd probably recommend handling this with an environment variable on your linux machine, since that's the most straightforward approach. I doubt this is true on _all_ linux machines, so I don't know that it's really fair to say you'd have to go do this globally on every linux machine you own, but on the ones that can't use the wheel, sure.

At the risk of throwing more "gold-plated" features into the mix... How difficult would it be to make pipenv think it is running on a different platform?

To keep my various lockfiles in sync I opted to set up a Docker container to lock the linux packages which I can run straight after I lock my windows packages. It would be really cool if I could do a "dry-run" and generate the lock file for another platform simply by changing the env markers. Although I guess there would be no check that the packages actually installed successfully by doing it that way.

Locking results should be the same regardless of your host platform. Pipenv currently can’t do this, but is constantly being improved toward the goal. Your proposal would be a step to the wrong direction, hiding the underlying problems instead of solving them.

@uranusjr Unfortunately due to the "extras" field in many packages, you will get a slightly different dependency graph on different platforms. E.g. I noticed jupyter pulls in pexpect on Linux but not on Windows. Trying to restore a lockfile with pexpect on Windows results in an error since the package is not available on that platform.

edit: actually I think that example is caused by different environment markers, but the result is that the lockfile changes.
edit 2: Oh, I see what you are saying now. Yes, ideally locking would work across platforms but I am unsure how to achieve that without making breaking changes to the python packaging system.

@ed-alertedh to be clear I really want this functionality and in many cases we actually do support this — it just depends on how the package maintainer has chosen to indicate their dependencies. If they do it dynamically via actual code then we are SOL.

With the latest release you should be able to keep all the markers and they will just drop through to the lockfile in theory

@techalchemy You're doing a great job with what you have to work with 😄

@techalchemy Yes, it is an edge case and I also agree: Great work!
Just wanted to know whether it is possible somehow.
Thanks alot!

I've read everything thrice, yet I fail to get anything constructive from this post.

I have exactly the same issue as OP, an application that runs with TF-CPU on dev machines and TF-GPU on production machines. I need to package it into a docker container (the ones from nvidia with TF already installed), and basically I need to tell pipenv to NOT install tensorflow since it is already provided inside the base container.

The environment markers just don't cut it, since docker containers have same markers than linux machines (sys.platform=linux,os.name=posix).

What is the recommended workaround ? How can I tell pipenv to not install a given package in Docker ?

@Overdrivr Whilst this is far from an ideal setup, this is how I'm currently handling it: https://github.com/ed-alertedh/multiple-pipenv-example

Hey that’s pretty clever, thanks for putting that example together! That really helps me understand also. One side comment which wasnt possible when this was originally discussed— if you give each Pipfile and Lockfile its own folder, you can just set PIPENV_PIPFILE to point at the right Pipfile and it _should_ figure things out for you. I’m not totally sure about that, but I’m guessing.

@techalchemy Oh, that's good to know, when I get a chance I might update the scripts to take advantage of that new feature! Really the biggest issue I have with this is that we can't take advantage of pipenv's CLI tools for editing the Pipfile (e.g. pipenv install), otherwise it is working reasonably well for us. We also find that updating the lockfiles on Windows is signficantly slower than on Linux, but I'm guessing a lot of that probably relates to pip's performance on Windows.

@kingmoduk please stop spamming the project.

@ed-alertedh Thanks for sharing.

Another solution could be to move TensorFlow to dev-packages. That way it won't be installed with a --deploy installation of the Pipfile packages. Then, on each platform, manually install the appropriate TF (None on Docker, CPU on cpu machines and GPU on gpu machines)

I realize this could be also useful between dev and test environment for modules.

In dev mode, the module must be installed in editable mode -e . to make development easy.
However in test environment (CI test runner like Jenkins or GitlabCI), it would be better to install to install the package directly from the wheel, to be closer to how the package will be used in reality.

AFAIK, it is not possible in a Pipfile to specify whether to install a package from editable or wheel.
The only solution I see is to manage 2 Pipfiles, which is not convenient.

@kennethreitz Any recommendations for this kind of scenario ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jacek-jablonski picture jacek-jablonski  Â·  3Comments

konstin picture konstin  Â·  3Comments

erinxocon picture erinxocon  Â·  3Comments

AkiraSama picture AkiraSama  Â·  3Comments

jeyraof picture jeyraof  Â·  3Comments