Hacking maya
I learned few lessons which resulted in my following proposal of recommended usage of pipenv
in python libraries. I expect others to review the proposal and if we reach agreement, the (updated) text could end up in pipenv
docs.
pipenv
patterns and antipatterns for python library projectEDIT
Following is best applicable for general (mostly Open Source) python libraries, which are supposed to run on different python versions and OSes. Libraries developed in strict Enterprise environment may be different case (be sure to review all the Problems sections anyway).
END OF EDIT
TL;DR: Adding pipenv
files into python library project is likely to introduce extra complexity and can hide some errors while not adding anything to library security. For this reason, keep Pipfile
, Pipfile.lock
and .env
out of library source control.
You will be able to use full power of pipenv
regardless of it's files living in .gitignore
.
By python library I mean a project, typically having setup.py
, being targeted for distribution and usage on various platform differing in python version and/or OS.
Examples being maya
, requests
, flask
etc.
On the other side (not python library) there are applications targeted for specific python interpreter, OS and often being deployed in strictly consistent environment.
pipfile
describes these differences very well in it's Pipfile vs setup.py.
pipenv
(deployment tool)I completely agree on the statement, that pipenv
is deployment tool as it allows to:
Pipfile.lock
) for deployment of virtual environmentIt helps when one has to deploy an application or develop in python environment very consistent across multiple developers.
To call pipenv
packaging tool is misleading if one expects it to create python libraries or to be deeply involved in creation of them. Yes, pipenv
can help a lot (in local development of libraries) but can possibly harm (often in CI tests when used without deeper thought).
TL;DR: pipenv
provides secure environment via applying approved concrete dependencies described in Pipfile.lock
file and python library is only allowed to define abstract dependencies (thus cannot provide Pipfile.lock
).
pipenv
shines in deployment scenarios following these steps:
Pipfile
)Pipfile.lock
Pipfile.lock
as definition of approved python environmentpipenv sync
to apply "the golden" Pipfile.lock
elsewhere getting identical python environment.With development of python library one cannot achieve such security, because libraries must not define concrete dependencies. Breaking this rule (thus trying to declare concrete dependencies by python library) results in problems such as:
setup.py
defined dependenciessetup.py
shall define all abstract dependencies via install_requires
.
If Pipfile
defines those dependencies too, it may easily hide problems such as:
install_requires
Pipfile
defines specific rules (version ranges etc.) for a dependency and install_requires
does not.To prevent it, follow these rules:
Pipfile
[packages]
section in Pipfile
shall be either empty or define only single dependency on the library itself.Pipfile.lock
in repositoryKeeping Pipfile.lock
(typically for "security reasons") in library repository is wrong, because:
To prevent it, one should:
Pipfile.lock
from repository and add it into .gitignore
tox
(hiding usedevelop
)If tox.ini
contains in it's commands
section entries such as:
pipenv install
pipenv install --dev
pipenv lock
it is often a problem, because:
pipenv install
shall install only the library itself, and tox
is (by default) doing it too. Apart from duplicity it also prevents of usedevelop=True
and usedevelop=False
in tox.ini
because Pipenv
is able to express it only in one variant (and tox.ini
allows differencies in different environments).To prevent it, one should:
pipenv
in tox.ini
. See requests tox.inipipenv
failspipenv
is under heavy development and things break sometime. If such issue breaks your CI build, there is a failure which could be prevented by not using pipenv
and using traditional tools (which are often a bit more mature).
To prevent it, one should:
pipenv
into a CI build script, tox.ini
or similar place. Do you know what value you get from adding it? Could be the job done with existing tooling?Key questions regarding pipenv
role in development of python library are:
pipenv
really brings? A: Virtualenv management tool.pipenv
? A: Manage virtualenv.Few more details and tricks follow.
pipenv
will not add any security to your packageDo not push it into project just because everybody does it or because you expect extra security. It will disappoint you.
Securing by using concrete (and approved) dependencies shall take place in later phase in the application going to use your library.
Pipfile
, Pipfile.lock
and .env
files out of repositoryPut the files into .gitignore
.
Pipfile
is easy to recreate as demonstrated below as most or all requirements are already defined in your setup.py
. And the .env
file probably contains private information, which shall not be shared.
Keeping these files out of repository will prevent all the problems, which may happen with CI builds when using pipenv
in situations, which are not appropriate.
pipenv
as developer's private toolboxpipenv
may simplify developer's work as virtualenv management tool.
The trick is to learn, how to quickly recreate your (private) pipenv
related files, e.g.:
$ cd <project_repository>
$ # your library will bring the dependencies (via install_requires in setup.py)
$ pipenv install -e .
$ # add more dev tools you preffer
$ pipenv install --dev ipython pdbpp
$ # start hacking
$ pipenv shell
...
Use .env
file if you need convenient method for setting up environment variables.
Remember: Keep pipenv
usage out of your CI builds and your life will be simpler.
setup.py
ability to declare extras dependenciesIn your setup.py
use the extras_requires
section:
from setuptools import setup
setup(
name='mypackage',
....,
install_requires=["jinja2", "simplejson"],
extras_require={
'tests': ['pytest', 'pyyaml'],
'pg': ['psycopg2'],
},
....
)
To install all dependencies declared for tests
extra:
$ pipenv install -e .[tests]
Note, that it will always include the install_requires
dependencies.
This method does not allow spliting dependencies into default and dev sections, but this shall not be real problem in expected scenarios.
This is very impressive, thanks a ton for compiling. Will definitely review in more detail in a bit
/cc @uranusjr @jtratner @ncoghlan
Some references to maya
issues:
pendulum>=1.0
in setup.py: version was in Pipfile but was missing in setup.py)pipenv
issue broke whole Travis run)I love this too. Maybe we should add this to Pipenvâs documentation, or even the Python Packaging User Guide.
The corollary of the above advice appears to be "forego deterministic/reproducible CI builds", which strikes me as a very large anti-pattern.
What are you proposing as an alternative which would still allow for determinism?
@tsiq-oliverc Deterministic builds have their place at the moment, an application is to be built.
Imagine following attempt to perform really deterministic builds of python library:
Pipfile.lock
Pipfile.lock
resulting from library abstract dependencies defined in Pipfile
Pipfile.lock
instances defined in the repository. Note, that building Pipfile.lock
automatically during CI build does not add any determinismThis is a lot of extra effort. And what you get is a library, which will be installed in different context (e.g. a week later standard installation will pick up upgraded dependency or two) and which will not get anything from the fact, you used Pipfile.lock
, which is at the moment obsolete.
The conflict is in the fact the library must never define strict dependencies inside.
If you think, there is another alternative to gain deterministic builds for python library - describe it.
@vlcinsky - If a consumer of your library uses different versions of dependencies, etc. then that's out of your control. So I agree there's no feasible way for a library maintainer to manage that.
But the goal here is presumably much smaller scope. In particular, I'd see the goals for a library maintainer as the following (which are roughly equivalences):
If any of those three things don't hold, it strikes me as antithetical to quality control.
So yes, I'd say that if you guarantee to support Python variants A, B and C to your consumer, and they behave differently enough that one lockfile (etc.) doesn't cut it, then you should have three lockfiles (or whatever).
I haven't used Pipenv enough to know how easy that would be in practice, though.
I'm currently considering adding Pipfile
s to a few library projects for the CI system as well.
I absolutely need the dependency locking (+hashing) for complying with company-wide security guidelines and I currently don't need to test with different Python versions, since there's only one that's officially supported. And the fact that pipenv simplifies setting up a local development environment, including the virtualenv, is a nice side-effect.
And what you get is a library, which will be installed in different context (e.g. a week later standard installation will pick up upgraded dependency or two) and which will not get anything from the fact, you used
Pipfile.lock
, which is at the moment obsolete.
This is not universally true. In the world of enterprise software, you still have very specific environments that are officially supported and a security issue in a dependency results in your product being updated rather than the customer updating the dependency themselves.
(Yes, I'm talking about a library, not an application here...)
@Moritz90 your scenario is for python library in enterprise environment and there pipenv
may help because it is much more deterministic environment.
My description is aiming at general python libraries such as flask
, request
, maya
etc. where the context is much more variable. Trying to fix couple of things in maya
I got frustrated learning, that in many cases usage of pipenv
introduced real problems (typically hiding problems which would be normally detected) while not providing much or any added value.
Getting deterministic builds is good thing, but it incurs costs. And if done wrong, you may pay extra for lower quality result - and this is what I wanted to prevent.
Iâd argue this is one of the instances we donât want the builds to be absolutely deterministic. If you donât pin your dependencies with ==
, youâre committing to maintain support to multiple versions by default, and should design the library that way. A dependency upgrade breaking the build on CI is actually a good thing because it exposes a bug in the library. Completely deterministic dependencies (as managed by Pipenv) would mask that. It would still be beneficial to be able to be determinitic when you want it, that is generally not the best.
@uranusjr - Sure. I agree that if the desire is "non-deterministic builds", then the advice up top may well make sense. In fact, it's almost a logical equivalence, and could be stated much more succinctly: "If you don't want deterministic builds, then don't use a tool (pipenv
) whose purpose is to ensure deterministic builds" đ.
But that's certainly not a desirable goal in general.
@tsiq-oliverc nice scope definition - it supports focused discussion. I would add one more requirement: The CI determinism shall not hide possible issues within tested library.
If we use Pipenv.lock
, create virtualenv based on that and run CI test of the library, we did part of the functionality the library is supposed to do - installing proper dependencies. If the library is somehow broken in this regard - preinstalled environment would hide this problem.
To me it seems more important to detect issues within a library than to run CI in deterministic way. If there is a way to do both (e.g. running the test behind private pypi index, which could also support determinism) I have no problem, but if there is a conflict, I have my priorities.
Do not take me wrong: there is no desire to run non-deterministic builds, my desire is to run CI builds, which will detect as much issues as possible.
@vlcinsky Sure, I just wanted to share my experience to make sure that the updated documentation reflects it as well. The current documentation does a great job at explaining the tradeoffs:
For libraries, define abstract dependencies via install_requires in setup.py. [...]
For applications, define dependencies and where to get them in the Pipfile and use this file to update the set of concrete dependencies in Pipfile.lock. [...]
Of course, Pipfile and pipenv are still useful for library developers, as they can be used to define a development or test environment.
And, of course, there are projects for which the distinction between library and application isnât that clear. In that case, use install_requires alongside pipenv and Pipfile.
(Highlighted the part that applies in my case.)
I just want to make sure it stays that way. I think your original post contains too many blanket statements without a disclaimer that you're talking about an open-source project that's going to be published on PyPI.
@Moritz90 I completely agree. I was trying to highlight that focus but I can make it even more visible.
@Moritz90 I added introductory note reflecting your comment.
@vlcinsky - That makes sense. I understand that you don't explicitly want non-deterministic builds, but I think that it's unavoidably equivalent to what you do want (i.e. to catch issues when your upstream dependencies update).
Thinking out loud, what's the best way to resolve these two conflicting goals? One possibility is to have a two-phase CI process:
Pipfile.lock
in your repo, so it's entirely reproducible.pipenv update
and then runs the tests, so that it pulls in the latest of all your dependencies (which is basically the same as the behaviour with no lockfile, I think?).@tsiq-oliverc To get deterministic builds, I would think of following setup:
pipenv
Using pipenv
to do the installation is similar to what installing the library itself shall do but it is definitely different because it is different code doing the work.
$ git clone <repo_url> <project_dir>
$ cd <project_dir>
$ pip install pipenv
$ $ # clean pypi cache and make it ready to cache somehow - not described here
$ pipenv install -e .[test]
$ # if we need extra testing packages in pipenv
$ pipenv install <extra_test_packages>
$ # record current requirements expressed in `Pipfile.lock`
$ pipenv lock
$ # if needed, record the `Pipfile.lock` somewhere
Outputs of such job are:
Pipfile.lock
as recorded dependencies (may help developers to reproduce the environment easily)there are phases:
tox
, pip
etc. using only our local pypi cachepipenv
)Pipfile.lock
records pypi packages which were used to install the library. It can be used to reproduce environment at developer site.Another advantage is, this setup does not require developers maintaining Pipfile
or Pipfile.lock
. Also running the tests in different contexts is always the same (Pipfile.lock
is always rebuilt in given context).
The pypi cache is the part which needs some research. I guess, simple directory would be sufficient and maybe pipenv
is already ready to help with that. Maybe issue #1731 is the missing part.
As a package that does dependency resolution many of our own tests rely on deterministic builds â that is, taking known stuff and expecting a resolved graph. We use pytest-pypi
for this.
Love the lively discussion on this topic. I think the nuance is important and you should always test against known dependencies as well as unpinned ones
you should always test against known dependencies as well as unpinned ones
I second this suggestion. It's a good idea to always have an explicit "known good state" for reproducible builds and to simplify debugging in case an update breaks something in addition to making sure that newer minor/bugfix versions work as well.
(In my very personal opinion, the ideal situation would be that the package manager installs the latest minor versions by default so that libraries can always specify the concrete dependency versions that they were tested with, but I realize that's a highly controversial opinion and requires everyone to follow semver.)
@Moritz90 @techalchemy @uranusjr @tsiq-oliverc
Here is my summary from previous discussion.
Pipfile.lock
file(s)?Each supported OS and python interpreter contribute to matrix of possible execution contexts.
E.g. Flask supports (at least CI stuff visible in the repository):
It makes 9 different execution contexts which may differ.
Each execution context may have different Pipfile.lock
.
Who shall maintain them?
Options are:
Pipfile.lock
for main development platform (which platform enjoys to be ignored?)Proposal: Let CI generate the file by pipenv install -e .
. Do not include it in repo, help developers to pick proper Pipfile.lock
as result of automated builds.
When fixing an issue, which may be caused by changes of dependencies on pypi, developer may need simple mean to reproduce the environment from the failing test.
Proposal:
Pipfile.lock
by pipenv install -e .
followed by pipenv lock
.Pipfile.lock
from the failing test.Pipfile.lock
in tox.ini
.setup.py
Library setup.py
may be broken (missing dependency in install_requires
, missing version specifier etc.) and CI test must not hide such problem (by preinstalling omitted dependencies on it's own).
Proposal:
pipenv install -e .
to provide the same result as plain installation (there are currently some issues with that).pipenv
) and possibly compare that resulting pip freeze
output is subset of what is installed by pipenv
.Some dependency update may break library using it. CI shall detect failure on such problem.
Proposal:
Pipfile.lock
, this is non-issue (as we run in unpinned mode anyway)In all proposed modes I tried to avoid keeping pipenv
files in the repository saving developers from maintaining this really complex stuff (automation!!!).
In contrast to my original text the 2nd and 3rd mode do use pipenv
in CI scripts.
Simple package with smaller number of dependencies which are not changing often.
Simply run as before the pipenv
era and keep things simple to us.
Rare cases when dependencies will make troubles are easy to fix and do not justify making CI more complex.
Each time the CI test is run, generate new Pipfile.lock
which completely describes the environment used at the moment.
The Pipfile.lock
shall become CI artefact.
If things go wrong, developer can pick Pipfile.lock
from broken build, apply it locally and do the testing and fixing.
If someone wants to deploy, Pipfile.lock
from last successful build can be used.
When changing dependencies are real problem, CI shall create Pipfile.lock
once upon a time and keep it using for certain period (a month?).
This makes CI setup more difficult, as there must be at least two different jobs (one generating Pipfile.lock
, the other one applying it and using in tests).
Warning: Pipfile.lock
must be update also at the moment, setup.py
changes dependencies.
Note, that the Ice Age requires Scrat the squirrel type of test which ignores the frozen status and checks against unpinned versions.
As seen, the determinism and complexity grows mode by mode.
My proposal would be:
All gains cost something.
If the goal here is to update the advice in the docs, then honestly it feels irresponsible to say something dramatically different to "Follow best practice (reproducible builds) by default, until you have no choice."
@vlcinsky Under the headline "Mode: Generate and seal", it might make sense to mention that the last successful Pipfile.lock
should always be kept around, e.g. by declaring it as a Jenkins artifact. With that change, it would be fine to recommend that setup for most projects. Like @tsiq-oliverc, I wouldn't recommend the first mode, ever.
The more I think about it, the more I feel like this documentation will become a section on why using pipenv
for CI builds is a great idea, even if you're developing a library.
@tsiq-oliverc vast majority of general python packages are in mode "Run, Forrest, Run". I have helped few of these packages with introducing tox
and pytest
, because I felt it would contribute to given package quality and because I had quite clear idea, how it could be done well.
Now there is another great tool and I wonder how to use pipenv
properly in general python projects to contribute to it's quality. I want to find one or two well working recipes which are justified and easy to follow.
What would I say to Flask project?
Pipfile.lock
files and set up policy for updating them?Pipfile.lock
artefact for cases, when someone need to reproduce failing test on it's own computer?The goal is to find functional working style. If it ends in doc, nice, if not, no problem.
@vlcinsky I'd say (1) and (4) should be the recommendation for such projects. While without a pre-existing Pipfile.lock
you won't know the versions used in the build in advance (which is fine outside corporate environments), you'll still get a reproducible result if you generate and archive the lock file during the build.
Edit: The tl;dr version of my recommendation would be:
pipenv
can help you achieve this goal.Pipfile.lock
to your repository and use it for deployment. (This is already covered by the existing documentation.)Pipfile.lock
on-the-fly in your CI build and archive it for later.(Of course, the actual documentation should have a bit more detail and examples.)
@Moritz90 I modified the "Generate and Seal" as you proposed.
Re (1): easy to say, impossible to execute without being more specific.
Re (4): yes, I also think, that "Generate and Seal" is most feasible mode. But in case of Flask I will not dare (at least not at the moment).
Re pre-existing Pipfile.lock
in enterprise environment: It has to be created somehow, either (semi)manually or automatically. I guess, in corporate environment you do not install directly from public pypi but use some private one and that provides only approved packages (devpi-server
provides great service in that - multiple indexes, controlled volatility of published packages, approvals for external packages etc.) If the process of building Pipfile.lock
runs in such environment, it can only use what is approved so if new version is to appear there, someone has to stand up and make it approved. Following CI build will test that it does not break things. And with pipenv check
test of security issues may be also automated.
I guess such a workflow would be more secure compared to someone creating it (semi)manually. But my knowledge of enterprise environment is very limited.
Hello pipenv team. I do share a lot of what is said in this text, it helps a lot any developer better understand the limitations of Pipfile/pipenv when developing a library. I do want to see this text or part of this text integrated inside the official pipenv documentation.
I do have a the following amendement I would like to discuss:
For our internal python package, fully reusable, published on our internal pypi, etc, and even for my own python packages (ex: cfgtree, txrwlock, pipenv-to-requirements), I use a package that some may already know or even use, that abstracts these details and make life of python developer easier: PBR.
PBR basicaly reads requirements.txt
found at the root folder of a distribution package and inject it into the install_requires
of the setup.py
. Developer simply needs to maintain a requirements.txt
with loose dependency declarations. Until the support of Pipfile
is integrated officially inside PBR, I have to use pipenv-to-requirements that automatically generates requirements.txt
from Pipfile
so that they are both synchronized and both commited in source, and PBR does the injection correctly after the distribution package has been built. I think one could use pipenv
to generate this requirements.txt
I work on a support of Pipfile for PBR, so that it will be able to read the Pipfile
(and not the lock file) and inject it into install_requires
like it does with requirements.txt
.
I do not know if other similar packages exist, because it also does other things people might not want (version from git history, auto generation of AUTHORS and ChangLog).
But at the end, I really feel it is so easier to write, maintain and handle versioning of a Python library, I would be sad not to share this experience. I am promoting it as the "recommended" way of writing modern python libraries on my company.
I do recon that it is like "cheating" on all difficulties about library and pipenv, but at the end the work is done and developers are happy to use it so far. Part of the python training, I am giving to new python developer in my company, involves, first writing a python library maintaining install_requires
manually, and then switching to PBR
to see how it becomes easier (and frankly I am a fan of the semantic commit feature of pbr to automatically create the right semver version tag).
Part of the reason to declare the libraries dependencies using a dedicated file also for libraries is to be able to use tools such as readthedocs or pyup (even if pyup makes more sence when linked to an application).
I do not necessarily want to promote this method as the "standard" way of doing python package, it is actually the "OpenStack" way, but I would to share my experience, and if others have similar or contradictory experience, I'll be happy to ear them and update my point of view.
Team, what do you think of a kind of "community" section on the documentation? So that users like me can share his experience on how he uses pipenv, without necessarily the full endorsement of pipenv team?
PS: I can move this to a dedicated issue if you do not want to polute this thread
@vlcinsky (1) is very easy to execute - put your lockfile in your repo.
I think what you instead mean is: it's impossible to give specific advice once this basic strategy is no longer sufficient. That's certainly true, but that's because the specific problem probably differs on a case-by-case basis.
Or to put it another way, the solution depends on what additional guarantees you want your CI workflow to provide.
@gsemet you know what? All my python packages created in last two years are based on pbr
- it is really great. And I follow your attempts to support Pipfile in pbr whenever I can (some thumbs up, votes etc).
In case of this issue (searching for pipenv
patterns and antipatterns for general python libraries) I intentionally omitted pbr
for two reasons:
pbr
for other reasons (you mentioned them) and it would sidetrack the discussion probablyOn the other hand, I am really looking forward to a recipe of yours for pbr-lovers. I will read it.
@tsiq-oliverc you hit the nail: put your lockfile in your repo
This is exactly the problem which motivated me to start this issue. If you reread the start of this issue, you will find description of few cases, where adding Pipfile.lock
can break your CI tests (either breaking the build run or hiding issues, which would otherwise be detected, or installing wrong dependencies for given context...).
If you show me a repo, where this is done properly (general python library), I would be happy. Or I would demonstrate, what risks there are or what things are unfinished.
Cool ! I also maintain this cookiecutter :)
@vlcinsky Right, so let's enumerate the specific problems and find solutions for them đ (I don't know of any high-quality library that uses Pipenv, but that's mainly because I haven't looked.)
As best as I can tell, these are the specific symptoms in your original post:
pipenv install -e .
, right?@tsiq-oliverc I have to say, your comments inspired me and I know they contributed to higher level of reproducibility of proposed solution.
Following is related to your proposal to put lockfile (Pipfile.lock
) into repo to ensure repeatibility:
re Hiding broken setup.py dependencies.. The pipenv install -e .
follow what I propose, but note, that this is not usage of Pipfile.lock
, it is method to (re)create it. If someone keeps Pipenv.lock
and use it to create virtualenv before installing the package, the problem is present.
re Dependencies are likely to be invalid for different python versions or in another OS. Examples are many: doit
installed for Python 2.7 must be older version as newer one dropped support for Python 2.x. watchdog
dependency requires platform dependent libraries: inotify on Linux, something else on Windows, something else on OSX. My former client used to say "This will never happen" and in 50% of situation it happened within 2 weeks. This is not the best practice for CI scripts.
re Developers are forced to update .. Imagine Open Source library with 15 contributors. It is so easy to forget regenerating Pipfile.lock
by a newcomer or tired core developer. E.g. in maya
package I was asked to regenerate the Pipfile.lock
as new dependency was added to setup.py
. Was that necessary? Did I update it properly? Did I update it for all execution contexts supported? Answers are no, not sure, no. Anyway, thanks for your proposal (it inspired me for solution described next to your comment).
re Competing with tox: Tox allows creation of multiple virtualenvs and automation of running tests within them. Typical tox.ini
defines different virtualenvs for python 2.7, 3.4, 3.5, 3.6 and any other you need, and allows installing the package there and run the test suite. It is established powertool of serious testers. pipenv
is not the tool for this purpose, but may interfere in installing things needed. In a way I followed your advice and proposed to use superior tool (tox) over pipenv
where possible.
re Pipenv fails. This is really unfortunate. I had CI test (tox based) which run well on localhost, but when run via Travis, it failed due to pipenv
issue. If I want to use it now, pinning does not help until fix is released. But this is how it goes - I will wait.
Note, that some parts of my original post will have to be updated as it seems, using pipenv
in CI scripts has it's justified place ("sealing" virtualenv configuration for possible later use).
@tsiq-oliverc While I initially liked your suggestion of testing against both the "known good" and the latest versions, I find it harder and harder to justify the effort the more I think about it. I think you should decide to do either one or the other, not both.
The only thing you gain is that you'll immediately know whether a failure was caused by a dependency update or a code change. But you can achieve the same by simply making separate commits (when manually updating locked dependencies) or trying to reproduce the bug with the latest lock file produced by a successful build (when always using the latest versions). And in restricted environments, you cannot "just update" anyway...
@vlcinsky While I agree with your general point about differences between environments, the "one lock file per configuration" argument sounds like a straw man to me. In practice, you will be able to share the lock files between at least some of the environments.
One remaining open question that nobody is answered yet is how to deal with the case where you both need to test in different environment and lock your dependencies. I have to admit that I don't know anything about tox
other than that it exists, but it seems like there's a need for some kind of glue between tox
and pipenv
that solves this problem somehow.
@Moritz90
Regarding too many variants of Pipfile.lock
serving as straw man (to keep others off my field):
I took flask
project (considering it very mature) and run tox tests:
Here you see list of variants tested (just locally on Linux, multiply it by 3 as windows and OSX will do the same set of tests but may result in different environments).
There are 16 different test runs on one OS, 5 of them failed as I do not have them installed (that is fine), one is dealing with building doc (it requires imporable library) and another one coverage (which is also requiring importable library):
coverage-report: commands succeeded
docs-html: commands succeeded
py27-devel: commands succeeded
py27-lowest: commands succeeded
py27-simplejson: commands succeeded
py27: commands succeeded
py35: commands succeeded
py36-devel: commands succeeded
py36-lowest: commands succeeded
py36-simplejson: commands succeeded
py36: commands succeeded
ERROR: py34: InterpreterNotFound: python3.4
ERROR: pypy-devel: InterpreterNotFound: pypy
ERROR: pypy-lowest: InterpreterNotFound: pypy
ERROR: pypy-simplejson: InterpreterNotFound: pypy
ERROR: pypy: InterpreterNotFound: pypy
For each of created virtualenvs I have created requirements.txt
file by pip freeze > {venv_name}.txt
Then calculated hashes for the files, sorted according to hash values, so all same will be grouped. Here comes the strawman:
b231a4cc8f30e3fd1ca0bfb0397c4918f5ab5ec3e56575c15920809705eb815e py35.txt
b231a4cc8f30e3fd1ca0bfb0397c4918f5ab5ec3e56575c15920809705eb815e py36.txt
cdf69aa2a87ffd0291ea65265a7714cc8c417805d613701af7b22c8ff2b5c0e4 py27-devel.txt
dfe27df6451f10a825f4a82dfe5bd58bd91c7e515240e1b102ffe46b4c358cdf py36-simplejson.txt
e48cd24ea944fc9d8472d989ef0094bf42eb55cc28d7b59ee00ddcbee66ea69f py36-lowest.txt
f8c745d16a20390873d146ccb50cf5689deb01aad6d157b77be203b407e6195d py36-devel.txt
053e107ac856bc8845a1c8095aff6737dfb5d7718b081432f7a67f2125dc87ef docs-html.txt
45b90aa0885182b883b16cb61091f754b2d889036c94eae0f49953aa6435ece5 py27-simplejson.txt
48bd0f6e66a6374a56b9c306e1c14217d224f9d42490328076993ebf490d61b5 coverage-report.txt
564580dad87c793c207a7cc6692554133e21a65fd4dd6fc964e5f819f9ab249c py27.txt
8b8ff4633af0897652630903ba7155feee543a823e09ced63a14959b653a7340 py27-lowest.txt
Scary one, isn't it? From all the tests, only two share the same frozen dependencies.
This is reality of general python library with good test suite. You will now probably admit, this is something quite different from python library tested in enterprise environment.
Checking jinja2
, which seems to be much simpler beast:
coverage-report: commands succeeded
py26: commands succeeded
py27: commands succeeded
py33: commands succeeded
py35: commands succeeded
py36: commands succeeded
ERROR: docs-html: commands failed
ERROR: py34: InterpreterNotFound: python3.4
ERROR: pypy: InterpreterNotFound: pypy
Seeing checksums I am surprised, that py27.txt and py26.txt differ:
047a880804009107999888a3198f319e5bbba2fa461b74cfdfdc81384499864e py26.txt
047a880804009107999888a3198f319e5bbba2fa461b74cfdfdc81384499864e py33.txt
047a880804009107999888a3198f319e5bbba2fa461b74cfdfdc81384499864e py35.txt
047a880804009107999888a3198f319e5bbba2fa461b74cfdfdc81384499864e py36.txt
48bd0f6e66a6374a56b9c306e1c14217d224f9d42490328076993ebf490d61b5 coverage-report.txt
743ad9e4b59d19e97284e9a5be7839e39e5c46f0b9653c39ef8ca89c7b0bc417 py27.txt
@vlcinsky That is indeed scary. I'm wondering whether Flask is a special case or whether that's actually the norm, but you've definitely proven me wrong.
I'm now hoping our Python library will not suffer from the same problem someday and that the differences will be more manageable there.
@Moritz90 Your internal library is serving completely different audience so you may afford keeping execution context much more narrow.
General python libraries are often flexible and configurable, e.g. Flask allows alternative json parsers to be installed and used what is covered by separate test run.
One can learn a lot about testing and tox from Flask's tox.ini
lowest test variants take care to test against oldest dependency version.
devel is testing against development version of core dependencies.
I would say, Flask is on the higher level of complexity and exhibits careful test suite.
pyramid's tox.ini shows similar number of environments (they aim to 100% code coverage too).
maya's tox.ini is very fresh (2 days) and simple, even here are 4 different environments and py27 differs in frozen requirements from py35 and py36.
@Moritz90
Regarding glue between pipenv and tox
pipenv --man
shows some instructions, how to use pipenv
within tox.ini
commandstox.ini
file allows running arbitrary commands so this includes pipenv
.
pipenv
has great feature, that when run in already activated virtualenv (what is case within tox based test), it installs into given virtual env. This is really nice.
As we probably need Pipfile.lock
generated, some extra effort must be done to get it and move to proper place (a.g. into .tox/py36/Pipfile.lock
to prevent overwriting by following test. This shall be possible, but some simplification would be welcome. Maybe some trick with environmental variable for location of Pipfile
would make it even simpler.
@vlcinsky
pipenv install -e .
once so that setup.py is now tracked via your lockfile. And then run pipenv install
whenever you add new packages to setup.py.pipenv --deploy
is designed to catch this. Run it in your CI!update-all-lockfiles.sh
locally, and run pipenv --deploy
on your CI to catch errors.@Moritz90 - Agreed, the "two phase" approach may be overkill in most cases. In particular, if you're making deliberate/intentional "manual" updates to your lockfile, it's completely unnecessary.
More generally, it would be good to ensure this "proposal" focuses on the things that are actually hard problems (in my view, that's (A) serving multiple envs, (B) wanting to catch changes in upstream dependencies). It shouldn't be based transient things (bugs in Pipenv) or potential misunderstandings of how the tool is intended to be used.
But even for those "hard" problems, the framing should be like "in some complex edge cases, you may find that a basic Pipenv workflow is insufficient, so here are some things to think about". IMO, it should not be framed as the default approach (because most people won't have those concerns).
The documentaion example @vlcinsky provided would become simpler and less confusing if Pipenv/Pipfile would allow handling of lib-dependencies
, app-dependencies
and dev-dependencies
. Docs could look something like this:
Use the lib-dependencies
option if you package is a shared library. Example Pipfile
:
[lib-dependencies]
some-lib=="*"
another-lib=="*"
yet-another-one==">=1.0"
[dev-dependencies]
some-dev-tool=="1.1"
For shared libraries it's important to keep the version ranges under [lib-dependencies]
as wide as possible, to prevent version conflicts on the consumer system.
If your package is an application (intended to be installed by pipenv on the target system) that requires exact dependency versions you should use the [app-dependencies]
option. Example Pipfile
:
[app-dependencies]
some-lib=="1.0.12"
another-lib=="1.*"
yet-another-one=="2.0"
[dev-dependencies]
some-dev-tool=="1.1"
/End doc example
Another approach could be a Pipfile.lib
and a Pipfile.app
.
I think something like this would omit the need for a chunk of anti-pattern sections and third-party tools to fill the gap.
To call pipenv packaging tool is misleading if one expects it to create python libraries or to be deeply involved in creation of them.
I think this is a real problem, which leads to a lot of confusion. Especially among people that are used to package managers in other programming languages (e.g. JS, Rust, Elm). It took me several month and occasional reading of GIthub issues, until I realized that I was using Pipenv and setup.py the wrong way.
@feluxe
Your [lib-dependencies]
or Pipfile.lib
is what we have today in Pipfile
(as abstract dependencies - being as wide as possible).
Your [app-dependencies]
or Pipfile.app
is what we have in Pipfile.lock
(as specific dependencies).
pipenv
and it's files can be used in two different situations - developing a library or preparing an application deployment but probably not for both at once. For this reason I do not see strong reasons for adding extra sections into Pipenv
. It's developers responsibility to know, what type of purpose the Pipfile
is going to serve.
I think this is a real problem, which leads to a lot of confusion. Especially among people that are used to package managers in other programming languages (e.g. JS, Rust, Elm). It took me several month and occasional reading of GIthub issues, until I realized that I was using Pipenv and setup.py the wrong way.
Agreed. The three-section solution is also a very interesting solution Iâve never considered, and it seems to be correct and (surprisingly!) simple.
Coming from a Python background myself, Iâve always felt Nodeâs package.json is doing it wrong (Rust is better because it has a compiler and linker, and can resolve this at a later stage). Treating app and lib dependencies the same way simply wonât work for a scripting language like Python, at least in an abstract senseâi.e. it might work for you, but a generic tool like Pipenv canât do it because it needs to be generic.
While I do like the three-section solution in concept, it is still a rather incompatible change to the existing ecosystem. There are already setup.py, setup.cfg, and (potentially) pyproject.toml filling this space. If Pipenv (Pipfile, to be exact) wants to move into the space, it needs to consolidate with related projects, such as pip (library support should ideally be supported directly by it) and flit.
As I mentioned in other issues regarding lib/app dependency handling, this discussion needs to be escalated to pypa-dev (the mailing list) and/or the PEP process, so it can be better heard by other parties and relevant persons, before Pipenv (Pipfile) can move in any directions.
@vlcinsky
Your [lib-dependencies] or Pipfile.lib is what we have today in Pipfile (as abstract dependencies - being as wide as possible).
Sorry if this wasn't clear. My lib-dependencies
are meant to be what people currently put into setup.py
/ install_requires
. Maybe pypi-dependencies
would be a better name for what I meant.
@uranusjr
There are already setup.py, setup.cfg, and (potentially) pyproject.toml filling this space.
Pipenv (the command line tool) could interface setup.py
. Just the dependency section of setup.py
would have to move to Pipfile. At least in my imagination :)
As I mentioned in other issues regarding lib/app dependency handling, this discussion needs to be escalated to pypa-dev (the mailing list) and/or the PEP process, so it can be better heard by other parties and relevant persons, before Pipenv (Pipfile) can move in any directions.
Ok, sorry for bothering ;) If I find some time I'll write something up for the mailling list.
Within the scope of this proposal, however, I would suggest it to focus on the currently possible best practices, instead of going into the rabbit hole of working out a new workflow for the whole Python packaging community. Itâd be more productive to propose a best practice within the current constraints, and then start the discussion for improvements.
@uranusjr - I come from a "compiled" background, so I'm curious why this is the case?
Treating app and lib dependencies the same way simply wonât work for a scripting language like Python
@tsiq-oliverc Since the best practice of app requires you to pin your dependencies, libraries would start to pin theirs as well, if they use the same source of requirement files. This would lead to problems in dependency resolution.
Say my app has two dependencies A and B, both of them depend on C, but A pins v1, while B pins v2. Compiled languages allow the toolchain to detect this at compile time, and resolve it in many ways. Rust, for example, does this during linking timeâThe end executable would contain two copies of C (v1 and v2), with A and B linking to each of them. In C++ land this would be solved with dynamic libraries; the symbol lookup is done even later (at runtime), but the idea is the sameâthe compiler knows what you need (from the interface you use), and can act accordingly.
Scripting languages canât do this because it doesnât know what you really want to do until it actually reaches the call. Node works around this by always assuming the dependencies are incompatible (A and B always get their own C, even if the two copies are identical), but that leads to a new class of problems, and results in awkward hacks like peer dependencies that everyone (I hope?) agrees are terrible. Python probably donât want to go there (it canât, anyway, since that would likely break all existing Python installations).
Another way to work around this is to do something clever in the packaging tools that âunpinsâ the dependency version. Bundler (of Ruby) sort of does this, by recommending people to not include the lock file into the gem, so Bundler can use the unpinned versions in Gemfile, instead of pinned versions in Gemfile.lock. But people tend to ignore advices and do whatever they want, so you still get pinned versions everywhere.
I was probably a bit too strong to say that it simply wonât work. But at least all previous tries have failed, and many of those who tried are very smart people, much smarter than myself. I donât think this can be done, personally, and Iâd continue to think this way until I see the very brilliant proposal that actually does it.
@tsiq-oliverc Pieter Hintjens wrote somewhere a concept of "Comments are welcome in form of pull request"
I like that because it moves focus from philosophical advices to really tangible and practical things. And it also limits number of comments because a commenter often learns on the way that the idea is incomplete or somehow broken in real use.
I asked you for an example of python library, where pipenv
is used properly (or at least used) and you did not provide any.
You comment on tox
qualities but admit you are not familiar with it, still repeating something about best practices in the world of python package development.
You say Flask
is possibly special case. So I searched Github for python projects using word "library", sorted according to number of forks (as it probably reflects how many people are doing some development with it), ignored all "currated list of something" and counted number of environments for one OS (typically Linux):
Real number of environments to run tests in will be mostly 2 (+Windows) or 3 (+OSX) times higher.
tox
is used in 2 projects out of 3 (I do not compare it to Travis or Appveyor as they do another level of testing beside).
The number of environments to test in is rather high, Flask is definitely not the most wild one.
The number of environments to define fixed dependencies for is really not managable manually.
Simply dropping Pipfile.lock
into a repository is rather easy, but it does no magic improvement (if yes, show me real scenario, when it will improve the situation).
Maybe you know golden rule from "compiled" world and feel that determinism (or repeatibility) is a must for Python too. As you see, really many Python projects lives without it rather well so may be the golden rule does not apply so strictly here.
I will be happy, if we find usage of pipenv
for python libraries which will improve the situation. And I want to prevent usage, which would harm overall quality.
To reach that goal, my approach is to iterate over questions:
@feluxe
Sorry if this wasn't clear. My lib-dependencies are meant to be what people currently put into setup.py / install_requires. Maybe pypi-dependencies would be a better name for what I meant.
See pbr
discussion in this issue. It is the effort to support library dependencies by Pipfile.
I think, that one Pipfile
shall not be used for two purposes (lib and app), these things shall be done separately. If you feel it is really needed, could you describe purpose of a project using it? I usually try to keep library development and deployment projects separated as they have quite different usage in time.
@vlcinsky I'm not really sure where you want to take this (I'm not sure what kind of PR you're asking for !), so I'm going to bow out of this conversation for now.
To restate the TL;DR of my position:
@uranusjr Got it. Though I don't think there's anything language-specific here, it's simply that different communities have settled on different heuristics for dealing with a problem with no generic solution - if you have version conflicts, you have a problem.
Maven/Java (for example) forces you to think about it at build time. The NPM way means you have runtime issues if the mismatched versions cross an interface. Runtime resolution (e.g. Python, dynamic libraries) means that a dependent may crash/etc. if the dependency version is not what it expected.
@vlcinsky
See pbr discussion in this issue. It is the effort to support library dependencies by Pipfile.
pbr seems nice and all, but it falls under the category that I was trying to address with this:
I think something like this would omit the need for a chunk of anti-pattern sections and third-party tools to fill the gap.
I think such tools shouldn't be necessary in the first place.
If you feel it is really needed, could you describe purpose of a project using it? I usually try to keep library development and deployment projects separated as they have quite different usage in time.
When it comes to pypi packages, I ended up using Pipenv for handling dev-dependencies, Pipfile
to describe dev-dependencies, setup.py
to describe lib dependencies with install_requires
and setuptools
in setup.py
to publish my package running pipenv run python setup.py bdist_wheel upload
. This is what I consider complicated.
In other modern languages I have to learn one command line tool (package manager) plus one dependency file format. Documentation is in one place and easier to follow and a newcomer will get all this sorted out in a couple of hours. It's a matter of npm init
, npm install foo --dev
, npm publish
. Pipenv/Pipfile can do most of it already, if it could do all of it, issues such as this one would not exist.
I reinterate my call for a kind of "community" section/wiki for this discution. There are several "pattern" can are legit and some of us might want to share it "way of doing python libraries", some like me with pbr, and other might have a very good pattern. But a page inside the pipenv document, no sure if it is a good idea.
PS: to prepare migration to the new pypi, you should use twine and not python setup.py upload. Using "upload" should be considered as an antipattern.
Maybe pipenv can grow a "publish" commands ?
@feluxe You might want to take a look at poetry. I just stumble across it and it seems that it's what you are looking for.
It does what pipenv
does and more and it seems that they do it better especially regarding dependency management (at least that's what they pretend). It does dependency management, packaging and publishing all a single tool poetry
.
I wonder if pipenv
and poetry
could gather effort to finally give Python a true package manager.
I want to reiterate myself again before this discussion goes too far. Pipenv cannot simply grow a publish
command, or do anything that tries to take over the packaging duty. This would only fragment the ecosystem more because not everyone does it this way, and with app and lib dependencies being theoretically different, you cannot tell someone to merge them back together once the distinction is made in their workflow.
It may seem almost everyone is onboard with this merge , but the truth is there are a lot more people not joining this discussion because things work for them and they are doing something else. Iâve repeatedly said it: Discussion about improving the design of toolchains and file formats should happen somewhere higher in the Python packaging hierarchy, so it receives more exposure to people designing more fundamental things that Pipenv relies on. Please take the discussion there. There is no use suggesting it here, because Pipenv is not at the position to change it.
Iâve repeatedly said it: Ddiscussion about improving the design of toolchains and file formats should happen somewhere higher in the Python packaging hierarchy, so it receives more exposure to people designing more fundamental things that Pipenv relies on.
I agree that the discussion on this bug spirals out of control now that packaging and publishing came up (this bug is only about dependency management!), but could you please point us at the right place to have this discussion? People are having it here because pipenv is seen as a much-needed step in the right direction, not because they want to impose additional responsibilities upon the pipenv maintainers.
Edit: Sorry, I must have missed the post in which you did exactly that when reading the new comments the first time.
Within the scope of this proposal, however, I would suggest it to focus on the currently possible best practices, instead of going into the rabbit hole of working out a new workflow for the whole Python packaging community. Itâd be more productive to propose a best practice within the current constraints, and then start the discussion for improvements.
I very much agree with this. We should first figure out what the best possible workflow for library maintainers is right now before we come up with big plans. So let's focus on that again, as we did at the start of this thread. I don't think we've reached a conclusion yet.
Back to topic: Quoting @uranusjr's post about why dependencies should be defined in a different file for libraries:
Another way to work around this is to do something clever in the packaging tools that âunpinsâ the dependency version. Bundler (of Ruby) sort of does this, by recommending people to not include the lock file into the gem, so Bundler can use the unpinned versions in Gemfile, instead of pinned versions in Gemfile.lock. But people tend to ignore advices and do whatever they want, so you still get pinned versions everywhere.
I was probably a bit too strong to say that it simply wonât work. But at least all previous tries have failed
I still don't see why the official recommendation for libraries for now cannot be to use pipenv
for their CI builds, but keep the Pipfile.lock
out of source control. Since, as a few people pointed out, pipenv
doesn't currently have anything to do with the packaging process, we shouldn't run into the problem you outlined above.
And I also don't see why this is an argument against defining your abstract dependencies in the same file that applications use to define their abstract dependencies. It's okay if pipenv
doesn't want to implement an elaborate solution for integrating the Pipfile
with setup.py
, but I don't see why that's a bad idea in general.
@vlcinsky
I think, that one Pipfile shall not be used for two purposes (lib and app), these things shall be done separately.
See my above post. Could you please elaborate on why you think that? I simply cannot see any downside in principle. Right now, it might be a bad idea to include a Pipfile
, since you'll then have to define the dependencies in the same way in two different files, but I haven't yet seen any argument that explains why it would be a bad idea to use the Pipfile
for dependency declarations in general.
Note that I've already agreed that Pipfile.lock
should not be in source control for libraries unless you're in the same situation I'm in.
Edit: Also, if it turns out that pipenv
itself actually needs to know about the difference, you might just introduce something like cargo's crate-type
field before you start introducing app-dependencies
and lib-dependencies
- that sounds overly complicated.
@Moritz90 Several of Pythonâs mailing lists would be good venues to hold this discussion.
pypa-dev is the most definite for discussions centring Python packaging, and the ecosystem around it. Iâd probably start here if I were to post a similar discussion.
python-ideas is a place to get ideas discussed, and has quite high visibility to the whole Python community. It would also be a good starting point if you want to push this to the PEP level (eventually you would, I think).
@tsiq-oliverc
By PR I mean: show an example proving your concept viable.
So pick up some existing library, fork it, apply your (1) - you say it shall be easy with pipenv
and show me. I tried pretty hard and have difficulties.
If your (2) means "someone else has to do the work", your PR will not exist.
In (3) you talk about "small subset of cases" without giving any real number. Are all the top libraries I described in regards to number of virtualenvs considered "small subset"?
To conclude this discussion, I created short summary of what was found during discussion.
pipenv
(anti)patterns for python libraries and applicationsI changed the focus a bit: it talks not only about (general) python libraries, but also about applications as it was rather cheap to include it and it demonstrates the differences well.
I intentionally excluded anything proposing changes in existing tooling such as pipenv
, tox
etc.
pipenv
and what it is notPipfile.lock
.The (python software) product is either ready to be used in another product (thus library) or it is final application ready to be run.
Personally I think, even "enterprise libraries" fall into library category (the same rules apply, only number of execution contexts is smaller).
pipenv install <package>
thus "get the package into play (resolving versions for other libraries around)"pipenv sync
thus "apply concrete dependencies"concrete dependencies: must pin versions, ideally with hashes of used libraries
pipenv
artefacts:Pipfile
: abstract dependencies
Pipfile.lock
: concrete (locked) dependencies"tox
or similar testing SWpipenv
and concrete dependencies (can be fine for libraries)Pipfile
in repositoryPipfile.lock
created by pipenv install -e .
Pipfile.lock
documents (seals) the environment and allows later reproduction of virtualenv for analysing issues.Mode: "Ice Age"
setup.py
install_requires
) change or dependent package on pypi is updated, regenerate Pipfile.lock
by pipenv install -e .
pipenv sync Pipfile.lock
Pipfile.lock
createdmanually by developer (may work for applications)
application: determinism (run in exactly the same way within selected execution context)
public pypi (low determinism, packages are updated at any time)
Pipfile.lock
): total determinismPipfile.lock
:deploy application into production
pbr
library allows definition of abstract library dependencies via requirements.txt
. Update reading Pipfile
is on the way.
poetry
package tries something similar to pyenv
"drop lockfile into repo" and you get deterministic builds:
Pipfile.lock
. Seriously: flask
shows on it's 11 different virtual envs (on one OS) 10 different locked dependencies. Who is going to create and commit them?Pipfile.lock
(but generated by CI script) allowing to regenerate the virtualenv elsewhere.Pipfile.lock
in library repositorysetup.py
.Pipfile
in library repositorysetup.py
), it may hide broken setup.py
dependency declaration.Pipfile
by pipenv install -e .
or pipenv install -e .[tests]
if you need also test dependencies and they are declared as "tests" extras in the setup.py
pipenv install <something>
into CI scriptspipenv
).Python libraries (especially general ones) exhibit unexpectedly high number of execution contexts. The reason is, that with libraries the goal is proven flexibility under different conditions. The flexibility seems more important over deterministic builds. For people coming from "compile" world this might feel like very bad antipattern. The fact is, most (possibly all) python libraries do not provide deterministic builds (if you are aware of some, let me know) and Python is still doing very well. The reasons Python applications are stile alive might be: python as scripting language differs from compiled world. The other reason could be, that determinism can(shall) be resolved a step later as soon as and application (build from set of libraries) shall resolve the (natural and justified) requirement for determinism.
For applications the situation is just opposite and here determinism is really easy to reach with tool such as pipenv
.
What to do next?
Thanks everyone for very inspiring discussion - I feels to me as a message "I am totally lost in this topic" refactored three times - what means naturally we got better.
@vlcinsky poetry
has nothing to do with pyenv
. It's a lot like pipenv
(but with a much better implementation regarding the management of libraries and applications, IMO) but with the packaging and publishing part.
You have a pyproject.toml
file that defines your project and its dependencies (abstract dependencies) and a pyproject.lock
which describes the pinned dependencies and are pinned for any python version and platform the pyproject.toml
file has specified in order to have only one deterministic lock file to avoid the problems that pipenv
is facing. Only when installing, poetry
will check which packages to install by checking them against the environment.
And when it packages your library, it will use the abstract dependencies (and not the pinned one) so you keep the flexibility when distributing your package (via PyPI for example).
Tha advantage of this is that it will use abstract dependencies for libraries and the lock file for applications. This is the best of both worlds.
@zface poetry not using pinned dependencies is literally defeating the entire purpose. Pipenv is _idempotent_ and this requires _reproduction_ of an environment. Please stop using this issue as a platform to try and sell everyone something that has listed as its first reason for why to use it over pipenv that the author doesn't like the cli. At the end of the day, our software is deployed across hundreds of thousands of machines and actually acknowledges and uses the best practices around packaging. If you don't want an idempotent environment and you do want to blur the lines between development and packaging please don't participate in this discussion because we are not moving in that direction and it will not be productive.
Essentially we spend a lot of time and effort on resiliency that small projects which make lofty claims donât have to spend as much effort on because people arenât hitting edge cases. If you truly believe that another tool offers you the best of all worlds then I encourage you to use itâ pipenv itself is not going to handle packaging for you in the near term, if ever.
@techalchemy I am not selling anything, really, I am merely directing towards ideas that could be used in pipenv
.
And poetry
does pin dependencies in the pyproject.lock
, just like pipenv
does in Pipfile.lock
. So you have reproduction just like pipenv
provides. If you have a lock file it will be used and install pinned dependency and if I am not mistaken it's also what pipenv
does.
The only time it uses abstract dependencies is when it packages the project for distribution (so basically for libraries) since in this case you do not want pinned dependencies.
@vlcinsky There are still a few points that need to be sorted out, corrected, or expanded on, but I am still very keen on this going into documentation form, Pipenv or otherwise. Would you be interested in sending in a pull request? Iâd be more than happy to help flesh out the article.
Regarding poetry, I am not personally a fan as a whole, but it does do many correct things. It should probably not be mentioned in Pipenv docs because it violates a few best practices Pipenv devs want to push people towards, but it should be mentioned if the discussion is held in pypa-dev or similar, to provide a complete picture of how the packaging ecosystem currently is.
poetry can also use more attention and contribution. This would be the best for the community, including Pipenv. With viable choices, people can weight on their choices instead of going into Pipenv head first and complaining it not doing what they expect. Good competition between libraries can also spur forward technical improvements in the dependency resolution front, which Pipenv and poetry both do (and neither perfectly). We can learn a lot from each other.
@uranusjr Yes, I think few things were clarified and deserve sharing with wider audience. Your assistance is really welcome.
What about "pair documentation drafting"? I think that at this moment it would be most effective to work on it in small scale of two persons only.
Thinks to do are (possibly with one or two iterations):
If you feel like writing it on your own (based on what was discussed) and have me as a reviewer, I would not complain.
I will contact you by e-mail to agree on next actions.
@vlcinsky Also Iâm available as @uranusjr
on PySlackers (A Slack workspace) if you prefer realtime interaction. Pipenv has a channel there (#pipenv
).
@uranusjr That's what I meant by gathering effort. Python desperately need a good package manager like cargo. The Python ecosystem pales in comparison with the other languages due to the lack of a standard way to do things. And pipenv
will not help with that, I think.
What bothers me is that pipenv
 advertise itself as the officially recommended Python packaging tool
while it's not a packaging tool, far from it, which is misleading for users. It's merely a dependency manager coupled with a virtualenv manager.
Also, you say that it was inspired by cargo, npm, yarn which are packaging tools along with dependency managers while piping is not.
And here is the flaw of pipenv
, it just muddies the water since people will still make the same mistakes as before with requirements.txt
vs setup.py
. Projects will still be badly packaged with badly define dependencies in their setup.py
because of that. That's what projects like cargo did right: they handle all the aspects of developing projects/applications to ensure a consistency while a project like pipenv
does not.
And when you say:
which Pipenv and poetry both do (and neither perfectly)
What do you mean? From what I have seen, their dependency manager is much more resilient than the one provided by pipenv
. The only downside is that they use the PyPI JSON API which sometimes does not have dependency information due to badly published packages.
Anyway, I think, like you said, that both projects can learn from each other.
And, one more thing, what's the future of pipenv if, ultimately, pip handles the Pipfile
? Will it just be a virtualenv manager?
If the poetry dependency manager relies on the json api itâs not only sometimes wrong due to âbadly published packagesâ, itâs going to be very limited in what it can actually resolve correctly. The warehouse json api posts the _most recent_ dependencies even if youâre dealing with an old version, and thatâs if it has that info at all. We used to incorporate the json api too, it was great because it was fast, but the infrastructure team told us not to trust it. It seems a bit disingenuous to call something resilient if it relies on an unreliable source to start off with.
Ultimately the challenges are around actually building a dependency graph that executed a setup file because currently, thatâs how packaging works. There is just no way around it. A dependency graph that resolves on my machine may be different from one that resolves on your machine even for the same package.
Itâs easy to hand wave and say âwell doesnât that just make pipenv a virtualenv manager if pip can read a pipfile?â No. Pipenv is a dependency manager. It manages idempotent environments and generates a reproducible lockfile. I realize this must seem trivial to you because you are waving it away and reducing this tool to a virtualenv manager, but it isnât. We resolve lockfiles and include markers for python versions that you donât have, arenât using, and keep that available so that you can precisely deploy and reproduce across platforms and python versions. We use several resolution methods including handling local wheels and files, vcs repositories (we resolve the graph there too) remote artifacts, pypi packages, private indexes, etc.
At the end of the day pip _will_ handle pipfiles, thatâs the plan, itâs been the plan since the format was created. But that is the same as asking âbut what about when pip can handle requirements files?â The question is basically identical. Pip can install that format. Itâs not really relevant to any of the functionality I described other than that we also install the files (using pip, by the way).
@techalchemy
The warehouse json api posts the most recent dependencies even if youâre dealing with an old version, and thatâs if it has that info at all
This just plain wrong, you can get a specific version dependencies by calling https://pypi.org/pypi/{project}/{release}/json
. If you just call https://pypi.org/pypi/{project}/json
sure you will only get the last dependencies but you can actually get the right set of dependencies.
And the packaging/publishing part of python projects really need to be improved because in the end it will benefit everyone, since it will make it possible to use the JSON API reliably.
It manages idempotent environments and generates a reproducible lockfile.
We resolve lockfiles and include markers for python versions that you donât have, arenât using, and keep that available so that you can precisely deploy and reproduce across platforms and python versions.
And so does poetry
. And you can make it not use the JSON API to provide the use the same resolution method as pipenv
(using pip-tools). See https://github.com/sdispater/poetry/issues/37#issuecomment-379071989 and it will still be more resilient than pipenv
(https://github.com/sdispater/poetry#dependency-resolution)
@zface I will say this one final time, please take this to somewhere higher in the hierarchy. Pipenv does not self-proclaim to be the officially recommended Python packaging tool; it says that because it is. If you feel that is inappropriate, tell it to the officials that recommend Pipenv. Please do not put these things on Pipenv dev. This is the wrong place to complain, and you cannot possibly get resolutions for your complaints here. You can also get better answers on technical questions you have there. This is an issue tracker for Pipenv, not a discussion board for Python packaging tools and how Python packaging is done.
Pipenv doesn't just rely on pip-tools for resolution, please stop reducing our software to one liners that demonstrate a lack of understanding. I know very well how the PyPI api works, I talked directly to the team that implemented it.
This just plain wrong,
This kind of attitude is not welcome here. Do not assume we don't understand what we are talking about. Please practice courtesy.
it will still be more resilient than pipenv (https://github.com/sdispater/poetry#dependency-resolution)
Pipenv does not currently flatten dependency graphs. Pointing to one specific issue where a tree has been flattened and claiming the entire tool is therefore both better and more resilient is foolish, you are proving over and over again that you are simply here to insult pipenv and promote poetry. Please be on your way, this behavior is not welcome.
I agree the discussion is way off-topic, that was trying to capitalize the "good practices" arround pipenv.
However,
[...] will still make the same mistakes as before with requirements.txt vs setup.py. Projects will still be badly packaged with badly define dependencies in their setup.py because of that.
I share this opinion, getting new developers to successfully package their own Python code is actually complex, too complex, requires to read way to much online documentation.
But it's not up to pipenv or any other package dependency to deal with that entirely. We could not rewrite the history. We, as a community, need to find a way to modernize the python tool chain, step by step.
And pipenv (and probably poetry) is a very good step forward.
Having to maintain on one side Pipfile
for application and setup.py
for libraries on the other side, is a no brainer. No matter how hard we explain with lot of words and long articles and good practices guides, it's too complex for what it is. I completely agree it is like this for the moment, but it should not prevent us of imagining a better and safer way.
At the end, as a developer, I want a single tool, maybe with two different modes, to help me and make my life as easier as possible.
Their should be a way of extracting just the part that does the requirements.txt/Pipfile
from libs such as PBR to propose a kind of 'easy setup.py', a Pipfile-aware wrapper arround install_requires
, without all the unwanted behavior pbr brings, and package that in a dedicated setuptools wrapper that only does that.
So we would be able to have the better of each world:
Pipfile
(versionned in both libraries and applications)Pipfile.lock
(versionned only for applications)pipfile_setuptools
, install_requires_pipfile
?) that would be a first level dependency which job is only to inject Pipfile
into install_requires
.This another project that would not be related to pipenv
, but still needs a generic Pipfile
parser library. What do you think?
@gsemet From my understanding PyPA has been trying to fill that with pyproject.toml instead, led by flit. Youâll need to talk to them first (at pypa-dev or distutils-sig) about this before proceeding to use Pipfile as the source format. As for parsing Pipfile (and the lock file), that is handled in pypa/pipfile (which Pipenv vendors to provide the core parsing logic).
Edit: Please drop me a message if you decide to start a discussion about this in either mailing list. I do have some ideas how we can bring the two parts of Python packaging distribution together.
I must admit I am a bit sad seeing dependencies declared in pyproject.toml
(which takes the roles as setup.cfg
done by PBR), while PyPa also supports Pipfile
....
Thanks for the pointer to flit and pipfile. There is also Kennethreitz 's pipenvlib that seems lighter.
PBR's setup.cfg seems more complete compared to the official documentation (ex: data_files
) and reuse a file already shared with several tools (flake8, pytest, ...) can use the same file, reducing the number of file at the root of a python project)
Most helpful comment
@Moritz90 Several of Pythonâs mailing lists would be good venues to hold this discussion.
pypa-dev is the most definite for discussions centring Python packaging, and the ecosystem around it. Iâd probably start here if I were to post a similar discussion.
python-ideas is a place to get ideas discussed, and has quite high visibility to the whole Python community. It would also be a good starting point if you want to push this to the PEP level (eventually you would, I think).