Pipenv: Possible legal problems with vendor/patched libraries

Created on 12 Apr 2018  路  44Comments  路  Source: pypa/pipenv

tl;dr

Pipenv probably violates couple of free software licenses by not shipping their text. It is also probably "infected" by GPL.

Details

Hello, we (@mcyprian, me and the Fedora Python SIG @fedora-python) are trying to finally package pipenv for Fedora, so user can just do sudo dnf install pipenv. The Fedora package review request is in Red Hat Bugzilla 1564500.

While we are trying to unbundle (unvendor?) most of the 3rd party libraries shipped with pipenv, we are in a bit of hurry so we decided to leave the libraries that are

  • not yet packaged for Fedora,
  • or patched

bundled for now. The first category is a TODO for future, the second will probably remain bundled forever.

As part of the review of the Fedora package, the reviewer is obligated to check whether the package is licensed with approved free software/content license and whether the licensing information for the package is correct.

Missing licenses

This is where I found out that all the vendored 3rd party libraries are shipped without their LICENSE/COPYING/etc. files and the NOTICES file is shipped instead.

The contents of the vendor and patched directories are subject to different licenses
than the rest of this project. Their respective licenses can be looked
up at pypi.python.org.

This is unacceptable for Fedora* and IMHO should not be acceptable for @pypa either. Most of the libraries are licensed with licenses that require the license text to be shipped. See MIT:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

Or BSD:

Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

This is similar for most of the permissive licenses. You should not strip the license file, in fact you need to ship it with the code unconditionally. You may put the license text and copyrights inside another file, such as NOTICES, but a link to pypi is IMHO not enough.

* This is currently only my opinion. It has not yet been reviewed by the Fedora legal team.

Copyleft

I also found out that strict_rfc3339 is shipped under the terms of the GNU General Public License version 3 or later. Needless to say that GPL is a copyleft license. By bundling this part of code inside pipenv, pipenv is "infected" with this license and shall be GPLv3 as well (which I think is undesired).

IANAL, however I'm quite confident that pipenv now violates couple of free software licenses including the GPL. This currently blocks us from inclusion into Fedora. Since pipenv is the recommended tool I think this shall be brought to @pypa.

Conclusion

What I believe shall be done:

  • The license texts shall be reintroduced, either individually or in one file.
  • Pipenv shall loose any copyleft licensed dependencies if it wishes to remain licensed under MIT.

I offer my help with collecting the licenses back, if that's agreed upon by pipenv maintainers. I can also try to replace strict_rfc3339 with rfc3339, however I haven't looked into it yet. There might also be other copylefted files (without a header that makes it obvious).

Discussion Vendored Dependencies

Most helpful comment

(I'd rather write code, but well, I want pipenv in Fedora.)

All 44 comments

Before I agree (though I can say I鈥檝e been thinking about this now that our bigger problems are addressed) I鈥檇 want to hear @ncoghlan鈥檚 thoughts. My initial instinct is that this seems like a valid concern and we would appreciate help. Thanks for documenting it!

Aye, we should definitely fix this, and I think @hroncok's suggested resolution is a good one.

Awesome. Let鈥檚 sort this out

Regarding strict-rfc3339 replacement, unfortunately rfc3339 does not do string parsing, only formatting. utcdatetime might be a viable alternative. It wouldn鈥檛 be astromical to build a helper library from the ground up based on datetime either.

OK. I'll start with the license files. If anyone is able to do the strict-rfc3339 thing, that would be great. If not, I'll look into it after the first thing is done.

Note that pipenv bundles pip as notpip and that bundles a lot as well and might have the very same problem.

But since pip itself is licensed under MIT I assume there shouldn鈥檛 be a GPL problem?

I haven't checked all the bundles libraries in pip for GPL. Yet there is a licensing problem.

There is, indeed. What I meant is that pip (notpip) wouldn鈥檛 be a source of licensing problems unless pip itself does, since pip itself is licensed under MIT, and Pipenv is MIT as well.

Pip itself is a source of licensing problem now, because pipenv violates pip's license terms. However, the license of pip (MIT) is compatible with license of pipenv (also MIT).

Fun fact, pipenv bundles requests and colorama 3 times:

$ find -name requests
./vendor/pip9/_vendor/requests
./vendor/requests
./patched/notpip/_vendor/requests
$ find -name colorama
./vendor/pip9/_vendor/colorama
./vendor/colorama
./patched/notpip/_vendor/colorama

The copies are identical.

Problems:

  • blindspin: No license file in upstream! Metadata says MIT @kennethreitz
  • delegator: No license file in upstream! Metadata says MIT @kennethreitz
  • parse: No license file in upstream! Metadata says BSD https://github.com/r1chardj0n3s/parse
  • timestamp: NO LICENSE AT ALL, PROPRIETARY. Assuming https://github.com/jarvys/timestamp
  • Levenshtein: GPLv2+!
  • strict_rfc3339: GPLv3+!
  • requirements: where is this from?

I need help here. Should pipenv be GPLv3 or should Levenshtein and strict_rfc3339 go away? Also timestamp is a big no no.

Bundled libraries inside bundled libraries need to be addressed as well, however, maybe on pip level mostly?

Note that I cannot find a place where Levenshtein is used. It has been added in here: https://github.com/pypa/pipenv/commit/aeaabf42f16e8167ca67af5ab7a34d864e7b358d but AFAIK it is not used at all.

requirements is from https://github.com/davidfischer/requirements-parser. I don鈥檛 think(?) either Kenneth or PyPA would want Pipenv to be GPL, so GPL dependencies probably needs to go away.

aeaabf4 was an old feature that Pipenv suggests package names during install to warn people about possible name squatting. The feature has since be removed, and if there鈥檚 no mention of Levenshtein in code right now it is probably safe to remove it, I think.

Gosh timestamp is virtually a one-liner. Pipenv鈥檚 pad-left.

@hroncok I believe that we stopped using Levenshtein recently and can remove it (I thought about doing that yesterday even) -- and agree we do need to un-bundle the pip stuff most likely although that's more a time issue (we don't have any)

timestamp -- where are we using this? I'll have to look

strict_rfc3339 -- we need an alternative, @uranusjr was looking at this?

requirements -- https://github.com/davidfischer/requirements-parser/blob/master/LICENSE.rst (BSD)

GPL stuff needs to be axed

@hroncok sorry for the delay on this but it's safe to say if we can find replacements for GPL'ed libraries or unlicensed code we can definitely axe it. Also I'm sure we can find something we currently have to convert to timestamps if we really need it

I can write a new library from the ground up to replace strict-rfc3339. The timestamp stuff should be easy to replace as well.

Will merge shortly

Thanks again for your effort I鈥檓 sure this has been a hassle

It's mostly boring :D

(I'd rather write code, but well, I want pipenv in Fedora.)

Thanks for working through this @hroncok! I can't send RH rewards any more, but if I could, I would :)

A quick check of current situation in vendor (notes added manually):

$ for lib in $(ls -d * | egrep -v '(__init__|LICENSE|README)'); do ls ${lib%.py}.LICENSE* 2>/dev/null || ls ${lib}/LICENSE* 2>/dev/null || echo $lib '<-----------------------' missing; done
Levenshtein <----------------------- missing (needs removal)
appdirs.LICENSE
backports/LICENSE
blindspin <----------------------- missing (needs upstream issue)
click/LICENSE
click_completion.LICENSE
click_didyoumean/LICENSE
colorama/LICENSE.txt
delegator.py <----------------------- missing (needs upstream issue)
docopt.LICENSE
dotenv/LICENSE
first.LICENSE
iso8601/LICENSE
jinja2/LICENSE
markupsafe/LICENSE
parse.py <----------------------- missing (needs upstream issue)
pathlib2.LICENSE
pexpect/LICENSE
pip9/LICENSE.txt
pipdeptree.LICENSE
pipreqs/LICENSE
ptyprocess/LICENSE
pytoml/LICENSE
requests/LICENSE
requirements/LICENSE.rst
semver.LICENSE
shutilwhich/LICENSE
six.LICENSE
strict_rfc3339.py <----------------------- missing (removal in #1982)
timestamp.py <----------------------- missing (removal in #1982)
toml.LICENSE
yarg/LICENSE

And in patched:

$ for lib in $(ls -d * | egrep -v '(__init__|LICENSE|README)'); do ls ${lib%.py}.LICENSE* 2>/dev/null || ls ${lib}/LICENSE* 2>/dev/null || echo $lib '<-----------------------' missing; done
contoml/LICENSE
crayons.py <----------------------- missing (and is OK, but we might add it for consistency)
notpip/LICENSE.txt
pew/LICENSE
pipfile/LICENSE  pipfile/LICENSE.APACHE  pipfile/LICENSE.BSD
piptools/LICENSE
prettytoml/LICENSE
safety.zip <----------------------- missing (bogus, is inside zip)

We can upstream issues to the mentioned libraries but you'll have more success just PR'ing them I'd guess, they are marked MIT and I assume that's how Kenneth wants them.

I think we covered everything else, but let me know. I began working on some logic to automatically rebuild the license files if we ever lose them, loosely based on pip 10's new vendoring code.

Also see https://github.com/pypa/pip/pull/5213 the code is there.

oh your implementation is much cleaner, if it's not too much trouble can you PR it back this way?

Once it's landed in pip, I'll do what I can to make that work here. It bas been easy with pip given that the vendored libs are tracked there (I recommend adapting a similar workflow here and possibly only have one layer of bundled libs).

@hroncok already began that with my hacked implementation which uses the same approach you took

See

  • Delegator: kennethreitz/delegator.py#53 (merged)
  • blindspin: kennethreitz/blindspin#4
  • parse.py: r1chardj0n3s/parse#64

OK, AFAIK the only thing that misses licensefiles now is pip9's and notpip's _vendor.

Any idea what is bundled there? The upstream vendor.txt file is missing, so the tracking is lost.

I'll strat with https://github.com/pypa/pip/blob/9.0.3/pip/_vendor/vendor.txt and see where that leads me.

https://github.com/pypa/pipenv/pull/2094 should be the last needed batch.

I suggest to adapt https://github.com/pypa/pip/pull/5213 as much as possible later not to have to deal with this manually. Sharing the deps between pip9 and notpip would also be nice, if possible.

@hroncok I already adapted and merged that code and used it to re-vendor everything.

@hroncok one additional note -- we only have both versions of pip in teh first place in order to accommodate pip-tools which has a hard dependency on pip 9 and which handles our dependency resolution (and which we have made modifications to, which are now visible in the patches subdirectory of the vendoring script).

This will no longer be necessary after they accept my PR over at https://github.com/jazzband/pip-tools/pull/657 which adds a compatibility layer so that pip-tools can use pip 8, 9 or 10.

As a sidenote I believe these licenses will get picked up whenever we re-vendor pip assuming they get bundled in a release at some point, and if we don't automate them they will get wiped next time we re-vendor.

As a sidenote I believe these licenses will get picked up whenever we re-vendor pip...

Assuming they get installed with pip install. I've never tried that.

and if we don't automate them they will get wiped next time we re-vendor.

right, however i have no idea when you'll switch to pip 10, so I'd rather see this in before manually for 9, so we can finally package this for Fedora, as was our primary motivation for this.

@techalchemy Thank you very much! Could I please have a release?

Yes we have 11.10.2 planned so we have a few things left

Was this page helpful?
0 / 5 - 0 ratings

Related issues

konstin picture konstin  路  3Comments

ipmb picture ipmb  路  3Comments

jakul picture jakul  路  3Comments

Californian picture Californian  路  3Comments

bgjelstrup picture bgjelstrup  路  3Comments