Pip: Updating remote links with new URLs for PEP508 functionallity

Created on 13 Sep 2018  ·  38Comments  ·  Source: pypa/pip

What's the problem this feature will solve?
Installing a wheel file from a URL will not install a new wheel if the link changes.

Example:
setup.py

install_requires=[
    "package@http://domain:port/link-0.1.3.whl
]

If you change that link and try to re-install or upgrade the parent package, pip ignores the new requirement.

If there is no way for the url/version to be checked when installing dependencies, the URL support added in PEP 508 seems problematic at best.

Describe the solution you'd like
Installing remote links should always be reinstalled, same as editable links seems to be the simplest method.

Alternative Solutions
We could instead try to guess/determine/specify versions in the requirement string.
Ex:

"package=0.1.3@http://domain:port/link-0.1.3.whl"

which would imply that the link provides version 0.1.3 that we could check against the currently installed package version

direct url PEP implementation awaiting PR enhancement

Most helpful comment

The problem is the name is the same, but the version changes.

If I change "package@http://domain:port/link-0.1.3.whl" to "package@http://domain:port/link-0.2.5.whl", it won't install the new version of the package. When I re-install or upgrade the current package.

The package@remote notation will never install a newer version of the specified package as it's written right now.

All 38 comments

What's the use case for the link changing? In particular, what's the situation where you'd have:

  1. The URL changes
  2. The project name and version remain the same
  3. The code is different

I'm particularly uncomfortable with assuming that the code could be different even thought the name/version is the same, so I'd like to see a strong justification for why that's the case (the assumption that if the name and version are the same, the built project will be, is pretty pervasive throughout the pip codebase).

The problem is the name is the same, but the version changes.

If I change "package@http://domain:port/link-0.1.3.whl" to "package@http://domain:port/link-0.2.5.whl", it won't install the new version of the package. When I re-install or upgrade the current package.

The package@remote notation will never install a newer version of the specified package as it's written right now.

The problem is PEP 508 direct URLs don't allow for version specifiers (otherwise you'd just be able to update the version, like for other requirements), so the only way pip would know the package version changed is to "prepare" it.

Exactly

Frankly, they really are a poor substitute for dependency links.

Something I noticed is that pip install functions a bit different. It looks like it compares the remote URL

pip install http://server:port/path/package-0.1.whl
(installs package-0.1)
pip install http://server:port/path/package-0.1.whl
(already installed message)
pip install http://server:port/path/package-0.2.whl
(installs new version)
pip install http://server:port/path/package-0.2.whl
(already installed message)
pip install http://server:port/different/path/package-0.2.whl
(already installed message)

AFAIK, the requirement is prepared, but not installed, because the version is already present.

How about something along those lines:

 src/pip/_internal/req/constructors.py | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git i/src/pip/_internal/req/constructors.py w/src/pip/_internal/req/constructors.py
index 4c4641dc..50292ab8 100644
--- i/src/pip/_internal/req/constructors.py
+++ w/src/pip/_internal/req/constructors.py
@@ -176,7 +176,7 @@ def install_req_from_editable(

 def install_req_from_line(
     name, comes_from=None, isolated=False, options=None, wheel_cache=None,
-    constraint=False
+    constraint=False, package_name=None,
 ):
     """Creates an InstallRequirement from a name, which might be a
     requirement, directory containing 'setup.py', filename, or URL.
@@ -263,6 +263,8 @@ def install_req_from_line(
                 "Invalid requirement: '%s'\n%s" % (req, add_msg)
             )

+    assert package_name is None or req.name == package_name
+
     return InstallRequirement(
         req, comes_from, link=link, markers=markers,
         isolated=isolated,
@@ -292,6 +294,11 @@ def install_req_from_req(
             "which are not also hosted on PyPI.\n"
             "%s depends on %s " % (comes_from.name, req)
         )
+    if req.url:
+        return install_req_from_line(req.url, comes_from=comes_from,
+                                     isolated=isolated,
+                                     wheel_cache=wheel_cache,
+                                     package_name=req.name)

     return InstallRequirement(
         req, comes_from, isolated=isolated, wheel_cache=wheel_cache

So basically package @ URL is treated like URL.

That works perfectly for my test cases. I'll close my PR if you can open one from that patch.

OK, cool. Thanks for clarifying.

I prefer the alternative proposal of determining the version from the filename in the URL, if possible, when parsing a direct URL, just like we do for a dependency picked up of PyPI (getting the version from a .whl filename is well-specified, and we have standard heuristics for doing so from sdists). That sounds like a far better solution than just always installing direct URLs,

It still needs some work (particularly with respect to extras handling), but I'll see if I can make a proper PR.

@benoit-pierre do you think it would be possible to add this to the 18.1 milestone?

I added this to 18.1, but it seems like there's additional discussion + work wanted here. I doubt this would be ready in time for 18.1, which is due early next month.

I'm waiting for #5788 to be merged. My intent was then to PR this change: package@url_to/package-0.6-py3-none-any.whl would be handled like url_to/package-0.6-py3-none-any.whl on the command line (in other instances, using package@url would still be equivalent to using url#egg=package on the command line).

But really, I've come to the conclusion that the lack of version specifier when using direct URLs is crippling, they certainly are no replacement for dependency links.

Ping? What's the next step here?

Is there anything that can be done about this? pip not reinstalling PEP 508 dependencies when the URL changes makes that feature much less useful 😕

I've been using a patched pip from https://github.com/pypa/pip/issues/5780#issuecomment-421092322

I opened a PR (#6402) based on the patch above (https://github.com/pypa/pip/issues/5780#issuecomment-421092322), with tests

This could be related to python/peps#1145, which will introduce a file into which we record the direct reference URL.

Would the patches proposed above also make this work when installing from GitHub like so:

install_requires=[
    "package@https://github.com/user/package/archive/0.1.3.zip"
]

@akaihola Yes, according to my understanding. The first example in Example direct_url.json uses a GitHub zip archive like yours.

Looks like the PR is stuck and it might take sometime to get in public release. Any workaround possible with existing version of PIP? Or is it now accepted that if you used git URLs then package updates simply would not work and there is no viable workaround if you keep using git urls?

@sytelus does installing with --upgrade work for you?

I'm using this in setup.py dependencies. Do I tell my users to do pip install --upgrade -e .? Wouldn't that would try to upgrade a lot of things instead of just detecting updated version on git urls?

pip install --upgrade --no-deps -e . might work (no it’s not ideal)

pip install --upgrade . does not work. The only thing I found which does work is pip install --upgrade --force-reinstall ..

This of course forces reinstallation of every single dependency, which is awful, but the only workaround I could find.

Any recommended workaround to solve this issue ?
Thanks.

I did not check, but the new resolver would automatically does the correct thing (if implemented correctly). See #8371 for a summary on how you can access the new resolver right now, and in the upcoming pip releases while we gradually roll it out. Feel free to report here whether it works or not.

Can anyone confirm whether this works or not in the 2020 resolver?

@uranusjr the new resolver does things differently but I don't think we are there yet.
One problem, IMO, is that it seems to consider the version behind the url to decide to reinstall or not, and not the url itself.

So for instance:

$ export PIP_USE_FEATURE=2020-resolver
$ pip install packaging
... installs 20.4
$ pip install "packaging @ git+https://github.com/pypa/[email protected]"
... reinstalls 20.3, ok
$ pip install "packaging @ git+https://github.com/pypa/packaging@db291c7bdac5c6684b6256562903b361baf518fa"
... reinstalls 20.4dev0, ok
$ pip install "packaging @ git+https://github.com/pypa/packaging@fbe51442f084d553e83c1a586f1b44e215657f3c"
... requirement already satisfied, surprising

The last one does not reinstall because the version metadata is 20.4dev0 too.

I'm quite convinced that a good solution will involve comparing the requested direct URL to the URL found in direct_url.json, but we'd certainly need to analyze different scenarios to see what is the most intuitive default.

Ah, that makes sense. There are two relevent sections in the new resolver.

https://github.com/pypa/pip/blob/370322eacfb698391d1f02c32e644cfb4de85eb0/src/pip/_internal/resolution/resolvelib/factory.py#L186-L196

This one checks whether there’s a matching dist already installed

https://github.com/pypa/pip/blob/370322eacfb698391d1f02c32e644cfb4de85eb0/src/pip/_internal/resolution/resolvelib/resolver.py#L142-L152

And this one decides whether the distribution we got needs to be installed.

With PEP 610, we can add one more elif branch to each of them handle the case where the URL spec changes, but not the version pointed by the URL.

I’m marking this to awaiting PR since I’m not interested in the issue enough to work on it myself. Feel free to ping me if anyone needs some eyes to review the changes.

If the url is specifying to the branch (not fixed hash or tag), should it consider fetching the newer commits when pip install or pip install --upgrade even if the url is not changed?

Maybe? This is a design decision without either answer being “correct” and can be argued either way. Personally I feel it should be the job for --force-reinstall since it is bad practice to set the VCS URL to a moving target in the first place.

since it is bad practice to set the VCS URL to a moving target in the first place

In my practice it is common to specify a branch in a constraints file, then pin the resulting commit in requirements.txt.

So in itself I would not say that specifying a branch is bad practice.

Having a way to ask pip to upgrade to the latest commit of that branch is a useful feature. I'm very aware that finding the right mechanism to ask pip to do that is a difficult topic. I personally think it should be part of any reflection about pip upgrade strategies.

I thought about how a reasonable strategy would look like a bit:

  • Always re-install when the URL changes, even if the downloaded distribution has the same version as the current installation.

    • The fragment (things after #) counts as a part of the URL.

  • Conversely, never re-install (unless --force-reinstall, of course) if a URL pointing to an archive does not change.

    • Redirects are not followed. pip only compares the URLs passed to the install command.

  • Always re-clone if the URL points to a VCS since pip cannot determine whether the target has changed.

    • pip should compare the values of commit_id from PEP 610, and re-install if and only if the values don’t match.

  • Always re-install editable requirements (this is the current behaviour).
  • --upgrade always triggers re-installation of URL specs (but not version specs so this can be used for development tools instead of --force-reinstall).
  • Always re-install a non-PEP-508 path.

Does the above look reasonable?

/cc @gaborbernat since this would affect #8711 but I’m not sure if you’re following this thread.

@uranusjr that sounds reasonable, although my head hurts trying to understand all implications.

Could you clarify the "Always re-clone part"? I'm not sure what it means.

--upgrade always triggers re-installation of URL specs

This one should probably be adjusted in case of VCS references pointing to a commit id.
If the rev in the URL is the same as the commit id in pep 610, in that case it should not reinstall, because it's the same as an exact version match.

I was not writing this accurately. What I meant to say was that --upgrade would always result to pip fetching the URL again (but the fetched result may or may not be installed into the environment). In the case of (non-editable) VCS URLs, pip would re-clone (since the original repo clone that resulted in the current installation is likely not available anymore) the repository to compare the commit IDs of the installation and newly-cloned repository to determine what to do.

Does this make sense?

Was this page helpful?
0 / 5 - 0 ratings