Pip: Don't print full paths for cached wheels when using them

Created on 3 Mar 2020  路  11Comments  路  Source: pypa/pip

Environment

  • pip version: 20.0.2
  • Python version: 3.8.0
  • OS: MacOS

Description

Currently, during installation, occasionally prints "Processing \"

Expected behavior

pip should say "Processing \ (cached)" instead of printing the entire path.

How to Reproduce

Not sure what exact cache state needs to be but I can consistently reproduce this with: pip install 'django-rest-swagger[reST]==0.3.10'.

Output

Collecting django-rest-swagger[reST]==0.3.10
  Using cached django_rest_swagger-0.3.10-py2.py3-none-any.whl (212 kB)
  Ignoring docutils: markers 'extra == "reST"' don't match your environment
Processing /Users/pradyunsg/Library/Caches/pip/wheels/e4/76/4d/a95b8dd7b452b69e8ed4f68b69e1b55e12c9c9624dd962b191/PyYAML-5.3-cp38-cp38-macosx_10_14_x86_64.whl
Collecting djangorestframework>=2.3.8
  Using cached djangorestframework-3.11.0-py3-none-any.whl (911 kB)
Collecting Django>=1.8
  Using cached Django-3.0.3-py3-none-any.whl (7.5 MB)
Collecting sqlparse>=0.2.2
  Using cached sqlparse-0.3.1-py2.py3-none-any.whl (40 kB)
Collecting asgiref~=3.2
  Using cached asgiref-3.2.3-py2.py3-none-any.whl (18 kB)
Collecting pytz
  Using cached pytz-2019.3-py2.py3-none-any.whl (509 kB)
Installing collected packages: PyYAML, sqlparse, asgiref, pytz, Django, djangorestframework, django-rest-swagger
Successfully installed Django-3.0.3 PyYAML-5.3 asgiref-3.2.3 django-rest-swagger-0.3.10 djangorestframework-3.11.0 pytz-2019.3 sqlparse-0.3.1
deferred till PR needs discussion bug

Most helpful comment

how that only happens to that particular wheel

Probably because PyYAML has no wheel for linux on PyPI and that cached wheel is the result of a local build.

All 11 comments

Could I take this one? I think I know how to fix it.

Go ahead! I鈥檇 be very interested in learning what the problem is.

Unless I miss something (which is likely), I don't think the PyYAML wheel should be logged with Processing at all, i.e. it cannot appear as a (local) file to the preparer. However, there is some sort of black magic happening behind the scene that caused a cached wheel to be seen as a file, and I'm not sure how that only happens to that particular wheel.

how that only happens to that particular wheel

Probably because PyYAML has no wheel for linux on PyPI and that cached wheel is the result of a local build.

I think this issue is a bit deeper than simply changing the file name displayed. @McSinyx has a point here.

Looking at the code where Processing... is logged:

https://github.com/pypa/pip/blob/4381493c528973dee4ccc49b2c1b7442b4b410cb/src/pip/_internal/operations/prepare.py#L375-L382

We see that it tests if req.link.scheme is a file. When the original link was a direct url, link may point to a local wheel in cache which is why we end up with the cache entry path in the log.

Is it not better to test for req.original_link.scheme == 'file' to diplay Processing ...? In that case the full path makes sense. And it also makes sense to display Collecting... if the original link is not a local file, and then display Using cached wheel ....

It could look like this:

        assert req.link
        link = req.link

        # TODO: Breakup into smaller functions
        if req.original_link and req.original_link.is_file:
            path = req.original_link.file_path
            logger.info('Processing %s', display_path(path))
        else:
            logger.info('Collecting %s', req.req or req)

        if link.is_file and link.is_wheel and link != req.original_link:
            with indent_log():
                logger.info("Using cached wheel %s", link.filename)
                logger.debug("Cached wheel is located at %s", link.file_path)

The user-visible result looks much more coherent like this.

Also I note that I personally sometimes find useful to have the full path of the cached wheel displayed. That could remain as a debug-level log though.

Since the info is to notice users on what's being carried out, does it make sense to separate the two cases where

  1. The wheel is downloaded and cached
  2. The wheel is build from source distribution, then cached

It does make sense, but the current code base does not make the distinction. It will be some significant refactoring to achieve this 馃槥

Is it a welcoming change? I mean not just to solve this particular issue but to improve the codebase in general. To handle GH-825 I may touch many things done before/during dependency resolution.

Code base improvement is definitely welcomed. I would suggest filing the refactorings as a separate PR from the actual feature, so the feature does not unnecessarily block the refactoring.

@uranusjr would it make sense to first move the cache check from populate_link to prepare_linked_requirement? That would make it easier to do meaningful logging in prepare_linked_requirement. Relatedly, it seems populate_link (and thus the cache check) is currently not called in the new resolver code path?

BTW there is a significant refactoring of the cache access in #7612 so I hope we can get that one in before going further in this area of the code base.

Actually I think it should be moved further out, either when the resolver is sure the link will be installed, or when the link is actually used to install. See #7801 and its linked issues for discussion. The reason is unrelated to this issue, but would incidentally provide a solution 馃槈

Was this page helpful?
0 / 5 - 0 ratings