Pip: Cross-platform command to return pip's cache directory

Created on 13 Nov 2019  路  9Comments  路  Source: pypa/pip

What's the problem this feature will solve?

From the command-line, I'd like a cross-platform method to get pip's cache directory, which by default is different per OS.

There's currently no supported way to do this.

Describe the solution you'd like

PR https://github.com/pypa/pip/pull/6391 is adding pip cache info to returns the wheels directory, plus some extra info:

$ pip cache info
Cache info:
  Location: /Users/hugo/Library/Caches/pip/wheels
  Packages: 471

So something like pip cache dir could be a simplified version of that:

$ pip cache dir
/Users/hugo/Library/Caches/pip

This would be useful for caching with GitHub Actions CI. Right now, the config needs repeating three times, once per OS, which is rather cumbersome (https://github.com/actions/cache/pull/86):

      - name: Ubuntu cache
        uses: actions/cache@v1
        if: startsWith(matrix.os, 'ubuntu')
        with:
          path: ~/.cache/pip
          key:
            ${{ matrix.os }}-${{ matrix.python-version }}-${{ hashFiles('**/setup.py')
            }}
          restore-keys: |
            ${{ matrix.os }}-${{ matrix.python-version }}-

      - name: macOS cache
        uses: actions/cache@v1
        if: startsWith(matrix.os, 'macOS')
        with:
          path: ~/Library/Caches/pip
          key:
            ${{ matrix.os }}-${{ matrix.python-version }}-${{ hashFiles('**/setup.py')
            }}
          restore-keys: |
            ${{ matrix.os }}-${{ matrix.python-version }}-

      - name: Windows cache
        uses: actions/cache@v1
        if: startsWith(matrix.os, 'windows')
        with:
          path: ~\AppData\Local\pip\Cache
          key:
            ${{ matrix.os }}-${{ matrix.python-version }}-${{ hashFiles('**/setup.py')
            }}
          restore-keys: |
            ${{ matrix.os }}-${{ matrix.python-version }}-

Other people also want a cross-platform method:

Alternative Solutions

  • The wrong way: use pip's internal API

Use pip's private, internal API, which has changed in the past and may change in the future:

$ python -c "from pip._internal.locations import USER_CACHE_DIR; print(USER_CACHE_DIR)"
/Users/hugo/Library/Caches/pip
  • Another way: change pip's cache dir

Provide --cache-dir or set the PIP_CACHE_DIR environment variable to whatever path you like and cache that. Or use --user to install into the user directory, and cache from there.

However, ideally I'd like not to change pip's behaviour in any way.

We also test on other CIs, and locally, and I'd like pip to use its defaults as much as possible across the board, and have fewer differences across envs.

needs discussion feature request

Most helpful comment

Thanks for the comments! I agree.

To start things off, I've created PR https://github.com/pypa/pip/pull/8095 for pip cache dir. It's simpler to implement, and is easier to use for my actual use case: checking the cache dir on a CI.

All 9 comments

Alternatively, a --json/--format=json could be added to the upcoming pip cache info, then anyone can extract the desired fields using something like jq?

Yes, that's one option, and would allow the total number of packages to be consumed programatically, if anyone needed it.

Although I had suggested this, which doesn't include the wheels subdirectory:

$ pip cache dir
/Users/hugo/Library/Caches/pip

pip cache info includes the wheels subdir:

$ pip cache info
Cache info:
  Location: /Users/hugo/Library/Caches/pip/wheels
  Packages: 508

Would that be the wrong thing to cache on a CI? Should http/, selfcheck/ and selfcheck.json be included or skipped when caching?

```console
$ ls -l /Users/hugo/Library/Caches/pip/
total 48
drwx------ 6 hugo staff 192B 13 Nov 12:09 .
drwx------+ 155 hugo staff 4.8K 14 Nov 16:57 ..
drwx------ 18 hugo staff 576B 8 Nov 2017 http
drwxr-xr-x 46 hugo staff 1.4K 14 Nov 12:20 selfcheck
-rw-r--r-- 1 hugo staff 22K 13 Nov 12:09 selfcheck.json
drwxr-xr-x 214 hugo staff 6.7K 12 Nov 10:22 wheels
````

(That's on macOS, I've not checked the structure for Linux/Windows.)


Here's the 14 lines needed for pip cache dir, done before you suggested --json:

With unlimited space, fast cache upload, and a cache that gets re-uploaded after each run, all of the directories you mentioned (except maybe selfcheck, but it should be small) would probably be worth caching. If any of those aren't true, then it depends. Based on the limits described in the docs it's probably not worth thinking about until you hit the limit.

Here's an example of how pip cache info --json could be implemented:

$ pip cache info
Cache info:
  Root: /Users/hugo/Library/Caches/pip
  Wheels: /Users/hugo/Library/Caches/pip/wheels
  Packages: 508
$ pip cache info --json
{"root": "/Users/hugo/Library/Caches/pip", "wheels": "/Users/hugo/Library/Caches/pip/wheels", "packages": 508}

One thing that would potentially affect #6391 is including the root cache path in addition to the wheels path. And renaming the wheels one from "Location" to "Wheels" and adding "Root" (or other names).

I'm in favor of adding this, and think it makes sense as part of the pip cache command, but I think it'd probably be best as a follow-up PR to #6391.

I agree, I'd like to see #6391 land first and don't wish to delay it.

Just one thing to consider: perhaps rename "Location:" to "Wheels:" in #6391 to avoid doing it later here. But again, bikeshedding/naming also takes time so we can do it here to avoid delaying #6391 :)

And I've created https://github.com/pypa/pip/issues/7372 for https://github.com/pypa/pip/pull/6391#issuecomment-554368620.

https://github.com/pypa/pip/pull/6391 has been merged! 馃帀


Two options have been suggested here for this issue, my initial pip cache dir and @chrahunt suggested pip cache list --json/--format=json.

I don't have a preference, and have updated these branches for each option.

Which one is preferred? I'll open a PR for it and we can discuss specifics. Thank you!

Honestly, I think they're both useful in different contexts.

  • In general, having machine-readable versions of things (as the --json/--format=json flag provides) can be very useful,
  • However, it simultaneously feels a bit silly to require a separate tool (e.g. jq) to parse data out for what is likely the most commonly-wanted value (the cache directory).

Thanks for the comments! I agree.

To start things off, I've created PR https://github.com/pypa/pip/pull/8095 for pip cache dir. It's simpler to implement, and is easier to use for my actual use case: checking the cache dir on a CI.

Was this page helpful?
0 / 5 - 0 ratings