What's the problem this feature will solve?
From the command-line, I'd like a cross-platform method to get pip's cache directory, which by default is different per OS.
There's currently no supported way to do this.
Describe the solution you'd like
PR https://github.com/pypa/pip/pull/6391 is adding pip cache info
to returns the wheels directory, plus some extra info:
$ pip cache info
Cache info:
Location: /Users/hugo/Library/Caches/pip/wheels
Packages: 471
So something like pip cache dir
could be a simplified version of that:
$ pip cache dir
/Users/hugo/Library/Caches/pip
This would be useful for caching with GitHub Actions CI. Right now, the config needs repeating three times, once per OS, which is rather cumbersome (https://github.com/actions/cache/pull/86):
- name: Ubuntu cache
uses: actions/cache@v1
if: startsWith(matrix.os, 'ubuntu')
with:
path: ~/.cache/pip
key:
${{ matrix.os }}-${{ matrix.python-version }}-${{ hashFiles('**/setup.py')
}}
restore-keys: |
${{ matrix.os }}-${{ matrix.python-version }}-
- name: macOS cache
uses: actions/cache@v1
if: startsWith(matrix.os, 'macOS')
with:
path: ~/Library/Caches/pip
key:
${{ matrix.os }}-${{ matrix.python-version }}-${{ hashFiles('**/setup.py')
}}
restore-keys: |
${{ matrix.os }}-${{ matrix.python-version }}-
- name: Windows cache
uses: actions/cache@v1
if: startsWith(matrix.os, 'windows')
with:
path: ~\AppData\Local\pip\Cache
key:
${{ matrix.os }}-${{ matrix.python-version }}-${{ hashFiles('**/setup.py')
}}
restore-keys: |
${{ matrix.os }}-${{ matrix.python-version }}-
Other people also want a cross-platform method:
Alternative Solutions
Use pip's private, internal API, which has changed in the past and may change in the future:
$ python -c "from pip._internal.locations import USER_CACHE_DIR; print(USER_CACHE_DIR)"
/Users/hugo/Library/Caches/pip
Provide --cache-dir
or set the PIP_CACHE_DIR
environment variable to whatever path you like and cache that. Or use --user
to install into the user directory, and cache from there.
However, ideally I'd like not to change pip's behaviour in any way.
We also test on other CIs, and locally, and I'd like pip to use its defaults as much as possible across the board, and have fewer differences across envs.
Alternatively, a --json
/--format=json
could be added to the upcoming pip cache info
, then anyone can extract the desired fields using something like jq
?
Yes, that's one option, and would allow the total number of packages to be consumed programatically, if anyone needed it.
Although I had suggested this, which doesn't include the wheels
subdirectory:
$ pip cache dir
/Users/hugo/Library/Caches/pip
pip cache info
includes the wheels
subdir:
$ pip cache info
Cache info:
Location: /Users/hugo/Library/Caches/pip/wheels
Packages: 508
Would that be the wrong thing to cache on a CI? Should http/
, selfcheck/
and selfcheck.json
be included or skipped when caching?
```console
$ ls -l /Users/hugo/Library/Caches/pip/
total 48
drwx------ 6 hugo staff 192B 13 Nov 12:09 .
drwx------+ 155 hugo staff 4.8K 14 Nov 16:57 ..
drwx------ 18 hugo staff 576B 8 Nov 2017 http
drwxr-xr-x 46 hugo staff 1.4K 14 Nov 12:20 selfcheck
-rw-r--r-- 1 hugo staff 22K 13 Nov 12:09 selfcheck.json
drwxr-xr-x 214 hugo staff 6.7K 12 Nov 10:22 wheels
````
(That's on macOS, I've not checked the structure for Linux/Windows.)
Here's the 14 lines needed for pip cache dir
, done before you suggested --json
:
With unlimited space, fast cache upload, and a cache that gets re-uploaded after each run, all of the directories you mentioned (except maybe selfcheck, but it should be small) would probably be worth caching. If any of those aren't true, then it depends. Based on the limits described in the docs it's probably not worth thinking about until you hit the limit.
Here's an example of how pip cache info --json
could be implemented:
$ pip cache info
Cache info:
Root: /Users/hugo/Library/Caches/pip
Wheels: /Users/hugo/Library/Caches/pip/wheels
Packages: 508
$ pip cache info --json
{"root": "/Users/hugo/Library/Caches/pip", "wheels": "/Users/hugo/Library/Caches/pip/wheels", "packages": 508}
One thing that would potentially affect #6391 is including the root cache path in addition to the wheels path. And renaming the wheels one from "Location" to "Wheels" and adding "Root" (or other names).
I'm in favor of adding this, and think it makes sense as part of the pip cache
command, but I think it'd probably be best as a follow-up PR to #6391.
I agree, I'd like to see #6391 land first and don't wish to delay it.
Just one thing to consider: perhaps rename "Location:" to "Wheels:" in #6391 to avoid doing it later here. But again, bikeshedding/naming also takes time so we can do it here to avoid delaying #6391 :)
And I've created https://github.com/pypa/pip/issues/7372 for https://github.com/pypa/pip/pull/6391#issuecomment-554368620.
https://github.com/pypa/pip/pull/6391 has been merged! 馃帀
Two options have been suggested here for this issue, my initial pip cache dir
and @chrahunt suggested pip cache list --json
/--format=json
.
I don't have a preference, and have updated these branches for each option.
pip cache dir
https://github.com/pypa/pip/compare/master...hugovk:pip-cache-dir2
pip cache list --json
https://github.com/pypa/pip/compare/master...hugovk:pip-cache-json2
Which one is preferred? I'll open a PR for it and we can discuss specifics. Thank you!
Honestly, I think they're both useful in different contexts.
--json
/--format=json
flag provides) can be very useful, jq
) to parse data out for what is likely the most commonly-wanted value (the cache directory).Thanks for the comments! I agree.
To start things off, I've created PR https://github.com/pypa/pip/pull/8095 for pip cache dir
. It's simpler to implement, and is easier to use for my actual use case: checking the cache dir on a CI.
Most helpful comment
Thanks for the comments! I agree.
To start things off, I've created PR https://github.com/pypa/pip/pull/8095 for
pip cache dir
. It's simpler to implement, and is easier to use for my actual use case: checking the cache dir on a CI.