I use pip a lot and had never considered anything about caching, and find I have a 1.7gb pip cache.
It would be useful if there was a command that could clear it of items beyond a specified age.
That way I could could create a script to run every day to delete anything in pip that is older than a month (and to do the same for unrelated things like yarn etc).
The pip cache command which is recently introduced, can be used to list down all cached wheel files via pip cache list
.
IIUC, this feature will use the file creation time of these wheel files, and remove cached wheel files whose creation time is x days/weeks/months etc from current time.
I am thinking about implementing this by adding an --keep-max-age
to the pip cache
command.
It can take time in x
last days, and then only removes from the cache if the created/accessed date is older then the current date - X days
what do you think @deveshks @pradyunsg?
its not a whole lot of code, but I am not sure that allowing only days is flexible enough, and also there may be a cache item that you use a lot but was created a while ago, so maybe access date is preferable.
Actually I was planning to attempt this, and hence I commented here to get a response from the poster, but it's perfectly fine if you want to go ahead and take a stab at this.
Sorry, maybe I didn't read your post properly because I didn't know what IIUC stands for.
Yes, both proposals seem reasonable.
Sorry, maybe I didn't read your post properly because I didn't know what IIUC stands for.
Apologies for the slang. Just as a clarification, my usage of IIUC meant If I Understand Correctly
.
Thanks, strangely this became clear as I was writing the reply, TBH I use a lot of these acronyms myself, so maybe I should be aware that people may be as time starved as me when they read my messages :)
Actually I was planning to attempt this, and hence I commented here to get a response from the poster, but it's perfectly fine if you want to go ahead and take a stab at this.
You can actually do this if you want, I won't have much time I can devote to this :(
You can actually do this if you want, I won't have much time I can devote to this :(
Thanks for that, I will wait for @pradyunsg 's thoughts on whether this is something we want to support before attempting it.
Sounds like a good idea. There's a related issue about adding a time-based sort order for pip cache, so it might make sense to tackle that as well!
Sounds like a good idea. There's a related issue about adding a time-based sort order for pip cache, so it might make sense to tackle that as well!
Is there an issue filed for time-based sort order
? I couldn't find a similar issue in the pip tracker.
Also what would be the preferred approach of trying this out? Should I just start by taking a
--keep-max-age
with the number of days to being with as suggested by @NoahGorny with pip cache purge
,
We can delete based on either the created time, or access time, bu I am not sure which one to pick. We can play safe though and do both and ask the user to choose between one of them. Let me know what you think?
FWIW https://github.com/pypa/pip/pull/3146 implemented something like that :)
FWIW #3146 implemented something like that :)
Thanks, I see that there is a --not-accessed-since
and --remove
flag which could be helpful. Let me look at those and see what can I come up with.
FWIW #3146 implemented something like that :)
Thanks, I see that there is a
--not-accessed-since
and--remove
flag which could be helpful. Let me look at those and see what can I come up with.
@xavfernandez , I have gone ahead and created https://github.com/pypa/pip/pull/8474 to address this issue based on your PR. I would really appreciate you taking a look at the same as well.
To respond to some comments on #8474:
1) I agree that we should not add new options lightly
2) But we should also be cautious of the "the same thing can be done with a carefully constructed shell recipe" argument.
Otherwise with that logic pip cache dir
would have been sufficient and we shouldn't have provided info
/ list
/ remove
/ purge
which are all variants of some clever find `pip cache dir`/wheels -type f | xargs -n1 ...
command.
The need to purge the ever-growing wheel cache seems quite natural and the two currently built-in solutions of:
purge
the whole cacheremove
both seem sub-optimal.
Hence the need for a better alternative.
I personally think that Least Recently Used cache purge strategy is a pragmatic solution to me: it is both simple to implement and easy for the user to understand (but I'm clearly biased since I was already pushing for this 5 years ago ;).
And I'm open to other proposal ^^
In the meantime find `pip cache dir`/wheels -type f -atime +90 -delete
should remove wheels not accessed since 90 days.
Thanks for the response @xavfernandez , I was thinking of going towards the suggestion made in https://github.com/pypa/pip/pull/8474#issuecomment-653445378 to output the wheels with absolute path allowing a consumer to use those paths and implementing their deletion behaviour as the middle ground to all the suggestions made there.
I'm also biased as also want this ticket.
Agree on not getting users to delete things via shell commands, they will find some example on the Web that may not support their OS, and they tend to add sudo :)
Most helpful comment
To respond to some comments on #8474:
1) I agree that we should not add new options lightly
2) But we should also be cautious of the "the same thing can be done with a carefully constructed shell recipe" argument.
Otherwise with that logic
pip cache dir
would have been sufficient and we shouldn't have providedinfo
/list
/remove
/purge
which are all variants of some cleverfind `pip cache dir`/wheels -type f | xargs -n1 ...
command.The need to purge the ever-growing wheel cache seems quite natural and the two currently built-in solutions of:
purge
the whole cacheremove
both seem sub-optimal.
Hence the need for a better alternative.
I personally think that Least Recently Used cache purge strategy is a pragmatic solution to me: it is both simple to implement and easy for the user to understand (but I'm clearly biased since I was already pushing for this 5 years ago ;).
And I'm open to other proposal ^^
In the meantime
find `pip cache dir`/wheels -type f -atime +90 -delete
should remove wheels not accessed since 90 days.