Pants: Add a goal to introspect and garbage collect caches.

Created on 12 Nov 2020  路  4Comments  路  Source: pantsbuild/pants

This might look like:

  • Adding a goal that reports the disk usage in all cache locations (the LMDB store and named caches in particular)
  • Exposing manually running GC for the cache locations (in alignment with the GC that pantsd does) and/or completely clearing them.

One possibility would be that this would be one goal with an optional preview mode: ./pants gc.

Q42020-idea

All 4 comments

Managing pantsd status is a bit abstract to me: maybe someone else has a clearer idea of what that means.

But the "how much disk space is pants using" point is a bit clearer. I imagine it looking like:

  • Adding a goal that reports the disk usage in all cache locations (the LMDB store and named caches in particular)
  • Expose manually running GC for the cache locations (in alignment with the GC that pantsd does) and/or completely clearing them.

I've added some information about clearing caches to https://www.pantsbuild.org/v2.3/docs/troubleshooting#cache-or-pantsd-invalidation-issues.

Following on from https://pantsbuild.slack.com/archives/C0105PY6BM5/p1612892422002100

It could be handy to have some goal (or standalone utility, or flag, or something), to perform a garbage collection of the lmdb store, but rather than being time-based, using local action cache entries as GC roots, and only keeping digests referenced by these roots. Allows us to prune "we made a copy of every input file" while preserving "we can avoid doing expensive work". Ideally we could order the action cache entries by time, too, so that there's some LRU cut-off.

Great idea.

I'm going to go ahead and drop the pantsd aspect from this ticket, because we haven't gotten any feedback elucidating what that might look like.

Was this page helpful?
0 / 5 - 0 ratings