Caddy: Caddy 2 caching layer and API endpoints

Created on 18 Oct 2019 · 9Comments · Source: caddyserver/caddy

1. What would you like to have changed?

The Caddy 2 caching layer is currently a WIP. It seems to be the right time to suggest a cache invalidation API that supports cache tags.

2. Why is this feature a useful, necessary, and/or important addition to this project?

A caching layer improves throughput by reducing bandwidth, latency, and workload. It can apply to less frequently updated content such as a blog or more critical components such as an API. The on-demand TLS feature coupled with a great caching and invalidation API would be a set of killer features for SaaS companies and I think would bring lot of new adopters.

There exist 3 main methods regarding cache invalidation:

1) By URL: requires a map data structure to identify matching entries to purge.
2) By URL and recursive: requires a prefix tree.
3) By surrogate key/cache tag: requires a multimap.

The third method is the most interesting since it enables and supports basic to advanced use cases while not requiring a lot of extra work.

It would be really convenient to have a REST API exposed by Caddy for cache invalidation. Requests could be formulated using the commonly accepted HTTP PURGE method.

3. What alternatives are there, or what are you doing in the meantime to work around the lack of this feature?

Using a solution other than Caddy.

4. Please link to any relevant issues, pull requests, or other discussions.

Here are some resources for an introduction about surrogate keys/cache tags:

feature request help wanted

Source

lpellegr

👍1

Most helpful comment

My 2¢ from having worked on fairly distributed systems.

For scalable infrastructures, requirement for cache invalidation is generally an anti-pattern. I'd recommend to use on of the following in the key:

time (like rounding at the hour for data that can accept various levels of staleness).
"generation" value, which gets bumped every time the data from a previous generation should be discarded.

Using these patterns remove the need for cache coherence between nodes in the caching infrastructure. A good example of what failed for us:

We use memcache to cache DB entries, a few billions RPCs per day.
Once every ~0.001%, a memcache write RPC fails, like PUT or DELETE.
When it happens, the object in cache becomes stale, causing all sorts of issues.

Sure you could use TTL (my preference), and/or explicit cache invalidation to work around those once they are detected, but the whole system becomes eventually consistent; especially that all the nodes in the system must have a coherent view of the cache data.

maruel on 19 Oct 2019

👍2

All 9 comments

Yep, this is a good idea, it definitely needs to be a part of Caddy's cache module.

Right now, the cache key is just the request URI: https://github.com/caddyserver/caddy/blob/19e834cf36ed757738ff7504eb8b3fff45f67a79/modules/caddyhttp/httpcache/httpcache.go#L85-L88 (keep in mind that the current implementation is just a PoC)

Since groupcache has no explicit cache invalidation features, all we need to do is encode the information related to cache expiration into the key.

And yes, an admin endpoint would be needed. These are pluggable, as admin endpoints are themselves Caddy modules: https://github.com/caddyserver/caddy/blob/19e834cf36ed757738ff7504eb8b3fff45f67a79/dynamicconfig.go#L29-L54

The on-demand TLS feature coupled with a great caching and invalidation API would be a set of killer features for SaaS companies and I think would bring lot of new adopters.

Agreed! We've talked to some CDNs who would go crazy for this combination.

Anyone want to post an end-to-end example of something being added to the cache (including what the key is), an API request that invalidates it, and then the next version being added to the cache and used in its place?

mholt on 18 Oct 2019

Also looping @maruel in on this conversation

mholt on 18 Oct 2019

My 2¢ from having worked on fairly distributed systems.

For scalable infrastructures, requirement for cache invalidation is generally an anti-pattern. I'd recommend to use on of the following in the key:

time (like rounding at the hour for data that can accept various levels of staleness).
"generation" value, which gets bumped every time the data from a previous generation should be discarded.

Using these patterns remove the need for cache coherence between nodes in the caching infrastructure. A good example of what failed for us:

We use memcache to cache DB entries, a few billions RPCs per day.
Once every ~0.001%, a memcache write RPC fails, like PUT or DELETE.
When it happens, the object in cache becomes stale, causing all sorts of issues.

maruel on 19 Oct 2019

👍2

@maruel Interesting thoughts and experience.

My original message did not mention TTL based eviction since it seems to be a mandatory feature to support HTTP Cache-control headers on an HTTP server. Do you plan to rely on the cache module to handle Cache-control headers?

What about a basic abstraction layer (maybe one for eviction policies and one for invalidation) and different implementations (e.g. a local disk-based, local mem-based, a distributed one). Not all use cases require a distributed setup, neither coordination between nodes. It would be interesting to get more feedback from Caddy users and their use cases. Maybe you already have?

Just to emphasize that my main motivation to create this issue was a cache invalidation API with modern actionable purge methods (i.e. surrogate keys). That's a really common and recurrent need in order to work with multi-tenant architectures in the SaaS world.

lpellegr on 19 Oct 2019

👍1

Even supporting max-age correctly is tricky: How long should the proxy cache it? Should the max-age value be reduced as time goes?

I'm not familiar with proxy implementations but I tend to favor safety over performance: cache for a small amount of time (like 1% of the cache max-age statement) so we don't need to answer the question above.

maruel on 19 Oct 2019

👍1

Not sure if you folks have already looked at https://github.com/mailgun/groupcache

It is fork of groupcache and support TTL based eviction and a way to remove(sort of).

varun06 on 26 Mar 2020

@varun06 I did see that once upon a time, but the theoretical guarantees are less strong. I think if we went that route we'd have to do some careful profiling and benchmarking before we decide to go with it. It has to be better than simply encoding expiration into the keys themselves. Also, this issue is a little unsettling (a common gotcha that I've done myself, and something easy to fix, but let's see what the response time is): https://github.com/mailgun/groupcache/issues/14

In the meantime, I've moved the cache handler to a separate repo for further development: https://github.com/caddyserver/cache-handler

I might transfer this issue as well. (or just link to it)

mholt on 26 Mar 2020

Thanks for the explanation @mholt. I am working on a similar exercise(Distributed Cache, although we offload much of caching to CDN layer) for reverse proxy used by a big retailer. It is in very initial phase. If I learn something that might help caddy, I will contribute or let you folks know.

varun06 on 27 Mar 2020

Moving discussion to new repo dedicated for the caching layer: https://github.com/caddyserver/cache-handler/issues/1

Would be happy to have people work on it!

mholt on 2 Apr 2020

👍1

Was this page helpful?

5 / 5 - 1 ratings