Thanos: query: Built-in cache for Thanos Querier backed up by Memcached

Created on 3 Apr 2019 · 5Comments · Source: thanos-io/thanos

We are missing caching layer for Querier.

There are multiple design choices we need to make on:

Should it be built-in in Querier or separate proxy
Should we use memcached backend? Should we support any others?
How to structure cache items
Should we cache Query API result or actually StoreAPI results?
Should we do it near QueryAPI or on federated Querier as well?
Should we just DON'T do it and leave that fully to Trickster: https://github.com/Comcast/trickster

AC:

Produce design proposal
Produce PoC with rough benchmarks

My plan is to start some yolo PoC for this while reusing awesome code that @tomwilkie created for Cortex: https://sourcegraph.com/github.com/cortexproject/cortex@1d0ff216199e43b7b221774b5cd56936e7d22440/-/blob/pkg/querier/frontend/frontend.go#L103 Hope I can just import it and "run" =D but probably will bumb into import issues, we will see.

Initial thoughts? Feedback? Issue for tracking mostly, and proper proposal will come after short spike.

query hard feature request / improvement P1 stale

Source

bwplotka

👍8

Most helpful comment

From my observation the two most popular external network/cluster cache protocols right now are memcached and Redis. Both have support for self-hosting and cloud providers offer them as a service.

I would suggest sticking to external caching of data that could be shared between multiple query instances.

Trickster works reasonably OK. There is a next branch that should in theory improve a number of things. But, I agree, it doesn't really understand the data model, plus as a "dumb" cache, it can't to cache eviction.

SuperQ on 3 Apr 2019

👍4

All 5 comments

Initial thoughts:

Should it be built-in in Querier or separate proxy

Built-in as first step, as we might want to make it more complex in future, plust it already alter a query bit (chop it, align) - (e.g mixed caching of results for QueryAPI and StoreAPI). We can always produce proxy-like bit in future..

Should we use memcached backend? Should we support any others?

I have a very good expierience with Memcached so far, we used it everywhere, again, there will be dep hell and code scope creep if we will allow ANY backend, so we need to be careful.

How to structure cache items

:man_shrugging: Need to dive in to Cortex and Trickster caches.

Should we cache Query API result or actually StoreAPI results?

Result is easy win for now, but I feel like something in middle (caching PromQL evaluations) might be better. I think we should start with QueryAPI results, benchmark and iterate. Also worth to sync with Cortex guys on this - they are solving same problem.

Should we do it near QueryAPI or on federated Querier as well?

Query API for now, as we care about caching results as a first step.

Should we just DON'T do it and leave that fully to Trickster: https://github.com/Comcast/trickster

IMO, no as "Trickster" is not working well for users, mostly because of lack of understanding of PartialResponse strategies Querier allows. Also we would be forced to use results caching only.

bwplotka on 3 Apr 2019

From my observation the two most popular external network/cluster cache protocols right now are memcached and Redis. Both have support for self-hosting and cloud providers offer them as a service.

I would suggest sticking to external caching of data that could be shared between multiple query instances.

SuperQ on 3 Apr 2019

👍4

Not bad in terms of deps I guess.

bwplotka on 6 Apr 2019

👍1

Update

The work to embed caching in Querier is potentially no longer needed as you can run Cortex query-frontend on top of any Prometheus query range API (: You can learn more about this in meetup video here

It's definitely a way to go, we already started to run thins on Production with Thanos (: We are now discussing the possibility to move query-frontend to separate neutral project: here

This will have many benefits e.g will allow us to properly document, and recommend using this. We can also definitely discuss the possibility to embed this logic inside Querier, but it will be much easier if query-frontend would be a separate project to do so (dependencies).

bwplotka on 17 Sep 2019

👍3

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.