Hi, colleagues!
What is missing?
At the moment there is no targets sharding support in the Prometheus Operator.
It would be great to add it.
Why do we need it?
Currently we have a thousands of targets in each Prometheus and it seems that we're coming to its performance limit on one node.
Possible solutions that I see are:
1) Prometheus per namespace
2) Use sharding
Both of solutions have their own advantages.
At the moment solution with sharding seems a bit better, that's why:
My propose is to add sharding attribute to ServiceMonitor like shard_by: <label>. This label (or labels list will be used as source label for sharding with action: hashmod). Modulus could be configured automatically based on Prom instances count.
What do you think?
If you need a solution quickly, you can already use additional relabeling rules on your ServiceMonitor via the hashmod action, and create multiple ServiceMonitors per "shard". Your use case makes a lot of sense, I'd like to think it through a little bit further, and arrive at a solution, that would allow us to eventually autoscale sharding based on the metric ingestion (I'm thinking a general purpose way, where a Prometheus object would become a shard and maybe a ShardedPrometheus object that orchestrates these, and can be autoscaled via the HPA). What I'm saying is, maybe the sharding decision should be configured in the Prometheus object ultimately instead of the ServiceMonitor (where it's already possible albeit a little manual today).
Most helpful comment
If you need a solution quickly, you can already use additional relabeling rules on your ServiceMonitor via the
hashmodaction, and create multiple ServiceMonitors per "shard". Your use case makes a lot of sense, I'd like to think it through a little bit further, and arrive at a solution, that would allow us to eventually autoscale sharding based on the metric ingestion (I'm thinking a general purpose way, where a Prometheus object would become ashardand maybe aShardedPrometheusobject that orchestrates these, and can be autoscaled via the HPA). What I'm saying is, maybe the sharding decision should be configured in the Prometheus object ultimately instead of the ServiceMonitor (where it's already possible albeit a little manual today).