Thanos, Prometheus and Golang version used:
thanos: v0.12.0
Object Storage Provider:
private CEPH (S3)
What happened:
See on end_input time and resolution:


Staleness functionality in prometheus library get rid of some points returned from thanos-stores.
What you expected to happen:
Return all data from store on any time_range
How to reproduce it (as minimally and precisely as possible):
see on screenshots.
Full logs to relevant components:
Anything else we need to know:
I think, that we have few ways to resolve problem:
LookbackDelta parameter > 5 min (need check)What is the ceph version?
CEPH does not matter, because thanos-stores return all data to query, and only in prometheus library points marked as staleness.
Nice, thanks for this. Funnily enough we just talked about this exact problem with @juliusv (:
We need different lookbackDelta for different resolution I think, right? @juliusv
ref: https://matrix.to/#/!WaUKIfoqfiyWQhenET:matrix.org/$1589300436157120ZqIrC:matrix.org?via=matrix.org
At a minimum, it would be good to add the --query.lookback-delta that we have in Prometheus to Thanos as well. However, since it's a global setting, it would apply to all time series, even the ones that are scraped a intervals <5m. Normally you wouldn't want to set this lookback delta higher than needed for everything, as that will result in old samples being returned for quite long (although explicit staleness markers already help with that).
I think that we need in dynamic lookback-delta inpdepended on resolution, forexample resolution/2.
Well. The main problem is that we can use different resolution in single PromQL eval (:
So it can be [1h of raw data, 2w of 1h resolution, and 5h of 5m resolution] combined.
So I think we might need to think of something in the PromQL itself. @brian-brazil do you know how hard would be that?
Also we can temporarily add lookback delta per query as well :thinking:
Varying resolution within one query is unlikely to wrok. What I'd do is present that to PromQL that looks real from the downsampled data - e.g. here you might provide interpolated samples every 1m.
It kinda depends on what the query is though.
@bwplotka @brian-brazil
Can we choose solution as soon as possible? I'm work on this problem now, and can to implement both solutions...
Can you elaborate more @brian-brazil ? So essentially you would actually for each downsampled data, actually expand it to have samples every 1m, fake interval? :thinking:
What would be the corner cases? Why it depends on query?
Alternatively we could have 3 PromQL engines in Querier and chose what to use based on the returned data. Then we can evaluate for the given periods and contact the results. However for large steps and intervals, it would be most likely bad....
So essentially you would actually for each downsampled data, actually expand it to have samples every 1m, fake interval
Yes, something like that.
Why it depends on query?
For e.g. sum_over_time you need different data than count_over_time to produce the desired result.
Looks like @IKSIN we could try that in querier.go
Ok! I try do it )
I am pretty sure we need special iterator for downsampled chunks.
For e.g. sum_over_time you need different data than count_over_time to produce the desired result.
This is already well handled.
Hello 👋 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
Closing for now as promised, let us know if you need this to be reopened! 🤗
we still work on this, let's reopen
BTW do you know we can now configure stalenees Lookback delta?
However we might want to adjust it for different resolutions indeed
@bwplotka As I remember staleness lookback delta is not something new. Or it was changed recently somehow?
We just allow users to configure it on Querier from flag that's it.
BTW do you know we can now configure stalenees Lookback delta?
However we might want to adjust it for different resolutions indeed
Well, here's my attempt at it: https://github.com/thanos-io/thanos/pull/3277