Thanos: compact: Ensure downsampled chunks are not larger than 120 samples; stream downsampling more

Created on 30 Apr 2020 · 18Comments · Source: thanos-io/thanos

During bug fixing on https://github.com/thanos-io/thanos/pull/2528 I found that downsampling always encodes whatever is given in the block into huge chunks. This can lead to inefficiency during query time when only a small part of chunk data is needed, but store GW needs to fetch and decode everything.

See: https://github.com/thanos-io/thanos/blob/55cb8ca38b3539381dc6a781e637df15c694e50a/pkg/compact/downsample/downsample.go#L141

AC:

Downsampled chunks are not larger than 120 samples
Chunks are expanded on demand in iterator.

medium feature request / improvement help wanted

Source

bwplotka

All 18 comments

If no one is working on this issue, can I have a try? @bwplotka

yashrsharma44 on 8 May 2020

I assume you are not on it right? (: Just making sure, so others can help as well (:

bwplotka on 23 May 2020

Yeah, apologies for keeping blocked, I am not working on it currently 😅

yashrsharma44 on 23 May 2020

I am happy to work on this.

jyizheng on 23 May 2020

🎉1

Hello 👋 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

stale[bot] on 22 Jun 2020

Closing for now as promised, let us know if you need this to be reopened! 🤗

stale[bot] on 29 Jun 2020

Still relevant AFAIK.

GiedriusS on 30 Jun 2020

Hi, I still want to work on this. I will get back to you guy soon.

jyizheng on 30 Jun 2020

stale[bot] on 30 Jul 2020

Closing for now as promised, let us know if you need this to be reopened! 🤗

stale[bot] on 6 Aug 2020

stale[bot] on 6 Sep 2020

Closing for now as promised, let us know if you need this to be reopened! 🤗

stale[bot] on 13 Sep 2020

_Is this still valid? If not, which PR fixed it?_

pracucci on 14 Sep 2020

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

stale[bot] on 13 Nov 2020

_I think this is still valid_

pracucci on 14 Nov 2020

Hi, two questions I have about this issue.

There is already a limit of 140 samples per chunk for downsampled chunks here. I guess the performance should be similar to 120 samples?
What's the benefit of Chunks are expanded on demand in iterator.? I see that we need to expand all the chunks before the downsampling anyway. So what is the doing it on demand?

Thanks!

yeya24 on 15 Nov 2020

There is already a limit of 140 samples per chunk for downsampled chunks here. I guess the performance should be similar to 120 samples?

Yes, performances should be similar and 140 should be a problem. However, I'm wondering if that 140 value was a "typo" of 120 or intentional 🤔

pracucci on 17 Nov 2020

👍1

Does not matter 120 / 140. If there is check I might miss it and all good (: Let's double check.
2.

What's the benefit of Chunks are expanded on demand in iterator.? I see that we need to expand all the chunks before the downsampling anyway. So what is the doing it on demand?

What I mean is to not expand all of chunks and only then process query, just instead one, process sample, then another and process those and another etc... (streaming)

bwplotka on 17 Nov 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings