Thanos: compact: Ensure downsampled chunks are not larger than 120 samples; stream downsampling more

Created on 30 Apr 2020  路  18Comments  路  Source: thanos-io/thanos

During bug fixing on https://github.com/thanos-io/thanos/pull/2528 I found that downsampling always encodes whatever is given in the block into huge chunks. This can lead to inefficiency during query time when only a small part of chunk data is needed, but store GW needs to fetch and decode everything.

See: https://github.com/thanos-io/thanos/blob/55cb8ca38b3539381dc6a781e637df15c694e50a/pkg/compact/downsample/downsample.go#L141

AC:

  • Downsampled chunks are not larger than 120 samples
  • Chunks are expanded on demand in iterator.
medium feature request / improvement help wanted

All 18 comments

If no one is working on this issue, can I have a try? @bwplotka

I assume you are not on it right? (: Just making sure, so others can help as well (:

Yeah, apologies for keeping blocked, I am not working on it currently 馃槄

I am happy to work on this.

Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Closing for now as promised, let us know if you need this to be reopened! 馃

Still relevant AFAIK.

Hi, I still want to work on this. I will get back to you guy soon.

Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Closing for now as promised, let us know if you need this to be reopened! 馃

Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Closing for now as promised, let us know if you need this to be reopened! 馃

_Is this still valid? If not, which PR fixed it?_

Hello 馃憢 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

_I think this is still valid_

Hi, two questions I have about this issue.

  1. There is already a limit of 140 samples per chunk for downsampled chunks here. I guess the performance should be similar to 120 samples?

  2. What's the benefit of Chunks are expanded on demand in iterator.? I see that we need to expand all the chunks before the downsampling anyway. So what is the doing it on demand?

Thanks!

  1. There is already a limit of 140 samples per chunk for downsampled chunks here. I guess the performance should be similar to 120 samples?

Yes, performances should be similar and 140 should be a problem. However, I'm wondering if that 140 value was a "typo" of 120 or intentional 馃

  1. Does not matter 120 / 140. If there is check I might miss it and all good (: Let's double check.
    2.

What's the benefit of Chunks are expanded on demand in iterator.? I see that we need to expand all the chunks before the downsampling anyway. So what is the doing it on demand?

What I mean is to not expand all of chunks and only then process query, just instead one, process sample, then another and process those and another etc... (streaming)

Was this page helpful?
0 / 5 - 0 ratings