Thanos: compact: Add a warning within compact doc for --retention.resolution-raw=xd, will remove raw objects from the object store keeping only xday of data.

Created on 19 Mar 2020  Β·  10Comments  Β·  Source: thanos-io/thanos

Thanos version: https://github.com/thanos-io/thanos/releases/tag/v0.11.0-rc.1

Object Storage Provider:
s3

What happened:
--retention.resolution-raw=xDay, removed all raw data greater than the day from the object store

What you expected to happen:
From how I read the doc for retention, there is a feeling that all the raw data would remain except the query results would use the xDay of raw up to the yDay of 5m then the zDay of 1h.

How to reproduce it (as minimally and precisely as possible):
Seed the object store with 60d of data
Add the following config to compact
--retention.resolution-raw=7d
--retention.resolution-5m=14d
--retention.resolution-1h=30d
Deploy compact

feature request / improvement good first issue help wanted stale

Most helpful comment

I think the documentation is good. Raw data is not required, if you want to zoom in your data. You can also zoom into a 5m resolution data.
If your working with Grafana, you should disable "auto-downsampling" in querier and work with different Grafana prometheus datasources. Each prometheus datasource should have an other value in CustomQueryParameters for max_source_resolution. For CustomQueryParameters you need a least grafana 6.5, take a look into this nice feature (https://github.com/grafana/grafana/pull/19121).
Grafana dashboard templates should help you to reach this setting. The time value for a rate function should also be variable.

We should warn the user, if retention.resolution-raw is to low to create other resolutions.

All 10 comments

This could be the same issue as https://github.com/thanos-io/thanos/issues/1674.

I don't think that triggering that warning in that situation would get the correct message across. When I was implementing Thanos, I would have shrugged at such a message and thought "yup, that's exactly what I want".

The 'brokenness' does not occur when you set some arbitrary _raw_ retention value, the problem occurs when the values for 'raw', '5m' and '1h' _differ_. If 'raw' retention is lower than '5m' or '1h', then short-span graphs will be unexpectedly empty beyond the raw retention point. Root cause: the Thanos Querier chooses data granularity based on the width of the timespan, not the age of the data.

Suggestion: warn if retention values differ between raw, 5m, 1h. The message should be something like this:

Downsampling is not intended to save on storage space by removing raw data after a short perod. Using this configuration will break short timespan graphs beyond the raw retention period because the Querier chooses data granularity based on the timespan, not the age of the data. The purpose of downsampling is to reduce the number of data points for long timespans (eg: weeks, months)

As @bwplotka pointed out, this is already documented here: https://thanos.io/components/compact.md/#downsampling-resolution-and-retention

Keep in mind, that the initial goal of downsampling is not saving disk space (Read further for elaboration on storage space consumption). The goal of downsampling is providing an opportunity to get fast results for range queries of big time intervals like months or years. In other words, if you set --retention.resolution-raw less then --retention.resolution-5m and --retention.resolution-1h - you might run into a problem of not being able to β€œzoom in” to your historical data.

I think the documentation is good. Raw data is not required, if you want to zoom in your data. You can also zoom into a 5m resolution data.
If your working with Grafana, you should disable "auto-downsampling" in querier and work with different Grafana prometheus datasources. Each prometheus datasource should have an other value in CustomQueryParameters for max_source_resolution. For CustomQueryParameters you need a least grafana 6.5, take a look into this nice feature (https://github.com/grafana/grafana/pull/19121).
Grafana dashboard templates should help you to reach this setting. The time value for a rate function should also be variable.

We should warn the user, if retention.resolution-raw is to low to create other resolutions.

@Reamer would you kindly clarify why the Compactor auto-downsampling option doesn't work with Grafana?

Is it only about being specific about the resolution you want to query? So I set up several Grafana datasources to the same thanos, each with different max_source_resolution.

But dashboards have variable time ranges. So the query author doesn't know if it will be used for 1 day or 1 year...

Hi @belm0
auto-downsampling is an option from the Querier component - relevant documentation.
Grafana is using this component, when you query Prometheus data saved with Thanos.
Grafana also sends a step information for good resolution - Prometheus-Datasource

If you do not want to save RAW data for months because you only have limited storage space but want to zoom into your data with a resolution of 5m or 1h, the calculation step / 5 with auto-ownsampling does not work.

Is it only about being specific about the resolution you want to query? So I set up several Grafana datasources to the same thanos, each with different max_source_resolution.

Yes, you are correct.

But dashboards have variable time ranges. So the query author doesn't know if it will be used for 1 day or 1 year...

The time ranges like in irate or rate should also be variable via Grafana dashboard templating.

Hello πŸ‘‹ Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! πŸ€—
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Hello πŸ‘‹ Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! πŸ€—
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

help wanted, very needed (:

On Wed, 17 Jun 2020 at 19:15, stale[bot] notifications@github.com wrote:

Hello πŸ‘‹ Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or
needed? If yes, just comment on this PR or push a commit. Thanks! πŸ€—
If there will be no activity for next week, this issue will be closed (we
can always reopen an issue if we need!). Alternatively, use remind command
https://probot.github.io/apps/reminders/ if you wish to be reminded at
some point in future.

β€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/thanos-io/thanos/issues/2290#issuecomment-645538186,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABVA3O5HHWLZKLBQYOSZXXTRXEB2TANCNFSM4LPE4FWQ
.

Hello πŸ‘‹ Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! πŸ€—
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Closing for now as promised, let us know if you need this to be reopened! πŸ€—

Was this page helpful?
0 / 5 - 0 ratings