Timescaledb: Automatic rollup table selection

Created on 26 Feb 2019 · 3Comments · Source: timescale/timescaledb

I would like to zoom in Grafana from several months timespans to seconds and back with data sampled several times per second.

Given a table with raw data I can create one or several rollup tables to speed things up, but the query should fetch data from the correct table based on the timespan.

Grafana provides variable $__interval_ms that we can use in the query:

...
FROM (
  SELECT ...
    FROM data_raw
    WHERE $__interval_ms < use_rollup_threshold
  UNION
    SELECT ...
    FROM data_rolledup
    WHERE $__interval_ms >= use_rollup_threshold
)

This works, but grows unwieldy.

How about infastructure that would allow you to specify a rollup table family:

table_name| bucket_size
=======================
foo       | NULL
foo_1min  | 60000
foo_1hour | 3600000
foo_1day  | 86400000
foo_1week | 604800000

and a way to target it in a query

...
FROM
  rollup_table('foo', $__interval_ms)

and it would automatically select the appropriate table to query.

Most helpful comment

TLDR

It is possible to use a Grafana query variable, refreshed On Time Range Change, that calls a PL/pgSQL function to automatically determine the best aggregate view to query. This seems like a fairly effective workaround for this issue (and is IMHO better than alternative suggested in both this blog post and this video using UNION ALL) but still has at least one drawback (increased latency) compared to how @tkurki framed the feature request.

Inspiration for this workaround was found in the comments on a Grafana issue relating to InfluxDB https://github.com/grafana/grafana/issues/4262

View and Table Names

To avoid maintaining a mapping from bucket sizes to view names, this examples code encodes the aggregate view's bucket sizes in the name. All aggregate views are postfixed with an underscore, a quantity (a number) and a multiplier (the letters: s, m, h, d) representing the bucket width. A view with rides aggregated in 1 hour buckets (hourly) would be called _rides_1h_. A view with rides aggragated every 10 minutes would be called _rides_10m_. The postfix should basically agree with the bucket_width argument to the time_bucket function.

The naming scheme can be altered by modifying the regular expressions in the example code.

PL/pgSQL functions

The function get_aggregate_view_for_interval_ms can be called from Grafana to determine the correct view name for a given interval (i.e. the interval defined in the Grafana variable $__interval_ms).

Example:

SELECT get_aggregate_view_for_interval_ms('rides', $__interval_ms);

Arguments:

name the base name of the aggregate views
interval_ms the desired bucket width in ms, e.g. Grafana's $__interval_ms variable

CREATE OR REPLACE FUNCTION public.get_aggregate_view_for_interval_ms(name information_schema.sql_identifier, interval_ms bigint)
 RETURNS information_schema.sql_identifier
 LANGUAGE plpgsql
STABLE
AS $function$
DECLARE
  result character varying := (SELECT table_name
    FROM information_schema.tables
    WHERE table_schema = 'public'
      AND table_type = 'VIEW'
      AND table_name ~ (name || '_\d+[smhd]')
      AND interval_ms >= get_bucket_width_ms(table_name)
    ORDER BY get_bucket_width_ms(table_name) DESC
    LIMIT 1);
BEGIN
  -- If a aggragate view was found that has an appropriate
  -- bucket width for this interval, return it
  IF result IS NOT NULL THEN
    RETURN result;
  END IF;   

  -- Otherwise return the view with the smallest bucket width
  RETURN (SELECT table_name
    FROM information_schema.tables
    WHERE table_schema = 'public'
      AND table_type = 'VIEW'
      AND table_name ~ (name || '_\d+[smhd]')
    ORDER BY get_bucket_width_ms(table_name)
    LIMIT 1);
END; $function$;

get_bucket_width_ms is used by the above function to extract the bucket width (in milliseconds) from a view name.

CREATE OR REPLACE FUNCTION public.get_bucket_width_ms(name information_schema.sql_identifier)
 RETURNS bigint
 LANGUAGE plpgsql
IMMUTABLE
AS $function$
DECLARE
  quantity integer := SUBSTRING(name FROM '(\d+)[smhd]$');
  unit char := SUBSTRING(name FROM '\d+([smhd])$');
  mult bigint := CASE
    WHEN unit = 's' THEN 1
    WHEN unit = 'm' THEN 1 * 60
    WHEN unit = 'h' THEN 1 * 60 * 60
    WHEN unit = 'd' THEN 1 * 60 * 60 * 24
  END;
BEGIN
  RETURN quantity * mult * 1000;
END; $function$;

Example

Assuming you have the ride data from the tutorial, create the same views as in the blog post (but using the naming scheme described above):

CREATE VIEW rides_1d
WITH (timescaledb.continuous, timescaledb.refresh_interval = '1 day')
AS
    SELECT time_bucket('1 day', pickup_datetime), COUNT(*) AS ride_count
    FROM rides
    GROUP BY 1;

CREATE VIEW rides_1h
WITH (timescaledb.continuous, timescaledb.refresh_interval = '1 hour')
AS
    SELECT time_bucket('1 hour', pickup_datetime), COUNT(*) AS ride_count
    FROM rides
    GROUP BY 1;

CREATE VIEW rides_10m
WITH (timescaledb.continuous, timescaledb.refresh_interval = '10 minutes')
AS
    SELECT time_bucket('10 minutes', pickup_datetime), COUNT(*) AS ride_count
    FROM rides
    GROUP BY 1;

The get_aggregate_view_for_interval_ms can now be used to select the view name most appropriate for a given interval. E.g.: with an interval of one minute, 59 minutes, one hour, two hours, and one day:

nyc_data=# SELECT get_aggregate_view_for_interval_ms('rides', 1 * 60 * 1000);
-[ RECORD 1 ]----------------------+----------
get_aggregate_view_for_interval_ms | rides_10m

nyc_data=# SELECT get_aggregate_view_for_interval_ms('rides', 59 * 60 * 1000);
-[ RECORD 1 ]----------------------+----------
get_aggregate_view_for_interval_ms | rides_10m

nyc_data=# SELECT get_aggregate_view_for_interval_ms('rides', 60 * 60 * 1000);
-[ RECORD 1 ]-----------------+---------
aggregate_view_for_resolution | rides_1h

nyc_data=# SELECT get_aggregate_view_for_interval_ms('rides', 2 * 60 * 60 * 1000);
-[ RECORD 1 ]-----------------+---------
aggregate_view_for_resolution | rides_1h

nyc_data=# SELECT get_aggregate_view_for_interval_ms('rides', 24 * 60 * 60 * 1000);
-[ RECORD 1 ]-----------------+---------
aggregate_view_for_resolution | rides_1d

Grafana

To use the function to select the correct view in Grafana:

Create the above functions
Ensure that the continuous aggregates are named as above and that the user Grafana is using has at least the SELECT privilege on the continuous aggregates, e.g.:
```
GRANT SELECT ON rides_1d TO grafana;
GRANT SELECT ON rides_1h TO grafana;
GRANT SELECT ON rides_10m TO grafana;
```
(Views that the user does not have permissions to access will not be returned by get_aggregate_view_for_interval_ms)

Then, to make a template variable to use in queries:

Go to the _Dashboard settings_ and select _Variables_
Add a new _Query variable_ (see the Grafana docs for details)
Give the variable a _Name_ that will be used in queries, e.g.: rides_view_name
Select the appropriate _Data source_, e.g.: PostgreSQL
Set _Refresh_ to On Time Range Change
Add the _Query_, e.g: SELECT get_aggregate_view_for_interval_ms('rides', $__interval_ms);
Optionally set _Hide_ to Variable to avoid showing the template variable in the dashboard (though it can be useful for debugging, so you might want to wait with that until everything works as expected)

Finally, update the panel to use the new template variable:

Edit the panel
Use the template variable (e.g. $rides_view_name) in the From field in the query builder, or edit the SQL and use the template variable in the raw SQL, e.g.:
```
SELECT
 time_bucket AS "time",
 ride_count AS "ride_count"
FROM $rides_view_name
WHERE
 $__timeFilter(time_bucket)
ORDER BY 1
```

Note: for the example 'rides' dataset there really isn't enough data for the rides_1d aggregate view to be very useful given the intervals that Grafana (by default) requests. The rides_1d is only used when zoomed out to a time range of around 5 years or more.

Drawbacks

The major issue with this approach is that it introduces some latency for each template variable set up to refresh _On Time Range Change_ (as these are performed before any queries for actual data are issued). This problem will be compounded for dashboards that use many different continuous aggregates and therefore require many template variables that refresh _On Time Range Change_ (Grafana might parallelise these requests, but I haven't checked if this is the case).

When using the query builder, if you change the From field in an existing panel to use the template variable, the query builder resets (even though the value of the template variable is the same as the view that is being replaced).

The example code here assumes that continuous aggregates at all resolutions contain the full dataset. If data is dropped out of (for example) high resolution continuous aggregates then the get_aggregate_view_for_interval_ms would need to grow the ability to ignore views that don't contain data for the currently viewed time period.

clj on 22 Dec 2020

👍3

All 3 comments

@tkurki I agree that this type of functionality would be particularly useful, and we hope to offer it sometime in the next several releases. Thanks!

mfreed on 27 Feb 2019

Hi,
Any progress on this?

k1ng440 on 8 Jun 2019