Timescaledb: time_bucket - support local time zones / calendar days

Created on 28 Mar 2018  路  6Comments  路  Source: timescale/timescaledb

time_bucket is a really useful and quick function, but for time periods of a day or more, it's important that the local time zone is considered.

For example, I've a series of measurements with timestamps recorded in UTC for a location in timezone Canada/Eastern (Toronto) e.g. 2018-03-28 03:00:00 and I want to aggregate them by day and week.

By local day
select time_bucket('1 day', '2018-03-26 03:00:00'::timestamptz)::date as local_date
gives 2018-03-26, but in Toronto's timezone it should be 2018-03-25.

Currently I have to store an additional column with the calculated local date rather than rely on the output of time_bucket.
select timezone('Canada/Eastern', '2018-03-26 03:00:00'::timestamptz)::date as local_date
gives 2018-03-25

By local week
In terms of week numbers, the timezone also is important.
select to_char('2018-03-26 03:00:00'::timestamptz, 'IYYY/IW') as week_num
gives 2018/13

Where as in the local timezone, it should be the week before.
select to_char(timezone('Canada/Eastern', '2018-03-26 03:00:00'::timestamptz), 'IYYY/IW') as iso_week
gives 2018/12

Suggested syntax
Provide an optional timezone - defaulting to UTC if not specified.

If 'day' provided, return a date type result.
select time_bucket('day', ts_col, 'Canada/Eastern') from meas;

If 'week' provided, return a text type result formatted as 'IYYY/IW'
select time_bucket('week', ts_col, 'Canada/Eastern') from meas;

If the time_bucket function can't return different types, then add day_bucket and week_bucket functions instead.

These changes to time_bucket would avoid storing lots of pre-calculated local date and week values, and simplify aggregation query generation.

Thanks!

Time Bucket community-request enhancement

Most helpful comment

@andrew-blake I just wanted to point out two possibilities while we consider this feature:
1) Use SET TIMEZONE which sets the time zone for the session (overriding the server setting for the session). Then you could use

SET TIMEZONE 'Canada/Eastern';
SELECT time_bucket('2 hours', timetz::TIMESTAMP) AS five_min,
    avg(cpu)
  FROM metrics
  GROUP BY five_min
  ORDER BY five_min DESC LIMIT 10;

and it would convert the time to the right timezone before passing it to time_bucket.
2) You can convert the timestamptz to a timestamp at a particular timezone with AT TIME ZONE

SELECT time_bucket('2 hours', timetz AT TIME ZONE 'Canada/Eastern') AS five_min,
    avg(cpu)
  FROM metrics
  GROUP BY five_min
  ORDER BY five_min DESC LIMIT 10;

All 6 comments

@andrew-blake just to make sure, are the timestamps you are storing TIMESTAMP or TIMESTAMPTZ types? Also have you looked at date_trunc that function is meant for operations on locals time in contrast time_bucket operates on UTC when give a TIMESTAMPTZ (https://docs.timescale.com/latest/api#time_bucket). If you can provide more context about how the underlying data is stored (with or without timezone) I can perhaps provide more help.

Yes, the timestamps are all expressed in UTC timezone and stored in the timezone aware TIMESTAMPTZ data type.

Btw, the server time is UTC, so can't use the suggestion from the docs, as I need to perform the date bucketing in an arbitrary time zone that differs from UTC.

Bucketing a TIMESTAMPTZ at local time instead of UTC(see note above):

SELECT time_bucket('2 hours', timetz::TIMESTAMP) AS five_min,
    avg(cpu)
  FROM metrics
  GROUP BY five_min
  ORDER BY five_min DESC LIMIT 10;

Note that the above cast to TIMESTAMP converts the time to local time according to the server's timezone setting.

@andrew-blake I just wanted to point out two possibilities while we consider this feature:
1) Use SET TIMEZONE which sets the time zone for the session (overriding the server setting for the session). Then you could use

SET TIMEZONE 'Canada/Eastern';
SELECT time_bucket('2 hours', timetz::TIMESTAMP) AS five_min,
    avg(cpu)
  FROM metrics
  GROUP BY five_min
  ORDER BY five_min DESC LIMIT 10;

and it would convert the time to the right timezone before passing it to time_bucket.
2) You can convert the timestamptz to a timestamp at a particular timezone with AT TIME ZONE

SELECT time_bucket('2 hours', timetz AT TIME ZONE 'Canada/Eastern') AS five_min,
    avg(cpu)
  FROM metrics
  GROUP BY five_min
  ORDER BY five_min DESC LIMIT 10;

Sorry for the delay in responding.

Unfortunately, I can't use option 1) as we use PgBouncer in front of Postgres with transaction pooling and queries for multiple timezones occurring concurrently. However, 2) works - thanks!

      to_char(timezone('Canada/Eastern', time_tz), 'IYYY/IW') as week_num_local
    , time_bucket('24 hours'::interval, timezone('Canada/Eastern', time_tz))::date as date_local
    , count(*)
from meas
where time_tz >= (timezone('Canada/Eastern', '2018-05-03'::timestamp))::timestamptz
    and time_tz < (timezone('Canada/Eastern', '2018-05-09'::timestamp))::timestamptz
group by 1, 2
order by 1, 2;

In terms of performance, would it be quicker if the time_bucket function performed the timezone conversion directly rather than calling the timezone function in the SQL ? Similarly with the date casting on the result ?

I still think it would be useful to provide a variant of the time_bucket function to return a date type, and possibly also week numbers.

Duplicate of #414

This also works with time_bucket_gapfill

Was this page helpful?
0 / 5 - 0 ratings

Related issues

thaxy picture thaxy  路  4Comments

ya-jeks picture ya-jeks  路  3Comments

sanpa1977 picture sanpa1977  路  5Comments

100milliongold picture 100milliongold  路  5Comments

shane-axiom picture shane-axiom  路  4Comments