time_bucket is a really useful and quick function, but for time periods of a day or more, it's important that the local time zone is considered.
For example, I've a series of measurements with timestamps recorded in UTC for a location in timezone Canada/Eastern (Toronto) e.g. 2018-03-28 03:00:00 and I want to aggregate them by day and week.
By local day
select time_bucket('1 day', '2018-03-26 03:00:00'::timestamptz)::date as local_date
gives 2018-03-26, but in Toronto's timezone it should be 2018-03-25.
Currently I have to store an additional column with the calculated local date rather than rely on the output of time_bucket.
select timezone('Canada/Eastern', '2018-03-26 03:00:00'::timestamptz)::date as local_date
gives 2018-03-25
By local week
In terms of week numbers, the timezone also is important.
select to_char('2018-03-26 03:00:00'::timestamptz, 'IYYY/IW') as week_num
gives 2018/13
Where as in the local timezone, it should be the week before.
select to_char(timezone('Canada/Eastern', '2018-03-26 03:00:00'::timestamptz), 'IYYY/IW') as iso_week
gives 2018/12
Suggested syntax
Provide an optional timezone - defaulting to UTC if not specified.
If 'day' provided, return a date type result.
select time_bucket('day', ts_col, 'Canada/Eastern') from meas;
If 'week' provided, return a text type result formatted as 'IYYY/IW'
select time_bucket('week', ts_col, 'Canada/Eastern') from meas;
If the time_bucket function can't return different types, then add day_bucket and week_bucket functions instead.
These changes to time_bucket would avoid storing lots of pre-calculated local date and week values, and simplify aggregation query generation.
Thanks!
@andrew-blake just to make sure, are the timestamps you are storing TIMESTAMP or TIMESTAMPTZ types? Also have you looked at date_trunc that function is meant for operations on locals time in contrast time_bucket operates on UTC when give a TIMESTAMPTZ (https://docs.timescale.com/latest/api#time_bucket). If you can provide more context about how the underlying data is stored (with or without timezone) I can perhaps provide more help.
Yes, the timestamps are all expressed in UTC timezone and stored in the timezone aware TIMESTAMPTZ data type.
Btw, the server time is UTC, so can't use the suggestion from the docs, as I need to perform the date bucketing in an arbitrary time zone that differs from UTC.
Bucketing a TIMESTAMPTZ at local time instead of UTC(see note above):
SELECT time_bucket('2 hours', timetz::TIMESTAMP) AS five_min,
avg(cpu)
FROM metrics
GROUP BY five_min
ORDER BY five_min DESC LIMIT 10;
Note that the above cast to TIMESTAMP converts the time to local time according to the server's timezone setting.
@andrew-blake I just wanted to point out two possibilities while we consider this feature:
1) Use SET TIMEZONE which sets the time zone for the session (overriding the server setting for the session). Then you could use
SET TIMEZONE 'Canada/Eastern';
SELECT time_bucket('2 hours', timetz::TIMESTAMP) AS five_min,
avg(cpu)
FROM metrics
GROUP BY five_min
ORDER BY five_min DESC LIMIT 10;
and it would convert the time to the right timezone before passing it to time_bucket.
2) You can convert the timestamptz to a timestamp at a particular timezone with AT TIME ZONE
SELECT time_bucket('2 hours', timetz AT TIME ZONE 'Canada/Eastern') AS five_min,
avg(cpu)
FROM metrics
GROUP BY five_min
ORDER BY five_min DESC LIMIT 10;
Sorry for the delay in responding.
Unfortunately, I can't use option 1) as we use PgBouncer in front of Postgres with transaction pooling and queries for multiple timezones occurring concurrently. However, 2) works - thanks!
to_char(timezone('Canada/Eastern', time_tz), 'IYYY/IW') as week_num_local
, time_bucket('24 hours'::interval, timezone('Canada/Eastern', time_tz))::date as date_local
, count(*)
from meas
where time_tz >= (timezone('Canada/Eastern', '2018-05-03'::timestamp))::timestamptz
and time_tz < (timezone('Canada/Eastern', '2018-05-09'::timestamp))::timestamptz
group by 1, 2
order by 1, 2;
In terms of performance, would it be quicker if the time_bucket function performed the timezone conversion directly rather than calling the timezone function in the SQL ? Similarly with the date casting on the result ?
I still think it would be useful to provide a variant of the time_bucket function to return a date type, and possibly also week numbers.
Duplicate of #414
This also works with time_bucket_gapfill
Most helpful comment
@andrew-blake I just wanted to point out two possibilities while we consider this feature:
1) Use
SET TIMEZONEwhich sets the time zone for the session (overriding the server setting for the session). Then you could useand it would convert the time to the right timezone before passing it to time_bucket.
2) You can convert the timestamptz to a timestamp at a particular timezone with
AT TIME ZONE