CREATE TABLE hypertable (
"whatever" float8 NOT NULL,
"timestamp" timestamptz NOT NULL
);
SELECT create_hypertable('hypertable', 'timestamp');
INSERT INTO hypertable
VALUES
(9.6,'2019-01-01 13:00'),
(8.8,'2019-01-01 13:15');
create view aggregatedView with
(timescaledb.continuous, timescaledb.refresh_interval = '5m', timescaledb.refresh_lag='5m')
as
SELECT
sum("whatever")::BIGINT as "whatever",
time_bucket('1 day', "timestamp" AT TIME ZONE 'Europe/Madrid') as "day"
FROM hypertable
GROUP BY time_bucket('1 day', "timestamp" AT TIME ZONE 'Europe/Madrid');
SQL Error [XX000]: ERROR: time_bucket function for continuous aggregate query should be called on the dimension column of the hypertable
I'm getting this error maybe because what i want to do is not possible, the select alone works fine.
So is it possible to aggregate by day in a specific time zone with a continuous aggregation?
I have the same problem. Can麓t convert to local time. Original table.fecha_hora is a TIMESTAMPTZ :
CREATE VIEW va_ftp_pxd_datos
WITH ( timescaledb.continuous )
AS
SELECT time_bucket('1 day', fecha_hora::TIMESTAMP)
AS dia,
avg(ftp_pxd_datos.activa_i) AS activa,
avg(ftp_pxd_datos.reactiva_i) AS reactiva,
ftp_pxd_datos.id_cups
FROM ftp_pxd_datos
GROUP BY ftp_pxd_datos.id_cups, dia
We currently don't support timezone conversions inside of time_bucket with continuous aggregates. There are two possible workarounds:
1) Do the timestamptz -> timestamp conversion when querying out of the view. or
2) store the time in the underlying table as a timestamp instead of timestamptz.
This turns out to be a very complex issue to implement because of daylight savings time issues so we would like to hear from the community if the above workarounds are sufficient or if you need a fuller solution. Please upvote and leave comments with your usecases.
A #3 solution can be calculate an hourly aggregation and then the client can make the aggregate outside the db, manually slicing the data in days for each time zone interested (although some countries have 15min or 30 shift time zone)
A fuller solution would come late for me anyway, so i guess i'm fine for now knowing that it is not supported
This is a must-have for me.
I can't find a good work around for this. Any suggestions?
I could not find a case where the non-unique local time would cause a problem for daily aggregates, since in that case only the date matters.
The only case I could find without a workaround is places with sub-hour shifted time zones and daylight saving:
SET TIMEZONE TO 'Australia/Adelaide';
SELECT time_bucket(
'1 hour',
ts AT TIME ZONE 'Australia/Adelaide'
) AT TIME ZONE 'Australia/Adelaide' AS ts_hour, ts
FROM generate_series(
'2019-04-07 01:00:00+10:30'::timestamptz,
'2019-04-07 03:59:59+09:30'::timestamptz,
'00:15:00'::interval
) AS ts;
ts_hour | ts
---------------------------+---------------------------
2019-04-07 01:00:00+10:30 | 2019-04-07 01:00:00+10:30
2019-04-07 01:00:00+10:30 | 2019-04-07 01:15:00+10:30
2019-04-07 01:00:00+10:30 | 2019-04-07 01:30:00+10:30
2019-04-07 01:00:00+10:30 | 2019-04-07 01:45:00+10:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:00:00+10:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:15:00+10:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:30:00+10:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:45:00+10:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:00:00+09:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:15:00+09:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:30:00+09:30
2019-04-07 02:00:00+09:30 | 2019-04-07 02:45:00+09:30
2019-04-07 03:00:00+09:30 | 2019-04-07 03:00:00+09:30
2019-04-07 03:00:00+09:30 | 2019-04-07 03:15:00+09:30
2019-04-07 03:00:00+09:30 | 2019-04-07 03:30:00+09:30
2019-04-07 03:00:00+09:30 | 2019-04-07 03:45:00+09:30
(16 rows)
Here the rows 5 - 8 are obviously wrong.
There are not many places like this.
I'm not sure how time_bucket is implemented, but maybe the new time zone aware date_trunc could be used as an inspiration.
The non-unique local time is always an issue. I can't store Nov. 6, 2am and then store a second record with the same local time without overwriting the first.
I hope to see a timezone aware time_bucket function.
As timescale is specifically targeted towards time series data with the intent of simplifying overhead and data analysis I was surprised to find out, that this feature is not supported. I agree that a timezone aware time_bucket function would be very helpful and necessary.
In my use case I'm receiving sensor data from different farms around the world. So for each of the sensors I want to store the daily average from 0:00-23:59 in its local timezone. However as the time bucket function would calculate the daily average based on UTC time, it is making a "mistake" in hours of the timezone the sensor is in. In the worst case this could be +/- 12h.
And none of the workarounds seem really satisfying:
So all in all I would be really happy to see this feature added in the future.
Any news on this issue @cevian ?
I believe adding an optional time_zone argument to time_bucket to mimic the signature of current postgresql date_trunc seems like a good plan.
What are your thoughts ?
Just adding my vote for this feature.
This would be game-changing for our infrastructure.
We currently perform aggregation (15 minutes to monthly) on 5-minute data on the fly and it is killing our database. We had to heavily rate-limit our API for those granularity-requests otherwise our database just dies...but it's preventing us from scaling up.
also voting for this feature. We have a use-case very similar to the one described by @lely475
We discovered this limitation today and it's also an issue for us, so adding my vote as well.
I'd also like to chime in and request for a timezone aware way to use continuous aggregates.
The proposed solutions really create a lot of overhead and make it significantly harder to use an otherwise great product.
Possibly related to #414
I also +1 for this features too. This really a game-changing features. I try to create cagg for a table so I can get 1 day of sum of specific metrics but it always start at 0:00 UTC time and this introduce incorrect calculation
Most helpful comment
As timescale is specifically targeted towards time series data with the intent of simplifying overhead and data analysis I was surprised to find out, that this feature is not supported. I agree that a timezone aware time_bucket function would be very helpful and necessary.
In my use case I'm receiving sensor data from different farms around the world. So for each of the sensors I want to store the daily average from 0:00-23:59 in its local timezone. However as the time bucket function would calculate the daily average based on UTC time, it is making a "mistake" in hours of the timezone the sensor is in. In the worst case this could be +/- 12h.
And none of the workarounds seem really satisfying:
1 is not useful for reasons stated above
2 enforces the saving of another column for all sensors and all system which (in the long run) creates a constantly growing, unnecessary overhead and just makes life more complicated in general
3 works well if I'm analyzing a time frame in the month scale, but creates a larger and larger overhead for longer (yearly) periods: instead of extracting 1 value/day I have to calculate avg(24 values)/day, thus slowing down the SELECT query
So all in all I would be really happy to see this feature added in the future.