Relevant system information:
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
CREATE TABLE conditions (
time TIMESTAMPTZ NOT NULL,
location TEXT NOT NULL,
temperature DOUBLE PRECISION NULL,
humidity DOUBLE PRECISION NULL
);
SELECT create_hypertable('conditions', 'time', chunk_time_interval => INTERVAL '1 hour');
INSERT INTO conditions(time, location, temperature, humidity)
VALUES (NOW(), 'office', 70.0, 50.0);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES (NOW(), 'office', 70.0, 50.0);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES (NOW(), 'office', 70.0, 50.0);
INSERT INTO conditions(time, location, temperature, humidity)
VALUES (NOW(), 'office', 70.0, 50.0);
create view cagg_conditions_view with
(timescaledb.continuous, timescaledb.refresh_interval = '5m')
AS
select location,time_bucket('30m', time) as half_hour,avg(temperature) AS avg_temp,avg(humidity) AS avg_humi from conditions group by location ,half_hour;
REFRESH MATERIALIZED VIEW cagg_conditions_view;
Expected behavior
cagg_conditions_view can have aggregated data
Actual behavior
my_cagg=# REFRESH MATERIALIZED VIEW cagg_conditions_view;
INFO: new materialization range not found for public.conditions (time column time): not enough new data past completion threshold (1578542400000000)
INFO: materializing continuous aggregate public.cagg_conditions_view: no new range to materialize
INFO: materializing continuous aggregate public.cagg_conditions_view: no new range to materialize or invalidations found, exiting early
REFRESH MATERIALIZED VIEW
my_cagg=# select * from cagg_conditions_view ;
location | half_hour | avg_temp | avg_humi
----------+-----------+----------+----------
(0 rows)
Screenshots


Additional context
Add any other context about the problem here.
Thank you for the detailed description.
As I see you don't provide refresh_lag parameters, which is twice the bucket width by default, see the doc of creating the view. Thus the refresh doesn't find the data to materialise, since the inserted data is within the lag range. See the doc of refresh.
Can you insert data that they are older than the lag, or change the refresh_lag? (note that setting refresh_lag too small or negative might affect performance during high load.)
Let us know if you are able to get it to work.
Good shot. I get it. (I get a timer to generate the data for 2 hours, then i figure out the usage of refresh_lag and bucket_width).
Can you explain why we need timescaledb.max_interval_per_job parameter ? when should i take care of this parameter?
And after refresh_interval, what timescaledb would do, as refresh_interval is smaller than the bucket_width, would it aggregate the refresh_interval data? where would it save the intermediate result ?
Can you explain why we need
timescaledb.max_interval_per_jobparameter ? when should i take care of this parameter?
This parameter is used to manage how much data will be materialised, since if materialisation goes for entire data set, it might take too much resources and affect production. So it is good to set this parameter to be not too big and not too small (if small, then it might not catch up with materialising data). The parameter might be better explained in this tutorial.
Let me know if you have more questions, or I can close the issue.
One more question:
Why the data from from 15:30 ~ 16:49 didn't be aggregated.
select * from cagg_conditions_view order by half_hour desc limit 10;


what'more, i run cagg manually and i get the follow logs
my_cagg=# REFRESH MATERIALIZED VIEW cagg_conditions_view;
INFO: new materialization range not found for public.conditions (time column time): not enough new data past completion threshold (1578556800000000) 【2020-01-09 16:00:00】
INFO: materializing continuous aggregate public.cagg_conditions_view: no new range to materialize
INFO: materializing continuous aggregate public.cagg_conditions_view: no new range to materialize or invalidations found, exiting early
REFRESH MATERIALIZED VIEW
I would pay attention that the data needs to lag behind by bucket_width + refresh_lag, which will be 1 hour in your case if I don't mistake.
yes, bucket_width + refresh_lag=1h
data from from 15:30 ~ 16:49 didn't be aggregated. So non-aggregated data will more than the value(1hour), and no matter what time is it currently, only the time of the latest record inserted, isn't it ?
no matter what time is it currently, only the time of the latest record inserted, isn't it ?
Yes, it uses the maximum time value in the data. This might be changed in future, to use now instead.
I am closing issue. Welcome to reopen if necessary.
Most helpful comment
Yes, it uses the maximum time value in the data. This might be changed in future, to use
nowinstead.