Timescaledb: Continuous aggregate not scheduling refresh

Created on 2 Aug 2019  路  19Comments  路  Source: timescale/timescaledb

Relevant system information:

  • OS: Timescaledb official docker image running in NixOS
  • PostgreSQL version (output of postgres --version): 10.8
  • TimescaleDB version (output of \dx in psql): 1.3.0
  • Installation method: Docker

Describe the bug
After creating a continuous aggregate, the view is not being refreshed at all. Manually updating the view with REFRESH MATERIALIZED VIEW does work as expected.

Running select * from timescaledb_information.continuous_aggregate_stats; produces:

-[ RECORD 1 ]----------+--------------------
view_name              | event_log_summary
completed_threshold    | 2019-07-31 03:35:45
invalidation_threshold | 2019-07-31 03:35:45
job_id                 | 1000
last_run_started_at    |
job_status             |
last_run_duration      |
next_scheduled_run     |

Then select * from _timescaledb_config.bgw_job where id = 1000; produces:

 select * from _timescaledb_config.bgw_job where id = 1000;
-[ RECORD 1 ]-----+------------------------------------
id                | 1000
application_name  | Continuous Aggregate Background Job
job_type          | continuous_aggregate
schedule_interval | 00:00:30
max_runtime       | 00:00:00
max_retries       | -1
retry_period      | 00:00:30

Whereas select * from _timescaledb_internal.bgw_job_stat wherejob_id = 1000 returns 0 rows.

I created the view with the following query:

CREATE VIEW event_log_summary
WITH (timescaledb.continuous)
AS
SELECT
 time_bucket('15s',at) AS "time",
  type AS metric,
  count(id) AS "count"
FROM event_log
WHERE
  type != 'invalid_json' AND
  type != 'worker_crashed'
GROUP BY 1,2;

There are no other continuous aggregates on this database.

need-more-info

Most helpful comment

Could this be reopened?

I'm glad I stumbled across this issue because it was affecting me too.
However needing to restart the whole database service just because you've added a new continuous aggregate is obnoxious and at the very minimum should be represented in the documentation.

All 19 comments

@ljmarks sounds like you've figured out how to get this working through our Slack channel. Do you mind if I close out the issue if you've figured it out?

@dianasaur323 After a restart everything seems to be working - I'll close the issue.

I have the exact same problem looks like I'll just have to restart the server?

After restart it didn't seem to work either

2019-10-16 21:25:01.998 CST [123449] LOG:  the materialize continuous aggregate job is scheduled to run again immediately
2019-10-16 21:27:22.326 CST [124637] LOG:  the materialize continuous aggregate job is scheduled to run again immediately
2019-10-16 21:30:52.739 CST [125606] LOG:  the materialize continuous aggregate job is scheduled to run again immediately
2019-10-16 21:35:25.728 CST [128221] ERROR:  interval out of range
2019-10-16 21:35:25.737 CST [83212] LOG:  background worker "Continuous Aggregate Background Job" (PID 128221) exited with exit code 1
2019-10-16 21:35:25.737 CST [83223] ERROR:  interval out of range
2019-10-16 21:35:25.737 CST [126762] LOG:  terminating TimescaleDB background job "Continuous Aggregate Background Job" due to administrator command
2019-10-16 21:35:25.737 CST [126762] CONTEXT:  SQL statement "INSERT INTO _timescaledb_internal._materialized_hypertable_17 SELECT * FROM _timescaledb_internal._partial_view_17 AS I WHERE I.tb >= '2019-10-11 11:00:00+08' AND I.tb < '2019-10-12 07:00:00+08';"
2019-10-16 21:35:25.737 CST [126762] FATAL:  terminating connection due to administrator command
2019-10-16 21:35:25.737 CST [126762] CONTEXT:  SQL statement "INSERT INTO _timescaledb_internal._materialized_hypertable_17 SELECT * FROM _timescaledb_internal._partial_view_17 AS I WHERE I.tb >= '2019-10-11 11:00:00+08' AND I.tb < '2019-10-12 07:00:00+08';"
2019-10-16 21:35:25.739 CST [83212] LOG:  background worker "TimescaleDB Background Worker Scheduler" (PID 83223) exited with exit code 1
2019-10-16 21:35:26.089 CST [83212] LOG:  background worker "Continuous Aggregate Background Job" (PID 126762) exited with exit code 1

Could this be reopened?

I'm glad I stumbled across this issue because it was affecting me too.
However needing to restart the whole database service just because you've added a new continuous aggregate is obnoxious and at the very minimum should be represented in the documentation.

Just to clarify: Is it really necessary to restart the complete database service for each continuous aggregate? If so, this would seem like a clear bug by design to me.

@holgerbrandl No, there is no reason to restart the db after adding a cont. aggregate. The jobs should get scheduled automatically.

Is anyone experiencing this on 1.6.x?

Yes, I seemingly experienced this on 1.6, I moved to a cluster setup on Kubernetes and after a DB restore, I was not getting any data (or only a small amount of data) in my continuous aggregates.

I tried rebooting, which didn't help. I instead made a transaction, created a new table, made it a hyper table, copied the data from the old table to the new hypertable, dropped the old table and then recreated the continuous aggregates.

After that, all worked fine.

@Firaenix Thanks for reporting. Did you happen to check the bgw_job_stat table to see if jobs were failing? Did any of the jobs show up ? I am trying to gather information to repro the issue.

Hi, same thing is happening to me, timescale db version 1.4.1. When creating new views I have to manually refresh them, auto refresh doesn't work

Sorry, I completely missed this comment.
No I did not get to check the bgw_job_stat table unfortunately, sorry.

if it happens again, I'll make sure to check it out.

Hi, same thing is happening to me, timescale db version 1.4.1. When creating new views I have to manually refresh them, auto refresh doesn't work.
@simaosantos Could you move to 1.6.x? Please report if you still see the issue on 1.6.x.

Some fixes related to bgw scheduler are in 1.6.1. Please upgrade to 1.6.1. if you are running into this issue.
https://github.com/timescale/timescaledb/releases

I just upgraded from 1.3.2 to 1.6.1, hopefully CA will be much smoother. Everything under _timescaledb_internal.bgw_job_stat looking good so far.

I have experienced this issue twice now in versions 1.7.1 and 1.7.2. Both times have been after I have deleted an aggregate and then recreated it again with the same definition/query. Even recreating it with a new name does not fix the issue. Restarting postgres fixed the issue.

I can second that the issue still exists in 1.7.3.

The CA gets created successfully and it shows up in continuous_aggregates and continuous_aggregate_stats, but it is never refreshed automatically. It is refreshable via refresh materialized view.

I am not able to reproduce the issue locally, so I need more data to figure out what is going on.

@shadev Since you ran into this in 1.7.3, could you
1) post information from timescaledb_information.continuous_aggregate_stats
2) check postgres log for errors
3) Verify that the background worker scheduler is up
ps -ef | grep -i timescale | grep -i background
4) Are other background jobs running ?

After checking the postgres-log we found entries like:

2020-09-28 13:29:49 CEST [8477]: [24757-l] user=,db=,app=,client= WARNING: failed to launch job 1044 "Continuous Aggregate Background Job": out of background workers

Actually, the bg-jobs did never run because the number of bg-workers was too low. After adjusting them everything works.

Was this page helpful?
0 / 5 - 0 ratings