Influxdb: Continuous Queries should respect fill() function in downsampled measurements

Created on 2 Jul 2015 · 6Comments · Source: influxdata/influxdb

When using fill(0) while creating a continuous query, it is expected that missing values during the aggregation period would be injected with a 0 instead of not being returned at all.

_original series:_

SELECT * FROM "mydb"."my_retention_policy".example_series

time value

2015-05-12T21:00:02Z 0
2015-05-14T18:22:06Z 1
2015-05-15T00:20:46Z 1
2015-05-15T16:08:46Z 1
2015-05-19T21:17:35Z 1

_Select from original series with aggregation and properly filled empty values:_

SELECT sum(value) AS value FROM "mydb"."my_retention_policy".example_series where time >= '' GROUP BY time(1d), * fill(0)

time value

2015-05-12T00:00:00Z 0
2015-05-13T00:00:00Z 0
2015-05-14T00:00:00Z 1
2015-05-15T00:00:00Z 2
2015-05-16T00:00:00Z 0
2015-05-17T00:00:00Z 0
2015-05-18T00:00:00Z 0
2015-05-19T00:00:00Z 1
2015-05-20T00:00:00Z 0

Unexpected result when selecting from CQ measurement:

CREATE CONTINUOUS QUERY daily_counts ON mydb BEGIN SELECT sum(value) AS value INTO "mydb"."my_retention_policy".daily_counter FROM "mydb"."example_retention_policy".my_series GROUP BY time(1d), * fill(0) END

SELECT * FROM my_retention_policy.daily_counter;

time value

2015-05-12T00:00:00Z 0
2015-05-14T00:00:00Z 1
2015-05-15T00:00:00Z 2
2015-05-19T00:00:00Z 1

1.x arecontinuous queries kinbug preview wontfix

Source

jamebr

👍1

Most helpful comment

+1
So frustrating that queries behave differently at the CLI, via the API, in Kapacitor and in CQs. Particularly egregious for Kapacitor and CQs because the obvious work flow is to develop your query using CLI or API, then deploy into a Kapacitor task or a CQ, wait a few days for the results to accumulate and then discover the query did something different! Makes for a long develop, test, fix cycle.

My reproduction of the issue:

influx -execute "drop database d"
influx -execute "create database d"
TIME_START=`date -u +%s%N`
influx -execute "INSERT m a=1" -database d
sleep 1
influx -execute "INSERT m a=2" -database d
sleep 1
influx -execute "INSERT m a=3" -database d
sleep 1
influx -execute "INSERT m a=4" -database d
TIME_END=`date -u +%s%N`
influx -execute "SELECT count(a) FROM m WHERE a>=3 AND time >= $TIME_START AND time <= $TIME_END GROUP BY time(1s) FILL(0)" -database d

Which produces the expected output: a count of 0 for the first two periods and a count of 1 for the last two:

name: m
time                count
----                -----
1492133235000000000 0
1492133236000000000 0
1492133237000000000 1
1492133238000000000 1

Then, as a CQ:

influx -execute "CREATE CONTINUOUS QUERY cq ON d BEGIN SELECT count(a) AS count_a INTO d.autogen.cq_results FROM d.autogen.m WHERE a>=3 GROUP BY time(1s) FILL(0) END" -database d
sleep 1
influx -execute "INSERT m a=1" -database d
sleep 1
influx -execute "INSERT m a=2" -database d
sleep 1
influx -execute "INSERT m a=3" -database d
sleep 1
influx -execute "INSERT m a=4" -database d
sleep 1
influx -execute "DROP CONTINUOUS QUERY cq ON d" -database d
influx -execute "SELECT * FROM cq_results" -database d

This time, the 0 results are missing:

name: cq_results
time                count_a
----                -------
1492133167000000000 1
1492133168000000000 1

I suspect the issue is to do with the difference between these two queries:

influx -execute "SELECT count(a) FROM m WHERE a>=3 AND time >= $TIME_START AND time <= $TIME_END GROUP BY time(1s) FILL(0)" -database d
influx -execute "SELECT count(a) FROM m WHERE a>=5 AND time >= $TIME_START AND time <= $TIME_END GROUP BY time(1s) FILL(0)" -database d

The former produces the two 0's and two 1's shown earlier, but the latter produces nothing at all, instead of four 0's. Since the CQs only run for one time period at a time, they're probably producing nothing at all instead of 0's.

Sure enough, it's actually spelled out quite plainly here:

Issue 1: Handling time intervals with no data

CQs do not write any results for a time interval if no data fall within that time range. Note that the basic syntax does not support using fill() to change the value reported for intervals with no data. Basic syntax CQs ignore fill() if it’s included in the CQ query. A possible workaround is to use the advanced CQ syntax.

That leads me to investigate Advanced CQs. Since all the examples seem to show the "benefit" is redundant entries that are overwritten anyway, I hadn't paid much attention to it before. But maybe we can trick the CQ into returning the 0's by including some non-zero's in the time period...

influx -execute "DROP MEASUREMENT cq_results" -database d
influx -execute "CREATE CONTINUOUS QUERY cq ON d RESAMPLE FOR 10s BEGIN SELECT count(a) AS count_a INTO d.autogen.cq_results FROM d.autogen.m WHERE a>=5 GROUP BY time(1s) FILL(0) END" -database d
sleep 11
influx -execute "INSERT m a=1" -database d
sleep 1
influx -execute "INSERT m a=2" -database d
sleep 1
influx -execute "INSERT m a=3" -database d
sleep 1
influx -execute "INSERT m a=4" -database d
sleep 1
influx -execute "DROP CONTINUOUS QUERY cq ON d" -database d
influx -execute "SELECT * FROM cq_results" -database d

Looks good:

name: cq_results
time                count_a
----                -------
1492134838000000000 0
1492134839000000000 0
1492134840000000000 0
1492134841000000000 0
1492134842000000000 0
1492134843000000000 0
1492134844000000000 0
1492134845000000000 0
1492134846000000000 0
1492134847000000000 1
1492134848000000000 1

But of course, it requires that the SELECT does actually return at least one result for the FOR period. If not, the entries are skipped just like in the basic syntax.

There's another big hairy gotcha with the FOR syntax - note the sleep 11. If you don't wait at least the FOR period before inserting a point that satisfies the CQ, it somehow goes back and adds points for every second back to 10s before the last satisfying point, which can be well before the CQ was created! Sounds like it could be useful, if it wasn't so unpredictable.

So what's the resolution? Just specify a FOR period that you know is going to be long enough to find some matches? Means a lot more work for the CQ - if the FOR period is, say, 10x longer than the GROUP BY period, then the CQ will be doing 10x more work than necessary, right?

hraftery on 14 Apr 2017

👍2

All 6 comments

srikara on 5 Jul 2015

This is a real problem for us. It makes continuous queries very difficult to handle

jdfergason on 17 Sep 2016

My reproduction of the issue:

influx -execute "drop database d"
influx -execute "create database d"
TIME_START=`date -u +%s%N`
influx -execute "INSERT m a=1" -database d
sleep 1
influx -execute "INSERT m a=2" -database d
sleep 1
influx -execute "INSERT m a=3" -database d
sleep 1
influx -execute "INSERT m a=4" -database d
TIME_END=`date -u +%s%N`
influx -execute "SELECT count(a) FROM m WHERE a>=3 AND time >= $TIME_START AND time <= $TIME_END GROUP BY time(1s) FILL(0)" -database d

Which produces the expected output: a count of 0 for the first two periods and a count of 1 for the last two:

name: m
time                count
----                -----
1492133235000000000 0
1492133236000000000 0
1492133237000000000 1
1492133238000000000 1

Then, as a CQ:

influx -execute "CREATE CONTINUOUS QUERY cq ON d BEGIN SELECT count(a) AS count_a INTO d.autogen.cq_results FROM d.autogen.m WHERE a>=3 GROUP BY time(1s) FILL(0) END" -database d
sleep 1
influx -execute "INSERT m a=1" -database d
sleep 1
influx -execute "INSERT m a=2" -database d
sleep 1
influx -execute "INSERT m a=3" -database d
sleep 1
influx -execute "INSERT m a=4" -database d
sleep 1
influx -execute "DROP CONTINUOUS QUERY cq ON d" -database d
influx -execute "SELECT * FROM cq_results" -database d

This time, the 0 results are missing:

name: cq_results
time                count_a
----                -------
1492133167000000000 1
1492133168000000000 1

I suspect the issue is to do with the difference between these two queries:

influx -execute "SELECT count(a) FROM m WHERE a>=3 AND time >= $TIME_START AND time <= $TIME_END GROUP BY time(1s) FILL(0)" -database d
influx -execute "SELECT count(a) FROM m WHERE a>=5 AND time >= $TIME_START AND time <= $TIME_END GROUP BY time(1s) FILL(0)" -database d

Sure enough, it's actually spelled out quite plainly here:

Issue 1: Handling time intervals with no data

CQs do not write any results for a time interval if no data fall within that time range. Note that the basic syntax does not support using fill() to change the value reported for intervals with no data. Basic syntax CQs ignore fill() if it’s included in the CQ query. A possible workaround is to use the advanced CQ syntax.

influx -execute "DROP MEASUREMENT cq_results" -database d
influx -execute "CREATE CONTINUOUS QUERY cq ON d RESAMPLE FOR 10s BEGIN SELECT count(a) AS count_a INTO d.autogen.cq_results FROM d.autogen.m WHERE a>=5 GROUP BY time(1s) FILL(0) END" -database d
sleep 11
influx -execute "INSERT m a=1" -database d
sleep 1
influx -execute "INSERT m a=2" -database d
sleep 1
influx -execute "INSERT m a=3" -database d
sleep 1
influx -execute "INSERT m a=4" -database d
sleep 1
influx -execute "DROP CONTINUOUS QUERY cq ON d" -database d
influx -execute "SELECT * FROM cq_results" -database d

Looks good:

name: cq_results
time                count_a
----                -------
1492134838000000000 0
1492134839000000000 0
1492134840000000000 0
1492134841000000000 0
1492134842000000000 0
1492134843000000000 0
1492134844000000000 0
1492134845000000000 0
1492134846000000000 0
1492134847000000000 1
1492134848000000000 1

But of course, it requires that the SELECT does actually return at least one result for the FOR period. If not, the entries are skipped just like in the basic syntax.

hraftery on 14 Apr 2017

👍2

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 23 Jul 2019

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

stale[bot] on 30 Jul 2019

This issue is one of the blockers for CQ's for us. It's too risky for us to use them when they behave differently to the query without the CQ.

hraftery on 30 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings