Timescaledb: Request: return periods that are missing data

Created on 3 Jan 2018  路  7Comments  路  Source: timescale/timescaledb

Hi, thank you for your amazing work :)

Sorry to post this here but I get no response on my threads on Stackoverflow and this feature could really be helpful (to me at least) : https://stackoverflow.com/questions/48064901/timescaledb-time-bucket-fetch-periods-with-no-row

Here's a copy of it :

I have sensors emitting datas each ~20 seconds. However sometimes these are in trouble and doesn't emit datas for minutes or hours.

I would like to fetch those disconnection and know how long these last.

So I tried with the function time_bucket to count number of records by bucket of 5 minutes (for example), but this function only affect the record itself. So it isn't possible to fetch bucket HAVING COUNT(*) = 0.

I tried this :

SELECT time_bucket('5 minutes', datetime) AS bucket, COUNT(*) AS nb_datas
FROM measures
WHERE id_sensor = 123456
GROUP BY bucket
HAVING COUNT(*) = 0
ORDER BY bucket DESC;

But logically it return nothing.

Sorry again to post here but there is absolutely no TimescaleDB community on Stackoverflow.

enhancement

Most helpful comment

All 7 comments

Hey @Raphiki44, absolutely not a problem to post that here, we absolutely take feature requests/enhancements here.

We did get an alert late last night about the Stackoverflow post but hadn't gotten a chance yet to respond ourselves, so thank you for following up here. We also have a community slack that can help for future questions/suggestions:
http://slack-login.timescale.com/

We'll also try to keep a closer eye on SO too :)

@Raphiki44 You can look at the "gap filling" section of our docs: http://docs.timescale.com/latest/using-timescaledb/reading-data#advanced-analytics

I think it may answer your question

This is really nice thanks :) just what I looked for. I didn't know about this function generate_series

PS : In the exemple of this section you have different date used in the script and in the result

I maybe will have powerfull analytics on my sensors to set up, so I was wondering if there was a more gentle way to do what I'm trying to do.

The objectives is to get ranges of times where a sensor has not emit datas more than 5 minutes for example.

We can imagine a design like this :

SELECT empty_periods('5 minutes', datetime) AS period,
FROM measures
WHERE id_sensor = 123456
ORDER BY period DESC;

And the result could be an an array just like histogram with the start date and end date of the outage

--------
{2017-07-28 11:42:42.846621+00, 2017-07-28 15:12:33.456321+00} 
{2017-07-18 12:31:42.846621+00, 2017-07-18 18:55:33.456321+00}

This is maybe too much specific but really helpful to me :)

@Raphiki44 You might be interested in our new first-class support for gapfilling:

https://blog.timescale.com/sql-functions-for-time-series-analysis/

Then the values with NULLs are your empty periods, you can message into the various forms as above if you want.

Anything similar to gapfill for apache license? I am using timescaledb on Azure and it throws error that the time_bucket_gapfill is not available in apache license.

Was this page helpful?
0 / 5 - 0 ratings