Influxdb: Aggregate function with low GROUP BY and low LIMIT takes inordinately long on 10-point measurement

Created on 7 Oct 2016  路  6Comments  路  Source: influxdata/influxdb

I expect this query to return very quickly because it has a LIMIT 3:

SELECT MEAN(water_level) FROM h2o_feet WHERE location='coyote_creek' AND time > 0ms GROUP BY time(1s) LIMIT 3

Instead, the query takes an obviously long amount of time. If I GROUP BY time(1h), the query completes in ~10 seconds, but all returned mean values are null.

Here's the kicker: I've only imported the first 10 values from the sample data.

> SELECT count(*) FROM h2o_feet;
{
    "results": [
        {
            "series": [
                {
                    "name": "h2o_feet",
                    "columns": [
                        "time",
                        "count_level description",
                        "count_water_level"
                    ],
                    "values": [
                        [
                            0,
                            8,
                            10
                        ]
                    ]
                }
            ]
        }
    ]
}
1.x areperformance arequeries kinbug wontfix

Most helpful comment

Influx team, not to go all pronouns & stuff on you, but the wording of these stalebot messages suggests that it's upon us the contributors to provide more information. If we don't keep adding comments, the issue gets closed, even though it's still an ongoing problem.

If there is actual information I can supply, please let me know what debugging info I can add in addition to the repro steps I've submitted.

All 6 comments

My hunch is that Influx samples the actual time span between epoch (1970) and now() in groups of 1 second. That's about 1.5 billion seconds.

How about reducing that interval to MIN(time)..MAX(time) for the measurement?

Still seeing this a year later in v1.4. If a fix is non-trivial, some documentation about this sort of gotcha would be welcome. Something like

SELECT MEAN(level) FROM water WHERE (location = 'xxx') ORDER BY time DESC LIMIT 1000

looks intuitive and can easily trip beginners because it doesn't do what it looks like it would do at first sight ("select the average of the most recent 1000 water levels at location xxx").

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

Influx team, not to go all pronouns & stuff on you, but the wording of these stalebot messages suggests that it's upon us the contributors to provide more information. If we don't keep adding comments, the issue gets closed, even though it's still an ongoing problem.

If there is actual information I can supply, please let me know what debugging info I can add in addition to the repro steps I've submitted.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

TechniclabErdmann picture TechniclabErdmann  路  80Comments

pauldix picture pauldix  路  89Comments

srfraser picture srfraser  路  90Comments

dmke picture dmke  路  45Comments

corylanou picture corylanou  路  42Comments