Influxdb: Add support for 'having' queries

Created on 15 Jan 2014  Â·  42Comments  Â·  Source: influxdata/influxdb

Should we support having in queries? Like this:

select count(value) from some_series
group by time(5m)
having count(value) > 23
revisit in the future

Most helpful comment

+2

All 42 comments

Having is particularly useful for top queries as well. For instance:

select * from cpu
group by time(1h), host having top(value, 10)
where time > now() - 1d

That would give you the top 10 hosts in each hour for the last 24 hours. Whereas this query:

select top(value, 10) from cpu
group by time(1h), host

Would give you the top 10 measurements from each host in a 1 hour interval. "Having" is what we need to actually pick up the top hosts for a given group by interval. For instance this query:

select max(value) as max_value, host from cpu
group by time(1h), host
having top(max_value, 10)

Would give you the top 10 hosts per hour.

@pauldix Could you set approximate milestone for this? I'd like to reference that for planning our schedule.

@chobie, how important is this one to you? We have a number of other things we're working on at the moment so we were going to push this one out by 30 days or so.

But, if it's something you really need we can reassess. Any chance that we can help you through a few of the bits to write yourself and submit a PR?

I also would like to see this feature. I want to use it for sorted and limited output of frequency counts on column values with many different values. For example, what are the values and counts of the top 10 most frequent values? Without this feature the output would be very large, making it very difficult to sort and limit on the application side.

With the help of some pointers in the code I might be able to help.

Again,

I've challenged this but I couldn't figure out how to implement complete having clause features. Engine and boolean filters are difficult for me :cry:
I guess having clause might need some engine api change. (or, just I don't realized good solution)

For now, I only use top and bottom function for my use case. I'll use this patch until InfluxDB support having clause.
https://github.com/chobie/influxdb/compare/master...incomplete-top-bottom-having-clause

Thanks @chobie for giving it a shot. We'll come back to this soon. I think there are some other features that are higher up on our priority list for the next 2-3 weeks, but we'd love to get to this.

+1

I see. I'll take some time to improve this patch in next weekend. probably I can implement it.

I guess mixing aggregate function and conditions makes things complicated.
So I'd like to choose below syntax for having clause at this time.

#Query Syntax:
GROUP BY VALUES 
  [HAVING 
         (AGGREGATE_FUNCTION_FOR_HAVING_CLAUSE | CONDITIONS )
  ]

The only other thing I'd note about having is that it should also support where style syntax. For example:

select count(value) as valCount from foo
group by time(10m) having valCount > 3
where time > now() - 6h

I'd like this functionality also.

I have a series with a lot of grouping, and I want to group all small groups into a single "other" group. I believe I can have 2 queries, one for all groups with sufficiently large group sum, and another one for all "others"... I am not sure if it's even possible with "having" though.

+1

any updates on this feature?

yep, +1, very useful, and relieves some of the subquery use cases

+1

+1, this can be avoided by creating a continious query which not seems to be a dificult solution.

+1

+1

+1

+1

+1

+1

Any news on this feature?

+1

This would be very much needed.

:+1: this would be extremely useful!

How's going on?

+1
This would be very helpful to get top, bottom items after aggregations

Hi @pauldix, any update on the status of the having feature?

I noticed you and @chobie had a discussion in the pull request, but it is not clear whether the feature has made it in and if not, do you have any estimate of when it would be available?

The use case you mention in one of your comments at the top of this thread is very powerful and necessary for exploring time series data – how the top n items instead of top 10 values for every item.

+1

+1

+2

+1. Need this badly! Makes GROUP * much more efficient for querying.

+1

+1

On my first foray with Influx and very quickly ran into multiple cases where I need this kind of behaviour. How are people working round this currently, with continuous queries?

+1

+1

+1
I need this. Need to exclude certain values in a continuous query and can't find another way to do it.

+1 please support HAVING clause

I encountered another issue in Grafana https://github.com/grafana/grafana/issues/4623

The problem is Grafana displays all GROUP BY results, lots of zeros can not be hidden.

We hope Influxdb can filter those zeros out, the best method so far is to have a GROUP BY ... HAVING clause.

+1

+1

+1

+2019

+2020

+2021

Was this page helpful?
0 / 5 - 0 ratings

Related issues

udf2457 picture udf2457  Â·  3Comments

airyland picture airyland  Â·  3Comments

jonira picture jonira  Â·  4Comments

Witee picture Witee  Â·  3Comments

bigKS picture bigKS  Â·  3Comments