Elasticsearch: Aggregations on Range Fields

Created on 19 Oct 2018  路  7Comments  路  Source: elastic/elasticsearch

Meta-issue to encompass adding aggregation support for Range fields (date, numeric and IP).

Pre-work todo:

  • [ ] Document that aggs do not currently work with range fields

Common aggregations:

  • ~Min~
  • ~Max~
  • [x] Cardinality - done (#44502)
  • [x] Value Count - done (#44502)
  • [x] Histogram - done (#41545)
  • [ ] Range
  • [x] Missing - done (#44502)
  • [x] Terms - Doesn't need to support ranges but should gracefully handle rejecting range fields (#44502)

Date Specific:

IP Specific:

  • [ ] IP Range

Considerations

Dealing with ranges opens up some interesting usability issues. For example, min/max could deal with either the start or the end of the range. Similarly, for bucketing aggs like histogram and range, we may need to support "relations" such as contains, intersect, etc.

Related issues:

Use Cases:

37642 - Weighting data across buckets

:AnalyticAggregations >enhancement Meta Analytics

Most helpful comment

Can someone please at least do the "Pre-work" and make a quick update to the documentation with a warning that aggs don't work with range fields? This has just bitten me - date range seemed perfect for a document that has start date and end date values, but now I cannot aggregate on the minimum start date for example.

All 7 comments

Pinging @elastic/es-search-aggs

getting this feature would be significant in making use of the datasets we are creating. is there any traction?

@chrisbeckc1 we're internally talking about how to implement this, but we don't have any concrete roadmap/timeline yet (and we generally don't state public timelines anyway since features can slip... we don't want to get people prematurely excited on accident).

It is under active consideration and development though, so once we start making progress we'll update this meta issue with links to PRs, etc. :)

Can someone please at least do the "Pre-work" and make a quick update to the documentation with a warning that aggs don't work with range fields? This has just bitten me - date range seemed perfect for a document that has start date and end date values, but now I cannot aggregate on the minimum start date for example.

This would be very useful for many of our use cases, which involve generating metrics on "active" records, which have a start and end date. A date_range_histogram would be great.

I've taken min and max off the to-do list; After discussion we've decided not to include them at this time. There's no natural ordering for ranges, so we can't rely on the existing min & max aggregation API; if we were to include min & max, we'd need to add a new API allowing users to specify the ordering for the ranges. We don't have a use case specifically in mind for min & max on ranges, so expanding the API like that doesn't seem justified. If there's a specific use case for min & max aggregations over ranges in the future, we can revisit that choice.

Closing as we're done with our initial set of range aggs, and don't have plans for more range-based aggs in the near future. Any future additions/enhancements can be dealt with on a per-agg basis, no need for this meta anymore :tada:

Was this page helpful?
0 / 5 - 0 ratings