Elasticsearch: [Histogram data type] Allow average, sum, count, and histogram aggs

Created on 9 Mar 2020  路  9Comments  路  Source: elastic/elasticsearch

Describe the feature:

Would be great if we could support average, sum, count, and histogram aggregations on the histogram data type. We would need these for the APM UI. We could possible make do with 50th percentile instead of avg. as a start.

:AnalyticAggregations Analytics

Most helpful comment

Hi @graphaelli, we have scheduled implementations of value_count, sum and average aggs on histogram field types. Currently there are no blockers.

Once we have those aggs done, we will discuss about the histogram aggs.

All 9 comments

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@roncohen I've taken the liberty of adding the "histogram" aggregation to the list, as we'll also need this in the APM UI for the "Transaction duration distribution" chart.

Hey @roncohen , thank you submitting this feature request. Those aggregations on histogram datatype sound very useful and we would be happy to add them to our roadmap.

However, can you please elaborate a bit more on how the count aggregation should work on histograms and what information would it return?

thanks @csoulios!

I'll let @axw drive the conversation forward here, but here's my immediate thinking:

By "count" here, I mean sum of the counts of the buckets (HDR) or sum of weights (tdigest). For a histogram of response times, this is "total number of requests".

For "sum" I'd want the sum each bucket multiplied by its count. For a histogram of response times, this is an approximate "total time spent". I don't know if the equivalent works for tdigest centroids.

Let me know if that helps

FWIW, I've been thinking of the "count" aggregation here as if it were a value_count agg over the original values that made up the pre-aggregated histogram. Would it make sense to special-case value_count for histogram fields?

@axw a value_count would simply return the number of values extracted from the documents. If we special-case the value_count agg, we miss this feature. I would prefer the creation of a specific histo_value_count aggregation that would return the sum of the counts of the buckets.

Let me discuss this with the team and I will get back to you.

@csoulios Did the discussion take place? Does the team need any more information that we could provide?

Hi @graphaelli, we have scheduled implementations of value_count, sum and average aggs on histogram field types. Currently there are no blockers.

Once we have those aggs done, we will discuss about the histogram aggs.

Was this page helpful?
0 / 5 - 0 ratings