Describe the feature:
Would be great if we could support average, sum, count, and histogram aggregations on the histogram data type. We would need these for the APM UI. We could possible make do with 50th percentile instead of avg. as a start.
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)
@roncohen I've taken the liberty of adding the "histogram" aggregation to the list, as we'll also need this in the APM UI for the "Transaction duration distribution" chart.
Hey @roncohen , thank you submitting this feature request. Those aggregations on histogram datatype sound very useful and we would be happy to add them to our roadmap.
However, can you please elaborate a bit more on how the count aggregation should work on histograms and what information would it return?
thanks @csoulios!
I'll let @axw drive the conversation forward here, but here's my immediate thinking:
By "count" here, I mean sum of the counts of the buckets (HDR) or sum of weights (tdigest). For a histogram of response times, this is "total number of requests".
For "sum" I'd want the sum each bucket multiplied by its count. For a histogram of response times, this is an approximate "total time spent". I don't know if the equivalent works for tdigest centroids.
Let me know if that helps
FWIW, I've been thinking of the "count" aggregation here as if it were a value_count agg over the original values that made up the pre-aggregated histogram. Would it make sense to special-case value_count for histogram fields?
@axw a value_count would simply return the number of values extracted from the documents. If we special-case the value_count agg, we miss this feature. I would prefer the creation of a specific histo_value_count aggregation that would return the sum of the counts of the buckets.
Let me discuss this with the team and I will get back to you.
@csoulios Did the discussion take place? Does the team need any more information that we could provide?
Hi @graphaelli, we have scheduled implementations of value_count, sum and average aggs on histogram field types. Currently there are no blockers.
Once we have those aggs done, we will discuss about the histogram aggs.
Merging https://github.com/elastic/elasticsearch/pull/58930 closes this issue.
Most helpful comment
Hi @graphaelli, we have scheduled implementations of
value_count,sumandaverageaggs on histogram field types. Currently there are no blockers.Once we have those aggs done, we will discuss about the histogram aggs.