We are planning to port most of Stack Monitoring current aggregations to use auto_date_histogram
in lieu of date_histogram
as part of https://github.com/elastic/kibana/issues/37246.
We noticed that auto_date_histogram
supports most of aggregations that date_histogram
does, but fails (on fetch
phase) with bucket_script
. As a first pass, I tried replacing the existing date_histogram
usage with auto_date_histogram
and it broke:
GET .monitoring-logstash-*/_search
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"range": {
"logstash_stats.timestamp": {
"format": "epoch_millis",
"gte": 1563551274281,
"lte": 1563554874281
}
}
}
]
}
},
"aggs": {
"check": {
"auto_date_histogram": {
"field": "logstash_stats.timestamp",
"buckets": 10
},
"aggs": {
"pipelines_nested": {
"nested": {
"path": "logstash_stats.pipelines"
},
"aggs": {
"by_pipeline_id": {
"terms": {
"field": "logstash_stats.pipelines.id",
"size": 1000
},
"aggs": {
"by_pipeline_hash": {
"terms": {
"field": "logstash_stats.pipelines.hash",
"size": 1000
},
"aggs": {
"by_ephemeral_id": {
"terms": {
"field": "logstash_stats.pipelines.ephemeral_id",
"size": 1000
},
"aggs": {
"events_stats": {
"stats": {
"field": "logstash_stats.pipelines.events.out"
}
},
"throughput": {
"bucket_script": {
"script": "params.max - params.min",
"buckets_path": {
"min": "events_stats.min",
"max": "events_stats.max"
}
}
}
}
}
}
}
}
}
}
}
}
}
}
}
This query fails where the same query works if you replace auto_date_histogram
with date_histogram
(and use interval
instead of buckets
). Anyone attempting to run this on an existing cluster will likely need to shift the date range
query.
/cc @pcsanwald
Pinging @elastic/es-analytics-geo
@igoristic Do you have a stack trace or error from Elasticsearch when it fails to work with the bucket script?
@polyfractal I've been looking into this a bit already, so I assigned to me. working with @pickypg, I do have access to a cluster where we can repro this error.
Do you have a stack trace or error from Elasticsearch when it fails to work with the bucket script?
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "unsupported_operation_exception",
"reason": "Not supported"
}
},
"status": 500
}
ES output:
path: /.monitoring-logstash-6-*,.monitoring-logstash-7-*/_search, params: {pretty=, index=.monitoring-logstash-6-*,.monitoring-logstash-7-*}
โ org.elasticsearch.action.search.SearchPhaseExecutionException:
โ at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:305) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:91) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
โ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
โ at java.lang.Thread.run(Thread.java:835) [?:?]
โ Caused by: java.lang.UnsupportedOperationException: Not supported
โ at org.elasticsearch.search.aggregations.pipeline.InternalSimpleValue.doReduce(InternalSimpleValue.java:80) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.pipeline.InternalSimpleValue.doReduce(InternalSimpleValue.java:34) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:135) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:123) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms$Bucket.reduce(InternalTerms.java:142) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms.doReduce(InternalTerms.java:286) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:135) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:123) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.bucket.InternalSingleBucketAggregation.doReduce(InternalSingleBucketAggregation.java:108) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:135) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:123) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogram$Bucket.reduce(InternalAutoDateHistogram.java:131) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
โ at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogram.mergeBuckets(InternalAutoDateHistogram.java:394) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
Hi @pcsanwald. Any updates on this? This bug is a blocker for a high-priority bug in the Stack Monitoring application that causes it to become unusable under certain conditions and we're hoping to try to get a fix pushed quickly. We're happy to do whatever we can to help. :)
@cachedout do you have a link to an issue for the high priority bug that's blocking you? Or is it the kibana bug referenced in this issue above?
@cachedout apologies for the late response. We're discussing how to deal with this, it's due to some of the specifics of pipeline aggs. It's not a trivial fix, @polyfractal will be looking into this.
@pcsanwald Thank you for the update.
@nknize Yes, it is the issue you linked to. @igoristic can help provide more context as well.
@nknize Yeah, it's the issue you linked to, and this issue is already affecting some customers in live productions.
The auto_date_histogram
feature/fix will help us fix two issues:
too_many_buckets_exception
date_histogram
elastic/kibana#37246
Just merged #45359, which should resolve this issue. :)
False alarm, sorry. :( The bugfix is mostly correct I think, but there is a failing test that makes me think it isn't fully fixed yet. I'm going to revert the commit and try again, will keep this thread updated.
I have opened a new PR to fix this issue (which I thought I had reopened, but apparently not). Will update again when the new PR merges :)
accidentally closed this one, thanks @pickypg for the catch. apologies
Luckily, I just merged #45796 so we can go ahead and close again :)
(So sorry for the delay on this, the PR got LGTM but it fell through the cracks and I never merged it :sob: )
Hey @polyfractal,
I'm seeing some unexpected behavior that I think is related to this PR.
It seems that bucket_script
aggregations isn't working properly with the nested
aggregation.
See this example here: https://gist.github.com/chrisronline/836dbbdb594d5848538e412236f32147
For 7.5 and below, the output is:
{
"aggregations" : {
"date_histo" : {
"buckets" : [
{
"in_nested" : {
"types" : {
"buckets" : [
{
"bs_value" : {
"value" : 1.0
}
}
]
}
},
"high_level_bs_value" : {
"value" : 0.0
}
}
]
}
}
}
For 7.6 and above, the output is:
{
"aggregations" : {
"date_histo" : {
"buckets" : [
{
"high_level_bs_value" : {
"value" : 0.0
}
}
]
}
}
}
Is this intended? (As a side note for @elastic/stack-monitoring folks, this is causing this failure: https://github.com/elastic/kibana/issues/52470)
Most helpful comment
I have opened a new PR to fix this issue (which I thought I had reopened, but apparently not). Will update again when the new PR merges :)