Elasticsearch: auto_date_histogram fails where date_histogram does not

Created on 26 Jul 2019  ยท  16Comments  ยท  Source: elastic/elasticsearch

We are planning to port most of Stack Monitoring current aggregations to use auto_date_histogram in lieu of date_histogram as part of https://github.com/elastic/kibana/issues/37246.

We noticed that auto_date_histogram supports most of aggregations that date_histogram does, but fails (on fetch phase) with bucket_script. As a first pass, I tried replacing the existing date_histogram usage with auto_date_histogram and it broke:

GET .monitoring-logstash-*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "logstash_stats.timestamp": {
              "format": "epoch_millis",
              "gte": 1563551274281,
              "lte": 1563554874281
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "check": {
      "auto_date_histogram": {
        "field": "logstash_stats.timestamp",
        "buckets": 10
      },
      "aggs": {
        "pipelines_nested": {
          "nested": {
            "path": "logstash_stats.pipelines"
          },
          "aggs": {
            "by_pipeline_id": {
              "terms": {
                "field": "logstash_stats.pipelines.id",
                "size": 1000
              },
              "aggs": {
                "by_pipeline_hash": {
                  "terms": {
                    "field": "logstash_stats.pipelines.hash",
                    "size": 1000
                  },
                  "aggs": {
                    "by_ephemeral_id": {
                      "terms": {
                        "field": "logstash_stats.pipelines.ephemeral_id",
                        "size": 1000
                      },
                      "aggs": {
                        "events_stats": {
                          "stats": {
                            "field": "logstash_stats.pipelines.events.out"
                          }
                        },
                        "throughput": {
                          "bucket_script": {
                            "script": "params.max - params.min",
                            "buckets_path": {
                              "min": "events_stats.min",
                              "max": "events_stats.max"
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

This query fails where the same query works if you replace auto_date_histogram with date_histogram (and use interval instead of buckets). Anyone attempting to run this on an existing cluster will likely need to shift the date range query.

:AnalyticAggregations >bug

Most helpful comment

I have opened a new PR to fix this issue (which I thought I had reopened, but apparently not). Will update again when the new PR merges :)

All 16 comments

/cc @pcsanwald

Pinging @elastic/es-analytics-geo

@igoristic Do you have a stack trace or error from Elasticsearch when it fails to work with the bucket script?

@polyfractal I've been looking into this a bit already, so I assigned to me. working with @pickypg, I do have access to a cluster where we can repro this error.

Do you have a stack trace or error from Elasticsearch when it fails to work with the bucket script?

{
  "error": {
    "root_cause": [],
    "type": "search_phase_execution_exception",
    "reason": "",
    "phase": "fetch",
    "grouped": true,
    "failed_shards": [],
    "caused_by": {
      "type": "unsupported_operation_exception",
      "reason": "Not supported"
    }
  },
  "status": 500
}

ES output:

path: /.monitoring-logstash-6-*,.monitoring-logstash-7-*/_search, params: {pretty=, index=.monitoring-logstash-6-*,.monitoring-logstash-7-*}
   โ”‚      org.elasticsearch.action.search.SearchPhaseExecutionException: 
   โ”‚            at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:305) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:91) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
   โ”‚            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
   โ”‚            at java.lang.Thread.run(Thread.java:835) [?:?]
   โ”‚      Caused by: java.lang.UnsupportedOperationException: Not supported
   โ”‚            at org.elasticsearch.search.aggregations.pipeline.InternalSimpleValue.doReduce(InternalSimpleValue.java:80) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.pipeline.InternalSimpleValue.doReduce(InternalSimpleValue.java:34) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:135) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:123) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms$Bucket.reduce(InternalTerms.java:142) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.bucket.terms.InternalTerms.doReduce(InternalTerms.java:286) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:135) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:123) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.bucket.InternalSingleBucketAggregation.doReduce(InternalSingleBucketAggregation.java:108) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.InternalAggregation.reduce(InternalAggregation.java:135) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.InternalAggregations.reduce(InternalAggregations.java:123) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogram$Bucket.reduce(InternalAutoDateHistogram.java:131) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
   โ”‚            at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogram.mergeBuckets(InternalAutoDateHistogram.java:394) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]

Hi @pcsanwald. Any updates on this? This bug is a blocker for a high-priority bug in the Stack Monitoring application that causes it to become unusable under certain conditions and we're hoping to try to get a fix pushed quickly. We're happy to do whatever we can to help. :)

@cachedout do you have a link to an issue for the high priority bug that's blocking you? Or is it the kibana bug referenced in this issue above?

@cachedout apologies for the late response. We're discussing how to deal with this, it's due to some of the specifics of pipeline aggs. It's not a trivial fix, @polyfractal will be looking into this.

@pcsanwald Thank you for the update.

@nknize Yes, it is the issue you linked to. @igoristic can help provide more context as well.

@nknize Yeah, it's the issue you linked to, and this issue is already affecting some customers in live productions.

The auto_date_histogram feature/fix will help us fix two issues:

  • Max bucket error too_many_buckets_exception
  • OOM because of pipeline aggs using date_histogram

elastic/kibana#37246

Just merged #45359, which should resolve this issue. :)

False alarm, sorry. :( The bugfix is mostly correct I think, but there is a failing test that makes me think it isn't fully fixed yet. I'm going to revert the commit and try again, will keep this thread updated.

I have opened a new PR to fix this issue (which I thought I had reopened, but apparently not). Will update again when the new PR merges :)

accidentally closed this one, thanks @pickypg for the catch. apologies

Luckily, I just merged #45796 so we can go ahead and close again :)

(So sorry for the delay on this, the PR got LGTM but it fell through the cracks and I never merged it :sob: )

Hey @polyfractal,

I'm seeing some unexpected behavior that I think is related to this PR.

It seems that bucket_script aggregations isn't working properly with the nested aggregation.

See this example here: https://gist.github.com/chrisronline/836dbbdb594d5848538e412236f32147

For 7.5 and below, the output is:

{
  "aggregations" : {
    "date_histo" : {
      "buckets" : [
        {
          "in_nested" : {
            "types" : {
              "buckets" : [
                {
                  "bs_value" : {
                    "value" : 1.0
                  }
                }
              ]
            }
          },
          "high_level_bs_value" : {
            "value" : 0.0
          }
        }
      ]
    }
  }
}

For 7.6 and above, the output is:

{
  "aggregations" : {
    "date_histo" : {
      "buckets" : [
        {
          "high_level_bs_value" : {
            "value" : 0.0
          }
        }
      ]
    }
  }
}

Is this intended? (As a side note for @elastic/stack-monitoring folks, this is causing this failure: https://github.com/elastic/kibana/issues/52470)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

martijnvg picture martijnvg  ยท  3Comments

ppf2 picture ppf2  ยท  3Comments

makeyang picture makeyang  ยท  3Comments

DhairyashilBhosale picture DhairyashilBhosale  ยท  3Comments

matthughes picture matthughes  ยท  3Comments