Elasticsearch: Derivative aggregation does not work with the new auto_date_histogram as a parent

Created on 15 Nov 2018  路  8Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version):
6.5.0
Plugins installed: []

JVM version (java -version):
1.8.0_191
OS version (uname -a if on a Unix-like system):
Centos 7
Description of the problem including expected versus actual behavior:
The derivative aggregation must obviously have a histogram or a date histogram parent, but it doesn't seem that the new auto_date_histogram is supported.

_derivative aggregation [slope] must have a histogram or date_histogram as parent_

Steps to reproduce:
Here's a kibana/metricbeat specific example.

"aggs":{
    "host":{
      "terms": {
        "field": "host.name",
        "size": 10
      },
      "aggs": {
        "disk_usage": {
          "auto_date_histogram": {
            "field": "@timestamp",
            "buckets": 2
          }, 
          "aggs": {
            "newest_reported": {
              "top_hits": {
                "size": 1
              }
          },
          "top_hit":{
            "max": {
              "field": "system.filesystem.used.pct"
            }
          },
          "slope":{
            "derivative": {
              "buckets_path": "top_hit"
            }
          }
          }

        }

      }
    }
  }

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Create an index in ES>6.5 with some documents with a date field and a numeric value
  2. Query with the aggregations above, obviously with the appropriate field names replaced.
:AnalyticAggregations >bug help wanted

Most helpful comment

This is something we should fix. We should ensure that the auto_date_histogram (and when we add it the auto_histogram) work with the relevant pipeline aggregations including the derivative aggregation

All 8 comments

Pinging @elastic/es-search-aggs

@elastic/es-search-aggs not sure if this is a limitation or a lack of documentation, so I'm marking this issue as team-discuss so that you can take the appropriate action.

This is something we should fix. We should ensure that the auto_date_histogram (and when we add it the auto_histogram) work with the relevant pipeline aggregations including the derivative aggregation

For anyone that may consider working on this in the future, the easy fix is to update the doValidate() method for the DerivativePipelineAggregationBuilder to add auto_date_histo, as well as the doValidate() methods for related aggs (cusum, etc). And auto_histo in the future.

We may also want to refactor some of these aggs for better maintainability. A number of sequential-time based pipelines (deriv, serial diff, cusum, mov avg, mov fn) all have the same requirements, but each define their own validation. And it looks like serial diff doesn't even do that which is probably a bug in it's own right.

Rather than updating each individually we may want to centralize those validations in a super class or static helper method or something. I haven't looked closely so it may not be practical (or introduce more complexity) but it's something we could consider.

Can I work on this? I would love to help out.

@jklancic sure! Let me know if you have any questions. If you open a PR go ahead and tag me and I'll help out reviewing the PR :)

I'm pretty sure the list of pipeline aggs that need updating is: deriv, serial diff, cusum, mov avg, mov fn (basically all the aggs that perform some function on time-based buckets)

I will take a look first thing tomorrow @polyfractal . Thanks for letting me know. And if questions arise, I will let you know.

@polyfractal I took a look at the code as well as read up about aggregations. Now the AutoDateHistogramAggregatorFactory does not have a minimal document count like DateHistogramAggregatorFactory or HistogramAggregatorFactory. Is my assumption correct to verify (in the validation method) the value numBuckets is not equal zero? Or is the validation based on some other state?

Here is the current change in the PR. I ran all tests against this change with success. Let me know if you wanted a different condition to be validated for AutoDateHistogramAggregatorFactory.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rjernst picture rjernst  路  3Comments

clintongormley picture clintongormley  路  3Comments

dawi picture dawi  路  3Comments

matthughes picture matthughes  路  3Comments

jasontedor picture jasontedor  路  3Comments