Elasticsearch version (bin/elasticsearch --version):
Version: 6.2.1, Build: 7299dc3/2018-02-07T19:34:26.990113Z, JVM: 1.8.0_25
Plugins installed: []
None?
JVM version (java -version):
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
OS version (uname -a if on a Unix-like system):
Darwin mynode.local 14.5.0 Darwin Kernel Version 14.5.0: Sun Jun 4 21:40:08 PDT 2017; root:xnu-2782.70.3~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
The idea is to pick out a bunch of documents from the index which have interesting data in a few of their fields (I've removed some of the fields from the below for simplicity), organize those documents by the contents of the sourceId field, and then discard buckets which are empty or otherwise drawn from data which doesn't match.
I had a query similar to the below which worked. I then modified the document structure such that most of the interesting data moved to a nested mapping. Attempting to modify the query to match results in an error:
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "class_cast_exception",
"reason": "org.elasticsearch.search.aggregations.bucket.nested.InternalNested cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation"
}
},
"status": 503
}
I tried several variations on the theme, and the commonality seems to be that a buckets_path cannot route through a nested aggregation.
New query:
{
"size": 0,
"query": {
"nested": {
"path": "eventData.6_2",
"query": {
"dis_max": {
"queries": [
{ "term": { "eventData.6_2.6_2_1_2": "093" } },
{ "exists": { "field": "eventData.6_2.6_2_3" } },
{ "range": { "eventData.6_2.6_2_3": { "lt": "1000" } } }
]
}
}
}
},
"aggs": {
"flights": {
"terms": {
"size": 100000,
"field": "sourceId"
},
"aggs": {
"subA": {
"nested": { "path": "eventData.6_2" },
"aggs": {
"TargetCount": {
"cardinality": {
"field": "eventData.6_2.6_2_1_2",
"precision_threshold": 10
}
},
"MaxCC": { "max": { "field": "eventData.6_2.6_2_3" } },
"FindIt": {
"bucket_selector": {
"buckets_path": { "foundRecs": "TargetCount" },
"script": "params.foundRecs > 0"
}
}
}
}
}
},
"CC": { "max_bucket": { "buckets_path": "flights>subA>MaxCC" } }
}
}
Steps to reproduce:
I'm willing to go scrape this together, but first I'd like confirmation that (a) it's not a fault in my query and (b) it's not just an implementation restriction.
Thanks!
Pinging @elastic/es-search-aggs
@webbnh There is a bug here, but the bug is that we should be catching the problem at parsing time instead of when we try to run the pipeline aggregation and output a much better error.
The problem is that the request is trying to run the bucket_selector aggregation on the nested aggregation which is a single bucket aggregation and the bucket_selector agg only works on multi-bucket aggregations. I think what you intend to do is remove the entire terms bucket if the TargetCount of SubA is 0? If so you need to move the bucket selector up one level so it is a direct sub-agg to the terms aggregation and then modify the buckets_path. Something like the following:
{
"size":0,
"query":{
"nested":{
"path":"eventData.6_2",
"query":{
"dis_max":{
"queries":[
{
"term":{
"eventData.6_2.6_2_1_2":"093"
}
},
{
"exists":{
"field":"eventData.6_2.6_2_3"
}
},
{
"range":{
"eventData.6_2.6_2_3":{
"lt":"1000"
}
}
}
]
}
}
}
},
"aggs":{
"flights":{
"terms":{
"size":100000,
"field":"sourceId"
},
"aggs":{
"subA":{
"nested":{
"path":"eventData.6_2"
},
"aggs":{
"TargetCount":{
"cardinality":{
"field":"eventData.6_2.6_2_1_2",
"precision_threshold":10
}
},
"MaxCC":{
"max":{
"field":"eventData.6_2.6_2_3"
}
}
}
}
},
"aggs": {
"FindIt":{
"bucket_selector":{
"buckets_path":{
"foundRecs":"subA>TargetCount"
},
"script":"params.foundRecs > 0"
}
}
}
},
"CC":{
"max_bucket":{
"buckets_path":"flights>subA>MaxCC"
}
}
}
}
One unrelated thing to note is that your max_bucket aggregation will also not work. Pipeline aggregations need to be inside multi-bucket aggregations and cannot live at the top level. There is a separate issue for this: https://github.com/elastic/elasticsearch/issues/14600. For now you will need to calculate the max bucket on the client side.
@colings86, thanks for the quick reply!
Your suggestion has a duplicate aggs key under flights, but when I remove that and place FindIt in the aggs with subA, then it seems to work! Thanks!!
I ran across #14600 looking for other reports of the problem I was encountering, but with your suggested change I'm not hitting the problem reported there. (I can't tell yet whether the query is actually working properly, as I don't have enough data in the new format yet, but my corrected query is producing values and no errors...so that seems positive! ;-) )
Thanks again for your help!
@webbnh ok, glad its working for you. I'll leave this issue open to fix the validation problem so that a more clear error is returned at parsing time.
Hi Team,
I am also facing similar issue. Pasting my code here.. it will a great help if someone can help me out. Thanks in advance.
"aggs": {
"business": {
"composite": {
"sources" : [
{ "competency_name": { "terms" : { "field": "busn_competency_name.keyword" } }
},
{ "component_name": { "terms" : { "field": "busn_component_name.keyword" } }
},
{ "busn_srvc_name": { "terms" : { "field": "busn_srvc_name.keyword" } }
}
]
},
"aggs" : {
"comp" : {
"filter" : { "term": { "automata_status.keyword": "Completed" } },
"aggs" : {
"sum1" : { "sum": { "field" : "p_manual_exe_time" } },
"sum2" : { "sum": { "field" : "a_actual_exe_time" } },
"effort_saved": {
"bucket_selector": {
"buckets_path": {
"var1": "sum1",
"var2": "sum2"
},
"script": "params.var1 - params.var2"
}
}
}
}
}
}}
the error I am receiving is:
{
"error": {
"root_cause": [],
"type": "search_phase_execution_exception",
"reason": "",
"phase": "fetch",
"grouped": true,
"failed_shards": [],
"caused_by": {
"type": "class_cast_exception",
"reason": "org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation"
}
},
"status": 503
}
@biji-padhy Known limitation, unfortunately. See: https://github.com/elastic/elasticsearch/issues/14600
Usually you can get around this by using a filters agg instead of filter. Irritating but it's a quirk of how the framework works at the moment :(