Elasticsearch: Aggregations: nested filter aggregation with nested filter returning no results

Created on 18 Jun 2015  路  17Comments  路  Source: elastic/elasticsearch

Hello,

I am not able to combine normal (e.g., term filter) with nested filter in filter aggregation which is itself nested in nested aggregation.

Mapping: tasks with nested events, events with nested parameters.

curl -XPUT 'localhost:9200/nestingtest/' -d '
{
  "mappings": {
    "tasks": {
      properties: {
        task_id: {
          type: "long",
          doc_values: true
        },
        events: {
          type: "nested",
          properties: {
            id: {
              type: "long",
              doc_values: true
            },
            parameters: {
              type: "nested",
              properties: {
                name: {
                  type: "string",
                  index: "not_analyzed",
                  doc_values: true
                },
                value: {
                  type: "string",
                  index: "not_analyzed",
                  doc_values: true
                }
              }
            }
          }
        }
      }
    }
  }
}
'

Data:

curl -XPOST 'localhost:9200/nestingtest/tasks/1' -d '
{
  task_id: 1,
  events: [
    {
      id: 1,
      parameters: {
        name: "attribution",
        value: "campaignX"
      }
    }
  ]
}
'

curl -XPOST 'localhost:9200/nestingtest/tasks/2' -d '
{
  task_id: 2,
  events: [
    {
      id: 21,
      parameters: [
        {
          name: "attribution",
          value: "campaignY"
        }
      ]
    },
    {
      id: 22,
      parameters: {
        name: "attribution",
        value: "campaignY"
      }
    }
  ]
}
'

Query that does not work as expected:

curl -XPOST 'localhost:9200/nestingtest/tasks/_search' -d '
{
  size: 0,
  aggs: {
    "to-events": {
      nested: {
        path: "events"
      },
      aggs: {
        filtered: {
          filter: {
            nested: {
              path: "events.parameters",
              filter: {
                term: {
                  "events.parameters.value": "campaignX"
                }
              }
            }
          }
        }
      }
    }
  }
}
'

Response:

"aggregations": {
  "to-events": {
    "doc_count": 3,
    "filtered": {
      "doc_count": 0 //this is wrong, should be 1
    }
  }
}

If I change the query to use the nested aggregation, the results are as expected:

curl -XPOST 'localhost:9200' -d '
{
  size: 0,
  aggs: {
    "to-events": {
      nested: {
        path: "events"
      },
      aggs: {
        "to-parameters": {
          nested: {
            path: "events.parameters"
          },
          aggs: {
            filtered: {
              filter: {
                term: {
                  "events.parameters.value": "campaignX"
                }
              }
            }
          }
        }
      }
    }
  }
}'

Response:

"aggregations": {
      "to-events": {
         "doc_count": 3,
         "to-parameters": {
            "doc_count": 3,
            "filtered": {
               "doc_count": 1
            }
         }
      }
   }

Using nested aggregation is not an ideal solution for me, as I would like to combine that nested filter with other filters acting at the level of events, get one number of matched documents and then dive into sub-aggs for those documents.

"version" : {
    "number" : "1.6.0",
    "build_hash" : "cdd3ac4dde4f69524ec0a14de3828cb95bbb86d0",
    "build_timestamp" : "2015-06-09T13:36:34Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  }
:AnalyticAggregations >bug

Most helpful comment

You can do aggs like this:
"myaggs": {
"filter": {"bool": {"must_not": {"term": {"method.keyword": "POST"}}}},
"aggs": example_aggs
}

All 17 comments

@martijnvg please take a look

@crutch The issue is that the nested aggregator doesn't tell what path has been set to the nested filter. This causes the nested to translate the matching docs incorrectly to the nested aggregator and this results in no results. So this is indeed a bug and should get fixed.

@crutch About the nested aggregator solution, I think that should work for you too? You can just place another filter aggregator under the first nested aggregator:

{
  "size": 0,
  "aggs": {
    "to-events": {
      "nested": {
        "path": "events"
      },
      "aggs": {
        "id_filter": {
          "filter": {
            "term": {
              "events.id": 22
            }
          },
          "aggs": {
            "to-parameters": {
              "nested": {
                "path": "events.parameters"
              },
              "aggs": {
                "filtered": {
                  "filter": {
                    "term": {
                      "events.parameters.value": "campaignY"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

@martijnvg thank you for investigating this.

Using a filter aggregator under the nested one is my current solution. I am able to query for correct numbers, but struggle to get e.g., correct terms aggs.

Imagine that my filter is a compound one, matching some fields of event and some fields of nested parameters. And I would like to have a terms aggregation based on a field of event for all documents matching this compound filter. I can split the filter into two parts: event-level & nested-level, which is used under nested aggregation...but where to put the terms aggregation then?

If my terms aggregation is used outside of the nested aggregation, its results are not filtered by my nested filter. And I cannot put this terms aggregation inside the nested one, as it does not have access to its parent's fields.

@martijnvg I hit this issue once again. I need a not filter on top of a nested filter, in an aggregation. Do you have any idea how to achieve this using not filter and nested aggregation instead of nested filter?

Using the not filter inside the nested aggregation will match on sibling nested documents...

Any news about this? I have similar problems and a solution to this would help me a lot.

We also faced this issue during the migration from facets to aggregation framework. In our case we have a bool filter with must_not condition which contains a nested query. As described here and in https://github.com/elastic/elasticsearch/issues/12410, it is possible to represent simple conjunction filters with nested aggregations, but this is pretty much it. I don't see any way how one can represent a disjunctions or negations in this way (would appreciate if somebody could share how to do it, if it's possible).

It makes it very hard for us to do the migration to an aggregations framework. Our main use-case for facets/aggregations is a faceted navigation (which means that we are using aggregation filter buckets in conjunction with the post_filter). I would really appreciate if somebody could revisit this issue or at least share an information whether this issue is planned to be solved (and if yet then. of course, it would be very helpful to know when since we need to plan our migration as well).

:+1:

This is really a blocker for me, and I think there is no possible workaround.
I have 2 levels of nesting, product > [offer] > [invprice]. I want to calculate price range facet for the products, only considering offers which have popularity of 5 and have inventory for all provided dates. This is the query I am trying to run:

{
  "query": {
    "match": {
      "productcode": "p1"
    }
  },
  "aggs": {
    "product_offers": {
      "nested": {
        "path": "offers"
      },
      "aggs": {
        "offers": {
          "filter": {
                "bool": {
                  "must": [
                    {
                      "term": {
                        "popularity": 5
                      }
                    },
                    {
                      "nested": {
                        "path": "offers.invprice",
                        "query": {
                          "terms": {
                            "offers.invprice.date": [1444501800000]
                          }
                        }
                      }
                    },
                    {
                      "nested": {
                        "path": "offers.invprice",
                        "query": {
                          "terms": {
                            "offers.invprice.date": [1447093800000]
                          }
                        }
                      }
                    }
                  ]
                }
            },
           "aggs": {
            "price_ranges": {
              "nested": {
                "path": "offers.invprice"
              }, 
              "aggs": {
                "ranges": {
                  "range": {
                    "field": "offers.invprice.price",
                    "ranges": [
                      {
                        "from": 50,
                        "to": 300
                      },
                      {
                        "from": 300,
                        "to": 700
                      },
                      {
                        "from": 700,
                        "to": 1000
                      }
                    ]
                  }
                }
              }
            }
          } 
          }
        }
    }
  }
}

Since it is label as a bug, is there any ETA on a fix?

This is kind of an important feature for me

Hi there,
are you consider to work on this in the next, say, 2-3 months? Otherwise we have to consider implemeting another solution to this problem.

thanks

@ddombrowskii @sebbulon I suggest working on another solution in the meantime.

@clintongormley Can we expecting a fix in Elastic Stack 5.0 ?

It's very unfortunate that this fix is available in 5.x version only as migrating from 2.x to 5.x is not something I can do at the moment...

You can do aggs like this:
"myaggs": {
"filter": {"bool": {"must_not": {"term": {"method.keyword": "POST"}}}},
"aggs": example_aggs
}

Is that means ?

@crutch About the nested aggregator solution, I think that should work for you too? You can just place another filter aggregator under the first nested aggregator:

{
  "size": 0,
  "aggs": {
    "to-events": {
      "nested": {
        "path": "events"
      },
      "aggs": {
        "id_filter": {
          "filter": {
            "term": {
              "events.id": 22
            }
          },
          "aggs": {
            "to-parameters": {
              "nested": {
                "path": "events.parameters"
              },
              "aggs": {
                "filtered": {
                  "filter": {
                    "term": {
                      "events.parameters.value": "campaignY"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Based on your answer, managed to solve my issues, thank you! :)
I had multiple nested documents on one single document.

Was this page helpful?
0 / 5 - 0 ratings