Hi there,
I was trying to use the exists
/missing
filters when I stumbled upon this behavior: When I use the missing
filter for nested objects, it always returns an empty set if the containing nested object is missing, too.
Here is my document mapping:
{
"site": {
"properties": {
"host": {
"type": "string",
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs"
},
"ip": {,
"type": "string",
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs"
},
...
"modules": {
"type": "nested",
"properties": {
"module_id": {
"type": "integer"
},
"name": {
"type": "string",
"index": "not_analyzed",
"omit_norms": true,
"index_options": "docs"
},
...
}
}
}
}
}
My Document looks like this:
{
"host": "6c1bb1fb58e8c48cabbd1e4382e55871f31ad776.com",
"ip" : "0.0.0.0",
...
"modules": [ ]
}
If I now use a query with a nested filter to select every document where modules.name is missing, I only get an empty set.
{
"query": {
"filtered": {
"query": { "match_all": { } },
"filter": {
"nested": {
"path": "modules",
"query": {
"filtered": {
"query": { "match_all": { } },
"filter": {
"missing": { "field": "modules.name" }
}
}
}
}
}
}
}
}
It seems to work if I submit a document which contains a module:
{
"host": "6c1bb1fb58e8c48cabbd1e4382e55871f31ad776.com",
"ip" : "0.0.0.0",
...
"modules": [ { "version" : "foo" } ]
}
When using documents where the modules object isn't empty, use a missing
filter which looks for "deeper" missing attributes seems to work, too.
{
"query": {
"filtered": {
"query": { "match_all": { } },
"filter": {
"nested": {
"path": "modules",
"query": {
"filtered": {
"query": { "match_all": { } },
"filter": {
"missing": { "field": "modules.foo.bar.baz" }
}
}
}
}
}
}
}
}
I was expecting, that a missing
filter also returns documents if the containing nested object is missing or empty.
Update: Wrapping an exists
filter in a not
filter doesn't return any documents, either.
Yeah I've also been a little stumped trying to figure out how to find documents without a nested object... in my case the nested object is an array of objects.... I'd like to find them when the array is empty.
Ah I found a solution at http://grokbase.com/t/gg/elasticsearch/13bfq5qbse/missing-filter-with-nested-objects
curl -XPOST "http://ocvli-apw602:9200/test2/IR/_search" -d'
{
"filter": {
"not": {
"nested": {
"path": "priosenio",
"filter": {
"match_all": {}
}
}
}
}
}'
I just ran into this. The workaround highlighted by @drewish feels pretty clunky though :confused:
Here's a simple recreation that describes the problem:
PUT t
{
"mappings": {
"t": {
"properties": {
"foo": {
"type": "nested"
}
}
}
}
}
PUT t/t/1
{
"foo": {
"bar": "bar"
}
}
PUT t/t/2
{
"xyz": "xyz"
}
This request matches doc 1, because it has a nested doc which is missing the field, but not doc 2 because it has no nested docs:
GET t/_search
{
"query": {
"nested": {
"path": "foo",
"query": {
"missing": {
"field": "foo.baz"
}
}
}
}
}
This workaround works correctly for both docs:
GET t/_search
{
"query": {
"not": {
"nested": {
"path": "foo",
"query": {
"exists": {
"field": "foo.baz"
}
}
}
}
}
}
@martijnvg @jpountz is this fixable?
It is not fixable, unless the missing query can detect it is being used within a nested query, which I would like to avoid at all costs. We don't index missing fields in documents, only existing fields, so the missing
query is internally implemented as the negation of an exists
query. This raises problems as described here given that putting the not
inside of the nested
query has a totally different effect than putting it outside as your workaround does.
I think the way to fix this trap would be to deprecate the missing
query in favor of explicit negations of the exists
query.
@jpountz ++ makes sense.
Closing in favour of #14112
Since the "not" is deprecated , you can use the must not .
POST /my_index/my_type/_search
{
"filter": {
"bool": {
"must_not": [
{
"nested": {
"path": "path_to_nested_doc",
"query": {
"match_all": {}
}
}
}
]
}
}
}
This works for me
GET /type/_search?pretty=true
{
"query": {
"bool": {
"must_not": [
{
"nested": {
"path": "outcome",
"query": {
"exists": {
"field": "outcome.outcomeName"
}
}
}
}
]
}
}
}
Any update on this ? the query from @manuprasanth does not return any document for me, while some of my nested elements are empty. What I find even stranger is that if I do the same query with _must_ instead of _mustnot_, I get the correct output (my empty elements are not returned).
Most helpful comment
Since the "not" is deprecated , you can use the must not .
POST /my_index/my_type/_search
{ "filter": { "bool": { "must_not": [ { "nested": { "path": "path_to_nested_doc", "query": { "match_all": {} } } } ] } } }