Elasticsearch version (bin/elasticsearch --version):
Version: 6.1.3, Build: af51318/2018-01-26T18:22:55.523Z, JVM: 1.8.0_151
Plugins installed: []
OS version (uname -a if on a Unix-like system):
Linux ubuntu 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
While issuing a query on a field using a match query with minimum_should_match parameter
the field is highlighted in the results although there aren't enough clauses for the field to match.
Steps to reproduce:
PUT test_index
{
"mappings": {
"doc": {
"properties": {
"field1": {
"type": "text"
},
"field2": {
"type": "keyword"
}
}
}
}
}
PUT test_index/doc/1
{
"field1": "id1 id2 id3",
"field2": "1"
}
POST test_index/doc/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"field1": {
"query": "id1 id4 id5",
"minimum_should_match": "80%"
}
}
}
],
"minimum_should_match": 1
}
},
"highlight": {
"type": "unified",
"order": "score",
"fields": {
"field1": {}
}
}
}
no results
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
POST test_index/doc/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"field1": {
"query": "id1 id4 id5",
"minimum_should_match": "80%"
}
}
},
{
"term": {
"field2": "1"
}
}
],
"minimum_should_match": 1
}
},
"highlight": {
"type": "unified",
"order": "score",
"fields": {
"field1": {}
}
}
}
the result: field1 is highlighted. the expected result: field1 should not be highlighted.
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"field1": "id1 id2 id3",
"field2": "1"
},
"highlight": {
"field1": [
"<em>id1</em> id2 id3"
]
}
}
]
}
}
Thanks for opening an issue, @NadavHarnik
The lucene highlighters give a best-effort approximation of where a query has hit, rather than exact matches. One of the query aspects that aren't handled at the moment are boolean combinations, including minimum should match.
This won't be fixed absent some fairly fundamental reworking of how highlighting works, unfortunately.
Most helpful comment
Thanks for opening an issue, @NadavHarnik
The lucene highlighters give a best-effort approximation of where a query has hit, rather than exact matches. One of the query aspects that aren't handled at the moment are boolean combinations, including minimum should match.
This won't be fixed absent some fairly fundamental reworking of how highlighting works, unfortunately.