Elasticsearch: Unified highlighter: include additional context outside of highlighted sentence to reach target fragment_size

Created on 5 Jan 2018  路  4Comments  路  Source: elastic/elasticsearch

Describe the feature

Currently, the unified highlighter can only provide context by including the sentence the highlighted word is in. This is sometimes a very short highlight. For example, given text in a field like this:

Some leading context. A short sentence. Some more content. And even more context around that sentence.

Running a query for the term sentence using the unified highlighter and fragment_size set to 300, results in a highlight that, while it includes the word that we're looking for, does not provide much context and is nowhere close to the target size requested:

A short <em>sentence</em>.

In contrast, run the same query with the plain highlighter results in a highlight with much more useful context (and in this case another highlighted word!):

Some leading context. A short <em>sentence</em>. Some more content. And even more context around that <em>sentence</em>.

The unified highlighter should include as much context as possible without going over the target fragment size. This will result in more consistently sized highlights (which is nice for visual consistency) and will provide more useful context in cases where the highlight occurs in a short sentence.

cc: @colings86

:SearcHighlighting >feature

Most helpful comment

Note the pr only expands context to the right of the match. Any sentences to the left (i.e leading context) are not included at the moment

All 4 comments

Thank you @jimczi!

Sorry to update on an old thread. Since what version is this fix available?

@lfplazas10 you can see the version on the linked pr:
https://github.com/elastic/elasticsearch/pull/28132
It is available since 6.2.0

Note the pr only expands context to the right of the match. Any sentences to the left (i.e leading context) are not included at the moment

Was this page helpful?
0 / 5 - 0 ratings