Elasticsearch: Feature Request: Add The Ability To Normalize Query Scores

Created on 31 Mar 2017 · 2Comments · Source: elastic/elasticsearch

The ability to normalize the relevance scores of documents on a per query basis would be incredibly useful. It would enable someone to meaningfully combine multiple queries in a bool query, without a single clause overtaking the others. This would be a huge improvement over boosting the individual queries.

I could see this being done via:

A new query type,
An addition to the Function Score Query
Access to the mean and standard deviation of a query in the "script_score" function of a function score query.

I personally would prefer option 2, but recognize option 3 would allow individuals to design their own normalization functions to best suit their needs.

Note:
I'm not the only one looking for this functionality. There is a StackOverflow discussion with 20+ votes on attempts (none successful) to achieve this here.

:SearcSearch >feature discuss

Source

davidklebanoff

Most helpful comment

This is not something that can be done easily. This feature request makes the assumption that we are able to qualify how good a match is, which is not the case. This page gives some background about the issue https://wiki.apache.org/lucene-java/ScoresAsPercentages.

One way to work-around the issue could be to use constant_score queries, or the upcoming boolean similarity, which generate predictable scores.

But I agree with you we need to work on making it easier to combine full-text relevance with other sources of relevance. This is something that has been researched (in particular with geo) and we need to make progress on exposing better ways to combine full-text scores with other kinds of scores.