Elasticsearch: Scoring bug in 2.3

Created on 12 Oct 2016  路  16Comments  路  Source: elastic/elasticsearch

Elasticsearch version: 2.3

Plugins installed: []

JVM version:
Deployed on AWS
OS version:
Deployed on AWS
Description of the problem including expected versus actual behavior:

I have a database deployed in Amazon Elasticsearch with about 20M documents. The document schema defines simple analysers and fields are all are strings. There is 1 replica and 1 shard. _all is disabled.

A query to a field with exactly the same value as the field in document I am looking for results in other documents being scored higher. Doing an explain, results in a weird results:

Query:

{'query': {'bool': {'must': [{'match': {'title': 'todo el mundo lo sabe'}}]}}}

Result:
http://pastebin.com/vdxzc6nM

As you can see, the result of the last value of the explain and the final _score is different.

For other documents, this is not the case. The result of explain is the same as the result of _score.

Is this a bug?

Thank you.

:SearcSearch >bug

Most helpful comment

@drount We only maintain the 2.4 branch in the 2.x series now, and the fix is already released. This bug was highly situation dependent. it doesn't require an urgent (or non-urgent) deprecation

All 16 comments

Can you also share the explanation of a document that gets ranked higher even though it should be worse?

Here is the top match of the search:

http://pastebin.com/x3caXrVg

explain returns a different score, which I think is the right one and would have ranked this document better. This is probably due to https://issues.apache.org/jira/browse/LUCENE-7132, a pretty bad bug which is only fixed in recent Lucene versions (Lucene 6.1+). Maybe we should backport it and do a new 5.5 release.

For the record, I _think_ this can be worked around by adding a FILTER clause with a query that matches all docs without being a match_all query, eg.

{
  "query": {
    "bool": {
      "must": [
        { "match" : { "title" : "todo el mundo lo sabe" } }
      ],
      "filter": [
        { "exists" : { "field" : "title" } }
      ]
    }
  }
}

I reindexed the data in a local Elasticsearch 2.4 and can not reproduce the bug. Is the Lucene version updated in 2.4?

Other than that... Looks like that wrong-scored document does not use coord (maybe because it is a perfect match).

I'll try your "patch".

It is unfortunate that due to this bug I cannot use AWS Elastichsearch.

2.4 should have the bug too, but in order to reproduce the bug you need documents to be indexed in a certain order, so it does not necessarily reproduce if you reindex.

I see.

Maybe I'm wrong but... is this a huge problem in general for ES?

It may be indeed. It had been thought as quite an exotic bug until now but the fact that you hit it is making us reconsider.

Any chance ES will update the Lucene dependency?

Due to this bug, I may have to reconsider ES for my use case.

@drount The 5.0 rc1 release has the fix (it is on Lucene 6.2), but we still need a plan in order to get this bug fixed on the 2.x branch, which cannot move from Lucene 5.5.

I will try today the

"filter": [ { "exists" : { "field" : "title" } } ]

workaround.

Would you mind elaborating why this may avoid the error?

I tested the suggested workaround. It now does return the correct _score. However, the query times is 10x more so it is basically unacceptable.

Any other idea?

Thank you.

Would you mind elaborating why this may avoid the error?

The bug is located in an optimization to pure disjunctions (a top-level A OR B OR C OR ...). By adding a filter, the optimization gets disabled since it is not a top-level pure disjunction anymore.

Any other idea?

We need to get the bugfix backported and released in Lucene so that we can do a new Elasticsearch 2.x release on a fixed version of Lucene.

Nevermind what I said above. I was reading change logs more carefully and this bug should be fixed in Elasticsearch 2.4.0+ since it is based on Lucene 5.5.2 which has the fix. Would you mind giving it a try?

I indexed 3 times in 2.4 (locally) and I seems that works.

Shouldn't 2.3 be urgently deprecated?

@drount We only maintain the 2.4 branch in the 2.x series now, and the fix is already released. This bug was highly situation dependent. it doesn't require an urgent (or non-urgent) deprecation

Was this page helpful?
0 / 5 - 0 ratings