Elasticsearch: inconsistent sorting by score due to differences between primary and replicas

Created on 27 Aug 2013 · 10Comments · Source: elastic/elasticsearch

not really sure if this is a bug or not.
the first part is:
- is it normal that shards have different number of max_docs? what could cause that? a fast insertion + delete(that my guess is wont be replicated to the other shards). and of course, i guess that if they have different number of max docs, they most likely will also have different term freq and whatnot.

the second, based on if the previous is true:
- is it then possible to have a consistent sorting(based on score) with this scenario?

i currently have for a simple match query, completely different result lists based on the shards that the query hits.

Source

lmenezes

Most helpful comment

Well, heck, its 2019. Why not revive this thread? :-)

We just ran into the same issue, and the cause was a different number of deleted documents between the primary and replica of the same shard. We were already using "dfs_query_then_fetch" which we expected would solve the problem, but it did not.

For calculating doc counts when using dfs_query_then_fetch (at which point you're already incurring a perf penalty to get the stats from every shard), why not just use the active doc count instead of the maxDocs value? And if that is too big a performance hit, then at the very least this should be documented. Neither the blog page about DFS (https://www.elastic.co/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch), the docs about DFS (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-type.html), nor the page about Preference (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-preference.html) explain this problem. The search_after (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-search-after.html) docs could also mention this, as consistent scoring is necessary when using the score with search_after.

rocketraman on 26 Mar 2019

👍2

All 10 comments

if it is of any help, here are the stats for my shards(the cluster was not receiving updates at this moment, just to be sure the shards were equal): https://gist.github.com/lmenezes/6351739

lmenezes on 27 Aug 2013

hey @lmenezes, max_docs is the total number of documents in the shard including deletes. So if one shard has already merged away a bigger segment you can easily see a bid difference here. The frequency you mean is the document_frequency and that is calculated based on the num_docs (withou deletions).

is it then possible to have a consistent sorting(based on score) with this scenario?

you mean across requests, well yes you can you can use a _preference based on a session id or a users ID to get a consistent result set so you hit the same replicas all the time. The problem here might be 1. tie-breaking since lucene by default tie-breaks on doc ID which is shard dependent (internal doc id) 2. Differences in Refreshes since one replica might already have refreshed...

does this make sense?

s1monw on 27 Aug 2013

yep, that's exactly what i needed. but it's "preference", right? no _ in front of it.

lmenezes on 27 Aug 2013

yep, that's exactly what i needed. but it's "preference", right? no _ in front of it.

ah yeah maybe :)

s1monw on 27 Aug 2013

I face a similar (or maybe exactly the same) issue described in https://groups.google.com/forum/#!topic/elasticsearch/RJAT6MR4sHQ.

In my case, four nodes (one primary shard and three replicas) have the same "num_docs" but huge differences in "deleted" count. This explains why "max_docs" is different, but as far as I understood from the discussion above, the score computation ignores the deleted documents and is only based on those reflected by "num_docs" (which is equal on all nodes).

But if that's true, how come that the score computations differ for identical queries sent to the four nodes?

Tie-breaking is not the issue in my case because the scores are actually different. And I don't think that refreshes are the issue, because the symptom persists even after several hours without changes to the index and explicitly requested refresh.

Maybe I misunderstood something. Anyway I hope you can clarify this once more :-)

peschlowp on 28 Aug 2013

hey @peschlowp
actually i think its not the num_docs, but the max_docs which is used while computing score. i just got the explanation for a query as an example:

value: 0.99963135,
description: queryWeight, product of:
details: [
{ value: 10, description: boost},
{value: 3.6819985,description: idf(docFreq=79358, maxDocs=1159774)},
{value: 0.027149152, description: queryNorm}
]

lmenezes on 28 Aug 2013

@peschlowp well deleted documents still contribute to the score calculation since they are only marked as deleted but statistics are not updated so yes they contribute to the score. I confused num_docs in my explain above though. Sorry about that.

s1monw on 28 Aug 2013

Thanks for clarifying! I have to confess that I don't like the fact that deleted documents still contribute the to score (doesn't seem intuitive from a user's point of view), but on the other hand now at least the observed behavior makes sense :-)

peschlowp on 28 Aug 2013

👍1

I have to confess that I don't like the fact that deleted documents still contribute the to score

I can see how this is confusing but in practice this is hardly a problem. In fact that is most of the search indices work. Given the write once nature of lucene it's safe to say this won't change in the near future :)

sorry about the confusion.

s1monw on 28 Aug 2013

Well, heck, its 2019. Why not revive this thread? :-)

rocketraman on 26 Mar 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings