Elasticsearch: Error upper bound may be wrong when performing incremental reductions

Created on 13 Mar 2019 · 15Comments · Source: elastic/elasticsearch

When reducing terms aggs results, we check if we already have a doc count error for a certain bucket by looking at its error see https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/InternalTerms.java#L245): if it's greater than zero we have already calculated it, while if we have zero it means we have not hence we ignore such value and use the doc count of the last returned bucket.

When performing incremental reductions though, 0 may mean that the error was not previously calculated, or that the error was indeed previously calculated and its value was 0. We end up rejecting true values set to 0 this way.

:AnalyticAggregations >bug Analytics help wanted

Source

javanna

All 15 comments

Pinging @elastic/es-analytics-geo

elasticmachine on 13 Mar 2019

ping @colings86 @polyfractal as we have chatted about this.

javanna on 13 Mar 2019

IMO the best way to fix this will be to change to using null as the indication the error was not previously calculated and make the variable a Long. Note that -1 already has a meaning here in that it indicates that the error cannot be calculated because of the sort order used. I think it would be a bad idea to start encoding more negative numbers with meanings (e.g. -2 means not calculated) since it will make what is already a complex series of calculations for the error confusing and doesn't save us much here since a null check and a check for -2 would amount to basically the same thing.

colings86 on 14 Mar 2019

👍1

sounds good @colings86 I can look into this.

javanna on 14 Mar 2019

Actually, I am not sure I will get to this anytime soon, I marked this issue "help wanted"

javanna on 26 Mar 2019

I'd like to take this.

Hohol on 27 Jun 2019

As I can see, there is a binary serialization/deserialization of InternalTerms.Bucket values.
Can we assume serialization will always be performed when docCountError is non-null?
If not, we need to change binary format. What about cross-version compatibility?

Hohol on 27 Jun 2019

@javanna, @colings86, can you answer, please?
Is this issue still needed at all?

Hohol on 1 Jul 2019

As I can see, there is a binary serialization/deserialization of InternalTerms.Bucket values.
Can we assume serialization will always be performed when docCountError is non-null?

Are you referring to how it is serialized today? Currently, docCountError is a primitive long, so it is not possible for the variable to be null.

If not, we need to change binary format. What about cross-version compatibility?

We will indeed need to change the serialization, and worry about cross-version compatibility (clusters can be heterogeneous, so a new version might need to serialize a response to an older version and vice versa).

This is done by checking the input/output stream versions and serializing appropriately. The pattern is:

if (in.getVersion().onOrAfter(Version.V_7_3_0)) {
  this.docCountError = in.readOptionalLong();
} else {
  this.docCountError = in.readLong();
}

Similar for the output stream. E.g. when talking to an older node, we use the old serialization method. And when talking to newer nodes, we can use the "optional" methods to instantiate a Long instead of long

Here's another example, you can find these scattered around the code if you look at the input/output streams: https://github.com/elastic/elasticsearch/blob/44ea7dca2241a748a2c00a4a4d788c9ea74cac60/server/src/main/java/org/elasticsearch/action/admin/indices/close/CloseIndexResponse.java#L47

$polyfractal picture$ polyfractal on 1 Jul 2019

If I read from an old version stream, there is ambiguity in 0 value meaning.
Should I treat such value as "0 errors" or "not calculated"?

Hohol on 1 Jul 2019

I think we'll want to treat it as "not calculated", since that's the less-bad way to resolve the ambiguity. E.g. if we interpret it as "0 error" we might assign a no-error rate to something that was just not calculated yet and actually has a large error, leading to very incorrect error reports.

(@javanna is on holiday right now, but he can confirm when he's back)

$polyfractal picture$ polyfractal on 1 Jul 2019

👍1

I have two questions regarding this piece of code:
https://github.com/elastic/elasticsearch/blob/44ea7dca2241a748a2c00a4a4d788c9ea74cac60/server/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/InternalTerms.java#L242-L243

Wouldn't <= instead of < be better in the first check?
How key order guarantees that there are 0 errors?
I found this mentioned in the documentation:

When the aggregation is ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard does not return a particular term which appears in the results from another shard, it must not have that term in its index.

I don't understand, what happens in this case when terms.getBuckets().size() > getShardSize(). Wouldn't some terms be skipped in this case regardless of ordering?

Hohol on 2 Jul 2019

Wouldn't <= instead of < be better in the first check?

I believe it is < because, if the returned size is smaller than what we requested, we know definitively that we have all the terms from that shard. But if it is the same as the requested size, we're not sure if that was all that was available or if it was truncated before being sent to the coordinator

How key order guarantees that there are 0 errors?

Ordering by the term itself ("_key" : "asc") is basically lexicographic sorting. So imagine we have three shards, asking for top-5 from each (and for simplicity each term only has count: 1 on the shard):

| Shard A | Shard B | Shard C |
| ------------- | ------------- | ------------- |
| A: 1 | B: 1 | D: 1 |
| B: 1 | C: 1 | E: 1 |
| C: 1 | D: 1 | F: 1 |
| D: 1 | E: 1 | G: 1 |
| E: 1 | F: 1 | H: 1 |

We are sorting alphabetically, so since Shard 2 starts with "B" we know it doesn't have an "A" term (otherwise it would have sent a count for "A"). Similarly, we know Shard 3 doesn't have "A", "B", or "C".

When merging, we take the "top" 5 alphabetically from the returned results, meaning A through E in this case. After merging doc counts we get:

| Top 5 |
| ------------- |
| A: 1 |
| B: 2 |
| C: 2 |
| D: 3 |
| E: 3 |

We know the counts are exact because we're not relying on doc_counts for ordering, but the total lexicographic ordering. "A" is the "top" term because it was returned and sorts to the first position, regardless of the document count. If only one shard returns that value, or all the shards return it, doesn't matter because it will sort to the "top" regardless of count.

$polyfractal picture$ polyfractal on 2 Jul 2019

Got it, thanks!

Hohol on 2 Jul 2019

PR is ready for review
https://github.com/elastic/elasticsearch/pull/43874

Hohol on 3 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings