When reducing terms aggs results, we check if we already have a doc count error for a certain bucket by looking at its error see https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/InternalTerms.java#L245): if it's greater than zero we have already calculated it, while if we have zero it means we have not hence we ignore such value and use the doc count of the last returned bucket.
When performing incremental reductions though, 0 may mean that the error was not previously calculated, or that the error was indeed previously calculated and its value was 0. We end up rejecting true values set to 0 this way.
Pinging @elastic/es-analytics-geo
ping @colings86 @polyfractal as we have chatted about this.
IMO the best way to fix this will be to change to using null as the indication the error was not previously calculated and make the variable a Long. Note that -1 already has a meaning here in that it indicates that the error cannot be calculated because of the sort order used. I think it would be a bad idea to start encoding more negative numbers with meanings (e.g. -2 means not calculated) since it will make what is already a complex series of calculations for the error confusing and doesn't save us much here since a null check and a check for -2 would amount to basically the same thing.
sounds good @colings86 I can look into this.
Actually, I am not sure I will get to this anytime soon, I marked this issue "help wanted"
I'd like to take this.
As I can see, there is a binary serialization/deserialization of InternalTerms.Bucket values.
Can we assume serialization will always be performed when docCountError is non-null?
If not, we need to change binary format. What about cross-version compatibility?
@javanna, @colings86, can you answer, please?
Is this issue still needed at all?
As I can see, there is a binary serialization/deserialization of InternalTerms.Bucket values.
Can we assume serialization will always be performed when docCountError is non-null?
Are you referring to how it is serialized today? Currently, docCountError is a primitive long, so it is not possible for the variable to be null.
If not, we need to change binary format. What about cross-version compatibility?
We will indeed need to change the serialization, and worry about cross-version compatibility (clusters can be heterogeneous, so a new version might need to serialize a response to an older version and vice versa).
This is done by checking the input/output stream versions and serializing appropriately. The pattern is:
if (in.getVersion().onOrAfter(Version.V_7_3_0)) {
this.docCountError = in.readOptionalLong();
} else {
this.docCountError = in.readLong();
}
Similar for the output stream. E.g. when talking to an older node, we use the old serialization method. And when talking to newer nodes, we can use the "optional" methods to instantiate a Long instead of long
Here's another example, you can find these scattered around the code if you look at the input/output streams: https://github.com/elastic/elasticsearch/blob/44ea7dca2241a748a2c00a4a4d788c9ea74cac60/server/src/main/java/org/elasticsearch/action/admin/indices/close/CloseIndexResponse.java#L47
If I read from an old version stream, there is ambiguity in 0 value meaning.
Should I treat such value as "0 errors" or "not calculated"?
I think we'll want to treat it as "not calculated", since that's the less-bad way to resolve the ambiguity. E.g. if we interpret it as "0 error" we might assign a no-error rate to something that was just not calculated yet and actually has a large error, leading to very incorrect error reports.
(@javanna is on holiday right now, but he can confirm when he's back)
I have two questions regarding this piece of code:
https://github.com/elastic/elasticsearch/blob/44ea7dca2241a748a2c00a4a4d788c9ea74cac60/server/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/InternalTerms.java#L242-L243
<= instead of < be better in the first check?How key order guarantees that there are 0 errors?
I found this mentioned in the documentation:
When the aggregation is ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard does not return a particular term which appears in the results from another shard, it must not have that term in its index.
I don't understand, what happens in this case when terms.getBuckets().size() > getShardSize(). Wouldn't some terms be skipped in this case regardless of ordering?
Wouldn't
<=instead of<be better in the first check?
I believe it is < because, if the returned size is smaller than what we requested, we know definitively that we have all the terms from that shard. But if it is the same as the requested size, we're not sure if that was all that was available or if it was truncated before being sent to the coordinator
How key order guarantees that there are 0 errors?
Ordering by the term itself ("_key" : "asc") is basically lexicographic sorting. So imagine we have three shards, asking for top-5 from each (and for simplicity each term only has count: 1 on the shard):
| Shard A | Shard B | Shard C |
| ------------- | ------------- | ------------- |
| A: 1 | B: 1 | D: 1 |
| B: 1 | C: 1 | E: 1 |
| C: 1 | D: 1 | F: 1 |
| D: 1 | E: 1 | G: 1 |
| E: 1 | F: 1 | H: 1 |
We are sorting alphabetically, so since Shard 2 starts with "B" we know it doesn't have an "A" term (otherwise it would have sent a count for "A"). Similarly, we know Shard 3 doesn't have "A", "B", or "C".
When merging, we take the "top" 5 alphabetically from the returned results, meaning A through E in this case. After merging doc counts we get:
| Top 5 |
| ------------- |
| A: 1 |
| B: 2 |
| C: 2 |
| D: 3 |
| E: 3 |
We know the counts are exact because we're not relying on doc_counts for ordering, but the total lexicographic ordering. "A" is the "top" term because it was returned and sorts to the first position, regardless of the document count. If only one shard returns that value, or all the shards return it, doesn't matter because it will sort to the "top" regardless of count.
Got it, thanks!
PR is ready for review
https://github.com/elastic/elasticsearch/pull/43874