On 2.x versions i used to apply size: 0
to achieve what the docs say:
If set to 0, the size will be set to Integer.MAX_VALUE.
And the terms aggregation gave me as many result buckets as terms i had.
But on 5.x i'm getting size must be positive, got 0
as error. If i omit the value, i only get 10 results back. The docs says:
By default, the node coordinating the search process will request each shard to provide its own top size term buckets and once all shards respond, it will reduce the results to the final list that will then be returned to the client.
So it's not clear how to achieve the same and get not only 10 buckets but all instead.
How i do that? Thanks
You can check here why we decided to remove the size:0
option:
https://github.com/elastic/elasticsearch/issues/18838
Having size: 0 as an option makes it look like there is a short cut here and we can do the give me all the buckets case in a different very efficient way, which we can't. Internally we rewrite size:0 to Integer.MAX_VALUE
So you can just specify a size that is bigger than the cardinality of your field. If the cardinality is big you should consider other options since returning millions of terms is going to cause problem in your cluster.
Also we reserve github for bugs and feature requests so the best way to get an answer for questions like this is to use the discuss forum:
https://discuss.elastic.co/c/elasticsearch
Using size parameter for Query and Terms Aggregations on 5.6.x
{
"query": {
"bool": {
"filter": [
{
"terms": {
"mid": [
185422,
13446728
]
}
}
]
}
},
"aggs": {
"group_mids": {
"terms": {
"field": "mid",
"size": 5,
"shard_size": 10,
"order": {
"max_hot": "desc"
}
},
"aggs": {
"max_hot": {
"max": {
"field": "hotvalue"
}
}
}
}
},
"_source": {
"includes": [
"mid",
"cid",
"hotvalue"
]
},
"sort": [
{
"lastreptime": "desc"
}
],
"from": 0,
"size": 10
}
But every buckets do not match the size = 5, Why ?
Most helpful comment
You can check here why we decided to remove the
size:0
option:https://github.com/elastic/elasticsearch/issues/18838
So you can just specify a size that is bigger than the cardinality of your field. If the cardinality is big you should consider other options since returning millions of terms is going to cause problem in your cluster.
Also we reserve github for bugs and feature requests so the best way to get an answer for questions like this is to use the discuss forum:
https://discuss.elastic.co/c/elasticsearch