Elasticsearch version (bin/elasticsearch --version): 7.0.0 - 7.4.2
Plugins installed: []
JVM version (java -version): AdoptOpenJDK/OpenJDK 64-Bit Server VM/13.0.1/13.0.1+9 (packed with your docker image)
OS version (uname -a if on a Unix-like system): your dockerimage running on
Mac-OS
Darwin XXXX 18.7.0 Darwin Kernel Version 18.7.0: Tue Aug 20 16:57:14 PDT 2019; root:xnu-4903.271.2~2/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
with version 6 of elasticsearch we have used the index setting routing_partition_size to smoothen the issue of hot shards for our custom routing of documents. now we try to upgrade to version 7 of elasticsearch but the feature seems to be broken. it is still described in the documentation (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html#routing-index-partition) and we still are able to configure it as an index setting, but
the _search_shards-api and also analyzing the placement of documents with a routing-parameter are showing, that everything is put to only one shard instead of spreading it to what ever is configured in the setting routing_partition_size
Steps to reproduce:
# start elasticsearch e.g. with docker
docker run -it --rm -p 9200:9200 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.4.2
# delete the index if needed
# curl -XDELETE localhost:9200/test
# create an index with a routing_partition_size set to two
curl -XPUT localhost:9200/test -H "Content-Type: application/json" -d '{
"settings" : {
"index" : {
"number_of_shards" : "20",
"number_of_replicas": "0",
"routing_partition_size": "2"
}
},
"mappings": {
"_routing": {
"required": true
}
}
}
'
# search shards for a routing should return two shards with different shards ids, but is returning only one
curl -XGET "localhost:9200/test/_search_shards?routing=foo&pretty"
# also creating 100 documents with different ids but the same routing should spread the documents over two shards
for i in {1..100}; do curl -XPUT "localhost:9200/test/_doc/$i?routing=foo" -H "Content-Type: application/json" -d "{ \"description\":\"bar $i \"}"; done
# but all 100 documents endup in the same shard ( see number of matches for this single shard)
curl -XGET "localhost:9200/test/_search?preference=_shards:10&pretty&size=0"
Pinging @elastic/es-distributed (:Distributed/CRUD)
I further debugged the issue and the good news is, that there is a workaround in setting index.number_of_routing_shards equal to index.number_of_shards.
The source of the underlying issue is located in OperationRouting.calculateScaledShardId.
what ever is set as the number_of_routing_shards will influence the routingFactor, which then is used to
divide the hash. in this calculation the partitionOffset is unexpectedly scaled down and sometimes is lost completely.
e.g.
when I create a new index with number_of_shards = 200 and routing_partition_size = 20 the default number_of_routing_shards is set to 800 and the routingFactor is 4.
the result is that partitionOffset is indirectly divided by 4 and only a subset of the config routing_partition_size is reachable.
I think for an existing index this is not fixable, cause it would change the routing for every document and search.
But for every new index, the default for number_of_routing_shards should be number_of_shards if routing_partition_size is greater than 1. https://github.com/elastic/elasticsearch/blob/ff575a3e941c8f710bfdf7c137cdbde6985e545e/server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataCreateIndexService.java#L818-L834
one more strange thing to note is that I'm able to configure index.number_of_routing_shards and it is stored and used in the index, but never returned by the _settings endpoint for the index.
Hi @jgaedicke , I also debugged this issue and found the reason you said above, but I also found that if index.number_of_routing_shards is explicitly set to the same value with index.number_of_shards, then everyting is ok.
I also found that in the unit test function of PartitionedRoutingIT class, the setting-index.number_of_routing_shards is set to the value of index.number_of_shards.
https://github.com/elastic/elasticsearch/blob/3b4845490de0cbaea8c95c07cd87b275b760d476/server/src/test/java/org/elasticsearch/routing/PartitionedRoutingIT.java#L43-L49
this bug was introduced with the feature "Automatically prepare indices for splitting" https://github.com/elastic/elasticsearch/pull/27451
@gaobinlong
And it is good to know that setting the number_of_routing_shards is also the preferred workaround of the implementer of this feature. (see test)
But I still think it should be fixed.
For every new index of version 7 I would recommend setting the default for number_of_routing_shards to number_of_shards when the routing_partition_size is greater 1.(MetaDataCreateIndexService.java)
Another solution would be to multiply the partitionOffset with the routingFactor before adding it to the base hash for a routing (https://github.com/elastic/elasticsearch/blob/a4ed7b1ca102267ecdda7c788713675ed3958f3f/server/src/main/java/org/elasticsearch/cluster/routing/OperationRouting.java#L247). As this is influencing the shardId I would introduce this only for new indices created above a new major version. I guess the indices splitting could then work again and there is no need for setting number_of_routing_shards to number_of_shards.
@s1monw as the implementer of #27451 do you have any thoughts around this issue?
@jgaedicke @gaobinlong Thank you for reporting this issue.
We prefer to get rid of index.number_of_routing_shards and routing_factor. It should fix this issue. However, we have to proceed with the proposal carefully as it's a critical change. In the meantime, a workaround is to set index.number_of_routing_shards to the index.number_of_shards as @jgaedicke and @gaobinlong suggested.
@dnhatn tanks for confirming the issue and giving the background information around how you plan to fix it. As index.number_of_routing_shards is used by the split index api, is the plan to remove this as well? Does a github issue exist for removing index.number_of_routing_shards and could we link it to this issue here? Should this here be tagged as bug?
@jgaedicke I've marked this issue as a bug.
As index.number_of_routing_shards is used by the split index api, is the plan to remove this as well?
No, we will keep split and shrink APIs. The proposal suggests making these APIs without index.number_of_routing_shards.
index.routing_partition_size does not appear to work in ES 6.8 either. We create indices using the following template settings:
{
"index": {
"number_of_routing_shards": "999",
"number_of_shards": "27",
"routing_partition_size": "5"
}
}
but when we use the search_shards endpoint or just manually query for all documents for a given routing value, we can see that they all are in a single shard.
not appear to work in 7.4.2 too
Most helpful comment
index.routing_partition_sizedoes not appear to work in ES 6.8 either. We create indices using the following template settings:but when we use the
search_shardsendpoint or just manually query for all documents for a given routing value, we can see that they all are in a single shard.