5.0 introduces a per-node limit on the rate of inline script compilations that should help catch the anti-pattern of embedding script parameters in the scripts themselves. I wonder if it is worth adding a master-only limit on the rate of indexes created to catch situations where people accidentally misconfigure an input system and it ends up creating thousands of indexes in quick succession. Such a rate limit would cause indexing to fail with a useful error message, causing back pressure in any queueing system. I think this'd be better than just creating thousands of indexes as fast as we can.
Is this a good idea or a horrible idea?
Have we run into any situations where someone has actually hit this issue? I don't recall seeing any github issues about it before.
I don't recall seeing any github issues about it before.
I've seen it come through Elastic's support organization a few times. I expect this hasn't come up on github because Elasticsearch isn't the root cause of the issue.
I think I'd rather limit the total number of indices/shards in a cluster than the creation rate.
Discussed in FixItFriday. There are two issues here: creating too many indices and creating indices faster than the master can cope. We suggest adding two safeguards:
This setting would be checked on user actions like create index, restore snapshot, open index. If the total number of shards in the cluster is greater than max_shards_per_node * number_of_nodes then the user action can be rejected. This implementation allows the max value to be exceeded if (eg) a node fails, resulting in a lower total max shards per cluster.
We would default to a high number during 5.x (eg 1000), giving sysadmins the ability to set it to whatever makes sense for their cluster, and we can look at lowering this value for 6.0.
This would be a simple counter which counts the number of in-flight index creation requests. New requests which would cause the max to be exceeded would be rejected. The aim of this setting is not to queue up potentially thousands of index creations which could be caused by erroneously trying to create an index per document. Default eg 30
The max_shards_per_node change will be handled in https://github.com/elastic/elasticsearch/issues/20705
We now have a limit on the number of shards per node in a cluster, thanks to #34892.
I've marked this as team-discuss because I would like to revisit the discussion about limiting the number of concurrent index creations, or applying another rate limit. I question how easy it would be to set this correctly. If using time-based indices then sometimes we might want to create many indices at the same time. Conversely, even if you could only create a single index at once I think the time it'd take to hit the shards-per-node limit is comparable with the time it'd take to react to a rogue client that's creating too many indices, so I don't think the concurrency limit helps much.
In short, I think we can close this.
We discussed this today and agreed to close this for the reasons I described above.
Most helpful comment
I think I'd rather limit the total number of indices/shards in a cluster than the creation rate.