Today every put-template request (both composable and legacy) results in a cluster state update, even if the cluster state already contains exactly the template that we're trying to put. There's no need to update the cluster state if the template is already there, so it would be preferable to skip any such requests without needing to publish a new cluster state version.
Pinging @elastic/es-core-features (:Core/Features/Indices APIs)
@DaveCTurner this sounds pretty reasonable to me, out of curiosity, was this causing any problems anywhere?
Yes, we've seen a few cases (and other issues linked to this one) where users unwittingly leave template auto-creation on across thousands of Beats, resulting in lot of no-op cluster state updates at HIGH priority preventing the master from doing anything more useful for quite some time. Although the requests do time out after 30s by default this doesn't really help since clients will often retry until successful.
There's changes in progress on the Beats side too to mitigate this kind of problem, but that won't help other clients.
@dakrone I recently learned that Beats will continue to default to using legacy templates (https://github.com/elastic/beats/pull/21212) for BWC reasons, but #57851 doesn't address that case, so I think this will continue to affect users for a while yet. Could you do something similar to #57851 for legacy templates?
@DaveCTurner I agree with the concern, I opened https://github.com/elastic/elasticsearch/pull/64493 for this for at least 7.11+
Just want to point out this could happen to APM as well.
https://www.elastic.co/guide/en/apm/server/current/configuration-template.html
setup.template.overwrite defaults to false.
Most helpful comment
Yes, we've seen a few cases (and other issues linked to this one) where users unwittingly leave template auto-creation on across thousands of Beats, resulting in lot of no-op cluster state updates at
HIGHpriority preventing the master from doing anything more useful for quite some time. Although the requests do time out after 30s by default this doesn't really help since clients will often retry until successful.There's changes in progress on the Beats side too to mitigate this kind of problem, but that won't help other clients.