Today, new analyzers need to be registered in several places when they are added to Elasticsearch. This should be much simpler - there should be just a single analyzer registry.
Also, custom analyzers today are per index, meaning that many instances of the same analyzer could exist on a single node. Instead, we should have a single per-node repository of analyzers which can be used by any index. While the APIs should remain as they are today, the analyzer configuration should be used to create a unique key to identify the analyzer, and index-level analyzer names should be mapped to entries in this per-node repository.
The ability to reload analyzers would be an optional nice-to-have (eg reloading synonym lists from files etc). (see https://github.com/elasticsearch/elasticsearch/issues/1956)
Adding to this from https://github.com/elasticsearch/elasticsearch/issues/1848: analyzers in config files should be added at "node" level, making them visible globally (like the predefined analyzers)
+1 Recreating the same analyzers every time you create a new index is stupid, especially when you use a daily index.
It would be really helpful if search analyzers could be modified dynamically (to solve e.g. query time synonyms change problem -- and no, custom plugin is not an universal solution as it's no go for many people using cloud managed ES clusters).
I understand the risk of modifying index time analyzers. That's why I think ES should have more knowledge about context in which a given analyzer is used -- if it's used for indexing it should not be possible to modify it, otherwise users should be able to modify/delete it.
cc @elastic/es-search-aggs
Most helpful comment
It would be really helpful if search analyzers could be modified dynamically (to solve e.g. query time synonyms change problem -- and no, custom plugin is not an universal solution as it's no go for many people using cloud managed ES clusters).
I understand the risk of modifying index time analyzers. That's why I think ES should have more knowledge about context in which a given analyzer is used -- if it's used for indexing it should not be possible to modify it, otherwise users should be able to modify/delete it.