Investigating https://discuss.elastic.co/t/index-creation-slows-down-over-time/230775 I wrote a script that creates and deletes indices repeatedly and left it running for several hours against a 7.6.2 cluster. Each iteration had indeed slowed down as reported, from around 6s to around 10s.
The hot threads API often reported a substantial amount of time spent doing things with loggers:
20.3% (101.5ms out of 500ms) cpu usage by thread 'elasticsearch[node-0][clusterApplierService#updateTask][T#1]'
2/10 snapshots sharing following 25 elements
java.base@14/java.util.Collections$UnmodifiableCollection$1.hasNext(Collections.java:1048)
app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:127)
app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:111)
app//org.elasticsearch.index.SearchSlowLog.setLevel(SearchSlowLog.java:127)
app//org.elasticsearch.index.SearchSlowLog.<init>(SearchSlowLog.java:123)
app//org.elasticsearch.index.IndexModule.<init>(IndexModule.java:151)
app//org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:595)
app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:542)
app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:173)
app//org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndices(IndicesClusterStateService.java:484)
app//org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:246)
app//org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517)
app//org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$4041/0x000000080175d440.accept(Unknown Source)
java.base@14/java.lang.Iterable.forEach(Iterable.java:75)
app//org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514)
app//org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485)
app//org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
app//org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73)
app//org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633)
app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
java.base@14/java.lang.Thread.run(Thread.java:832)
2/10 snapshots sharing following 24 elements
app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:127)
app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:111)
app//org.elasticsearch.index.SearchSlowLog.setLevel(SearchSlowLog.java:128)
app//org.elasticsearch.index.SearchSlowLog.<init>(SearchSlowLog.java:123)
app//org.elasticsearch.index.IndexModule.<init>(IndexModule.java:151)
app//org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:595)
app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:542)
app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:173)
app//org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndices(IndicesClusterStateService.java:484)
app//org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:246)
app//org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517)
app//org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$4041/0x000000080175d440.accept(Unknown Source)
java.base@14/java.lang.Iterable.forEach(Iterable.java:75)
app//org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514)
app//org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485)
app//org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
app//org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73)
app//org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633)
app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
java.base@14/java.lang.Thread.run(Thread.java:832)
3/10 snapshots sharing following 14 elements
app//org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517)
app//org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$4041/0x000000080175d440.accept(Unknown Source)
java.base@14/java.lang.Iterable.forEach(Iterable.java:75)
app//org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514)
app//org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485)
app//org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
app//org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73)
app//org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633)
app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
java.base@14/java.lang.Thread.run(Thread.java:832)
3/10 snapshots sharing following 10 elements
java.base@14/jdk.internal.misc.Unsafe.park(Native Method)
java.base@14/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
java.base@14/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:505)
java.base@14/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3137)
java.base@14/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1614)
java.base@14/java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:548)
java.base@14/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1056)
java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1116)
java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
java.base@14/java.lang.Thread.run(Thread.java:832)
I took a heap dump and observed almost 150,000 loggers of the forms index.indexing.slowlog.<INDEX_UUID>, index.search.slowlog.fetch.<INDEX_UUID> and index.search.slowlog.query.<INDEX_UUID> (equal parts of each). There were only 1280 other instances of org.apache.logging.log4j.core.Logger.
I suspect this is a consequence of https://github.com/elastic/elasticsearch/pull/47234 which introduced per-index loggers.
Pinging @elastic/es-core-infra (:Core/Infra/Logging)
The underlying problem here is that Log4j doesn't do any cleanup of loggers, once a logger is created, it is forever registered in the logger registry and there is no cleanup. This is why we introduced PrefixLogger, so that we can have, for example, per-index loggers that are cleaned up when the index is no longer allocated to a node. However, using per-index loggers in the slow logs with separate log-levels per such logger means that we can not use the PrefixLogger mechanism. It requires that Log4j create a logger for each slow log logger, and that will be forever maintained in the registry. Using per-index loggers in the slow log is therefore fatal. We effectively have two options then:
I'm not sure that I see the benefits of per-index slow log settings. If I want to log what is slow, I would want that for all of my indices. Therefore, I advocate that we remove per-index slow log settings.
There is a https://logging.apache.org/log4j/2.x/log4j-api/apidocs/org/apache/logging/log4j/spi/LoggerRegistry.WeakMapFactory.html that possibly could help us.
I probably need to educate myself better on this, sadly the log4j2 docs are not very detailed on the subject.
The weak map factory is designed for web servers like Tomcat where there are servlets that carry various context used when creating the logger. Using the weak map factory in Elasticsearch means replacing the logger registry which means replacing the logger context (since it holds a private reference to the logger registry and can not be controlled) which means replacing the logger context factory. It's a lot of hoops to jump through and we end up copying a lot of Log4j behavior. Again, I question the value of per-index slow log settings to begin with. Let's start with what features we want, and then engineer the solution.
we discussed this during core-infra sync on 6th May. The conclusion is that we will deprecate the use of changing log level per index in 7.x and remove it in 8. We will still allow changing thresholds per index. To fix a memory leak in 7.x we will revert the change that creates a new log4j logger per index and will do a "manual" check of index name/package checks.
In ES8 we will no longer allow changing log level per index. That means index.search.slowlog.level: info will be removed.
If a slowlog is required for specific index/index pattern, it will have to be changed using thresholds.
The option of maintaining current behaviour was rejected. It would require to maintain too many classes from log4j2.