Micrometer: Async log4j2 log events are not counted accurately

Created on 3 Jul 2020  路  22Comments  路  Source: micrometer-metrics/micrometer

Micrometer provides the Log4j2Metrics meter binder to add metrics for Log4j2 log events. Unfortunately the implementation is flawed. It is implemented as a filter, but filters can be invoked multiple times and must therefore not have any side effects.

In practice the filter _is_ invoked multiple times by Log4j2 in many circumstances (e.g. in the presence of multiple and/or async loggers). The result is that the counter is increased multiple times for each single log event and the counter values become too high.

bug help wanted

All 22 comments

Note that the issue is not that the filter is added multiple times. The same filter is invoked multiple times. You can for example see this in AsyncLoggingConfig.log(LogEvent, LoggerConfigPredicate). This first calls super.log(...), which invokes the filter, and then it calls logToAsyncDelegate(...), which again invokes the filter, and _again_ when it is invoked on the background thread to perform the actual logging asynchronously.

So this only applies to async loggers?

@bol-com-pschmitz thank you for the report. Just so we're explicit about the scenario and that a fix properly addresses it, would you be able to write a test demonstrating it like other tests in our Log4j2MetricsTest?

So this only applies to async loggers?

I'm not sure, possibly. That's how it occurred in my case. It doesn't matter though; Log4j2Metrics should not be implemented as a filter. Filters must always be able to be invoked multiple times and must never have side effects (such as incrementing metrics) as a general design principle.

@bol-com-pschmitz would you be able to write a test demonstrating it like other tests in our Log4j2MetricsTest?

I'll take a look.

Hi all, hello from Belgium! :)

I just create a test for this issue and indeed the filter have been called multiple times for same event.

Thanks @dennysfredericci ! I was about to start doing that.

There is a method isEndOfBatch on Log4jEventLog which can be used to see if this is the "last" call.

/** * Returns {@code true} if this event is the last one in a batch, {@code false} otherwise. Used by asynchronous * Loggers and Appenders to signal to buffered downstream components when to flush to disk, as a more efficient * alternative to the {@code immediateFlush=true} configuration. * * @return whether this event is the last one in a batch. */ // see also LOG4J2-164 boolean isEndOfBatch();

I changed the implementation to test this idea, but it breaks all other tests.

Hi Guys,

When we have an async logger configured a call to log.info or any other level triggers 2 times the filter, i found a way to use the isEndOfBatch attribute from LogEvent to check if this is the last call and then metrics counter will be incremented only once instead of multiple times.

I also create a simple spring-boot project to test it locally. You can see it here: https://github.com/dennysfredericci/springboot-micrometer-issue-2183

Hi @shakuzen

Just noticed now, I did the PR for master branch should I change it to 1.1.X branch?

Resolved by #2183

@bol-com-pschmitz would you confirm if the issue is fixed for you? The fix is available in versions 1.1.16, 1.3.11, and it will be in the upcoming 1.5.3 release (you can try the latest snapshot version 1.5.3-SNAPSHOT for now).

@bol-com-pschmitz would you confirm if the issue is fixed for you? The fix is available in versions 1.1.16, 1.3.11, and it will be in the upcoming 1.5.3 release (you can try the latest snapshot version 1.5.3-SNAPSHOT for now).

I'm testing it. At first glance it looks like now it might be undercounting. I'm not sure what constitutes a "batch", but if it can result in more than one log entry then this is not going to work.

@bol-com-pschmitz Thanks for testing. Let us know if there's another issue. If we can get a unit test to reproduce, that'll be best so we can make sure it gets fixed properly and stays fixed.

Unfortunately this approach does not work. It counts batches of log events, not log events. This small change to the unit test added by @dennysfredericci demonstrates it:

        assertThat(registry.get("log4j2.events").tags("level", "info").counter().count()).isEqualTo(0);
        logger.info("Hello, world!");
        logger.info("Hello, world!");
        logger.info("Hello, world!");
        assertThat(registry.get("log4j2.events").tags("level", "info").counter().count()).isEqualTo(3);

This means the issue has not been fixed and should be reopened.

Indeed I can see it as well :disappointed:

Sorry @dennysfredericci . I appreciate your effort! But I still think that this should not be implemented as a filter at all. Filters can be invoked any number of times and should not have side effects such as incrementing counters.

I'm no log4j2 expert, so I'm hoping we can get some help on a solution for this from someone more knowledgeable than me on this area. I've marked the issue as help wanted.

So, let's became a log4j2 expert :smile:

I just started a thread on log4j2 discussion list to see if is possible to use a filter for that use case.

Sorry @dennysfredericci . I appreciate your effort! But I still think that this should not be implemented as a filter at all. Filters can be invoked any number of times and should not have side effects such as incrementing counters.

Hi @bol-com-pschmitz this seems an unexpected behavior, I got the message below on log4j mail list

"... Filters that are attached to Loggers, Appender-Refs and Appenders should only be called once ..."

I just included a reconfiguration call and now the filter is called just once, but I am getting the exception below.

2020-08-11 22:24:28,784 Test worker ERROR Could not register mbeans java.lang.NullPointerException
    at org.apache.logging.log4j.core.async.AsyncLoggerConfigDisruptor.createRingBufferAdmin(AsyncLoggerConfigDisruptor.java:417)
    at org.apache.logging.log4j.core.async.AsyncLoggerConfig.createRingBufferAdmin(AsyncLoggerConfig.java:199)
    at org.apache.logging.log4j.core.jmx.Server.registerLoggerConfigs(Server.java:366)
    at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:186)
    at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:141)
    at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:629)
    at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:712)
    at org.apache.logging.log4j.core.config.Configurator.reconfigure(Configurator.java:239)
    at io.micrometer.core.instrument.binder.logging.Log4j2Metrics.bindTo(Log4j2Metrics.java:101)

In summary, this can be an issue in how we add the filters to log4j configuration.

As is not possible to use log4j filters to count log events properly I requested a statistics feature for the log4j team.

With some luck, we can have these counters direct from log4j API.

Was this page helpful?
0 / 5 - 0 ratings