Micrometer: StepCounter does not push incomplete step values on application shutdown

Created on 6 Mar 2020  路  14Comments  路  Source: micrometer-metrics/micrometer

_Hi! Had posted this in Slack, but thought to report it here as well pending response there_

In short: I have a Spring Boot 2.2.5 application with spring-boot-starter-actuator and micrometer-registry-elastic; when I set management.metrics.export.elastic.enabled=true, then my counters do not increment anymore.

Not sure if this should be a micrometer issue or related to Spring Boot; ElasticMeterRegistry is in micrometer whereas ElasticMetricsExportAutoConfiguration is in Spring Boot. But since the latter seems to mostly delegate and is rather minimal otherwise, I thought to report it here first.

To replicate the problem:

  1. Create a new project with: https://start.spring.io/#!type=maven-project&language=java&platformVersion=2.2.5.RELEASE&packaging=jar&jvmVersion=11&groupId=com.example&artifactId=demo&name=demo&description=Demo%20project%20for%20Spring%20Boot&packageName=com.example.demo&dependencies=actuator

  2. Add a dependency on micrometer-registry-elastic

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-elastic</artifactId>
</dependency>
  1. Create a test class containing
@Test
void counterStaysAtZero() {
    DebugApplication.main("", "--management.metrics.export.elastic.enabled=true");
    // DebugApplication : Counter: 0.0 , [Measurement{statistic='COUNT', value=0.0}]
    // DebugApplication : Counter: 0.0 , [Measurement{statistic='COUNT', value=0.0}]
}
@Test
void counterAddsToOne() {
    DebugApplication.main("", "--management.metrics.export.elastic.enabled=false");
    // DebugApplication : Counter: 0.0 , [Measurement{statistic='COUNT', value=0.0}]
    // DebugApplication : Counter: 1.0 , [Measurement{statistic='COUNT', value=1.0}]
}
@SpringBootApplication
@RequiredArgsConstructor
@Slf4j
public static class DebugApplication implements ApplicationRunner {
    public static void main(String... args) {
        log.info("{}", args[1]);
        SpringApplication.run(DebugApplication.class, args).close();
    }
    private final MeterRegistry registry;
    @Override
    public void run(ApplicationArguments args) throws Exception {
        Counter counter = registry.counter("someCounter");
        log.info("Counter: {} , {}", counter.count(), counter.measure());
        counter.increment();
        log.info("Counter: {} , {}", counter.count(), counter.measure());
    }
}

If you then run both tests you will get the log output that's listed in each test method.

I would expect the Counter to increment regardless of whether metrics are exported; What can be done to make this so the counters increase?

design work

Most helpful comment

Would really appreciate this one.

We run a big-data processing platform with a large number of processes, both long and short lived (where short-lived is even <10s, and long-lived even several hours, and we can't easily assume upfront whether a process will be short or long living), and we essentially have the same issue here - metrics from short-lived processes are not being published, therefore we really only have a subset of our processes being measured.

Putting a thread sleep call is not ideal for us, as due to high number of short-lived processes we run, this might raise our costs significantly. I can also see that there is a possibility of reducing this step size, but then I'm concerned about potential performance impact on long-running processes.

@shakuzen I'm wondering if you could recommend a preferred approach here. Perhaps we can modify the step at runtime? I'm not sure however if it's realistically achievable. I'd really appreciate your feedback on this.

All 14 comments

@timtebeek Thanks for the issue!

ElasticMeterRegistry is a StepMeterRegistry which uses StepCounter for Counter whereas SimpleMeterRegistry, which is used when there's no backend in Spring Boot, uses CumulativeCounter. That's the reason why you see the behavioral difference. StepCounter reports the number of events in the last complete interval as mentioned in its Javadoc, so if you check the value after its step, you can see what you expected.

I updated your sample to demonstrate what I said here.

Thanks for the quick response @izeye ! Had no idea StepMeterRegistry and StepCounter were involved to give this effect. It affects us most for short lived processes/jobs that increment a counter at the end of their run, before a last gasp push of metrics to Elastic. In these cases the behaviour we're seeing is that the counters are created with value 0, but aren't actually incremented before the final push and the process dies.

Are you suggesting we add a step sized (~1 minute) sleep to the end of our jobs to ensure the incremented counter values are pushed?

Or is there another way to force pushing all "current" values as a program is shutting down, rather than wait for a step duration?

@timtebeek The step-sized sleep was just for the demonstration, but I couldn't think of any way to push a value in an incomplete step with the current implementation except waiting for the completion of the step.

Can confirm adding the one minute sleep at the end of our jobs resolved the immediate issue we have with counters not showing up incremented, so thanks for that @izeye !

Now you've indicated that there's presently no way to push a value in an incomplete step, even when an application is shutting down and metrics are pushed as a result of that shutdown hook. Would it be worthwhile exploring such a use-case and keep this ticket open? Or do you suggest we close this one?

@timtebeek Thanks for the confirmation!

I think it would be nice if we could push a value in an incomplete step rather than waiting for a step to be completed in case of shutdown although I don't know how we could achieve it at the moment.

@shakuzen Thoughts?

Perfect thanks; I've (hopefully) updated the title to reflect this change in scope (feel free to refine as needed)

As a user of the micrometer for jobs it would be great if on application shutdown any values in incomplete metric steps are still pushed to Elastic or whatever else is configured.

I believe we have a previous ticket along these same lines - it came up for Influx as the backend if I remember correctly, but same use case. I do think we should have a better solution for this. I'll link the issues and add it to an upcoming release once I get done with the current release crunch.

Awesome, thanks for considering such a change!

Would really appreciate this one.

We run a big-data processing platform with a large number of processes, both long and short lived (where short-lived is even <10s, and long-lived even several hours, and we can't easily assume upfront whether a process will be short or long living), and we essentially have the same issue here - metrics from short-lived processes are not being published, therefore we really only have a subset of our processes being measured.

Putting a thread sleep call is not ideal for us, as due to high number of short-lived processes we run, this might raise our costs significantly. I can also see that there is a possibility of reducing this step size, but then I'm concerned about potential performance impact on long-running processes.

@shakuzen I'm wondering if you could recommend a preferred approach here. Perhaps we can modify the step at runtime? I'm not sure however if it's realistically achievable. I'd really appreciate your feedback on this.

Dear @shakuzen

Any news about this feature? I think that Timers are in the same situation.

Hi @shakuzen
I have a similar use case but with Influx Meter Registry. In a graceful shutdown it don't report metrics due to incomplete step. I have really interested in send metrics with short-lived processes or restarted services use cases.
It has any estimated date to this feature?

Regards

I've made a note for our 1.8 planning to holistically look at our support for metrics in short-lived processes.

Thanks @shakuzen , is possible to include this support in 1.3.x too ?. It will be very useful to me has this feature.
Regards

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jonatan-ivanov picture jonatan-ivanov  路  3Comments

matsumana picture matsumana  路  4Comments

filpano picture filpano  路  4Comments

nickcodefresh picture nickcodefresh  路  3Comments

nugnoperku picture nugnoperku  路  4Comments