Profiling:
public class App {
@Controller("/service")
public static class Control {
@Get
public CompletionStage<String> handle(Optional<String> str) {
return CompletableFuture.completedFuture(str.orElse("nop"));
}
}
public static void main(String[] args) {
Micronaut.run(App.class, args);
}
}
With:
ab -c 10 -n 10000000 -k http://localhost:8080/service?str=asd
Jprofiler with full instrumentation shows:
handleRouteMatchRequestArgumentSatisfier.fullfilArgumentRequirementsRouteMatch.executeRoutingInBoundHandler.prepareRouteForExecutionAbstractNettyHttpRequest.getPath (2% from getContentType)findAllClosestThis is just an example of course.
The real use case is that I am trying to upgrade a pure Netty app to Micronaut. The service is very lightweight: it only has a few GET endpoints and returns data cached in-memory. I am obviously expecting overhead from micronaut when "upgrading" from pure Netty, but CPU usage (more than) doubling (with a cpu distribution very similar to what presented here) isn't on my list of acceptable criteria.
Using micronaut 1.3.0.M1 & Oracle JDK 11.

Any code change suggestions are welcome
I wouldn't mind but this seems like a rather big change (no single hot function) and I don't have that kind of time on my hands to deep-dive into the project (unfortunately)... Detailed architecture docs would help understand how it is set up and why.
I also question the heavy utilization of streams for core functionalities in this project. In my experience, using streams over tiny collections does not perform well at scale.
I agree on the fundamental usage of streaming. It is not meant to be a replacement over simple iterations especially on small collections and when data is already available.
If there are any places where you think the usage of a stream can be replaced with iteration and it would not impact the readability or maintainability of the code and it can be determined that it is in part responsible for performance issues, please let me know so we can change it.
I will look at this when I get back next week. I have some ideas and it was an area on my todo list to optimize anyway
Pretty small optimization, but just overridding the methods in EmptyAnnotationMetadata results in a 50% performance improvement for RequestArgumentSatisfier.fullfilArgumentRequirements
With https://github.com/micronaut-projects/micronaut-core/commit/4c6ffe3a4d3bd30638067018d1cd14f17760a5e4 we're currently looking at a 125% performance improvement in RequestArgumentSatisfier.fullfilArgumentRequirements
handleRouteMatch is no longer appearing as a hotspot, at least not in YourKit, so closing this for the moment. There are probably other areas we can improve, but those are separate issues
@graemerocher What code did you use for your benchmark? Is the performance gain still there when the endpoint requires/accepts query parameters, headers or body of various types?
I upgraded to 1.3.0.M2 and run the same benchmark. This is the new call stack profile for handleRouteMatch:

It seems slightly better but out of the 33% of cpu used in this call stack:
fulfillArgumentRequirementsfilterPublisher + subscribeToResponsePublishergetProducesgetFirstTypeVariablestreammapswitchMapfulfillArgumentRequirements, it seems there is a little bit of overhead everywhere...
Most helpful comment
With https://github.com/micronaut-projects/micronaut-core/commit/4c6ffe3a4d3bd30638067018d1cd14f17760a5e4 we're currently looking at a 125% performance improvement in
RequestArgumentSatisfier.fullfilArgumentRequirements