Not sure exactly what is going on but getting a lot of those when deployed. I saw there was some issue regarding HTTP2 but this the same code as in 1.3. I have tried different stuff but cannot reproduce locally.
i.m.h.c.e.ReadTimeoutException: Read Timeout
at i.m.h.c.e.ReadTimeoutException.<clinit>(ReadTimeoutException.java:26)
at i.m.h.c.n.DefaultHttpClient.lambda$null$35(DefaultHttpClient.java:1152)
at i.r.i.o.f.FlowableOnErrorNext$OnErrorNextSubscriber.onError(FlowableOnErrorNext.java:103)
at i.m.r.r.RxInstrumentedSubscriber.onError(RxInstrumentedSubscriber.java:66)
at i.r.i.o.f.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:139)
at i.r.i.o.f.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:170)
at i.r.i.s.ScheduledRunnable.run(ScheduledRunnable.java:66)
at i.r.i.s.ScheduledRunnable.call(ScheduledRunnable.java:57)
at j.u.c.FutureTask.run(FutureTask.java:264)
at j.u.c.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at j.u.c.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at j.u.c.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:834)
Ok, little bit hard to provide a solution without an example that reproduces the issue 馃槩
Having the same issue. It only happens when I the app runs in a pod.
Project that is causing the problem: https://github.com/forbi/slow-mn
@forbi do you have steps to reproduce for that app?
@graemerocher Is it possible this commit https://github.com/micronaut-projects/micronaut-core/commit/1f370c4678116197d2f095f1b707ebff604acdb7 caused something like this?
@dstepanov Could be. If we had a way to reproduce the issue then we could try do a git bisect or revert the commit and see if it is the cause
@forbi do you have steps to reproduce for that app?
Deploy the app to k8s and hit the endpoint between 5-10 times.
@graemerocher With enabled logging, I see log full of messages:
Discarded inbound message EmptyLastHttpContent that reached at the tail of the pipeline. Please check your pipeline configuration.
Discarded message pipeline : [LoggingHandler#0, idle-state, http-server-codec, flow-control-handler, http-keep-alive-handler, http-compressor, http-decompressor, http-streams-codec, chunk-writer, micronaut-http-decoder, micronaut-http-encoder, websocket-upgrade-handler, micronaut-inbound-handler, DefaultChannelPipeline$TailContext#0]. Channel : [id: 0x4229cf2d, L:/100.96.42.204:8080 - R:/100.96.42.67:43418].
Possible solutions https://stackoverflow.com/questions/15242793/netty-pipeline-warning
It looks like it happens on the second request, first passes.
More info:
Jul 8, 2020 @ 09:48:14.496 nioEventLoopGroup-3-3 Read timeout at io.micronaut.http.client.netty.DefaultHttpClient.lambda$null$35(DefaultHttpClient.java:1152) (probably occured before CONNECT)
Jul 8, 2020 @ 09:48:14.494 nioEventLoopGroup-3-3 [id: 0x3bed0818] CONNECT: client request connected to host
Jul 8, 2020 @ 09:48:14.493 nioEventLoopGroup-3-3 Incoming HTTP Server Request
Multiple "Discarded inbound message" + "Discarded message pipeline" on nioEventLoopGroup-3-2 and nioEventLoopGroup-3-3
Jul 8, 2020 @ 09:47:13.489 nioEventLoopGroup-3-3 [id: 0x3bed0818] REGISTERED - incoming client request
multiple server + client requests
To me, it looks like pipeline error is blocking threads that are used for server and client processing.
Is it possible to have separate pools for the server and clients?
From the logs this looks like it is sending a file back since chunked writer is in the pipeline. That may help narrow down things down
It's not a file just a JSON request/response
You could try set micronaut.server.thread-selection: IO which runs all server requests on the IO thread pool.
You could also replace DefaultExecutorSelector with a new bean that runs on the executor service of your choice so yes it is possible to alter the behaviour. Still that doesn't get us closer to the root of the problem, I will hopefully get time soon to look at @forbi's example
Unfortunately @forbi's instructions to reproduce were not particularly helpful. His example application seems to require setup and fails with Unexpected error occurred: Failed to inject value for parameter [typeOfSrvE] of class: uy.com.antel.pi.debttlk.service.DebtFinder out of the box. Maybe if someone can provide a useful set of steps to reproduce this issue we can take a look.
I suspect the problem is related to how k8s (running inside AWS for us) is sending HTTP requests to the Micronaut server.
Simple hello-world with logging might produce the same Netty errors as I posted, maybe using a different HTTP client would also trigger the problem.
Unfortunately @forbi's instructions to reproduce were not particularly helpful. His example application seems to require setup and fails with
Unexpected error occurred: Failed to inject value for parameter [typeOfSrvE] of class: uy.com.antel.pi.debttlk.service.DebtFinderout of the box. Maybe if someone can provide a useful set of steps to reproduce this issue we can take a look.
I think you can delete typeOfSrvE
if I delete that and then hit http://localhost:8080/debttlk-mn/debttlk/foo all I see is:
<head><title>405 Not Allowed</title></head>
<body bgcolor="white">
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx</center>
</body>
</html>
io.micronaut.http.client.exceptions.HttpClientResponseException: <html>
<head><title>405 Not Allowed</title></head>
<body bgcolor="white">
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx</center>
</body>
</html>
at io.micronaut.http.client.netty.DefaultHttpClient$11.channelRead0(DefaultHttpClient.java:2046)
at io.micronaut.http.client.netty.DefaultHttpClient$11.channelRead0(DefaultHttpClient.java:1964)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
But not the problem in question.
And yes I have tried deployment via docker and k8s locally
@graemerocher No netty errors when tracing is enabled?
What I need is a sample application with clear steps to reproduce and maybe we can make some progress
Hello world app with a @Client should be enough. And as I say, in my case it only happens in k8s
@forbi Please create the hello world app and test to verify the problem occurs. If it does, document the steps you took to reproduce the issue in a README.md of the project and upload it to Github. Then create a link to that project here. We can't go back and forth wasting time with projects that aren't setup correctly to reproduce the issue.
@jameskleeh https://github.com/forbi/demo
Edit: I added a fail.log that shows that the request returns a timeout but the request is still processed.
Seems a similar or related error but on 1.3.x https://github.com/micronaut-projects/micronaut-core/issues/3694
@graemerocher I see the same errors in the project attached, so it looks like there is a way to reproduce it 馃憤
if it is the same problem
@graemerocher Read timeouts + "Discarded inbound message" I think it's the same problem
@graemerocher BTW "Discarded" message occurs only in Micronaut 2, I did migrate it to it
Right, but the read timeouts still occur regardless of that problem
I created a PR for the discarded message problem https://github.com/micronaut-projects/micronaut-core/pull/3695
Regarding the problem with the read timeout in the sample application, these were caused by multiple places in the sample application where the event loop was being blocked and blocking calls were happening. I resolved them with
https://github.com/graemerocher/micronaut-http/commit/5d2f0724051acc173f3178c0e112cc69cda0a451
I would suggest that in folks who are reporting this issue review their code for places where the event loop is being blocked and if it is you will need to add @ExecuteOn(TaskExecutors.IO) in Micronaut 2.0 or preferably don't block the event loop
@graemerocher Thanks!
I will try if it helps when it鈥檚 merged.
Not really understand how can it deadlock event loop, only case I would think is if app calls itself.
I don't think the issues are related because the same issue occurs on 1.3.x at least with that sample app. In Micronaut 2.x the problem is probably more likely to happen if you block the event loop since we run all operations by default on the event loop.
Having said that some people are reporting the second request failing (see #2905) which we need to investigate as well, but still waiting for a sample app for that case
Maybe a better reproducer https://github.com/volnei/bugreport
Nope, they are also doing blocking operations in the controller in that report. Application fixed with:
---
.../main/java/gateway/WorkerController.java | 3 ++
.../src/test/java/gateway/GatewayTest.java | 30 +++++++++++++++++--
2 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/gateway/src/main/java/gateway/WorkerController.java b/gateway/src/main/java/gateway/WorkerController.java
index d5cc164..b578643 100644
--- a/gateway/src/main/java/gateway/WorkerController.java
+++ b/gateway/src/main/java/gateway/WorkerController.java
@@ -2,12 +2,15 @@ package gateway;
import io.micronaut.http.annotation.*;
import io.micronaut.http.HttpResponse;
+import io.micronaut.scheduling.TaskExecutors;
+import io.micronaut.scheduling.annotation.ExecuteOn;
import java.util.Map;
import javax.inject.Inject;
@Controller("/worker")
+@ExecuteOn(TaskExecutors.IO)
public class WorkerController {
--
@graemerocher I think it's not a good idea to switch to the manual thread selection in version 2, developers are going to run into this kind of issues. Maybe it would be better to introduce some kind of smart auto selector which would measure the call time and decide if it should run on EL or IO executor.
We are considering adding blocked thread detection for the event loop.
Users can still go back to 1.x behavior by setting the thread selection strategy
Switching thread selection to AUTO solves the problem, some kind of detection of blocking the event loop would be nice.
Where do we set thread selection to AUTO?
I am intermittently getting this issue when upgraded from1.3 to 2.2.2
tried with thread selection as AUTO/IO
increased timeout of http client to 20s
each controller is annotated with @ExecuteOn(TaskExecutors.IO)
I have blocking rest api calls in my service but from controller returning Single>
Any suggestion or workaround is appreciated.
@nirvana124 provide an example that reproduces the issue
This has just reappeared for us going from 2.03 -> 2.2.3 . I鈥檓 working on an example app but it seems to be:
Just as an update (incase others are as stupid as myself) This was an issue on our side. That being said it was clearly impacted by some change in 2.03 -> 2.2.3.
In our case we had a JWT validator on some of our endpoints. On our authentication microservice the JWKS url was set to itself so authenticated calls would in a servlet filter call out to the same service. This only resulted in the thread lock in a k8s environment i assume as we had strict request/limits set which resulted in a single netty event loop thread.
Hope this helps others who may be doing something similar.
Most helpful comment
Switching thread selection to AUTO solves the problem, some kind of detection of blocking the event loop would be nice.