here is how it go endless loop:
in file RetryAndFollowUpInterceptor.java
public Response intercept(Chain chain) throws IOException {
Request request = chain.request();
streamAllocation = new StreamAllocation(
client.connectionPool(), createAddress(request.url()));
int followUpCount = 0;
Response priorResponse = null;
while (true) {
if (canceled) {
streamAllocation.release();
throw new IOException("Canceled");
}
Response response = null;
boolean releaseConnection = true;
try {
response = ((RealInterceptorChain) chain).proceed(request, streamAllocation, null, null);
releaseConnection = false;
} catch (RouteException e) {
// The attempt to connect via a route failed. The request will not have been sent.
if (!recover(e.getLastConnectException(), true, request)) throw e.getLastConnectException();
releaseConnection = false;
continue;
} catch (IOException e) {
// An attempt to communicate with a server failed. The request may have been sent.
if (!recover(e, false, request)) throw e;
releaseConnection = false;
continue;
} finally {
// We're throwing an unchecked exception. Release any resources.
if (releaseConnection) {
streamAllocation.streamFailed(null);
streamAllocation.release();
}
}
when there is a IOException("shutdown") when execute
response = ((RealInterceptorChain) chain).proceed(request, streamAllocation, null, null);
it will go into catch block,and judge if can recover,recover(e, false, request) unfortunately recover() return true,so it will continue and go endless loop
Can you provide a test case to demonstrate?
I had the same issue today too. The loop ran endlessly for minutes until I killed the app. Maybe put an upper bound to the number of executions or at least slow down after 10 iterations?
We gotta make sure shutdown connections get removed from the pool.
@swankjesse were you planning to pick this one up? If not I can look into it.
Please do!
I have a similar issue using http2, I tried to debug the code and I have found this TODO in method isHealthy of RealConnection class:
if (framedConnection != null) {
return true; // TODO: check framedConnection.shutdown.
}
The problem seems to be related to this TODO.
@fabioCollini yup, although that snippet is an older version. I'm working on a fix.
Is there any update on this issue? We're running into the same issue in our project. After some investigation we've discovered that the following code might not work as it's expected to work StreamAllocation.java#L135-L139:
synchronized (connectionPool) {
if (candidate.successCount == 0) {
return candidate;
}
}
candidate.successCount == 0 returns true, but candidate.isHealthy(doExtensiveHealthChecks) returns false for the same connection instance.
I don't have a test case at the moment, but I'll try to find and provide it ASAP.
Yea, sorry I fell behind on this. The test case is straightforward, send a GOAWAY frame without closing the connection. I'll try to wrap this up in the next day or so.
We're blocked on releasing due to this issue, so I'm trying to help out where I can... I'm not sure if this will help at this point, but here's a demonstration of the problem: https://github.com/dlew/okhttp/commit/5092a7b1cb99740e83d90364c1b3a97e5a6c19e3
bit by this too...hope it gets released soon :-)
What version was this bug introduced in? I.e., can we switch to an older version of okhttp to work around it?
Looks like the answer is no. I went back to 3.2.0 (couldn't go further w/o hitting conflicts w/ other libraries) and it's still there. It's easy to reproduce in an app by using CharlesProxy and throttling w/ a 50% reliability network.
I’m hoping to cherrypick #2955 and release it as 3.4.2.
Released as 3.4.2.
Thanks for the quick release @swankjesse...that fixed a weird issue w/ our app where a bad network caused a timeout and then the network stack fell over and the user had to restart the app to fix it :-)
From more testing, I found that if the network gets crappy enough (if you leave CharlesSSL at 50% and keep trying to get your app to load stuff over the network when Charles gives you an unstable network), the network stack gets stuck w/ these still:
Caused by: java.io.IOException: shutdown
at okhttp3.internal.framed.FramedConnection.newStream(FramedConnection.java:238)
Retrofit (1.9.0 if it matters) then gets stuck as before...takes longer now so initial testing didn't show this...the fix definitely helped.
@kenyee yeah, I know and I’m sad about that. We’ve got a couple of failing tests where our ‘perseverence’ goals aren’t being met. We’ll fix those before closing this issue.
@dave-r12 anything left here?
Nothing I can think of.
Sweet!
Most helpful comment
Released as 3.4.2.