okhttp 3.4.1will go endless loop,when there is a IOException("shutdown")

Created on 28 Jul 2016  Â·  20Comments  Â·  Source: square/okhttp

here is how it go endless loop:
in file RetryAndFollowUpInterceptor.java

public Response intercept(Chain chain) throws IOException {

Request request = chain.request();
streamAllocation = new StreamAllocation(
    client.connectionPool(), createAddress(request.url()));
int followUpCount = 0;
Response priorResponse = null;
while (true) {
  if (canceled) {
    streamAllocation.release();
    throw new IOException("Canceled");
  }

  Response response = null;
  boolean releaseConnection = true;
  try {
    response = ((RealInterceptorChain) chain).proceed(request, streamAllocation, null, null);
    releaseConnection = false;
  } catch (RouteException e) {
    // The attempt to connect via a route failed. The request will not have been sent.
    if (!recover(e.getLastConnectException(), true, request)) throw e.getLastConnectException();
    releaseConnection = false;
    continue;
  } catch (IOException e) {
    // An attempt to communicate with a server failed. The request may have been sent.
    if (!recover(e, false, request)) throw e;
    releaseConnection = false;
    continue;
  } finally {
    // We're throwing an unchecked exception. Release any resources.
    if (releaseConnection) {
      streamAllocation.streamFailed(null);
      streamAllocation.release();
    }
  }

when there is a IOException("shutdown") when execute
response = ((RealInterceptorChain) chain).proceed(request, streamAllocation, null, null);
it will go into catch block,and judge if can recover,recover(e, false, request) unfortunately recover() return true,so it will continue and go endless loop

bug needs info

Most helpful comment

Released as 3.4.2.

All 20 comments

Can you provide a test case to demonstrate?

I had the same issue today too. The loop ran endlessly for minutes until I killed the app. Maybe put an upper bound to the number of executions or at least slow down after 10 iterations?

We gotta make sure shutdown connections get removed from the pool.

@swankjesse were you planning to pick this one up? If not I can look into it.

Please do!

I have a similar issue using http2, I tried to debug the code and I have found this TODO in method isHealthy of RealConnection class:

if (framedConnection != null) {
  return true; // TODO: check framedConnection.shutdown.
}

The problem seems to be related to this TODO.

@fabioCollini yup, although that snippet is an older version. I'm working on a fix.

Is there any update on this issue? We're running into the same issue in our project. After some investigation we've discovered that the following code might not work as it's expected to work StreamAllocation.java#L135-L139:

synchronized (connectionPool) {
  if (candidate.successCount == 0) {
    return candidate;
  }
}

candidate.successCount == 0 returns true, but candidate.isHealthy(doExtensiveHealthChecks) returns false for the same connection instance.

I don't have a test case at the moment, but I'll try to find and provide it ASAP.

Yea, sorry I fell behind on this. The test case is straightforward, send a GOAWAY frame without closing the connection. I'll try to wrap this up in the next day or so.

We're blocked on releasing due to this issue, so I'm trying to help out where I can... I'm not sure if this will help at this point, but here's a demonstration of the problem: https://github.com/dlew/okhttp/commit/5092a7b1cb99740e83d90364c1b3a97e5a6c19e3

bit by this too...hope it gets released soon :-)

What version was this bug introduced in? I.e., can we switch to an older version of okhttp to work around it?

Looks like the answer is no. I went back to 3.2.0 (couldn't go further w/o hitting conflicts w/ other libraries) and it's still there. It's easy to reproduce in an app by using CharlesProxy and throttling w/ a 50% reliability network.

I’m hoping to cherrypick #2955 and release it as 3.4.2.

https://github.com/square/okhttp/pull/2955

Released as 3.4.2.

Thanks for the quick release @swankjesse...that fixed a weird issue w/ our app where a bad network caused a timeout and then the network stack fell over and the user had to restart the app to fix it :-)

From more testing, I found that if the network gets crappy enough (if you leave CharlesSSL at 50% and keep trying to get your app to load stuff over the network when Charles gives you an unstable network), the network stack gets stuck w/ these still:
Caused by: java.io.IOException: shutdown
at okhttp3.internal.framed.FramedConnection.newStream(FramedConnection.java:238)
Retrofit (1.9.0 if it matters) then gets stuck as before...takes longer now so initial testing didn't show this...the fix definitely helped.

@kenyee yeah, I know and I’m sad about that. We’ve got a couple of failing tests where our ‘perseverence’ goals aren’t being met. We’ll fix those before closing this issue.

@dave-r12 anything left here?

Nothing I can think of.

Sweet!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SandroMachado picture SandroMachado  Â·  3Comments

dannyZhou picture dannyZhou  Â·  3Comments

vanshg picture vanshg  Â·  3Comments

GuiForget picture GuiForget  Â·  3Comments

rfc2822 picture rfc2822  Â·  3Comments