Okhttp: HTTP/2 Write Timeouts Aren’t Immediate

Created on 18 Dec 2018  ·  13Comments  ·  Source: square/okhttp

  • OkHttp 3.12.0
  • Android asynctask that sends a set of little file (about 1.3mb each file) via HTTP2 POST to a cloud service, one by one
  • Android device with Android 5.1.1. (API LEVEL 22)
  • HttpClient build as following:
val client = OkHttpClient.Builder()
                .addInterceptor(logging)
                .connectTimeout(120, TimeUnit.SECONDS)
                .readTimeout(5000, TimeUnit.MILLISECONDS)
                .writeTimeout(5000, TimeUnit.MILLISECONDS)
                .callTimeout(10000, TimeUnit.MILLISECONDS)
                .build()
  • this is the async task code
fun uploadFile(fileName: String, fileInputStream: InputStream, totalNumber: Int, index: Int) {
            val client = OkHttpClient.Builder()
                .connectTimeout(120, TimeUnit.SECONDS)
                .readTimeout(5000, TimeUnit.MILLISECONDS)
                .writeTimeout(5000, TimeUnit.MILLISECONDS)
                .callTimeout(10000, TimeUnit.MILLISECONDS)
                .build()

            val data = fileInputStream.readBytes()

            Timber.i("Start uploading $fileName (${data.size} bytes)")

            val digest = md5(data)

            val body = RequestBody.create(null, data)

            val request = Request.Builder()
                .addHeader(HEADER_CHECKSUM, digest)
                .url(url)
                .post(body)
                .build()

            try {
                val call = client.newCall(request)

                call.execute().use { response ->
                   /* do nothing */
                }
            } catch (e : Exception) {
                Timber.i(e)
            } 
}
  • At my offiche I have a very bad network condition with a lot of TCP retransmission

EXPECTED:

  • The maximum expected time for each file (callTimeout) should be 10 seconds
    ACTUAL:
  • these are for example the timing for 10 file:
  • 85622ms - UPLOAD ERROR
  • 8539ms - UPLOAD ERROR
  • 18381ms - UPLOAD ERROR
  • 8564ms - UPLOAD ERROR
  • 10100ms - UPLOAD ERROR
  • 6810ms
  • 61457ms - UPLOAD ERROR
  • 9088ms
  • 8040ms
  • 22351ms - UPLOAD ERROR

Seven uploads exceeded the callTimeout and goes in error.
I can reproduce this behaviour easily but this happens only here at my office, so an example code could be useless

bug

Most helpful comment

I think the core problem is that our HTTP/2 write timeouts don’t have teeth. On HTTP/1 we close the underlying socket which takes effect immediately. On HTTP/2 we cancel the stream, but if it’s currently blocked on a write it won’t cancel until that write returns.

All 13 comments

Executable test case?

@swankjesse no, because this happens only at my office. If I try the same application on the same device at home, the upload is super fast without any timeout or delay (about 500ms each file)

Another info: the same code with a post to a standard HTTP 1.1 server works as expected

Example code at https://gist.github.com/sarbyn/2940cb5a5f4c6c144410f3dd7569fdfe
You need to use an HTTP2 server and a large file (for example 40mb)
Tie call timeout is set to 10 seconds. In the header comment you can find an output

@sarbyn, can you set this on OkHttpClient.Builder() and provide the output from a problematic run?

.eventListenerFactory(new LoggingEventListener.Factory()).build();

Also, does this reproduce with http://www.nghttp2.org/httpbin/post?

This is the output with LoggingEventListener enabled

2018-12-27 10:12:05.159 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [0 ms] callStart: Request{method=POST, url=https://MY_SERVER_NAME:8082/upload, tags={}}
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: --> POST https://MY_SERVER_NAME:8082/upload
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: Content-Length: 1332109
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: MY_CUSTOM_HEADER_1: cecf15f862a82c9bcebfb2f19661fdf2
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: MY_CUSTOM_HEADER_2: 5
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: MY_CUSTOM_HEADER_3: 4
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: MY_CUSTOM_HEADER_4: 7edb2819-36ab-4bb2-b80e-85050f23734f
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: MY_CUSTOM_HEADER_5: 2
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: --> END POST
2018-12-27 10:12:05.160 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [0 ms] dnsStart: MY_SERVER_NAME
2018-12-27 10:12:05.194 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [34 ms] dnsEnd: [MY_SERVER_NAME/MY_SERVER_IP]
2018-12-27 10:12:05.194 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [35 ms] connectStart: MY_SERVER_NAME/MY_SERVER_IP:8082 DIRECT
2018-12-27 10:12:05.216 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [56 ms] secureConnectStart
2018-12-27 10:12:05.246 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [86 ms] secureConnectEnd
2018-12-27 10:12:05.248 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [88 ms] connectEnd: h2
2018-12-27 10:12:05.248 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [88 ms] connectionAcquired: Connection{MY_SERVER_NAME:8082, proxy=DIRECT hostAddress=MY_SERVER_NAME/MY_SERVER_IP:8082 cipherSuite=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 protocol=h2}
2018-12-27 10:12:05.248 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [88 ms] requestHeadersStart
2018-12-27 10:12:05.249 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [89 ms] requestHeadersEnd
2018-12-27 10:12:05.249 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [89 ms] requestBodyStart
2018-12-27 10:12:17.714 15856-16590/com.sarbyn.androidsessionupload W/zygote: Long monitor contention with owner AsyncTask #8 (16569) at void com.android.org.conscrypt.NativeCrypto.SSL_write(long, java.io.FileDescriptor, com.android.org.conscrypt.NativeCrypto$SSLHandshakeCallbacks, byte[], int, int, int)(NativeCrypto.java:-2) waiters=0 in void okhttp3.internal.http2.Http2Writer.rstStream(int, okhttp3.internal.http2.ErrorCode) for 7.440s
2018-12-27 10:12:17.714 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [12554 ms] connectionReleased
2018-12-27 10:12:17.714 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [12554 ms] callEnd
2018-12-27 10:12:17.714 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: <-- HTTP FAILED: java.net.SocketTimeoutException: timeout
2018-12-27 10:12:17.715 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [12555 ms] callFailed: java.net.SocketTimeoutException: timeout
2018-12-27 10:12:17.717 15856-16569/com.sarbyn.androidsessionupload I/Uploader$Companion: java.net.SocketTimeoutException: timeout
        at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:656)
        at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:664)
        at okhttp3.internal.http2.Http2Stream$FramingSink.emitFrame(Http2Stream.java:572)
        at okhttp3.internal.http2.Http2Stream$FramingSink.write(Http2Stream.java:543)
        at okio.ForwardingSink.write(ForwardingSink.java:35)
        at okhttp3.internal.http.CallServerInterceptor$CountingSink.write(CallServerInterceptor.java:149)
        at okio.RealBufferedSink.emitCompleteSegments(RealBufferedSink.java:179)
        at okio.RealBufferedSink.write(RealBufferedSink.java:92)
        at okhttp3.RequestBody$2.writeTo(RequestBody.java:98)
        at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:72)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:225)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
        at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:254)
        at okhttp3.RealCall.execute(RealCall.java:92)
        at com.sarbyn.androidsessionupload.Uploader$Companion.uploadFile(Uploader.kt:184)
        at com.sarbyn.androidsessionupload.UploaderAsyncTask.doInBackground(UploaderAsyncTask.kt:23)
        at com.sarbyn.androidsessionupload.UploaderAsyncTask.doInBackground(UploaderAsyncTask.kt:10)
        at android.os.AsyncTask$2.call(AsyncTask.java:333)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:245)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
        at java.lang.Thread.run(Thread.java:764)
2018-12-27 10:12:17.717 15856-16569/com.sarbyn.androidsessionupload I/Uploader$Companion: TimeElapsed 12558
2018-12-27 10:12:05.249 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [89 ms] requestBodyStart
2018-12-27 10:12:17.714 15856-16590/com.sarbyn.androidsessionupload W/zygote: Long monitor contention with owner AsyncTask #8 (16569) at void com.android.org.conscrypt.NativeCrypto.SSL_write(long, java.io.FileDescriptor, com.android.org.conscrypt.NativeCrypto$SSLHandshakeCallbacks, byte[], int, int, int)(NativeCrypto.java:-2) waiters=0 in void okhttp3.internal.http2.Http2Writer.rstStream(int, okhttp3.internal.http2.ErrorCode) for 7.440s
2018-12-27 10:12:17.714 15856-16569/com.sarbyn.androidsessionupload D/OkHttp: [12554 ms] connectionReleased

looks suspicious. The fact that at 10:12:17.714, the monitor was held by SSL_write() for 7.440s, means that it was taken at 10:12:10.274, which is 5.025s after starting to write the request body. Both this and the stack trace of the SocketTimeoutException indicate that the 5s write timeout kicked in and blocked on writing the RST_STREAM frame. Even if there was no write timeout, it would have been the call timeout blocked on writing the RST_STREAM frame, preventing the call from ending after exactly 10 seconds as you expect.

This can also explain why it happens only with HTTP/2, as with HTTP/1.1 the socket is simply closed when the timeout fires and there is no attempt to write anything, which may block.

I'm not sure if this is an actual issue. I expect that if you remove the write timeout and add some logging, you'll see that the call timeout in fact kicks in after 10 seconds, but just takes a few extra seconds to completely end the call.

The last log is from an android application with AsyncTask but happens also with the Gist provided in the link above.

For example look at the log below (Read/write timeout 5 seconds - callTimeout 10 seconds). The socket timeout is fired afted 17 seconds:

dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [0 ms] callStart: Request{method=POST, url=https://blablabla.sarbyn.com:8443/upload, tags={}}
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: --> POST https://blablabla.sarbyn.com:8443/upload
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: Content-Length: 1930048
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: HEADER-1: bbcf35da1f80fd90483b990df7dacf52
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: HEADER-2: 10
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: HEADER-3: 3
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: HEADER-4: cf2d6535-cd3a-4f2b-bbd7-ae6d03d70e00
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: HEADER-5: 2
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: --> END POST
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [2 ms] dnsStart: blablabla.sarbyn.com
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [2 ms] dnsEnd: [blablabla.sarbyn.com/MY_IP_ADDRESS]
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [2 ms] connectStart: blablabla.sarbyn.com/MY_IP_ADDRESS:8443 DIRECT
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [14 ms] secureConnectStart
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [96 ms] secureConnectEnd
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [97 ms] connectEnd: h2
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [98 ms] connectionAcquired: Connection{blablabla.sarbyn.com:8443, proxy=DIRECT hostAddress=blablabla.sarbyn.com/MY_IP_ADDRESS:8443 cipherSuite=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 protocol=h2}
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [98 ms] requestHeadersStart
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [99 ms] requestHeadersEnd
dic 28, 2018 9:11:44 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [99 ms] requestBodyStart
dic 28, 2018 9:12:01 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [17336 ms] connectionReleased
dic 28, 2018 9:12:01 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [17336 ms] callEnd
dic 28, 2018 9:12:01 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: <-- HTTP FAILED: java.net.SocketTimeoutException: timeout
dic 28, 2018 9:12:01 AM okhttp3.internal.platform.Platform log
INFORMAZIONI: [17337 ms] callFailed: java.net.SocketTimeoutException: timeout

Another strange things: this is the upload result, (same laptop, same network, same server, same list of file), of a python script:

fileName,   time
1,  11938
2,  3273
3,  2723
4,  6597
5,  4929
6,  514
7,  3511
8,  2682
9,  6398
10, 3106
Done, total time is 45671 ms. goodbye

This is the result with OkHTTP (Read/write timeout 5 seconds - callTimeout 10 seconds)

fileName,time,success
1,      14187,      Timeout
2,      2462,       1
3,      20640,      Timeout
4,      4871,       1
5,      19881,      Timeout
6,      8096,       Timeout
7,      18926,      Timeout
8,      16175,      Timeout
9,      4174,       1
10,     7609,       Timeout
Total time is 117021ms. Goodbye

45 seconds vs 117 😞

I think the core problem is that our HTTP/2 write timeouts don’t have teeth. On HTTP/1 we close the underlying socket which takes effect immediately. On HTTP/2 we cancel the stream, but if it’s currently blocked on a write it won’t cancel until that write returns.

Any news on this?

@gabrielpopa sadly none!

I think the core problem is that our HTTP/2 write timeouts don’t have teeth. On HTTP/1 we close the underlying socket which takes effect immediately. On HTTP/2 we cancel the stream, but if it’s currently blocked on a write it won’t cancel until that write returns.

Related to https://github.com/square/okhttp/issues/1902, should we be interrupting ourselves?

Perhaps!

Was this page helpful?
0 / 5 - 0 ratings