Httpx: HTTP/2 higher number of timeouts compared with 1.1

Created on 2 Mar 2020  路  25Comments  路  Source: encode/httpx

I recently observed that the number of timeouts with http2 enabled on Async client are higher than with it disabled. Made a gist with an example:

https://gist.github.com/victoraugustolls/01002ae218366b453e962446c9c8a274

bug htt2

All 25 comments

Thanks, the results in the gist are very enlightening as to the magnitude of this issue... As you noted in #832, there might be an outstanding race condition or bug we fixed for HTTP/1.1 in the past but not HTTP/2. I guess we鈥檒l have to dig into the HTTP/2 implementation to find out... If you鈥檇 like you can turn on TRACE logs and see if there鈥檚 anything odd there:

HTTPX_LOG_LEVEL=trace python program.py

@florimondmanca updated gist with results from running with httpxprof of yours! Will enable trace and post it there.

It seems to get stuck here:
TRACE: release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))

Traces are too long and I can't upload files to gist, so I will upload them here:

With HTTP/2 enabled -> trace.txt

With HTTP/2 disabled -> trace_2.txt

But I think the interesting bits are before and after the exceptions:

TRACE:    receive_event stream_id=1 event=<StreamEnded stream_id:1>
TRACE [2020-03-02 13:13:09] httpx.dispatch.http2 - receive_event stream_id=0 event=<PingReceived ping_data:0000000000000000>
TRACE:    receive_event stream_id=0 event=<PingReceived ping_data:0000000000000000>
TRACE [2020-03-02 13:13:09] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE [2020-03-02 13:13:09] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:09] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE [2020-03-02 13:13:09] httpx.dispatch.connection - close_connection
TRACE:    close_connection
[2020-03-02 13:13:13] <class 'httpx.exceptions.ReadTimeout'>:
[2020-03-02 13:13:18] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 13:13:23] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 13:13:28] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 13:13:33] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 13:13:38] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 13:13:43] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 13:13:48] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
Count: 2
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection_pool - acquire_connection origin=Origin(scheme='https' host='google.com' port=443)
TRACE:    acquire_connection origin=Origin(scheme='https' host='google.com' port=443)
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection_pool - acquire_connection origin=Origin(scheme='https' host='google.com' port=443)
TRACE:    acquire_connection origin=Origin(scheme='https' host='google.com' port=443)
TRACE [2020-03-02 13:13:48] httpx.dispatch.connection_pool - acquire_connection origin=Origin(scheme='https' host='google.com' port=443)
TRACE:    acquire_connection origin=Origin(scheme='https' host='google.com' port=443)

I tried to find something hanging, but no luck so far...

I notice the KeyError鈥檚 are exactly 5s apart from one another. This is both the default connect timeout and the keepalive period for connections. So are they caused by our keepalive checking mechanism? Or is it a retry mechanism? Or is it what should have been a ConnectTimeout whose handling resulted in another error?

I think the KeyError's are due to #817 and should have been timeouts? Don't know for sure. But looks like all the connections actually raised an error, but due to #817 KeyError was raised

Wait. 5 is also the number of requests fired off in a batch. I notice the snippet fires them off perfectly in parallel via asyncio.gather(). I remembered this has caused issues in the past for HTTP/2.

Can you:

  • Add a random jitter delay to each request and see if that has any effect?
  • Change the batch size and connect timeout to distinct values that are not 5? Eg 3 and 4. That way we can better see what error is caused by what mechanism (parallel Requests, host connection, keepalive expiry, etc.)

As for #817, does https://github.com/encode/httpx/issues/817#issuecomment-592443931 suppress the KeyError for you as well? If so, what do TRACE logs look like with this patch?

Tried with asyncio.gather() with 10 requests in parallel and this was the output:

DEBUG:    HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
TRACE [2020-03-02 17:03:54] httpx.dispatch.http2 - receive_event stream_id=5 event=<DataReceived stream_id:5, flow_controlled_length:5454, data:1f8b08000000000002ffc55b5b73dbc6927ef7af>
TRACE:    receive_event stream_id=5 event=<DataReceived stream_id:5, flow_controlled_length:5454, data:1f8b08000000000002ffc55b5b73dbc6927ef7af>
TRACE [2020-03-02 17:03:54] httpx.dispatch.http2 - receive_event stream_id=5 event=<DataReceived stream_id:5, flow_controlled_length:96, data:>
TRACE:    receive_event stream_id=5 event=<DataReceived stream_id:5, flow_controlled_length:96, data:>
TRACE [2020-03-02 17:03:54] httpx.dispatch.http2 - receive_event stream_id=5 event=<StreamEnded stream_id:5>
TRACE:    receive_event stream_id=5 event=<StreamEnded stream_id:5>
TRACE [2020-03-02 17:03:54] httpx.dispatch.http2 - receive_event stream_id=0 event=<PingReceived ping_data:0000000000000004>
TRACE:    receive_event stream_id=0 event=<PingReceived ping_data:0000000000000004>
TRACE [2020-03-02 17:03:54] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
Client returned
[2020-03-02 17:03:59] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:04:04] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:04:09] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:04:14] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:04:19] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:04:24] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:04:29] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:04:34] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
Count: 2
TRACE [2020-03-02 17:04:34] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:04:34] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:04:34] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:04:34] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:04:34] httpx.dispatch.connection - close_connection

With 3 requests in parallel:

DEBUG [2020-03-02 17:06:15] httpx.client - HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
DEBUG:    HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
TRACE [2020-03-02 17:06:15] httpx.dispatch.http2 - receive_event stream_id=3 event=<DataReceived stream_id:3, flow_controlled_length:5496, data:1f8b08000000000002ffc55b5973dbc6967ef7af>
TRACE:    receive_event stream_id=3 event=<DataReceived stream_id:3, flow_controlled_length:5496, data:1f8b08000000000002ffc55b5973dbc6967ef7af>
TRACE [2020-03-02 17:06:15] httpx.dispatch.http2 - receive_event stream_id=3 event=<DataReceived stream_id:3, flow_controlled_length:126, data:>
TRACE:    receive_event stream_id=3 event=<DataReceived stream_id:3, flow_controlled_length:126, data:>
TRACE [2020-03-02 17:06:15] httpx.dispatch.http2 - receive_event stream_id=3 event=<StreamEnded stream_id:3>
TRACE:    receive_event stream_id=3 event=<StreamEnded stream_id:3>
TRACE [2020-03-02 17:06:15] httpx.dispatch.http2 - receive_event stream_id=0 event=<PingReceived ping_data:0000000000000002>
TRACE:    receive_event stream_id=0 event=<PingReceived ping_data:0000000000000002>
TRACE [2020-03-02 17:06:15] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
Client returned
[2020-03-02 17:06:20] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:06:25] <class 'httpx.exceptions.ReadTimeout'>: 
Count: 2
TRACE [2020-03-02 17:06:25] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:06:25] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:06:25] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:06:25] httpx.dispatch.connection - close_connection

Add a random jitter delay to each request and see if that has any effect?

added await asyncio.sleep(random.random()) before the client.get() and the result (for 10 parallel requests) was:

TRACE [2020-03-02 17:11:30] httpx.dispatch.http2 - receive_event stream_id=31 event=<DataReceived stream_id:31, flow_controlled_length:5527, data:1f8b08000000000002ffc55b5b73db46967ef7af>
TRACE:    receive_event stream_id=31 event=<DataReceived stream_id:31, flow_controlled_length:5527, data:1f8b08000000000002ffc55b5b73db46967ef7af>
TRACE [2020-03-02 17:11:30] httpx.dispatch.http2 - receive_event stream_id=31 event=<DataReceived stream_id:31, flow_controlled_length:189, data:>
TRACE:    receive_event stream_id=31 event=<DataReceived stream_id:31, flow_controlled_length:189, data:>
TRACE [2020-03-02 17:11:30] httpx.dispatch.http2 - receive_event stream_id=31 event=<StreamEnded stream_id:31>
TRACE:    receive_event stream_id=31 event=<StreamEnded stream_id:31>
DEBUG [2020-03-02 17:11:30] httpx.client - HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
DEBUG:    HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
TRACE [2020-03-02 17:11:30] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
Client returned
[2020-03-02 17:11:34] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:11:39] <class 'httpx.exceptions.ReadTimeout'>: 
Count: 2
TRACE [2020-03-02 17:11:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:11:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:11:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:11:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:11:40] httpx.dispatch.connection - close_connection

Will try #817 fix next

If I did this right:

async def request_multiple(client: httpx.AsyncClient):
    count = 0

    while count <= 2:
        print("Count:", count)
        todo = [
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client)),
            asyncio.create_task(do_request(client=client))
        ]
        try:
            await asyncio.gather(*todo)
        finally:
            for t in todo:
                if not t.done():
                    t.cancel()
        count += 1
    print("Finished")

The output was (without jitter):

DEBUG [2020-03-02 17:16:05] httpx.client - HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
DEBUG:    HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
Client returned
TRACE [2020-03-02 17:16:05] httpx.dispatch.http2 - receive_event stream_id=7 event=<DataReceived stream_id:7, flow_controlled_length:5578, data:1f8b08000000000002ffc55b5b779bc8967ecfaf>
TRACE:    receive_event stream_id=7 event=<DataReceived stream_id:7, flow_controlled_length:5578, data:1f8b08000000000002ffc55b5b779bc8967ecfaf>
TRACE [2020-03-02 17:16:05] httpx.dispatch.http2 - receive_event stream_id=7 event=<DataReceived stream_id:7, flow_controlled_length:227, data:>
TRACE:    receive_event stream_id=7 event=<DataReceived stream_id:7, flow_controlled_length:227, data:>
TRACE [2020-03-02 17:16:05] httpx.dispatch.http2 - receive_event stream_id=7 event=<StreamEnded stream_id:7>
TRACE:    receive_event stream_id=7 event=<StreamEnded stream_id:7>
TRACE [2020-03-02 17:16:05] httpx.dispatch.http2 - receive_event stream_id=0 event=<PingReceived ping_data:0000000000000004>
TRACE:    receive_event stream_id=0 event=<PingReceived ping_data:0000000000000004>
DEBUG [2020-03-02 17:16:05] httpx.client - HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
DEBUG:    HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
TRACE [2020-03-02 17:16:05] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
Client returned
[2020-03-02 17:16:10] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:16:15] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:16:20] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:16:25] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:16:30] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:16:35] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:16:40] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
Count: 2
TRACE [2020-03-02 17:16:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:16:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:16:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:16:40] httpx.dispatch.connection - close_connection
TRACE:    close_connection

With a timeout value of 10.0:

TRACE [2020-03-02 17:21:43] httpx.dispatch.http2 - receive_event stream_id=11 event=<DataReceived stream_id:11, flow_controlled_length:5331, data:1f8b08000000000002ffc55b5973dbc6967ef7af>
TRACE:    receive_event stream_id=11 event=<DataReceived stream_id:11, flow_controlled_length:5331, data:1f8b08000000000002ffc55b5973dbc6967ef7af>
TRACE [2020-03-02 17:21:43] httpx.dispatch.http2 - receive_event stream_id=11 event=<DataReceived stream_id:11, flow_controlled_length:184, data:>
TRACE:    receive_event stream_id=11 event=<DataReceived stream_id:11, flow_controlled_length:184, data:>
TRACE [2020-03-02 17:21:43] httpx.dispatch.http2 - receive_event stream_id=11 event=<StreamEnded stream_id:11>
TRACE:    receive_event stream_id=11 event=<StreamEnded stream_id:11>
DEBUG [2020-03-02 17:21:43] httpx.client - HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
DEBUG:    HTTP Request: GET https://www.google.com/ "HTTP/2 200 OK"
TRACE [2020-03-02 17:21:43] httpx.dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
TRACE:    release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.google.com' port=443))
Client returned
[2020-03-02 17:21:53] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:22:03] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:22:13] <class 'httpx.exceptions.ReadTimeout'>: 
[2020-03-02 17:22:23] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
[2020-03-02 17:22:33] <class 'KeyError'>: HTTPConnection(origin=Origin(scheme='https' host='google.com' port=443))
Count: 2
TRACE [2020-03-02 17:22:33] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:22:33] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:22:33] httpx.dispatch.connection - close_connection
TRACE:    close_connection
TRACE [2020-03-02 17:22:33] httpx.dispatch.connection - close_connection
TRACE:    close_connection

It appears that the time between errors is indeed the timeout time (read timeout)

@florimondmanca @tomchristie excellent news here! I had 0 timeouts after retrying the tests here multiple times with the httpcore-interface branch!

Fantastic! 馃憤

Just deployed a version of the service using the branch with httpcore interface. So far so good, 0 errors. Will post and update in the end of the day.

Update: now with errors :/ both connect timeout and <class 'httpcore._exceptions.ProtocolError'>: can't handle event type ConnectionClosed when role=SERVER and state=SEND_RESPONSE

And now: <class 'httpcore._exceptions.WriteError'>: [Errno 104] Connection reset by peer to a central microservice that everyone connects, and only the one with the httpcore interface receive this error.

Also: ProtocolError: Invalid input ConnectionInputs.RECV_HEADERS in state ConnectionState.CLOSED

Also: ProtocolError: <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:1999, additional_data:None>

@victoraugustolls Okay, so let's take these one at a time, in a bit more detail. They're generally looking roughly like valid behaviors that just need some neater error handling.

I'd suggest starting by opening an issue on httpcore with one of these, complete with a full traceback, plus a bit of information about what server you're using and other potentiatially useful infor for replication, so that we can work through the case comprehensively.

Once we've done that let's have open issues for each of these seperately on httpcore, and work through them fully one at a time.

Closing as issues are reflected in httpcore https://github.com/encode/httpcore/issues/45 and https://github.com/encode/httpcore/issues/46.

For the record, I just tested the original gist against httpx==0.13.0.dev2 and got the same result for http1.1 and http2.

Nice! Will update on my end (and my application) to see the result!

Was this page helpful?
0 / 5 - 0 ratings