A server may emit a response and close the socket before the httpx client has finished sending the request body. If I am not mistaken, in this situation httpx raises a NetworkError due to send failure, and won't try to receive the response even though the socket is still readable.
Would it be possible to try to read the response before raising an exception?
This is of interest mainly with Payload Too Large errors, where we don't want to receive the entire body, but wish to inform the client with a 413 response.
Hi, thanks!
Do you have a simple reproduction example for this, maybe uploading a large file to a local uvicorn server? Would be helpful to pinpoint the exact issue. :)
Test server
import trio
response = b"HTTP/1.0 413 PayloadTooLarge\r\n\r\n"
async def handler(stream):
async with stream:
print("Client connected")
buf = b""
while b"\r\n\r\n" not in buf:
data = await stream.receive_some()
print(data)
if not data:
print("Unexpected end of request")
return
buf += data
await stream.send_all(response)
print(response)
print("Server closed socket")
trio.run(trio.serve_tcp, handler, 8000)
Test client
import httpx, trio
@trio.run
async def main():
async with httpx.AsyncClient() as client:
r = await client.post("http://localhost:8000/", data=100_000_000 * b"x")
print(r)
If no data is sent with post, httpx gets a 413 response like it should. With request body (100 MB data) it raises httpx.exceptions.NetworkError.
File "testclient.py", line 6, in main
r = await client.post("http://localhost:8000/", data=100_000_000 * b"x")
File "/usr/local/lib/python3.7/site-packages/httpx/client.py", line 1316, in post
timeout=timeout,
File "/usr/local/lib/python3.7/site-packages/httpx/client.py", line 1097, in request
request, auth=auth, allow_redirects=allow_redirects, timeout=timeout,
File "/usr/local/lib/python3.7/site-packages/httpx/client.py", line 1118, in send
request, auth=auth, timeout=timeout, allow_redirects=allow_redirects,
File "/usr/local/lib/python3.7/site-packages/httpx/client.py", line 1148, in send_handling_redirects
request, auth=auth, timeout=timeout, history=history
File "/usr/local/lib/python3.7/site-packages/httpx/client.py", line 1184, in send_handling_auth
response = await self.send_single_request(request, timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/client.py", line 1208, in send_single_request
response = await dispatcher.send(request, timeout=timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/dispatch/connection_pool.py", line 157, in send
raise exc
File "/usr/local/lib/python3.7/site-packages/httpx/dispatch/connection_pool.py", line 153, in send
response = await connection.send(request, timeout=timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/dispatch/connection.py", line 44, in send
return await self.connection.send(request, timeout=timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/dispatch/http11.py", line 51, in send
await self._send_request_body(request, timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/dispatch/http11.py", line 101, in _send_request_body
await self._send_event(event, timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/dispatch/http11.py", line 118, in _send_event
await self.socket.write(bytes_to_send, timeout)
File "/usr/local/lib/python3.7/site-packages/httpx/backends/trio.py", line 65, in write
return await self.stream.send_all(data)
File "/usr/local/Cellar/python/3.7.7/Frameworks/Python.framework/Versions/3.7/lib/python3.7/contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.7/site-packages/httpx/utils.py", line 368, in as_network_error
raise NetworkError(exc) from exc
httpx.exceptions.NetworkError: socket connection broken: [Errno 41] Protocol wrong type for socket
Server log:
Client connected
b'POST / HTTP/1.1\r\nhost: localhost:8000\r\nuser-agent: python-httpx/0.11.1\r\naccept: */*\r\naccept-encoding: gzip, deflate\r\nconnection: keep-alive\r\ncontent-length: 100000000\r\n\r\n'
b'HTTP/1.0 413 PayloadTooLarge\r\n\r\n'
Server closed socket
@Tronic Perfect! I was able to reproduce. The error seems to randomly switch between the [Errno 41] Protocol wrong type for socket and [Errno 32] Broken pipe.
I was curious and tried on a uvicorn server instead of a raw TCP server:
from typing import Callable
import uvicorn
from starlette.requests import Request
from starlette.responses import PlainTextResponse
async def app(scope: dict, receive: Callable, send: Callable) -> None:
request = Request(scope, receive=receive)
size = 0
async for chunk in request.stream():
size += len(chunk)
if size > 1000:
response = PlainTextResponse("Too large", status_code=413)
await response(scope, receive, send)
break
uvicorn.run(app, log_level="trace")
The client is able to receive the response (even though the request body was not read in full).
But as can be seen from the TRACE logs, the client will still send the entire request body first, even though the request body is split into chunks (meaning that the server will indeed stop early).
Streaming client:
import httpx
import trio
@trio.run
async def main():
async with httpx.AsyncClient() as client:
async def data():
for _ in range(10):
yield 10_000_000 * b"x"
r = await client.post("http://localhost:8000/", data=data())
print(r)
Client logs:
$ HTTPX_LOG_LEVEL=trace python debug/client.py
TRACE [2020-03-27 09:59:26] httpx._config - load_ssl_context verify=True cert=None trust_env=True http2=False
TRACE [2020-03-27 09:59:26] httpx._config - load_verify_locations cafile=/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/certifi/cacert.pem
TRACE [2020-03-27 09:59:26] httpx._dispatch.connection_pool - acquire_connection origin=Origin(scheme='http' host='localhost' port=8000)
TRACE [2020-03-27 09:59:26] httpx._dispatch.connection_pool - new_connection connection=HTTPConnection(origin=Origin(scheme='http' host='localhost' port=8000))
TRACE [2020-03-27 09:59:26] httpx._dispatch.connection - start_connect tcp host='localhost' port=8000 timeout=Timeout(timeout=5.0)
TRACE [2020-03-27 09:59:26] httpx._dispatch.connection - connected http_version='HTTP/1.1'
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_headers method='POST' target='/' headers=Headers({'host': 'localhost:8000', 'user-agent': 'python-httpx/0.12.1', 'accept': '*/*', 'accept-encoding': 'gzip, deflate, br', 'connection': 'keep-alive', 'transfer-encoding': 'chunked'})
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:26] httpx._dispatch.http11 - send_data data=Data(<10000000 bytes>)
TRACE [2020-03-27 09:59:27] httpx._dispatch.http11 - receive_event event=NEED_DATA
TRACE [2020-03-27 09:59:27] httpx._dispatch.http11 - receive_event event=Response(status_code=413, headers=[(b'date', b'Fri, 27 Mar 2020 08:59:25 GMT'), (b'server', b'uvicorn'), (b'content-length', b'9'), (b'content-type', b'text/plain; charset=utf-8')], http_version=b'1.1', reason=b'Request Entity Too Large')
DEBUG [2020-03-27 09:59:27] httpx._client - HTTP Request: POST http://localhost:8000/ "HTTP/1.1 413 Request Entity Too Large"
TRACE [2020-03-27 09:59:27] httpx._dispatch.http11 - receive_event event=Data(<9 bytes>)
TRACE [2020-03-27 09:59:27] httpx._dispatch.http11 - receive_event event=EndOfMessage(headers=[])
TRACE [2020-03-27 09:59:27] httpx._dispatch.http11 - response_closed our_state=DONE their_state=DONE
TRACE [2020-03-27 09:59:27] httpx._dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='http' host='localhost' port=8000))
<Response [413 Request Entity Too Large]>
TRACE [2020-03-27 09:59:27] httpx._dispatch.connection - close_connection
TRACE [2020-03-27 09:59:27] httpx._dispatch.http11 - send_event event=ConnectionClosed()
Server logs:
TRACE: 127.0.0.1:52386 - Connection made
TRACE: 127.0.0.1:52386 - ASGI [2] Started scope={'type': 'http', 'http_version': '1.1', 'server': ('127.0.0.1', 8000), 'client': ('127.0.0.1', 52386), 'scheme': 'http', 'method': 'POST', 'root_path': '', 'path': '/', 'raw_path': b'/', 'query_string': b'', 'headers': '<...>'}
TRACE: 127.0.0.1:52386 - ASGI [2] Receive {'type': 'http.request', 'body': '<255992 bytes>', 'more_body': True}
TRACE: 127.0.0.1:52386 - ASGI [2] Send {'type': 'http.response.start', 'status': 413, 'headers': '<...>'}
INFO: 127.0.0.1:52386 - "POST / HTTP/1.1" 413 Request Entity Too Large
TRACE: 127.0.0.1:52386 - ASGI [2] Send {'type': 'http.response.body', 'body': '<9 bytes>'}
TRACE: 127.0.0.1:52386 - ASGI [2] Completed
TRACE: 127.0.0.1:52386 - Connection lost
I'm assuming there's no crash here because Uvicorn is a bit smarter: it seems like internally it has a reader task that puts stuff in a queue for receive() to consume, and it will keep the reader running (and the socket open) — even after the response has been sent — until the client has finished sending the request.
(I _think_ this is what happens because if I add a trio.sleep(1) between chunks on the client side, and Ctrl+C the server after the response was sent, the client fails to send the next chunks with the same [Errno 41] Protocol wrong type for socket error.)
I guess the real topic here is the same as in #877 — whether HTTPX should read responses concurrently to sending the request body.
Anyway, as for the NetworkError failure: IMO it is expected.
The raw TCP server seems to be hard-shutting down the socket (it doesn't wait for the client to finish whatever it's doing, i.e. sending the request body), so it effectively hangs up and the client is left with an unusable socket.
So, closing, but we can continue discussion on #877…
A client using raw streams manages to read the response completely.
import trio
@trio.run
async def main():
async with await trio.open_tcp_stream("localhost", 8000) as stream:
await stream.send_all(
b"POST / HTTP/1.0\r\ntransfer-encoding: chunked\r\ncontent-type: text/plain\r\n\r\n"
)
try:
# Infinite chunks until server disconnects
while True:
await stream.send_all(b"1000\r\n" + 0x1000 * b"X" + b"\r\n")
except trio.BrokenResourceError as e:
print(f"Send error: {e}")
response = b""
# Receive response
with trio.move_on_after(1):
try:
async for data in stream:
response += data
except trio.BrokenResourceError as e:
print(f"Recv error: {e}")
print(response.decode())
Notice that sending chunks fails with Errno 32 when server closes the socket but receiving still succeeds until all data is consumed and Errno 54 is received:
Send error: socket connection broken: [Errno 32] Broken pipe
Recv error: socket connection broken: [Errno 54] Connection reset by peer
HTTP/1.0 413 PayloadTooLarge
I believe that httpx could handle this similarly, proceeding to receive response after sending fails, even if it didn't support concurrent streaming.
I believe that httpx could handle this similarly, proceeding to receive response after sending fails, even if it didn't support concurrent streaming.
Yup, sounds sensible to me.