Httpx: 0.14.x's sync client errors with ReadError w/ nginx 'chunked_transfer_encoding off'

Created on 12 Aug 2020  Â·  18Comments  Â·  Source: encode/httpx

Checklist

  • [x] The bug is reproducible against the latest release and/or master.
  • [x] There are no similar issues or pull requests to fix it yet.

Describe the bug

Since the update to 0.14, I'm getting the following error when trying to communicate with an nginx server that has set chunked_transfer_encoding off:

ReadError                                 Traceback (most recent call last)
<ipython-input-3-0399eff110e8> in <module>
----> 1 resp = httpx.get(
      2     f"https://pypi.XXX/root/vrmd/{ name }",
      3     headers={"Accept": "application/json"},
      4 )

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_api.py in get(url, params, headers, cookies, auth, allow_redirects, cert, verify, timeout, trust_env)
    157     this function, as `GET` requests should not include a request body.
    158     """
--> 159     return request(
    160         "GET",
    161         url,

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_api.py in request(method, url, params, data, files, json, headers, cookies, auth, timeout, allow_redirects, verify, cert, trust_env)
     84         cert=cert, verify=verify, timeout=timeout, trust_env=trust_env,
     85     ) as client:
---> 86         return client.request(
     87             method=method,
     88             url=url,

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_client.py in request(self, method, url, data, files, json, params, headers, cookies, auth, allow_redirects, timeout)
    638             cookies=cookies,
    639         )
--> 640         return self.send(
    641             request, auth=auth, allow_redirects=allow_redirects, timeout=timeout,
    642         )

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_client.py in send(self, request, stream, auth, allow_redirects, timeout)
    674         if not stream:
    675             try:
--> 676                 response.read()
    677             finally:
    678                 response.close()

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_models.py in read(self)
    885         """
    886         if not hasattr(self, "_content"):
--> 887             self._content = b"".join(self.iter_bytes())
    888         return self._content
    889

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_models.py in iter_bytes(self)
    896             yield self._content
    897         else:
--> 898             for chunk in self.iter_raw():
    899                 yield self.decoder.decode(chunk)
    900             yield self.decoder.flush()

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpx/_models.py in iter_raw(self)
    929
    930         self.is_stream_consumed = True
--> 931         for part in self._raw_stream:
    932             yield part
    933         self.close()

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpcore/_sync/connection_pool.py in __iter__(self)
     48
     49     def __iter__(self) -> Iterator[bytes]:
---> 50         for chunk in self.stream:
     51             yield chunk
     52

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpcore/_bytestreams.py in __iter__(self)
     42
     43     def __iter__(self) -> Iterator[bytes]:
---> 44         for chunk in self._iterator:
     45             yield chunk
     46

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpcore/_sync/http11.py in _receive_response_data(self, timeout)
    141         """
    142         while True:
--> 143             event = self._receive_event(timeout)
    144             if isinstance(event, h11.Data):
    145                 logger.trace("receive_event=Data(<%d bytes>)", len(event.data))

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpcore/_sync/http11.py in _receive_event(self, timeout)
    158
    159             if event is h11.NEED_DATA:
--> 160                 data = self.socket.read(self.READ_NUM_BYTES, timeout)
    161                 self.h11_state.receive_data(data)
    162             else:

~/.local/venvs/tempenv-78259958181c/lib/python3.8/site-packages/httpcore/_backends/sync.py in read(self, n, timeout)
     62                 data = self.sock.recv(n)
     63                 if data == b"":
---> 64                     raise ReadError("Server disconnected while attempting read")
     65                 return data
     66

ReadError: Server disconnected while attempting read

Downgrading to 0.13.3 fixes it. All on Python 3.8.5 on macOS and Linux.

To reproduce

I can only reproduce it with nginx:

nginx version: nginx/1.10.3 (Ubuntu)
built with OpenSSL 1.0.2g  1 Mar 2016
TLS SNI support enabled
configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_geoip_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module --with-http_v2_module --with-http_sub_module --with-http_xslt_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads --add-module=/build/nginx-41auzt/nginx-1.10.3/debian/modules/nginx-auth-pam --add-module=/build/nginx-41auzt/nginx-1.10.3/debian/modules/nginx-dav-ext-module --add-module=/build/nginx-41auzt/nginx-1.10.3/debian/modules/nginx-echo --add-module=/build/nginx-41auzt/nginx-1.10.3/debian/modules/nginx-upstream-fair --add-module=/build/nginx-41auzt/nginx-1.10.3/debian/modules/ngx_http_substitutions_filter_module
c-0100:~#

Config for the server:

/etc/nginx/nginx.conf:

worker_processes 4;

user nobody nogroup;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
    accept_mutex off;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    log_format proxy '$http_host - $remote_addr - $remote_user [$time_local]  '
        '"$request" $status $body_bytes_sent '
        '"$http_referer" "$http_user_agent"';
    access_log /var/log/nginx/proxy/access.log proxy;
    error_log /var/log/nginx/proxy/error.log;
    sendfile on;
    types_hash_max_size 2048;

    gzip off;

    server_tokens off;

    ssl_prefer_server_ciphers On;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS;
    ssl_session_cache shared:SSL:10m;
    ssl_dhparam /etc/nginx/dhparams.pem;

    add_header Strict-Transport-Security max-age=63072000;
    add_header X-Content-Type-Options nosniff;
    add_header X-XSS-Protection '1; mode=block';
    add_header X-Frame-Options SAMEORIGIN;

    include /etc/nginx/conf.d/*.conf;
}

and in conf.d:

server {
    server_name pypi.XXX deferred;
    listen      REDACTED-IPv4-ADDRESS:443 ssl http2;

    gzip             on;
    gzip_min_length  2000;
    gzip_proxied     any;
    gzip_types       text/html application/json;

    ssl on;
    ssl_certificate      /etc/ssl/certs/*.XXX_all.crt;
    ssl_certificate_key  /etc/ssl/private/*.XXX.key;

    access_log /var/log/nginx/proxy/pypi/access.log;
    error_log /var/log/nginx/proxy/pypi/error.log;

    root /vrmd/pypi/.devpi/server;
    client_max_body_size 20M;

    # Chunked transfers break pip 8/9 interop. Commenting this out, fixes it.
    chunked_transfer_encoding off;

    # try serving static files directly
    location ~ /\+f/ {
        # workaround to pass non-GET/HEAD requests through to the named location below
        error_page 418 = @proxy_to_app;
        if ($request_method !~ (GET)|(HEAD)) {
            return 418;
        }

        expires max;
        try_files /+files$uri @proxy_to_app;
    }
    # try serving docs directly
    location ~ /\+doc/ {
        try_files $uri @proxy_to_app;
    }
    location / {
        # workaround to pass all requests to / through to the named location below
        error_page 418 = @proxy_to_app;
        return 418;
    }
    location @proxy_to_app {
        proxy_pass http://127.0.0.1:3141;
        proxy_set_header X-outside-url $scheme://$host:$server_port;
        proxy_set_header X-Real-IP $remote_addr;
        expires -1;  # no-cache
    }
}

Expected behavior

Keep working.

Actual behavior

Doesn't work.

Debugging material

Traceback see above.

Log:

TRACE [2020-08-12 15:18:05] httpx._config - load_ssl_context verify=True cert=None trust_env=True http2=False
TRACE [2020-08-12 15:18:05] httpx._config - load_verify_locations cafile=/usr/local/etc/openssl/cert.pem
TRACE [2020-08-12 15:18:05] httpcore._sync.connection_pool - get_connection_from_pool=(b'https', b'pypi.XXX', 443)
TRACE [2020-08-12 15:18:05] httpcore._sync.connection_pool - created connection=<SyncHTTPConnection http_version=UNKNOWN state=0>
TRACE [2020-08-12 15:18:05] httpcore._sync.connection_pool - adding connection to pool=<SyncHTTPConnection http_version=UNKNOWN state=0>
TRACE [2020-08-12 15:18:05] httpcore._sync.connection - open_socket origin=(b'https', b'pypi.XXX', 443) timeout={'connect': 5.0, 'read': 5.0, 'write': 5.0, 'pool': 5.0}
TRACE [2020-08-12 15:18:05] httpcore._sync.connection - create_connection socket=<httpcore._backends.sync.SyncSocketStream object at 0x10e682a30> http_version='HTTP/1.1'
TRACE [2020-08-12 15:18:05] httpcore._sync.connection - connection.request method=b'GET' url=(b'https', b'pypi.XXX', None, b'/') headers=[(b'host', b'pypi.XXX'), (b'user-agent', b'python-httpx/0.14.1'), (b'accept', b'*/*'), (b'accept-encoding', b'gzip, deflate'), (b'connection', b'keep-alive')]
TRACE [2020-08-12 15:18:05] httpcore._sync.http11 - send_request method=b'GET' url=(b'https', b'pypi.XXX', None, b'/') headers=[(b'host', b'pypi.XXX'), (b'user-agent', b'python-httpx/0.14.1'), (b'accept', b'*/*'), (b'accept-encoding', b'gzip, deflate'), (b'connection', b'keep-alive')]
TRACE [2020-08-12 15:18:05] httpcore._sync.http11 - send_data=Data(<0 bytes>)
DEBUG [2020-08-12 15:18:05] httpx._client - HTTP Request: GET https://pypi.XXX "HTTP/1.1 200 OK"
TRACE [2020-08-12 15:18:05] httpcore._sync.http11 - receive_event=Data(<2063 bytes>)
TRACE [2020-08-12 15:18:05] httpcore._sync.http11 - response_closed our_state=MUST_CLOSE their_state=SEND_BODY
TRACE [2020-08-12 15:18:05] httpcore._sync.connection_pool - removing connection from pool=<SyncHTTPConnection http_version=HTTP/1.1 state=5>

Environment

  • OS: Linux & macOS
  • Python version: 3.8.5
  • HTTPX version: 0.14.0 & 0.14.1
  • Async environment: none
  • HTTP proxy: no
  • Custom certificates: yes but likely unrelated

Additional context

n/a

bug

Most helpful comment

I'm happy to confirm that it indeed is! Thanks everyone!

All 18 comments

Just to be clear, have you checked that setting chunked_transfer_encoding on in nginx solves the issue, or do you suspect that's the culprit?

I have:

  • run into the problem and started reporting the bug without a clue what it is
  • while pasting the config my gut told me that could be the problem and I commented it out and reloaded
  • → problem went away
  • re-added and reloaded
  • → problem came back

Thanks for clarifying, I was asking because from a quick test, I was not able to reproduce it locally.

Nginx config (the default from docker, only added chunked_transfer_encoding off;):

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
    chunked_transfer_encoding off;
}

Docker container:

$ docker run --name my-custom-nginx-container -v /Users/josepcugat/workspace/httpx/nginx.conf:/etc/nginx/nginx.conf:ro -p 8080:80 nginx:1.10.3

Script to test:

import httpx

resp = httpx.get("http://127.0.0.1:8080")
print(resp.status_code)

I have added my /etc/nginx/nginx.conf to the original report. Shouldn't be anything important but just in case.

Hey,

Are you able to edit your deps code locally? If so, I'd be curious to see what happens if the code returns that b"" (i.e. what it did before https://github.com/encode/httpcore/pull/112) instead of raising the exception you're seeing.

I'm assuming we'll get an exception from h11, and it might help gather some more information, perhaps about a legit case of "sock.recv() returned empty bytes" that does not mean "the server has disconnected".

No dice replicating it, even with the provided nginx.conf and enabling SSL with a self-signed certificate. It might be triggered by the upstream server used by the proxy_pass directive.

I can confirm that httpx.get on a file that is served directly from location ~ /\+f/ works, but everything that goes thru proxy_pass fails.

Hey,

Are you able to edit your deps code locally? If so, I'd be curious to see what happens if the code returns that b"" (i.e. what it did before encode/httpcore#112) instead of raising the exception you're seeing.

I'm assuming we'll get an exception from h11, and it might help gather some more information, perhaps about a legit case of "sock.recv() returned empty bytes" that does not mean "the server has disconnected".

Removing the

                if data == b"":
                    raise ReadError("Server disconnected while attempting read")

block makes it work!

Hmm. So the rationale for this block was that a socket.recv() returning after handling 0 bytes means that the other end has disconnected, as per the Socket HOW-TO: https://docs.python.org/3.8/howto/sockets.html

Now, could it be that in your case we closed the connection ourselves already, perhaps concurrently (?), thus telling the other end to shut down while we were trying to read from it?

Put differently, could we need an additional check on the connection state before deciding that the disconnection is unexpected...?

Also, is this over HTTP/1.1 or HTTP/2? (I suppose we ought to add this to the issue template since it's a pretty important piece of information in general.)

Edit: I see listen REDACTED-IPv4-ADDRESS:443 ssl http2; in your nginx.conf so yes, I assume requests are made over HTTP/2. Does this replicate if you turn off HTTP/2 from Nginx and request it using HTTP/1.1?

From the stacktrace and trace logs it seems to be HTTP/1.1 (although the server has HTTP2 enabled),

Okay, it looks like I am able to replicate.

Nginx config:

worker_processes 4;

events {
    worker_connections 1024;
}

http {
  include /etc/nginx/mime.types;
  default_type application/octet-stream;

  upstream app {
    server host.docker.internal:8000;
  }

  server {
    server_name _;
    listen      80;

    chunked_transfer_encoding off;

    location /app/ {
        proxy_pass http://app;
        proxy_set_header X-outside-url $scheme://$host:$server_port;
        proxy_set_header X-Real-IP $remote_addr;
    }
  }
}

Nginx command (Docker - probably only works on macOS due to usage of host.docker.internal above):

docker run --rm -it -p 8001:80 -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro nginx:1.13

proxy_pass'ed app:

import uvicorn
from starlette.responses import PlainTextResponse, StreamingResponse
from starlette.applications import Starlette
from starlette.routing import Route


async def stream(_):
    async def body():
        for chunk in (b"Hello, ", b"World!"):
            yield chunk

    return StreamingResponse(body())


app = Starlette(
    routes=[
        Route("/plain", PlainTextResponse("Hello, World!")),
        Route("/stream", stream),
    ]
)

uvicorn.run(app)

Crucially, requesting the /plain endpoint works fine:

$ python -c "import httpx; print(httpx.get('http://localhost:8001/plain'))"
<Response [200 OK]>

But requesting /stream fails with the error from OP:

$ HTTPX_LOG_LEVEL=trace python -c "import httpx; print(httpx.get('http://localhost:8001/stream'))"
TRACE [2020-08-12 21:49:06] httpx._config - load_ssl_context verify=True cert=None trust_env=True http2=False
TRACE [2020-08-12 21:49:06] httpx._config - load_verify_locations cafile=/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/certifi/cacert.pem
TRACE [2020-08-12 21:49:06] httpcore._sync.connection_pool - get_connection_from_pool=(b'http', b'localhost', 8001)
TRACE [2020-08-12 21:49:06] httpcore._sync.connection_pool - created connection=<SyncHTTPConnection http_version=UNKNOWN state=0>
TRACE [2020-08-12 21:49:06] httpcore._sync.connection_pool - adding connection to pool=<SyncHTTPConnection http_version=UNKNOWN state=0>
TRACE [2020-08-12 21:49:06] httpcore._sync.connection - open_socket origin=(b'http', b'localhost', 8001) timeout={'connect': 5.0, 'read': 5.0, 'write': 5.0, 'pool': 5.0}
TRACE [2020-08-12 21:49:06] httpcore._sync.connection - create_connection socket=<httpcore._backends.sync.SyncSocketStream object at 0x10c610190> http_version='HTTP/1.1'
TRACE [2020-08-12 21:49:06] httpcore._sync.connection - connection.request method=b'GET' url=(b'http', b'localhost', 8001, b'/stream') headers=[(b'host', b'localhost:8001'), (b'user-agent', b'python-httpx/0.14.0'), (b'accept', b'*/*'), (b'accept-encoding', b'gzip, deflate, br'), (b'connection', b'keep-alive')]
TRACE [2020-08-12 21:49:06] httpcore._sync.http11 - send_request method=b'GET' url=(b'http', b'localhost', 8001, b'/stream') headers=[(b'host', b'localhost:8001'), (b'user-agent', b'python-httpx/0.14.0'), (b'accept', b'*/*'), (b'accept-encoding', b'gzip, deflate, br'), (b'connection', b'keep-alive')]
TRACE [2020-08-12 21:49:06] httpcore._sync.http11 - send_data=Data(<0 bytes>)
DEBUG [2020-08-12 21:49:06] httpx._client - HTTP Request: GET http://localhost:8001/stream "HTTP/1.1 200 OK"
TRACE [2020-08-12 21:49:06] httpcore._sync.http11 - receive_event=Data(<13 bytes>)
TRACE [2020-08-12 21:49:06] httpcore._sync.http11 - response_closed our_state=MUST_CLOSE their_state=SEND_BODY
TRACE [2020-08-12 21:49:06] httpcore._sync.connection_pool - removing connection from pool=<SyncHTTPConnection http_version=HTTP/1.1 state=5>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_api.py", line 159, in get
    return request(
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_api.py", line 86, in request
    return client.request(
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_client.py", line 609, in request
    return self.send(
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_client.py", line 632, in send
    response.read()
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_models.py", line 881, in read
    self._content = b"".join(self.iter_bytes())
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_models.py", line 892, in iter_bytes
    for chunk in self.iter_raw():
  File "/Users/florimond/Developer/python-projects/httpx/httpx/_models.py", line 925, in iter_raw
    for part in self._raw_stream:
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/httpcore/_sync/connection_pool.py", line 50, in __iter__
    for chunk in self.stream:
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/httpcore/_bytestreams.py", line 44, in __iter__
    for chunk in self._iterator:
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 143, in _receive_response_data
    event = self._receive_event(timeout)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/httpcore/_sync/http11.py", line 160, in _receive_event
    data = self.socket.read(self.READ_NUM_BYTES, timeout)
  File "/Users/florimond/Developer/python-projects/httpx/venv/lib/python3.8/site-packages/httpcore/_backends/sync.py", line 64, in read
    raise ReadError("Server disconnected while attempting read")
httpcore._exceptions.ReadError: Server disconnected while attempting read

Uvicorn shows that the response was sent correctly in both cases:

INFO:     127.0.0.1:52955 - "GET /plain HTTP/1.0" 200 OK
INFO:     127.0.0.1:52958 - "GET /stream HTTP/1.0" 200 OK

And likewise for Nginx:

172.17.0.1 - - [12/Aug/2020:19:50:08 +0000] "GET /plain HTTP/1.1" 200 13 "-" "python-httpx/0.14.0"
172.17.0.1 - - [12/Aug/2020:19:50:15 +0000] "GET /stream HTTP/1.1" 200 13 "-" "python-httpx/0.14.0"

Filed https://github.com/encode/httpcore/pull/159 which should fix this. It's indeed a regression introduced in 0.14.0. https://github.com/encode/httpcore/pull/112 wasn't quite the right fix…

Not sure exactly what headers nginx sends to indicate that there is a response body, but it doesn't know the content-length in advance, and isn't using transfer-encoding: chunked, which can delimit the end of the response, but it seems like a case where the server properly terminates the connection to indicate the end of the message body.

Relevant: https://stackoverflow.com/questions/12738519/http-response-headers-valid-with-no-transfer-encoding-and-content-length/12747103#12747103

Yes, in point no 5:

By the server closing the connection. (Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.)

So we ought to acknowledge that and make sure our HTTP parsers (h11, h2) handle it correctly, and ensure that we don't try to read on the socket anymore.

Note that for some reason h11 is able to distinguish between "connection closed after full response was sent" (this case, which doesn't raise an exception, as expected), and "connection closed but we needed response data" (encode/httpcore#110, which raises an exception, as expected). I don't know how exactly but TL;DR h11 does the right thing AFAICS. :-)

However h2 doesn't seem to handle "got empty bytes from server" at all, and that's what my PR linked above is all about.

Right yeah, which makes sense, because in HTTP/1.1, closing the connection is a valid way to end the response, but the equivalent in HTTP/2 is ending the data frames for a single stream. (Ending the connection would close all streams)

We've got a fix for this now in https://github.com/encode/httpcore/pull/164.
Then an upcoming httpcore release in https://github.com/encode/httpcore/pull/165 which should resolve the issue.

Should now be resolved in httpcore 0.10.2.

I'm happy to confirm that it indeed is! Thanks everyone!

Was this page helpful?
0 / 5 - 0 ratings