Httpx: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6> (asyncio only)

Created on 26 Feb 2020  ·  16Comments  ·  Source: encode/httpx

References: https://github.com/encode/httpx/issues/550

On 0.11.1 and current HEAD 50d337e807839c21e796fd8b01c67d8a672a9721 I'm getting:

$ ipython
Python 3.8.0 (default, Dec 12 2019, 09:56:28) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.12.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import httpx                                                                                                                                           

In [2]: async with httpx.AsyncClient() as client: 
   ...:     r = await client.get('https://www.example.org/')                                                                                                   

In [3]: r                                                                                                                                                      
Out[3]: <Response [200 OK]>

In [4]:                                                                                                                                                        
Do you really want to exit ([y]/n)? 
An open stream object is being garbage collected; call "stream.close()" explicitly.
source_traceback: Object created at (most recent call last):
[...]
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", line 595, in run_until_complete
    self.run_forever()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", line 563, in run_forever
    self._run_once()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", line 1836, in _run_once
    handle._run()
  File "/home/mdk/.local/lib/python3.8/asyncio/events.py", line 81, in _run
    self._context.run(self._callback, *self._args)
  File "/home/mdk/.local/lib/python3.8/asyncio/streams.py", line 50, in open_connection
    reader = StreamReader(limit=limit, loop=loop)
$ 
bug concurrencasyncio external

Most helpful comment

All 16 comments

Tried with HTTPX_LOG_LEVE=trace because tracemalloc don't tells me much, and Python 3.8.2:

$ cat reproducer.py 
import asyncio

import httpx

async def mdk():
    async with httpx.AsyncClient() as client:
        r = await client.get('https://www.example.org/')
    print(r.text)

asyncio.run(mdk())

$ python reproducer.py 
TRACE [2020-02-26 09:37:29] httpx._config - load_ssl_context verify=True cert=None trust_env=True http2=False
TRACE [2020-02-26 09:37:29] httpx._config - load_verify_locations cafile=/home/mdk/clones/python/docsbuild-scripts/.venv/lib/python3.8/site-packages/certifi/cacert.pem
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection_pool - acquire_connection origin=Origin(scheme='https' host='www.example.org' port=443)
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection_pool - new_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.example.org' port=443))
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection - start_connect tcp host='www.example.org' port=443 timeout=Timeout(timeout=5.0)
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection - connected http_version='HTTP/1.1'
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - send_headers method='GET' target='/' headers=Headers({'host': 'www.example.org', 'user-agent': 'python-httpx/0.11.1', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive'})
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - send_data data=Data(<0 bytes>)
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=NEED_DATA
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=Response(status_code=200, headers=[(b'content-encoding', b'gzip'), (b'accept-ranges', b'bytes'), (b'age', b'499617'), (b'cache-control', b'max-age=604800'), (b'content-type', b'text/html; charset=UTF-8'), (b'date', b'Wed, 26 Feb 2020 08:37:29 GMT'), (b'etag', b'"3147526947"'), (b'expires', b'Wed, 04 Mar 2020 08:37:29 GMT'), (b'last-modified', b'Thu, 17 Oct 2019 07:18:26 GMT'), (b'server', b'ECS (bsa/EB17)'), (b'vary', b'Accept-Encoding'), (b'x-cache', b'HIT'), (b'content-length', b'648')], http_version=b'1.1', reason=b'OK')
DEBUG [2020-02-26 09:37:29] httpx._client - HTTP Request: GET https://www.example.org/ "HTTP/1.1 200 OK"
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=Data(<648 bytes>)
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - receive_event event=EndOfMessage(headers=[])
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - response_closed our_state=DONE their_state=DONE
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection_pool - release_connection connection=HTTPConnection(origin=Origin(scheme='https' host='www.example.org' port=443))
TRACE [2020-02-26 09:37:29] httpx._dispatch.connection - close_connection
TRACE [2020-02-26 09:37:29] httpx._dispatch.http11 - send_event event=ConnectionClosed()
<!doctype html>
[... redacted (useless html) ...]

/home/mdk/.local/lib/python3.8/asyncio/selector_events.py:694: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=6>
  _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
Object allocated at (most recent call last):
  File "reproducer.py", lineno 10
    asyncio.run(mdk())
  File "/home/mdk/.local/lib/python3.8/asyncio/runners.py", lineno 43
    return loop.run_until_complete(main)
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 603
    self.run_forever()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 570
    self._run_once()
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 1859
    handle._run()
  File "/home/mdk/.local/lib/python3.8/asyncio/events.py", lineno 81
    self._context.run(self._callback, *self._args)
  File "/home/mdk/.local/lib/python3.8/asyncio/streams.py", lineno 52
    transport, _ = await loop.create_connection(
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 1050
    transport, protocol = await self._create_connection_transport(
  File "/home/mdk/.local/lib/python3.8/asyncio/base_events.py", lineno 1072
    transport = self._make_ssl_transport(
  File "/home/mdk/.local/lib/python3.8/asyncio/selector_events.py", lineno 78
    _SelectorSocketTransport(self, rawsock, ssl_protocol,

So this is related to https://github.com/encode/httpx/issues/634, _backends/asyncio.py, SocketStream's close method:

    async def close(self) -> None:
        # NOTE: StreamWriter instances expose a '.wait_closed()' coroutine function,
        # but using it has caused compatibility issues with certain sites in
        # the past (see https://github.com/encode/httpx/issues/634), which is
        # why we don't call it here.
        # This is fine, though, because '.close()' schedules the actual closing of the
        # stream, meaning that at best it will happen during the next event loop
        # iteration, and at worst asyncio will take care of it on program exit.
        async with self.write_lock:
            with as_network_error(OSError):
                self.stream_writer.close()

Obviously adding a await self.stream_writer.wait_closed() fixes #825 probably while reopening #634

So I'm going to read #634 now.

Hi @JulienPalard,

This is very strange to me. It's a straight-on request/response flow, no redirects, no special cases that surface in the TRACE logs…

Also, I am not able to reproduce on Python 3.8.1/HTTPX 0.11.1.

Huhh… I'm sick with issues we've been having with asyncio's .close() vs .wait_closed()

You can also look at the end of #714. Our conclusion back then was to close off as "this is a Python bug". @tomchristie also noted:

Turns out it only replicates on Python 3.8.0. Does not replicate on 3.7.1 or 3.8.1.

which matches my experience on 3.8.1. The bug may have come back on 3.8.2…? 🤷‍♂

@JulienPalard I'm curious — what happens if you await asyncio.sleep(0) at the end of main()? Does this give asyncio the chance to execute the callback that closes the stream?

import asyncio

import httpx

async def mdk():
    async with httpx.AsyncClient() as client:
        r = await client.get('https://www.example.org/')
    print(r.text)
    await asyncio.sleep(0)  # <--

asyncio.run(mdk())

Feasible that it could be a duplicate of https://github.com/encode/httpx/issues/824 if there happens to be proxies setup in the environment.

Few questions here @JulienPalard:

  • Does the issue still replicate for you when using async with httpx.AsyncClient(trust_env=False)?
  • Does the issue still replicate if you perform the same request in an example.py script, and run with asyncio.run(main()), rather than using a REPL?
  • Does the issue still replicate if you perform the same request in an example.py script, and run with trio.run(main), rather than using a REPL?

@florimondmanca No, adding asyncio.sleep(0) don't help but asyncio.sleep(.1) fixes it (probably randomly).

Checks done using httpx on current HEAD (50d337e807839c21e796fd8b01c67d8a672a9721), ipython==7.12.0, on Debian bullseye, trio==0.13.0:

Sorry I don't know how to design 3d tables in markdown. So, trust_env has no effect, (I tried True and False for each tests) cpython 3.8.1 fixed an issue that did not reappeared in 3.8.2.

Python 3.8.2

  • ipython, trust_env=*: No issue
  • script asyncio, trust_env=*: ResourceWarning
  • script trio, trust_env=*: No issue

Python 3.8.1

  • ipython, trust_env=*: No issue
  • script asyncio, trust_env=*: ResourceWarning
  • script trio, trust_env=*: No issue

Python 3.8.0

  • ipython, trust_env=*: An open stream object is being garbage collected
  • script asyncio, trust_env=*: ResourceWarning
  • script trio, trust_env=*: No issue

[1]:

import asyncio # or import trio

import httpx

async def mdk():
    async with httpx.AsyncClient(trust_env=False) as client:
        r = await client.get('https://www.example.org/')
    print(r.text)

asyncio.run(mdk()) # or trio.run(mdk)

I'm changing the title appropriately, as the "An open stream ..." has been fixed in Python 3.8.1, this discussion is about the ResourceWarning.

Gotcha, thanks @JulienPalard.

We'd be able to resolve that if we were doing what we're supposed to and calling .wait_closed(), but we can't because it appears to have an issue where it can hang https://github.com/encode/httpx/issues/634#issuecomment-567466727 which trio is particularly careful to work around https://github.com/python-trio/trio/blob/master/trio/_ssl.py#L782-L785

I guess things we ought to figure out are:

  • Can we narrow down the asyncio .wait_closed() issue to a simple enough reproducible case that we can constructively raise it as a Python bug?
  • Can we monkeypatch the issue ourselves, adding trio's careful behavior?

Also worth investigating - how are the aiohttp team dealing with this, and is there an equivelent issue on their tracker?

Here's a minimal reproducing case...

import asyncio
import ssl
import certifi

hostname = 'login.microsoftonline.com'
context = ssl.create_default_context()
context.load_verify_locations(cafile=certifi.where())

async def main():
    reader, writer = await asyncio.open_connection(hostname, 443, ssl=context)
    print('opened')
    writer.close()
    print('close started')
    await writer.wait_closed()
    print('close completed')

asyncio.run(main())

Haven't looked into how awkward it is to do, but really I think we'd want to monkeypatch asyncio's socket unwrapping behaviour, so that we can reliably call .wait_closed(...) without it potentially hanging.

Okay, so reviewed this one today, and looks like we can close it off.

Our asyncio backend has this comment...

        # NOTE: StreamWriter instances expose a '.wait_closed()' coroutine function,
        # but using it has caused compatibility issues with certain sites in
        # the past (see https://github.com/encode/httpx/issues/634), which is
        # why we don't call it here.
        # This is fine, though, because '.aclose()' schedules the actual closing of the
        # stream, meaning that at best it will happen during the next event loop
        # iteration, and at worst asyncio will take care of it on program exit.
        async with self.write_lock:
            with map_exceptions({OSError: CloseError}):
                self.stream_writer.close()

Also, this does not reproduce any warnings...

import httpx
import asyncio

async def main():
    async with httpx.AsyncClient() as client:
        r = await client.get("https://login.microsoftonline.com/")
        print(r)

asyncio.run(main(), debug=True)

Oh right, ResourceWarning is ignored by default, but this replicates when using eg. python -Wall

I think we'll need to look into a fix that allows us to call .wait_closed that doesn't ever hang if it can't unwrap the SSL connection cleanly.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

florimondmanca picture florimondmanca  ·  4Comments

FlorianREGAZ picture FlorianREGAZ  ·  4Comments

tomchristie picture tomchristie  ·  3Comments

florimondmanca picture florimondmanca  ·  3Comments

florimondmanca picture florimondmanca  ·  4Comments