Httpx: PoolTimeout when num tasks in asyncio.gather() exceeds client max_connections

Created on 11 Aug 2020  路  5Comments  路  Source: encode/httpx

Checklist

  • Reproducible on 0.13.3
  • This issue seems similar but it's closed and was supposedly fixed

Describe the bug

If the number of tasks executed via asyncio.gather(...) is greater than max_connections, i get a PoolTimeout. It seems like maybe this is happening because the tasks that have completed aren't releasing their connections upon completion.

I'm new to asyncio so it's possible I'm doing something wrong, but haven't been able to find any documentation or issues that cover this case definitively.

To reproduce

import asyncio
import httpx

async def main() -> None:
    url = "https://www.example.com"
    max_connections = 2
    timeout = httpx.Timeout(5.0, pool=2.0)
    limits = httpx.Limits(max_connections=2)
    client = httpx.AsyncClient(timeout=timeout, pool_limits=limits)

    async with client:
        tasks = []
        for _ in range(max_connections + 1):
            tasks.append(client.get(url))
        await asyncio.gather(*tasks)

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    try:
        loop.run_until_complete(main())
    finally:
        loop.close()

Expected behavior

I would expect all tasks to complete, rather than getting a PoolTimeout on the nth task, where n = max_connections + 1.

Actual behavior

Getting a PoolTimeout on the nth task, where n = max_connections + 1.

Debugging material

Traceback (most recent call last):
  File "test_async.py", line 21, in <module>
    loop.run_until_complete(main())
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "test_async.py", line 16, in main
    await asyncio.gather(*tasks)
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_client.py", line 1416, in get
    timeout=timeout,
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_client.py", line 1242, in request
    request, auth=auth, allow_redirects=allow_redirects, timeout=timeout,
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_client.py", line 1273, in send
    request, auth=auth, timeout=timeout, allow_redirects=allow_redirects,
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_client.py", line 1302, in _send_handling_redirects
    request, auth=auth, timeout=timeout, history=history
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_client.py", line 1338, in _send_handling_auth
    response = await self._send_single_request(request, timeout)
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_client.py", line 1374, in _send_single_request
    timeout=timeout.as_dict(),
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/Users/redacted/.pyenv/versions/3.6.9/lib/python3.6/site-packages/httpx/_exceptions.py", line 359, in map_exceptions
    raise mapped_exc(message, **kwargs) from None  # type: ignore
httpx._exceptions.PoolTimeout

Environment

  • OS: macOS 10.14.6
  • Python version: 3.6.9
  • HTTPX version: 0.13.3
  • Async environment: asyncio
  • HTTP proxy: no
  • Custom certificates: no

Additional context

I commented on this issue, but it's closed so figured it would be better to create a new one.

bug concurrency pooling

Most helpful comment

I'm planning at getting stuck into this one pretty soon yup.
It's a bit of an involved one, but I know what we need to do to resolve it.

All 5 comments

Yup, there's def. an issue here to be dealt with.

To get a bit more info, I tried this...

import asyncio
import httpx


async def get_url(client, url):
    print("GET", url)
    print(await client._transport.get_connection_info())
    print(await client.get(url))
    print(await client._transport.get_connection_info())


async def main() -> None:
    url = "https://www.example.com"
    max_connections = 2
    timeout = httpx.Timeout(5.0, pool=5.0)
    limits = httpx.Limits(max_connections=2, max_keepalive_connections=0)
    client = httpx.AsyncClient(timeout=timeout, limits=limits)

    async with client:
        tasks = []
        for _ in range(max_connections + 1):
            tasks.append(get_url(client, url))
        await asyncio.gather(*tasks)

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    try:
        loop.run_until_complete(main())
    finally:
        loop.close()

Which results in...

GET https://www.example.com
{}
GET https://www.example.com
{}
GET https://www.example.com
{}
<Response [200 OK]>
{'https://www.example.com': ['HTTP/1.1, IDLE', 'HTTP/1.1, ACTIVE']}
<Response [200 OK]>
{'https://www.example.com': ['HTTP/1.1, IDLE', 'HTTP/1.1, IDLE']}

We can see the connections returning from ACTIVE to IDLE, but the keep-alive connections are not being used by the pending request.

The issue here is that the pending request is in a state where it's blocking on the connection semaphore waiting to start a new connection, which is not being released by the fact that we've now got an available keep alive connection.

Will need a bit of careful thinking about, but clearly needs resolving yup - thanks for raising this.

I was going to mention the same. I also tested reading the whole response body (which should release the connection) and also closing the response manually but the issue persists either way.

ah good call checking the connection state! is there existing logic that _intends_ to have pending requests make use of existing idle connections, and it's just not working as expected? or does the code as written only intend for pending requests to create new connections? curious where that logic is if you can link me @tomchristie

Hello everyone,

just want to make sure that this it what I'm looking for. The server I want to send request to have a limited number of allowed connections. Currently I limit the number of async task by using Semaphore. But the pool_limits parameter for AsyncClient looks like this is intended for my use case. Am I right here? If so, any idea when this issue here will be resolved?

Thanks a lot!

fin swimmer

I'm planning at getting stuck into this one pretty soon yup.
It's a bit of an involved one, but I know what we need to do to resolve it.

Was this page helpful?
0 / 5 - 0 ratings