Httpx: Issuing requests in close concurrency can unneccessarily result in multiple HTTP/2 connections.

Created on 7 Nov 2019  路  10Comments  路  Source: encode/httpx

When I do several requests with the AsyncClient it seems that it opens multiple connections to the server. My code looks like this:

async def _send_request(self, client, image, url):
    headers = {
        'Content-Type': 'image/jpeg',
        'Content-Length': str(len(image))
    }
    return await client.post(
        url,
        headers=headers,
        data=image
    )

async with httpx.AsyncClient(timeout=timeout, base_url=self.server_url) as client:
    requests = []
    for image in images:
        requests.append(self._send_request(client, image, url))
    responses = await asyncio.gather(*requests)

As far as I understand it should use a connection pooling, store the connection and reuse it. In my case I get the following debug message for every image- post request I send with a loop:

DEBUG:httpx.dispatch.connection:start_connect host='api.garaza.io' port=443 timeout=TimeoutConfig(connect_timeout=None, read_timeout=60, write_timeout=5)

Am I doing anything wrong or is it a bug?

bug htt2

Most helpful comment

I noticed this happening on a similar situation as well and it also confused me. I fully expected connections to only be created once per origin, but it seems the current behaviour is by design according to this test?

https://github.com/encode/httpx/blob/master/tests/dispatch/test_connection_pools.py#L69-L80

@PrimozGodec, max_connections is a BoundedSemaphore which HTTPX uses to limit the number of connections of the pool. The default is 100.

All 10 comments

The issue seems to appear here:

https://github.com/encode/httpx/blob/1a32cf036a825f6eb35395af5388a3b23180a82e/httpx/dispatch/connection_pool.py#L132-L150

First self.pop_connection(origin) is called then there is await call in the line 135 and at the end there is self.active_connections.add(connection). Since there is await in between, first pops for all requests are called (there is no connection in the pool at that time) and the connection is added to the pool at the end when all pop already happens. It would be required to get rid of await in between but I do not know how since I do not understand what exactly await self.max_connections.acquire() does.

I noticed this happening on a similar situation as well and it also confused me. I fully expected connections to only be created once per origin, but it seems the current behaviour is by design according to this test?

https://github.com/encode/httpx/blob/master/tests/dispatch/test_connection_pools.py#L69-L80

@PrimozGodec, max_connections is a BoundedSemaphore which HTTPX uses to limit the number of connections of the pool. The default is 100.

@yeraydiazdiaz thank you for the response. So it looks OK then. Can establishing more connections instead of one at the same time decrease the speed of getting the responses?

Can establishing more connections instead of one at the same time decrease the speed of getting the responses?

Yes.

The behaviour you want here is...

  • If the connections are HTTP/1.1 then you want to have multiple concurrent connections.
  • If the connections are HTTP/2 then you don't want to have multiple concurrent connections, since you can have multiple concurrent streams on a single connection.

That test case happens to in the context of an HTTP/1.1 case, although it doesn't make that clear.

But, we don't know ahead of time if a connection will be HTTP/1.1 or HTTP/2 until it's been established.

It's also worth noting that you don't actually know ahead of time if a connection will end up being HTTP/1.1 or HTTP/2 until it's been established, so it is possible that making several requests to an HTTP/2 server could end up initially opening multiple connections. There's potentially some finessing to be done there, to make sure that we always only start by issuing a single connection, until we've determined if that connection is HTTP/1.1 or HTTP/2.

@PrimozGodec What is the response.http_version in your case here?

There's potentially some finessing to be done there, to make sure that we always only start by issuing a single connection, until we've determined if that connection is HTTP/1.1 or HTTP/2.

I think it is actually a good solution.

What is the response.http_version in your case here?

It is HTTP/2

It is HTTP/2

Right, then we've got some slightly complicated work to do.

If you'd issued the connections at very slightly different time intervals, you'd (presumably) find that they'd have ended up on the same single connection. As it is, each has started establishing a connection independently, which all turned out to be HTTP/2 connections, that could have just been a single shared connection.

Also, I see another potential issue. pop_connection is called first and in the case that there is no connection it will call await self.max_connections.acquire() which will prevent making more than 100 connections. Then the issue is if I want to make 1000 requests all will pass pop_connection first (when no connections in the pool) and will wait at await self.max_connections.acquire() for connection pool to get released even some of them could reuse the old connection. It means that after 100 requests everything gets blocked since connections are not released and all waiting requests will want to have a new connection. @tomchristie suggestion about making one connection to the same server and then wait would solve the issue.

But, we don't know ahead of time if a connection will be HTTP/1.1 or HTTP/2 until it's been established.

is there a way around this issue if we know beforehand that it's http2 (as in, when talking to a server that is http2 only, such as APNS)?

Believe this to now be resolved in the 0.13.dev0 pre-release.

Was this page helpful?
0 / 5 - 0 ratings