Aiohttp: session.get(...) timeout overriding ClientSession timeout and cancelling all requests

Created on 19 Aug 2018  路  9Comments  路  Source: aio-libs/aiohttp

Long story short

Setting a short timeout on session.get(...) overrides the ClientSession timeout and causes all requests to timeout within the session.get(...) timeframe.

Expected behaviour

The session.get(...) timeout should be per-request, while the timeout set in ClientSession should be cumulative for the session. I should be able to set an infinite ClientSession timeout with a limited session.get(...) per request and have it go until all tasks are completed.

Actual behaviour

Once a short session.get(...) timeout is set, it no longer respects the ClientSession timeout. This makes long lists of sites impossible to go through because either the timeout gets hit and everything gets cancelled, or you choose not to use a timeout and all connectors get stuck waiting indefinitely on non responding sites.

Steps to reproduce

Create something such as:

timeout = aiohttp.ClientTimeout(total=600)
connector = aiohttp.TCPConnector(limit=40)
dummy_jar = aiohttp.DummyCookieJar()
async with aiohttp.ClientSession(connector=connector, timeout=timeout, cookie_jar=dummy_jar) as session:
     for site in sites:
         task = asyncio.ensure_future(make_request(session, site))
         tasks.append(task)
     await asyncio.wait(tasks)  

Then for the make_request you can have something such as:

async def make_request(session, site):
    async with session.get(site, timeout=15) as response:
        return await response.read()

Throw in a large list of different sites (some unreachable) for sites, and notice after 15 seconds you will start to get a bunch of timeouts. It seems that's happening because the 15 seconds defined in session.get(...) is an aggregate for all requests instead of the current request it's making.

The expected result in the above case is to allow 600 seconds for all requests in the session, but limit each request to only 15 seconds.

Your environment

Windows 7x64
Python 3.7.0
aiohttp 3.3.2

Most helpful comment

The timeout is per-request already, but it counts the time of waiting for a connection from the pool.
You can configure the pool to unlimited size. Or you can use ClientTimeout(total=None, sock_connect=5, sock_read=5).

I found that existing timeout behavior is confusing, I want to fix it in newer aiohttp versions. For now, you can use workarounds provided above

All 9 comments

I actually migrated a scraper to node.js because of this, and the fact that no one seemed to talk about it.
But actually I might just try using a session per request instead.

I might just try using a session per request instead.

@popsail that's exactly what I ended up doing. Since you can't use the TCPConnector limit I ended up using a semaphore for throttling. It's unfortunate nobody else has posted about this, and this hasn't even gotten an acknowledgement from the developers.

You need to learn about client timeouts in aiohttp.
Until somebody (or I) will find a time for writing comprehensive timeout page - hope it will help.
Be patient: the library is developed by volunteers in their spare time

so this problem is still not solved?

I came across this within the search results when Googling this and I'm not sure it's overly clear that the changes that introduced ClientTimeout have actually _resolved_ this problem.

The code below, attaching a _new_ ClientTimeout instance to the request resolves this issue for me. It's not overly clear from the docs, and I can see this as a common use case, but this does work as expected (I have a shared session too, not a session per request):
async with session.get(url, trace_request_ctx=context, timeout=ClientTimeout(total=self.timeout)) as response:

Total timeout includes a time of waiting for new connection from the pool, not only communication time.
By default, the TCPConnector has a limit for 100 concurrent connections. If you request 101 fetches 100 will be executed immediately but the last will wait for finishing at least one previous request. Time ticks, the total timeout can expire.

Aha ok I see, it was working well for a handful of test requests but not over 100.

Is there no way to make a http request with an individual timeout? The use case is uptime monitoring - bulk requests to multiple unknown host names so a 5 second timeout would be ideal - per request.

The timeout is per-request already, but it counts the time of waiting for a connection from the pool.
You can configure the pool to unlimited size. Or you can use ClientTimeout(total=None, sock_connect=5, sock_read=5).

I found that existing timeout behavior is confusing, I want to fix it in newer aiohttp versions. For now, you can use workarounds provided above

Gotcha, thank you - I think that鈥檚 a reasonable workaround!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Smosker picture Smosker  路  3Comments

Codeberg-AsGithubAlternative-buhtz picture Codeberg-AsGithubAlternative-buhtz  路  3Comments

kbaston picture kbaston  路  3Comments

amsb picture amsb  路  3Comments

jonringer picture jonringer  路  4Comments