Aiohttp: 100% CPU usage with idle websocket

Created on 5 Jun 2017  路  14Comments  路  Source: aio-libs/aiohttp

Long story short

Fairly standard websocket server/client setup, but with session, results in 100% CPU usage.

  1. Generic websocket server
  2. Generic websocket client, but with a session that does a get request (to the same endpoint) before a new request to start a websocket
  3. In a few minutes, CPU usage increases to 100%

Note that removing the get request seems to avoid the problem.

Expected behaviour

Low CPU usage when idle

Actual behaviour

Worker process suddenly goes to 100% CPU usage, confirmed with htop.

Steps to reproduce

Server

import asyncio
import aiohttp
from aiohttp import web

async def worker_websocket(request):
    ws = aiohttp.web.WebSocketResponse()
    await ws.prepare(request)

    async for msg in ws:
        pass

    return ws

app = aiohttp.web.Application()
app.router.add_get('/worker-websocket',worker_websocket)
web.run_app( app, port=int(8080))

worker

import aiohttp
from aiohttp import web
import asyncio

async def websocket_client(loop):
    url = 'http://localhost:8080/worker-websocket'

    session = aiohttp.ClientSession(loop=loop)

    # removing this prevents the problem!
    async with session.get(url) as resp:
        pass

    async with session.ws_connect(url=url) as ws:
        async for msg in ws:
            pass

loop = asyncio.get_event_loop()
client = websocket_client(loop)
loop.run_until_complete(client)

Run server + worker and wait a few minutes. Eventually the server process will eat one core. Remove the session.get and it does not happen.

Your environment

I've reproduced this, so far, on:

  • MacOS 10.12 (Sierra) + Python 3.6.1 with aiohttp 2.1.0
  • Ubuntu server 16.04 LTS + Python 3.5.2 with aiohttp 2.1.0

Using strace, all I can see is highly frequent epoll_wait calls returning 0. My attempts with pyrasite and gdb-python3 yielded nothing useful.

I'm still yet to try ipdb and try to look through the source.

The only thing I can find that is even remotely similar is https://groups.google.com/forum/#!topic/libuv/opFk74lww7k .

Otherwise my experience with aiohttp has been fantastic. Thanks for a great library.

bug outdated

Most helpful comment

It seems like a problem related to keep-alive. When I tested the code above, web_protocol._process_keepalive is called infinitely after 75 seconds later (which is default value of keepalive_timeout) from calling session.get.

    def _process_keepalive(self):
        if self._force_close:
            return

        # 1. "self._keepalive_time" is not updated during waiting "async for msg in ws:"
        # therefore the value of "next" is unchanged.
        next = self._keepalive_time + self._keepalive_timeout

        if len(self._request_handlers) == len(self._waiters):  # <-- 2. no waiters during "async for msg in ws:"
            now = self._time_service.loop_time
            if now + 1.0 > next:
                self.force_close(send_last_heartbeat=True)
                return

        # 3. Therefore, the value of "next" is current time, which calls "self._process_keepalive" immediately.
        self._keepalive_handle = self._loop.call_at(
            next, self._process_keepalive)

All 14 comments

Confirmed on Windows too.

Windows 7 Enterprise 64bit + Python 3.5.1 amd64 + aiohttp 2.1.0

Usage is show as 25% as it's running on a quad-core machine:
image

Oddly, usage spikes suddenly?

image

@naggie @frillip could you test master branch

I can confirm it's still an issue on master, commit 6b85bcd82985c132c84acbc2b565997d7b512b45

It seems like a problem related to keep-alive. When I tested the code above, web_protocol._process_keepalive is called infinitely after 75 seconds later (which is default value of keepalive_timeout) from calling session.get.

    def _process_keepalive(self):
        if self._force_close:
            return

        # 1. "self._keepalive_time" is not updated during waiting "async for msg in ws:"
        # therefore the value of "next" is unchanged.
        next = self._keepalive_time + self._keepalive_timeout

        if len(self._request_handlers) == len(self._waiters):  # <-- 2. no waiters during "async for msg in ws:"
            now = self._time_service.loop_time
            if now + 1.0 > next:
                self.force_close(send_last_heartbeat=True)
                return

        # 3. Therefore, the value of "next" is current time, which calls "self._process_keepalive" immediately.
        self._keepalive_handle = self._loop.call_at(
            next, self._process_keepalive)

should be fixed in master

Fantastic! Thank you all for your time!

I have found that this still results in a 100% cpu usage situation after the default timeout of 75 seconds, using it in combination with python-socketio. If needed, I can provide new evidence.

aiohttp version 2.3.1

Same issue happened to me, any hint on how to debug this issue?

@ericdevries @zTrix could you provide reproducible example?

@zTrix do you use python-socketio as well?

@fafhrd91 sure, I will create one tomorrow and attach it to this ticket

@fafhrd91 No I didn't use python-socketio.

In my case I just started an server with websocket, and diving into _process_keepalive function shows that next actually is less than _loop.time(), but len(self._request_handlers) == len(self._waiters) is False, so it's calling itself just like an infinite loop.

Wild idea: disable HTTP keepalives after websocket connection establishment. They are useless, isn't it?

Also looks like RequestHandler._process_keepalive misses self._keepalive_time = self._loop.time() line.

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].

Was this page helpful?
0 / 5 - 0 ratings