Aiohttp: on_cleanup / on_shutdown are called after active tasks on the event loop are canceled

Created on 3 Feb 2019  路  11Comments  路  Source: aio-libs/aiohttp

I am running Python 3.7.2 with aiohttp 3.5.4.

web.run_app() is called. Then I press control-c.

  1. web.py executes _cancel_all_tasks() which sends cancel() to all incomplete tasks on the event loop
  2. web.py then executes "await runner.cleanup()" (eventually) which then executes on_shutdown and on_cleanup handlers

This causes my asyncio tasks (started via on_startup) to get cancelled before on_cleanup can gracefully stop them using a method other than .cancel(). This is new behavior from the previous version of aiohttp I was using.

Handling CancelledError is problematic given that it can be thrown anywhere 'await' and similar are used (I guess I could use asyncio.shield in literally hundreds of places).

pr-merged

Most helpful comment

@asvetlov per your comment:

on_shutdown / on_cleanup are called before tasks canceling.

Using python 3.7.2 and aiohttp version 3.6.2, after locally adding a print debug statement to _cancel_all_tasks() in web.py and BaseRunner.cleanup() in web_runner.py, it looks like _cancel_all_tasks() is always called before BaseRunner.cleanup() once a SIGINT is sent to the app. This suggests on_shutdown / on_cleanup is called after tasks are cancelled. I was wondering if this is expected behavior?

All 11 comments

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are https://github.com/aio-libs/aiohttp/issues/1207 (Handle all concurrent.futures._base.TimeoutError and throw a aiohttp.errors.TimeoutError instead.), https://github.com/aio-libs/aiohttp/issues/2168 (server often throw "concurrent.futures._base.CancelledError"), https://github.com/aio-libs/aiohttp/issues/2374 (cancellederror still exist.), https://github.com/aio-libs/aiohttp/issues/930 (aiohttp.Timeout causes CancelledError), and https://github.com/aio-libs/aiohttp/issues/3549 (Concurrent GET requests lead to ClientConnectorError(8, 'nodename nor servname provided, or not known')).

This is intended change in aiohttp.web.run_app behavior.
Earlier the server left unfinished tasks after the loop closing, which did lead to unpredictable behavior.

Now all unfinished tasks are canceled, run_app works pretty close to standard asyncio.run() function.
You still handle the cancellation error by try/except asyncio.CancelledError or try/finally to graceful shutdown, isn't it?

on_shutdown / on_cleanup are called before tasks canceling.
The cancellation is the last step before the loop stopping and destroying

I did more research in the code (including on the master branch). The description has been updated. The cancellation does not appear to be the last step.

My workaround is as follows. Because asyncio signal handlers must be synchronous, there are some challenges.

I do not propose altering the behavior of on_shutdown and on_cleanup. I would instead like an additional handler that would be called before canceling all tasks on the event loop. "some_stop_function" could be passed via, say a new method on Application such as on_signal().

loop = asyncio.get_event_loop()
for sig in (signal.SIGTERM, signal.SIGINT):
  loop.add_signal_handler(sig, lambda: asyncio.ensure_future(stop_handler1()))

web.run_app( handle_signals=False, ... )
...

async def stop_handler1():
  # This function runs asynchronously. While this code
  # runs, asyncio is running as if SIGTERM/SIGINT were
  # never caught
  await some_stop_function() #  Do shutdown stuff

  # Now send TERM to asyncio and handle it in stop_handler2
  loop = asyncio.get_event_loop()
  loop.remove_signal_handler(signal.SIGINT)
  sig = signal.SIGTERM
  loop.remove_signal_handler(sig)
  loop.add_signal_handler(sig, stop_handler2)
  os.kill(os.getpid(), sig)
...

def stop_handler2():
  # This function must run synchronously
  loop = asyncio.get_event_loop()
  loop.remove_signal_handler(signal.SIGTERM)
  raise web.GracefulExit()

on_shutdown / on_cleanup are called before tasks canceling.
The cancellation is the last step before the loop stopping and destroying

@asvetlov - Not sure if I'm doing something wrong, but I seem to be running into the same problem where long running tasks are cancelled prior to the on_shutdown / on_cleanup signals firing... sample app and results below:

import asyncio
from aiohttp import web

my_long_task = None

async def long_task():
    print("starting long task")
    try:
        await asyncio.sleep(3600)
    except asyncio.CancelledError:
        print("long_task cancelled")

async def shutdown(app):
    print("on_shutdown")
    print(f"my_long_task.done() = {my_long_task.done()}")

async def cleanup(app):
    print("on_cleanup")

async def startup(app):
    global my_long_task
    loop = asyncio.get_event_loop()
    my_long_task = loop.create_task(long_task())

app = web.Application()

app.on_startup.append(startup)
app.on_shutdown.append(shutdown)
app.on_cleanup.append(cleanup)

web.run_app(app)

Running this app 10 times - 50/50 split - half the time the task is cancelled first, the other half on_shutdown / on_cleanup are executed first:

for i in $(seq 1 10); do echo "************"; pipenv run python3 test_app.py; done
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled

I'm running:

aiohttp 3.5.4
Python 3.7.1

I have written a small code snippet to mimic current (aiohttp 3.5.4) run_app behavior,

import asyncio
import signal


class GracefulExit(SystemExit):
    code = 1


def _raise_graceful_exit():
    print('_raise_graceful_exit')
    raise GracefulExit()


async def _runner_setup():
    loop = asyncio.get_event_loop()

    loop.add_signal_handler(signal.SIGINT, _raise_graceful_exit)
    loop.add_signal_handler(signal.SIGTERM, _raise_graceful_exit)


async def _run_app():
    try:
        await _runner_setup()
        while True:
            print('running....')
            await asyncio.sleep(3600)
    finally:
        print('call runner.cleanup()')


def run_app():
    loop = asyncio.get_event_loop()

    try:
        loop.run_until_complete(_run_app())
    except (GracefulExit, KeyboardInterrupt) as ex:
        print('GracefulExit?', type(ex))
    finally:
        print('Cancel all tasks')


run_app()

The result shows that finally clause in _run_app has not been reached.

pjknkda@anne:~/test/asyncio_signal$ python3.7 main.py
running....
^C_raise_graceful_exit
GracefulExit? <class '__main__.GracefulExit'>
Cancel all tasks

This behavior is somewhat unexpected to me and I think this may cause the described issue.

async def stop_handler1():
# This function runs asynchronously. While this code
# runs, asyncio is running as if SIGTERM/SIGINT were
# never caught
await some_stop_function() # Do shutdown stuff

# Now send TERM to asyncio and handle it in stop_handler2
loop = asyncio.get_event_loop()
loop.remove_signal_handler(signal.SIGINT)
sig = signal.SIGTERM
loop.remove_signal_handler(sig)
loop.add_signal_handler(sig, stop_handler2)
os.kill(os.getpid(), sig)
...

def stop_handler2():
# This function must run synchronously
loop = asyncio.get_event_loop()
loop.remove_signal_handler(signal.SIGTERM)
raise web.GracefulExit()

this worked :)

I had a similar issue where a cleanup hook would depend on some background task running. The task would get cancelled before the cleanup hook ran, leading to some unexpected behavior.

Cancelling the run_app task first makes the cleanup hooks run first, thus hopefully fixing the bug.

@asvetlov per your comment:

on_shutdown / on_cleanup are called before tasks canceling.

Using python 3.7.2 and aiohttp version 3.6.2, after locally adding a print debug statement to _cancel_all_tasks() in web.py and BaseRunner.cleanup() in web_runner.py, it looks like _cancel_all_tasks() is always called before BaseRunner.cleanup() once a SIGINT is sent to the app. This suggests on_shutdown / on_cleanup is called after tasks are cancelled. I was wondering if this is expected behavior?

Just got bitten by this, too. Could aiohttp 3.7 containing the fix for this bug be released, please?

Was this page helpful?
0 / 5 - 0 ratings