Ticked all checks, then first commitment choice
Hi there, first of all many thanks for the work on FastAPI - this is now my goto framework for building Python-based REST APIs :)
My question is about adding a global timeout to _any potential request_ served by the server. My use-case includes occasionally long loading times when I have to load a new model for a given user request, and instead of blocking for 30-50s (which would often timeout on the user side due to default connection timeouts), I would like to return a temporary error whenever any endpoint takes more than a given delay to complete.
Today the only way I found to implement a timeout on every request is to wrap every endpoint method within a context manager like this one:
@contextmanager
def timeout_after(seconds: int):
# Register a function to raise a TimeoutError on the signal.
signal.signal(signal.SIGALRM, raise_timeout)
# Schedule the signal to be sent after `seconds`.
signal.alarm(seconds)
try:
yield
finally:
# Unregister the signal so it won't be triggered if the timeout is not reached.
signal.signal(signal.SIGALRM, signal.SIG_IGN)
def raise_timeout(_, frame):
raise TimeoutError
# Used as such:
@timeout_after(5)
@app.get("/1/version", tags=["Meta"],
description="Check if the server is alive, returning the version it runs.",
response_model=Version,
response_description="the version of the API currently running.")
async def version() -> Version:
return current_version
This is however quite cumbersome to add on every single function decorated as an endpoint.
Besides, it feels hacky: isn't there a better way to define app-level timeouts broadly, with a common handler, maybe akin to how ValidationErrors can be managed in a single global handler?
I looked into Starlette's timeout support to see if that was handled at a lower level. but to no avail.
Hi @PLNech
I am developing my own API using FastAPI and ran into the same "problem" as I am trying to add a global timeout to all my requests.
I am still new to fastapi but from what I understand I believe the "fastapi" way to do so would be to use a middleware as they are designed to be ran at every request by nature. As I searched on how to do so I found this
gitter community thread and thought it could maybe help you.
I am going to implement both your solution and the middleware based one and see which one I prefer and works best. Also note that there seems to be a problem with starlette 0.13.3 and higher so keep that in mind.
Also if you found a workaround by now I am more than interested.
Hope it helped you a bit
Hi @ZionStage, thanks for your message! I haven't found a workaround for now. Looking forward to continuing this conversation with you as we move forward on this topic :)
Hey @PLNech
I have implemented and tested the middleware and it seems to be working fine for me. Here is my code
import asyncio
import time
import pytest
from fastapi import FastAPI, Request, Response, HTTPException
from fastapi.responses import JSONResponse
from httpx import AsyncClient
from starlette.status import HTTP_504_GATEWAY_TIMEOUT
REQUEST_TIMEOUT_ERROR = 1 # Threshold
app = FastAPI() # Fake app
# Creating a test path
@app.get("/test_path")
async def route_for_test(sleep_time: float) -> None:
await asyncio.sleep(sleep_time)
# Adding a middleware returning a 504 error if the request processing time is above a certain threshold
@app.middleware("http")
async def timeout_middleware(request: Request, call_next):
try:
start_time = time.time()
return await asyncio.wait_for(call_next(request), timeout=REQUEST_TIMEOUT_ERROR)
except asyncio.TimeoutError:
process_time = time.time() - start_time
return JSONResponse({'detail': 'Request processing time excedeed limit',
'processing_time': process_time},
status_code=HTTP_504_GATEWAY_TIMEOUT)
# Testing wether or not the middleware triggers
@pytest.mark.asyncio
async def test_504_error_triggers():
# Creating an asynchronous client to test our asynchronous function
async with AsyncClient(app=app, base_url="http://test") as ac:
response = await ac.get("/test_path?sleep_time=3")
content = eval(response.content.decode())
assert response.status_code == HTTP_504_GATEWAY_TIMEOUT
assert content['processing_time'] < 1.1
# Testing middleware's consistency for requests having a processing time close to the threshold
@pytest.mark.asyncio
async def test_504_error_consistency():
async with AsyncClient(app=app, base_url="http://test") as ac:
errors = 0
sleep_time = REQUEST_TIMEOUT_ERROR*0.9
for i in range(100):
response = await ac.get("/test_path?sleep_time={}".format(sleep_time))
if response.status_code == HTTP_504_GATEWAY_TIMEOUT:
errors += 1
assert errors == 0
# Testing middleware's precision
# ie : Testing if it triggers when it should not and vice versa
@pytest.mark.asyncio
async def test_504_error_precision():
async with AsyncClient(app=app, base_url="http://test") as ac:
should_trigger = []
should_pass = []
have_triggered = []
have_passed = []
for i in range(200):
sleep_time = 2 * REQUEST_TIMEOUT_ERROR * random.random()
if sleep_time < 1.1:
should_pass.append(i)
else:
should_trigger.append(i)
response = await ac.get("/test_path?sleep_time={}".format(sleep_time))
if response.status_code == HTTP_504_GATEWAY_TIMEOUT:
have_triggered.append(i)
else:
have_passed.append(i)
assert should_trigger == have_triggered
I created three tests, the first one is designed to see wether or not the middleware actually does its job.
The second one is just there to check if there is any consistency problem with a single request.
The third one is here to check if I ran into the same issue raised in the thread I mentioned.
As far as I am concerned the first two tests passed without a problem.
However the third one failed. There are requests that have triggered when they should not :
E AssertionError: assert [3, 7, 10, 11, 12, 14, ...] == [3, 7, 8, 10, 11, 12, ...]
E At index 2 diff: 10 != 8
E Right contains 11 more items, first extra item: 165
This is the issue mentioned in the thread. I'll downgrade to starlette 0.13.2 and see if the test pass.
I might have made some mistakes or overlooked some things so I you ever have the chance to do some tests on your end let me know.
Cheers !
_Note :_
I wrote assert content['processing_time'] < 1.1 and not assert content['processing_time'] < 1 because the time I am monitoring isn't really the time it takes for python to execute the function (time to execute _asyncio.wait_for_ and catching the exception I guess) . I do not know the convention in this case.
@PLNech have you tried changing the timeout settings for gunicorn? By default it times out after 60 sec I believe but you can overwrite the settings.
https://docs.gunicorn.org/en/latest/settings.html#timeout
https://github.com/tiangolo/fastapi/issues/551
@ZionStage: thanks for sharing your implementation, this looks promising! I'll make some room in our backlog to give it a try in our next sprint and will let you know how it goes :)
@thomas-maschler: thanks for the advice. Unfortunately I've tried using Gunicorn's timeout, but it triggers a full restart of the app, disrupting other users of the service (e.g. by unloading their models from memory). What I'm trying to achieve is rather to enforce a timeout on individual requests, without affecting any other work handled by this worker.