Fastapi: Service built with Fastapi, but one request will wait another request to complete

Created on 7 Dec 2020 · 5Comments · Source: tiangolo/fastapi

I built the network service with fastapi and started it with the uvicorn command. This is the start command "uvicorn main: app --host 0.0.0.0 --port 8000". After startup, I initiated two requests to request different interfaces. The first request took 10 seconds, but the second request would have been waiting for the first time-consuming request to complete before it started executing. Why is this?

question

Source

xshone

Most helpful comment

What @ycd said is perfectly right, but I'm going to explain in details.

Assume we have the following application:
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
async def illblockyou():
    sleep(10)

@app.get('/omg')
async def imblocked():
    sleep(10)
We have two endpoints, /omg and /oma, and both of them have a synchronous call to sleep which takes 10 seconds.

Using async means you're creating coroutines, and those will run in the event loop. On the other hand, functions are running in the executor. To be more precise, Starlette (FastAPI core dependency), calls run_in_executor which makes use of a ThreadPoolExecutor.

That may be confusing, but let's see in practice. I have a small script to make it easy:
import asyncio
import aiohttp
import ssl
from datetime import datetime

urls = ["http://localhost:8000/oma", "http://localhost:8000/omg"]
# urls = ["http://localhost:8000/omg", "http://localhost:8000/oma"]


async def fetch(session, url):
    begin = datetime.now()
    print(f"Started at: {begin}!")
    async with session.get(url, ssl=ssl.SSLContext()) as response:
        res = await response.json()
        end = datetime.now()
        print(f"URL {url} took: {end - begin} seconds!")
        return res


async def fetch_all(urls, loop):
    async with aiohttp.ClientSession(loop=loop) as session:
        await asyncio.gather(*[fetch(session, url) for url in urls], return_exceptions=True)


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(fetch_all(urls, loop))
In the above script, what I'm doing is getting the event loop and running the coroutine fetch_all. But the trick here is the asyncio.gather that will call the fetch coroutines concurrently.

Running the script you have:
Started at: 2020-12-07 21:10:23.475472!
Started at: 2020-12-07 21:10:23.476403!
URL http://localhost:8000/oma took: 0:00:10.014925 seconds!
URL http://localhost:8000/omg took: 0:00:20.021950 seconds!
Which is the behavior that we are talking about.

Now let's change two lines of code in our application:
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
def illblockyou():
    sleep(10)

@app.get('/omg')
def imblocked():
    sleep(10)
I've removed the async keywords! Meaning that now those structures are functions!
Let's run the script again:
Started at: 2020-12-07 21:13:55.977826!
Started at: 2020-12-07 21:13:55.978717!
URL http://localhost:8000/omg took: 0:00:10.011898 seconds!
URL http://localhost:8000/oma took: 0:00:10.013075 seconds!
See what happened here? What happened here, one more time, is that those functions ran via executor, meaning each one of those ran in separate threads.

Now let's make another change, but instead of working with functions, we'll maintain the coroutines.
import asyncio
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
async def illblockyou():
    await asyncio.sleep(10)

@app.get('/omg')
async def imblocked():
    await asyncio.sleep(10)
Look carefully! Now, we're using asyncio.sleep, which will _always suspends the current task, allowing other tasks to run_.

Let's run the script one more time:
Started at: 2020-12-07 21:16:53.530794!
Started at: 2020-12-07 21:16:53.531618!
URL http://localhost:8000/omg took: 0:00:10.004118 seconds!
URL http://localhost:8000/oma took: 0:00:10.005099 seconds!
Oh! Cool! We get the same results as the one using functions, right? Well... Not exactly. One of the major benefits from coroutines is that they don't use as much as memory as threads do. If you check the outputs from the "fastest" solutions, you'll notice a time difference >0.005 seconds.

I've let the urls variable commented on purpose, in case you want to play with it and confirm my words. 😎

Thanks so so much. I've got the point. It's very kind of you!

xshone on 8 Dec 2020

❤1 😄1

All 5 comments

There must be something blocking your event loop. If you defined that endpoint as a coroutine, make it a normal function fastapi will process it with run_in_executor(executor, func, *args)

Also, use multiple workers. It will increase scalability.

uvicorn --workers 2

ycd on 7 Dec 2020

👍2

What @ycd said is perfectly right, but I'm going to explain in details.

Assume we have the following application:

from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
async def illblockyou():
    sleep(10)

@app.get('/omg')
async def imblocked():
    sleep(10)

We have two endpoints, /omg and /oma, and both of them have a synchronous call to sleep which takes 10 seconds.

Using async means you're creating coroutines, and those will run in the event loop. On the other hand, functions are running in the executor. To be more precise, Starlette (FastAPI core dependency), calls run_in_executor which makes use of a ThreadPoolExecutor.

That may be confusing, but let's see in practice. I have a small script to make it easy:

import asyncio
import aiohttp
import ssl
from datetime import datetime

urls = ["http://localhost:8000/oma", "http://localhost:8000/omg"]
# urls = ["http://localhost:8000/omg", "http://localhost:8000/oma"]


async def fetch(session, url):
    begin = datetime.now()
    print(f"Started at: {begin}!")
    async with session.get(url, ssl=ssl.SSLContext()) as response:
        res = await response.json()
        end = datetime.now()
        print(f"URL {url} took: {end - begin} seconds!")
        return res


async def fetch_all(urls, loop):
    async with aiohttp.ClientSession(loop=loop) as session:
        await asyncio.gather(*[fetch(session, url) for url in urls], return_exceptions=True)


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(fetch_all(urls, loop))

In the above script, what I'm doing is getting the event loop and running the coroutine fetch_all. But the trick here is the asyncio.gather that will call the fetch coroutines concurrently.

Running the script you have:

Started at: 2020-12-07 21:10:23.475472!
Started at: 2020-12-07 21:10:23.476403!
URL http://localhost:8000/oma took: 0:00:10.014925 seconds!
URL http://localhost:8000/omg took: 0:00:20.021950 seconds!

Which is the behavior that we are talking about.

Now let's change two lines of code in our application:

from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
def illblockyou():
    sleep(10)

@app.get('/omg')
def imblocked():
    sleep(10)

I've removed the async keywords! Meaning that now those structures are functions!
Let's run the script again:

Started at: 2020-12-07 21:13:55.977826!
Started at: 2020-12-07 21:13:55.978717!
URL http://localhost:8000/omg took: 0:00:10.011898 seconds!
URL http://localhost:8000/oma took: 0:00:10.013075 seconds!

See what happened here? What happened here, one more time, is that those functions ran via executor, meaning each one of those ran in separate threads.

Now let's make another change, but instead of working with functions, we'll maintain the coroutines.

import asyncio
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
async def illblockyou():
    await asyncio.sleep(10)

@app.get('/omg')
async def imblocked():
    await asyncio.sleep(10)

Look carefully! Now, we're using asyncio.sleep, which will _always suspends the current task, allowing other tasks to run_.

Let's run the script one more time:

Started at: 2020-12-07 21:16:53.530794!
Started at: 2020-12-07 21:16:53.531618!
URL http://localhost:8000/omg took: 0:00:10.004118 seconds!
URL http://localhost:8000/oma took: 0:00:10.005099 seconds!

Oh! Cool! We get the same results as the one using functions, right? Well... Not exactly. One of the major benefits from coroutines is that they don't use as much as memory as threads do. If you check the outputs from the "fastest" solutions, you'll notice a time difference >0.005 seconds.

I've let the urls variable commented on purpose, in case you want to play with it and confirm my words. :sunglasses:

Kludex on 7 Dec 2020

👍2

There must be something blocking your event loop. If you defined that endpoint as a coroutine, make it a normal function fastapi will process it with run_in_executor(executor, func, *args)

Also, use multiple workers. It will increase scalability.
uvicorn --workers 2

Thank you so much. It's really helpful.

xshone on 8 Dec 2020

What @ycd said is perfectly right, but I'm going to explain in details.

Assume we have the following application:
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
async def illblockyou():
    sleep(10)

@app.get('/omg')
async def imblocked():
    sleep(10)
We have two endpoints, /omg and /oma, and both of them have a synchronous call to sleep which takes 10 seconds.

Using async means you're creating coroutines, and those will run in the event loop. On the other hand, functions are running in the executor. To be more precise, Starlette (FastAPI core dependency), calls run_in_executor which makes use of a ThreadPoolExecutor.

That may be confusing, but let's see in practice. I have a small script to make it easy:
import asyncio
import aiohttp
import ssl
from datetime import datetime

urls = ["http://localhost:8000/oma", "http://localhost:8000/omg"]
# urls = ["http://localhost:8000/omg", "http://localhost:8000/oma"]


async def fetch(session, url):
    begin = datetime.now()
    print(f"Started at: {begin}!")
    async with session.get(url, ssl=ssl.SSLContext()) as response:
        res = await response.json()
        end = datetime.now()
        print(f"URL {url} took: {end - begin} seconds!")
        return res


async def fetch_all(urls, loop):
    async with aiohttp.ClientSession(loop=loop) as session:
        await asyncio.gather(*[fetch(session, url) for url in urls], return_exceptions=True)


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(fetch_all(urls, loop))
In the above script, what I'm doing is getting the event loop and running the coroutine fetch_all. But the trick here is the asyncio.gather that will call the fetch coroutines concurrently.

Running the script you have:
Started at: 2020-12-07 21:10:23.475472!
Started at: 2020-12-07 21:10:23.476403!
URL http://localhost:8000/oma took: 0:00:10.014925 seconds!
URL http://localhost:8000/omg took: 0:00:20.021950 seconds!
Which is the behavior that we are talking about.

Now let's change two lines of code in our application:
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
def illblockyou():
    sleep(10)

@app.get('/omg')
def imblocked():
    sleep(10)
I've removed the async keywords! Meaning that now those structures are functions!
Let's run the script again:
Started at: 2020-12-07 21:13:55.977826!
Started at: 2020-12-07 21:13:55.978717!
URL http://localhost:8000/omg took: 0:00:10.011898 seconds!
URL http://localhost:8000/oma took: 0:00:10.013075 seconds!
See what happened here? What happened here, one more time, is that those functions ran via executor, meaning each one of those ran in separate threads.

Now let's make another change, but instead of working with functions, we'll maintain the coroutines.
import asyncio
from fastapi import FastAPI
from time import sleep

app = FastAPI()

@app.get('/oma')
async def illblockyou():
    await asyncio.sleep(10)

@app.get('/omg')
async def imblocked():
    await asyncio.sleep(10)
Look carefully! Now, we're using asyncio.sleep, which will _always suspends the current task, allowing other tasks to run_.

Let's run the script one more time:
Started at: 2020-12-07 21:16:53.530794!
Started at: 2020-12-07 21:16:53.531618!
URL http://localhost:8000/omg took: 0:00:10.004118 seconds!
URL http://localhost:8000/oma took: 0:00:10.005099 seconds!
Oh! Cool! We get the same results as the one using functions, right? Well... Not exactly. One of the major benefits from coroutines is that they don't use as much as memory as threads do. If you check the outputs from the "fastest" solutions, you'll notice a time difference >0.005 seconds.

I've let the urls variable commented on purpose, in case you want to play with it and confirm my words. 😎