Fastapi: [QUESTION] - Why use "sync" instead of "async" when serving ML models?

Created on 6 Jul 2020 · 4Comments · Source: tiangolo/fastapi

Description

On the lecture of July 6th Sebastián was talking about FastAPI and ML models, and he said that is better to use sync notation instead of async when serving ML models. I didn't understand the motive, could someone explain it?

Thank you!

question

Source

Kludex

Most helpful comment

Thanks for the help here @rkbeatss and @phy25 ! :clap: :bow:

Thanks for reporting back and closing the issue @Kludex :+1:

It's mainly because:

Blocking (CPU-bound) code should not be called directly. For example, if a function performs a CPU-intensive calculation for 1 second, all concurrent asyncio Tasks and IO operations would be delayed by 1 second.

Ref: https://docs.python.org/3/library/asyncio-dev.html#running-blocking-code

By using normal def functions FastAPI runs them in a threadpool with loop.run_in_executor().

tiangolo on 6 Dec 2020

❤1 👍1

All 4 comments

He recommended the use of sync rather than async functions when serving ML models because most ML operations are CPU intensive and the program would benefit from being able to do more computations in parallel. This means that in most cases the time will be spent actually doing this work rather than waiting around, rendering theasync notation less useful in speeding up the program.

If I understood correctly, I think he suggested running several processes in parallel for CPU intensive ML models, which would allow to generate several predictions at the same time for different requests.

rkbeatss on 6 Jul 2020

👍1

Makes sense, thank you @rkbeatss

Kludex on 6 Jul 2020

You can close the issue if you don't have other questions.

phy25 on 6 Jul 2020