I'm using this dependencies:
starlette==0.13.6
uvicorn==0.11.8
fastapi==0.61.1
According to every check I've worked with, using FastAPI is supposed to handle more request per second than a Flask App but I'm finding different behaviors.
The other params (besides the worker class) are the same for both apps.
What can I check to improve this ? or is this the expected behavior? If so, what could be other places I could look into to improve the performance of using FastAPI ?
How are you benchmarking them?
It's kinda hard to tell without code and how you benchmarking them (running options etc.)
Also, I did some benchmarking you might want to check it out
I am benchmarking them using locust (https://locust.io/) with 100 concurrent users and a hatch rate of 5.
The application is receiving an avro schema and running some calculations for both cases. The only difference besides the actual api (FastAPI vs Flask) in both codes is that the FastAPI implementations need to run the request.body() in an asynchronous way
If you need a little bit more detail just let me know. Thanks in advance!
Yes, please add these,
uvicorn your:app --workers intwaitress-serve --call "your:app"
from fastapi import FastAPI, Request
app = FastAPI()
@app.get("/etc")
async def etc(request: Request):
return await request.body()
from flask import Flask
...
etc.
If you do not use uvicorn, it would't give you a good performance.
It's maybe a choice that Flask app + Tornado run.
http_server = HTTPServer(WSGIContainer(app))
http_server.listen(80)
IOLoop.instance().start()
Here are the results using ab to benchmark it too and using Fast API is still slower than Flask:
gunicorn -k uvicorn.workers.UvicornWorker api:production_app()`This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 200 requests
Completed 400 requests
Completed 600 requests
Completed 800 requests
Completed 1000 requests
Completed 1200 requests
Completed 1400 requests
Completed 1600 requests
Completed 1800 requests
Completed 2000 requests
Finished 2000 requests
Server Software: uvicorn
Server Hostname: localhost
Server Port: 8080
Document Path: /invocations
Document Length: 283 bytes
Concurrency Level: 10
Time taken for tests: 99.046 seconds
Complete requests: 2000
Failed requests: 0
Total transferred: 808000 bytes
Total body sent: 39252000
HTML transferred: 566000 bytes
Requests per second: 20.19 [#/sec] (mean)
Time per request: 495.232 [ms] (mean)
Time per request: 49.523 [ms] (mean, across all concurrent requests)
Transfer rate: 7.97 [Kbytes/sec] received
387.01 kb/s sent
394.98 kb/s total
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.3 0 6
Processing: 137 491 349.1 397 3087
Waiting: 137 490 349.0 397 3087
Total: 137 491 349.1 398 3087
Percentage of the requests served within a certain time (ms)
50% 398
66% 520
75% 590
80% 639
90% 912
95% 1136
98% 1549
99% 1843
100% 3087 (longest request)`
gunicorn api:production_app()`This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 200 requests
Completed 400 requests
Completed 600 requests
Completed 800 requests
Completed 1000 requests
Completed 1200 requests
Completed 1400 requests
Completed 1600 requests
Completed 1800 requests
Completed 2000 requests
Finished 2000 requests
Server Software: gunicorn/20.0.4
Server Hostname: localhost
Server Port: 8080
Document Path: /invocations
Document Length: 283 bytes
Concurrency Level: 10
Time taken for tests: 86.898 seconds
Complete requests: 2000
Failed requests: 0
Total transferred: 862000 bytes
Total body sent: 39252000
HTML transferred: 566000 bytes
Requests per second: 23.02 [#/sec] (mean)
Time per request: 434.491 [ms] (mean)
Time per request: 43.449 [ms] (mean, across all concurrent requests)
Transfer rate: 9.69 [Kbytes/sec] received
441.11 kb/s sent
450.80 kb/s total
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 2
Processing: 285 432 144.4 392 1661
Waiting: 285 431 144.3 392 1659
Total: 286 432 144.4 393 1661
Percentage of the requests served within a certain time (ms)
50% 393
66% 418
75% 439
80% 457
90% 550
95% 778
98% 925
99% 1030
100% 1661 (longest request)`
@Miguelme I don't know how these benchmarks are useful for us to help you.
I posted you the benchmarks and the way I execute each of the apps using gunicorn hoping that could help.
I will paste you here the code difference too. If there is anything else you could need to help me, feel free to ask to see if I can provide it.
Flask Code:
flask_app = Flask(__name__)
@flask_app.route("/invocations", methods=['POST'])
def invocations():
input = list(serialize_input(request.body()))
output = # runs_some_calculations
response_data = serialize_output_to_avro(output)
response = make_response(response_data)
response.headers['Content-Type'] = 'avro/binary'
return response
FastApi Code:
```
app = FastAPI()
@app.post("/invocations")
async def invocations(request: Request):
input = list(serialize_input(await request.body()))
output = # runs_some_calculations
response_data = serialize_output_to_avro(output)
response = Response(response_data, headers={'Content-Type': 'avro/binary'})
return response
```
Most helpful comment
It's kinda hard to tell without code and how you benchmarking them (running options etc.)
Also, I did some benchmarking you might want to check it out