Fastapi: Slower behavior using Fast API (same gunicorn config) Gunicorn with uvicorn.workers.UvicornWorker than using default one

Created on 10 Oct 2020  路  8Comments  路  Source: tiangolo/fastapi

I'm using this dependencies:

starlette==0.13.6 uvicorn==0.11.8 fastapi==0.61.1

According to every check I've worked with, using FastAPI is supposed to handle more request per second than a Flask App but I'm finding different behaviors.

  • Using Flask + Gunicorn, I'm getting twice the number of req/s vs using FastAPI + Gunicorn(uvicorn.workers.UvicornWorker)

The other params (besides the worker class) are the same for both apps.

What can I check to improve this ? or is this the expected behavior? If so, what could be other places I could look into to improve the performance of using FastAPI ?

question

Most helpful comment

It's kinda hard to tell without code and how you benchmarking them (running options etc.)

Also, I did some benchmarking you might want to check it out

All 8 comments

How are you benchmarking them?

It's kinda hard to tell without code and how you benchmarking them (running options etc.)

Also, I did some benchmarking you might want to check it out

I am benchmarking them using locust (https://locust.io/) with 100 concurrent users and a hatch rate of 5.

The application is receiving an avro schema and running some calculations for both cases. The only difference besides the actual api (FastAPI vs Flask) in both codes is that the FastAPI implementations need to run the request.body() in an asynchronous way

If you need a little bit more detail just let me know. Thanks in advance!

Yes, please add these,

How do you run FastAPI, and options?

  • Example: uvicorn your:app --workers int

How do you run Flask, and running options?

  • Example: waitress-serve --call "your:app"

The code that you use for testing FastAPI


from fastapi import FastAPI, Request

app = FastAPI()

@app.get("/etc")
async def etc(request: Request):
    return await request.body()

The code that you use for testing Flask

from flask import Flask

...

etc.

If you do not use uvicorn, it would't give you a good performance.
It's maybe a choice that Flask app + Tornado run.

    http_server = HTTPServer(WSGIContainer(app))
    http_server.listen(80)
    IOLoop.instance().start()

Here are the results using ab to benchmark it too and using Fast API is still slower than Flask:

  • Uvicorn + Fast API: Executed with this command gunicorn -k uvicorn.workers.UvicornWorker api:production_app()

`This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 200 requests
Completed 400 requests
Completed 600 requests
Completed 800 requests
Completed 1000 requests
Completed 1200 requests
Completed 1400 requests
Completed 1600 requests
Completed 1800 requests
Completed 2000 requests
Finished 2000 requests

Server Software: uvicorn
Server Hostname: localhost
Server Port: 8080

Document Path: /invocations
Document Length: 283 bytes

Concurrency Level: 10
Time taken for tests: 99.046 seconds
Complete requests: 2000
Failed requests: 0
Total transferred: 808000 bytes
Total body sent: 39252000
HTML transferred: 566000 bytes
Requests per second: 20.19 [#/sec] (mean)
Time per request: 495.232 [ms] (mean)
Time per request: 49.523 [ms] (mean, across all concurrent requests)
Transfer rate: 7.97 [Kbytes/sec] received
387.01 kb/s sent
394.98 kb/s total

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.3 0 6
Processing: 137 491 349.1 397 3087
Waiting: 137 490 349.0 397 3087
Total: 137 491 349.1 398 3087

Percentage of the requests served within a certain time (ms)
50% 398
66% 520
75% 590
80% 639
90% 912
95% 1136
98% 1549
99% 1843
100% 3087 (longest request)`

  • Gunicorn + Flask: Executed with this command gunicorn api:production_app()

`This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 200 requests
Completed 400 requests
Completed 600 requests
Completed 800 requests
Completed 1000 requests
Completed 1200 requests
Completed 1400 requests
Completed 1600 requests
Completed 1800 requests
Completed 2000 requests
Finished 2000 requests

Server Software: gunicorn/20.0.4
Server Hostname: localhost
Server Port: 8080

Document Path: /invocations
Document Length: 283 bytes

Concurrency Level: 10
Time taken for tests: 86.898 seconds
Complete requests: 2000
Failed requests: 0
Total transferred: 862000 bytes
Total body sent: 39252000
HTML transferred: 566000 bytes
Requests per second: 23.02 [#/sec] (mean)
Time per request: 434.491 [ms] (mean)
Time per request: 43.449 [ms] (mean, across all concurrent requests)
Transfer rate: 9.69 [Kbytes/sec] received
441.11 kb/s sent
450.80 kb/s total

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 2
Processing: 285 432 144.4 392 1661
Waiting: 285 431 144.3 392 1659
Total: 286 432 144.4 393 1661

Percentage of the requests served within a certain time (ms)
50% 393
66% 418
75% 439
80% 457
90% 550
95% 778
98% 925
99% 1030
100% 1661 (longest request)`

@Miguelme I don't know how these benchmarks are useful for us to help you.

I posted you the benchmarks and the way I execute each of the apps using gunicorn hoping that could help.

I will paste you here the code difference too. If there is anything else you could need to help me, feel free to ask to see if I can provide it.

Flask Code:

      flask_app = Flask(__name__)    
      @flask_app.route("/invocations", methods=['POST'])
      def invocations():

        input = list(serialize_input(request.body()))
        output = # runs_some_calculations
        response_data = serialize_output_to_avro(output)
        response = make_response(response_data)
        response.headers['Content-Type'] = 'avro/binary'

        return response

FastApi Code:

```
app = FastAPI()
@app.post("/invocations")
async def invocations(request: Request):

      input = list(serialize_input(await request.body()))
      output = # runs_some_calculations
      response_data = serialize_output_to_avro(output)
      response = Response(response_data, headers={'Content-Type': 'avro/binary'})
      return response

```

Was this page helpful?
0 / 5 - 0 ratings