Gunicorn: Sync vs async workers

Created on 21 Mar 2017  路  8Comments  路  Source: benoitc/gunicorn

This is one for the forum. I have read the docs and am having some further questions about the difference between sync and async workers. In a setup with Nginx acting as a proxy. The docs state that async workers are the way to go if you are doing long running operations. But given an example can you explain the difference.

Suppose I have 2 sync workers and that the endpoint the users are calling does a service call itself which takes around 20s.

Are the following suppositions correct:

  • if I have 2 concurrent users, each will have one of the workers to process their request, all other requests will be on hold until those 2 requests are processed, which takes around 20s given the above
  • if the backend service call takes longer than 30s the workers are restarted (default timeout)

And then the million dollar question, how does this work in a setup with 2 gevent workers? Given the same 2 concurrent users does Gunicorn handle other requests or does it also block until work on one of the 2 is finished? And what happens if one of the service calls takes longer than 30s to complete, does it also restart the worker?

Discussion - Forum -

Most helpful comment

I am not maintainer but:

Your assumptions are correct. Each sync worker can handle 1 request at time. So 2 workers support concurrent handling of 2 requests. The rest (if there are more requests coming in) are queued to nginx (for slow clients) and to gunicorn (once nginx is done buffering the request and sends it forward). See backlog option for queue size in gunicorn.

With gevent, worker_connections defines how many concurrent requests are allowed. It defaults to 1000, so with 2 gevent workers you can concurrently serve 2000 requests. The docs are correct, in practice you need async workers if you have slow processing time (due to I/O wait).

And yeah, default timeout is 30 seconds so workers are forcefully killed and restarted if processing doesn't complete in that time.

Of course, you can't probably just switch to async workers. You have to first ensure that other components (such as DB) in the system support the potential 2k concurrent connections. And if you hit timeout with async worker, it will not kill processing of just the problematic request but also all the other requests being handled by the same worker process... although I am not sure what this piece in timeout docs means:

For the non sync workers it just means that the worker process is still communicating and is not tied to the length of time required to handle a single request

All 8 comments

I am not maintainer but:

Your assumptions are correct. Each sync worker can handle 1 request at time. So 2 workers support concurrent handling of 2 requests. The rest (if there are more requests coming in) are queued to nginx (for slow clients) and to gunicorn (once nginx is done buffering the request and sends it forward). See backlog option for queue size in gunicorn.

With gevent, worker_connections defines how many concurrent requests are allowed. It defaults to 1000, so with 2 gevent workers you can concurrently serve 2000 requests. The docs are correct, in practice you need async workers if you have slow processing time (due to I/O wait).

And yeah, default timeout is 30 seconds so workers are forcefully killed and restarted if processing doesn't complete in that time.

Of course, you can't probably just switch to async workers. You have to first ensure that other components (such as DB) in the system support the potential 2k concurrent connections. And if you hit timeout with async worker, it will not kill processing of just the problematic request but also all the other requests being handled by the same worker process... although I am not sure what this piece in timeout docs means:

For the non sync workers it just means that the worker process is still communicating and is not tied to the length of time required to handle a single request

@tuukkamustonen is true . To complete A sync worker can process one worker at a time, but many connections can be waiting either in the socket queue (backlog) and then load balanced by the system between the workers or buffered by a proxy. The threaded workers works identically except it will spawn a number threads / workers to process more requests concurrently. The connections are then maintained in an eventloop that depends on the platform (kqueue, epoll, select, ..). This design will be improved soon.

That is what I presumed about the gevent worker as well. However something in our stack isn't playing nice with gevent, the worker times out constantly, something seems to be blocking execution. Something to do with the way gevent monkey patches everything? We've switched to gthread with 25 threads per worker which gives us 50 concurrent calls. Not near the 2000 of gevent but no problems with worker timeouts anymore and benchmarks show a nice improvement in speed.

Reading https://github.com/benoitc/gunicorn/issues/1045 I have a few more questions:

  • It is stated there that gthread is not really an async worker, what would the behavior of gthread be, when started with for example 10 threads, if one of the threads does a long blocking call, say for example using the requests library? Is work still handled by the 9 other threads. Or does everything block until the long call has ended?
  • It is stated here that to handle streaming one of the async workers should be used. Does gthread qualify here?

The threaded worker is not really an "async" worker in two senses:

  1. The threaded worker does not inherit from AsyncWorker.
  2. The threaded worker does not use an event loop built around a select or poll syscall API, so I/O operations are performed synchronously _on their calling thread_. One thread/request can perform a long blocking call and it won't affect the others, but it is still synchronous in terms of control flow in userland.

You could use the threaded worker for streaming. The most important thing about long requests is that the worker itself can still notify the arbiter that it is not dead. The threaded worker can do this.

closing the ticket as it seems it has been answered.

I had tests about this question with django apps and my conclusion is only the "eventlet" worker type who works as expected managing requests in asyncronous way. The test is simple:
I know in my app a slow request (aprox 20secs) and run gunicorn with two processes.

In sync mode, I make two concurrent requests and the third request dont work until 20 secs afeter, because the workers are busy with the 2 first requests. Ok, this is the expected behaviour.

with the "evenlet" mode, I can make 2, 10 or 50 concurrent requests, all of them are managed by gunicorn and all receives their responses after 20 sec. Great, this is the wanted result.

but, if I setup a "greenlet" workers, the behaviour is the same than with sync workers, I don't see any diferences between greenlet and sync workers. 驴this is normal? I thought the greenlet mode is better and more robust than eventlet.

PD: Why this request gets 20 sec no matter, maybe have a dependencie with other service or waits to DB o waits to i/o file or execute a simple sleep. the reason don't matter, this kind of scenario (slow requests) are very common in my apps and all my tests give me the same results.

@alex-left the gevent worker is known to work very much like the eventlet worker. If you can share a repo or a sample application that reproduces the issue you see, please open a new issue for it.

Was this page helpful?
0 / 5 - 0 ratings