I suspect either:
(I am attempting to post this question to the Gunicorn mailing list, but I was able to find no clear instructions for how to do that directly. So hopefully creating a Github issue is the correct route.)
It's a good place for it :) You can find some informations there already : http://docs.gunicorn.org/en/stable/design.html
anyway it will depends. With Gunicorn behind NGINX, even the sync worker is able to accept multiple connection. It will handle N requests concurrently (where N is the number of workers) while you can have many waiting in the nginx buffer. I most case this enough to run a very large website with more than 10K simultaneous connections.
For other usages you have the async workers. While NGINX or any proxy able to buffer the connections will continue to help these workers will allow Gunicorn to handle more concurrent connections or keep some connections open for a long time.
Hope I answered to your question.
To answer about Gunicorn only:
The backlog setting specifies how deep the OS listen backlog is. If the backlog is full, the request will be rejected by the OS and the gateway will get an upstream connection failure. If the backlog is not full, the connection will be opened by the OS, but will not be handled by Gunicorn until a Gunicorn worker is ready to accept the request. The gateway may return an upstream timeout if Gunicorn does not handle the request quickly.
If you find that you have a lot of upstream timeouts and you're running with a reverse proxy in front of Gunicorn you may want to lower the backlog so that your proxy can fail over more efficiently.
Thanks folks. These answers help.