Flask-socketio: scaling up for high availability on flask-socketio, gevent-websocket and gunicorn setting

Created on 10 Aug 2019 · 28Comments · Source: miguelgrinberg/Flask-SocketIO

hello, @miguelgrinberg

I'm following your flask-socketio guide.
And I'm working on flask-socketio, gevent-websocket and gunicorn combination.
By your guide and help, it is quite working well when gunicorn has 1 worker.

And I need additional high availability support.
What is the best architecture to run multiple gunicorn servers have single worker, in a single machine?

I already have aws alb, then I can set several routes from alb to each servers port by port.
But I cannot run many gunicorn servers at an app giving individual port numbers to them.

I dont' know what is proper way to run multiple gunicorn servers even I'm trying to use nginx as a proxy in front of multiple gunicorn servers.

cf.

After read this, I prefer gunicorn instead uwsgi.
You said it is slow that flask-socketio saves its state some db like redis. https://github.com/miguelgrinberg/Flask-SocketIO/issues/694
I hope that is better than using slow and heavy uwsgi.
I'm using redis for socketio message queue.

thank you always.

question

Source

BRIDGE-AI

Most helpful comment

This would seem to indicate that, from one ip address, all requests would be hitting the same worker process.

Correct.

This is a significant variation from my past gunicorn experience, where a single client would benefit from being able to use multiple worker processes (for example loading two ajax requests at once)

If a client sends multiple requests at the same time under an ip_hash configuration, then all of those requests will run concurrently on the same worker. But nginx is pretty powerful, you can configure the socket.io endpoint under ip_hash, and any other endpoints with the normal routing. This is really up to you.

I think it should also be mentioned in the documentation that to use this strategy we need to switch the load balancer like this.

The need for sticky sessions is mentioned in the documentation already. There is also a starter nginx configuration example.

It might also be helpful to have the multiprocess config for supervisor linked, for people who are used to using one gunicorn process now having to switch to many.

I don't really want my documentation to become an nginx manual. Configuring nginx is extremely complex and different people have different needs, I'm not sure I want to go beyond the simple example that people can start from.

miguelgrinberg on 17 Aug 2019

👍4

All 28 comments

The ALB documentation states that you can register the same EC2 instance multiple times with different ports (link).

But I cannot run many gunicorn servers at an app giving individual port numbers to them.

Why is this a problem? You can run each gunicorn individually on its own port.

miguelgrinberg on 11 Aug 2019

@miguelgrinberg

Yes, you're correct about ALB.
I'm going to do so.

And I could run each gunicorn individually on its own port.
But I couldn't run many gunicorns using only one deploy package.
I should be suffered from deployment if I have to manage and deploy many packages for individual gunicorn apps.
I'm trying to find an architecture to run each gunicorn individually on its own port using only one deployment(one app).

Do you have any idea?

BRIDGE-AI on 11 Aug 2019

I don't understand. What do you mean by "one deploy package"? What is that?

miguelgrinberg on 11 Aug 2019

I mean it single source code package.

BRIDGE-AI on 11 Aug 2019

I found that gunicorn kills another running gunicorn that has same command line.
I should run many gunicorns with difference command line each like below.

$ ps aux | grep api
ec2-user 29540  8.5  1.3 498100 105812 pts/0   Sl   07:58   0:01 python ./api dev-local-8011
ec2-user 29584  0.0  1.2 500440 97968 pts/0    Sl   07:58   0:00 python ./api dev-local-8011
ec2-user 29754 36.0  1.3 498104 105816 pts/0   Sl   07:58   0:01 python ./api dev-local-8010
ec2-user 29797  0.0  1.2 500444 97888 pts/0    Sl   07:58   0:00 python ./api dev-local-8010

This may suffers me on managing and deploying the package.

BRIDGE-AI on 11 Aug 2019

I wanted to add that I was also a little confused on this, maybe I will start a new thread. I read the documentation for deployment but I could not understand exactly how multiple workers were supposed to work. In my regular flask app I would make use of a number of workers with gunicorn for just basic http requests. It was useful for performance of the non-websocket part of the app. Now looking to add websocket support, I am confused about how this should be accomplished. Should we run many gunicorn processes at many hardcoded ports, each with one worker? Should then we run another gunicorn process with multiple workers, just for the non-websocket parts of the app? How to avoid errors when doing this (using multiple gunicorn workers)? Basically it seems that it is missing in the documentation how to move from a standard multi worker flask app (which many of us likely started with) to a socketio-enabled app, without losing prior performance, if that makes any sense. It might be very simple, I was just not able to understand it from the socketio deployment docs.

sona1111 on 16 Aug 2019

👀3

Should we run many gunicorn processes at many hardcoded ports, each with one worker?

Yes. Unfortunately this is the only way because gunicorn has very primitive load balancing. You then need to use nginx or other load balancer that supports sticky sessions to distribute the load across these single-instance workers.

Should then we run another gunicorn process with multiple workers, just for the non-websocket parts of the app?

No. The configuration that you are using for Socket.IO also works for regular Flask requests. You don't need to do anything else.

miguelgrinberg on 16 Aug 2019

Hi Miguel, appreciate the reply.

No. The configuration that you are using for Socket.IO also works for regular Flask requests. You don't need to do anything else.

So by this, it seems that the load balancer is effectively changed from gunicorn to nginx? This is slightly confusing to me especially because of the recommended "ip_hash" nginx function mentioned in the documentation. This would seem to indicate that, from one ip address, all requests would be hitting the same worker process. This is a significant variation from my past gunicorn experience, where a single client would benefit from being able to use multiple worker processes (for example loading two ajax requests at once)

I don't have enough experience with how nginx will load balance the upstream socketio_nodes section, it might be as good as gunicorn, but I think it should also be mentioned in the documentation that to use this strategy we need to switch the load balancer like this. It might also be helpful to have the multiprocess config for supervisor linked, for people who are used to using one gunicorn process now having to switch to many. I will see if I can write a short example soon if you think that would be helpful.

sona1111 on 16 Aug 2019

This would seem to indicate that, from one ip address, all requests would be hitting the same worker process.

Correct.

This is a significant variation from my past gunicorn experience, where a single client would benefit from being able to use multiple worker processes (for example loading two ajax requests at once)

I think it should also be mentioned in the documentation that to use this strategy we need to switch the load balancer like this.

The need for sticky sessions is mentioned in the documentation already. There is also a starter nginx configuration example.

It might also be helpful to have the multiprocess config for supervisor linked, for people who are used to using one gunicorn process now having to switch to many.

miguelgrinberg on 17 Aug 2019

👍4

@miguelgrinberg

I have question about private messaging. (not broadcasting)
I heard that I can implement private message using 'rooms' mechanism.

And I've seen about it below.

https://stackoverflow.com/a/39431811/4736859

Doesn't this have any problem on scaled up environment?
These kinds of examples use variables like 'clients' and it looks not good on high-available architecture to me.

And if someone want to send a message to another, should I save username and sid into a database?
I'd like to ask you what is most efficient and best implementation for private message between system(server) and user(client) or user to user based on system's relay.

always thank you.

BRIDGE-AI on 20 Aug 2019

@BRIDGE-AI if you want to send a private event to a client, you can use the client room. This is a room that is automatically allocated to each client when it joins. The room name is the sid assigned to that client. So you will do room=sid when you want to target a specific user.

In the connect event when you do authentication you can determine a mapping between sid values and user_ids in your application. Storing this mapping is something that you need to implement in your application. You can store the sid values in your database, in redis, etc.

miguelgrinberg on 20 Aug 2019

👍2

@miguelgrinberg

you are always the best.

BRIDGE-AI on 20 Aug 2019

@miguelgrinberg

it seems that I cannot access request.sid on outside of the socketio event handler.
is there any other option?
And what is the difference between inside of the handler and out.

BRIDGE-AI on 20 Aug 2019

I don't understand. Outside of an event handler what do you expect request.sid to be? If there is no client, request has no value, you can't use it.

miguelgrinberg on 20 Aug 2019

Sorry, I have a client.
I arranged this error on the code below.

  File "/usr/lib64/python2.7/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "./api", line 1380, in socketio_ping
    obj = {'data': 42, 'count': n, 'private':private, 'sid': flask.request.sid}
  File "/usr/lib/python2.7/site-packages/werkzeug/local.py", line 348, in __getattr__
    return getattr(self._get_current_object(), name)
AttributeError: 'Request' object has no attribute 'sid'

1374 # poing-pong test
1375 @app.route('/ping/<int:n>')
1376 def socketio_ping(n = 0):
1377     data = get_data()
1378     private = data.get("private", False)
1379
1380     obj = {'data': 42, 'count': n, 'private':private, 'sid': flask.request.sid}
1381     app.socketio.emit('ping-event', obj, room=flask.request.sid)
1382
1383     return 'okay sid:' + flask.request.sid, 200
1384
1385 @app.socketio.on('pong-event')
1386 def socketio_pong(data):
1387     data["count"] = data.get("count", 0) + 1
1388     data["sid"] = flask.request.sid
1389     app.socketio.emit('pong-answered', data, room=flask.request.sid)
1390     app.socketio.send(data)
1391 # poing-pong test end

I have no problem in socketio_pong().
But line 1380 is making an error.

BRIDGE-AI on 20 Aug 2019

request.sid is only available when the client is a Socket.IO client. You are trying to get a sid in an HTTP connection, which knows nothing about the Socket.IO side.

miguelgrinberg on 20 Aug 2019

okay, I understand.
Then, how can I start private ping-pong sequence from server by calling api '/ping'?
Should I use the sid value as room saved on redis after saving sid on socketio.on('connect')?

BRIDGE-AI on 20 Aug 2019

Sure, that would work.

miguelgrinberg on 20 Aug 2019

👍1

Hello, @miguelgrinberg

Sorry for more question on end thread
I'd like to ask your opinion related with HA on websocket.

Do I need any web server like gunicorn essentially even if I already have load balancer which has sticky session feature in front of the gunicorn?

I uses aws alb in front of each ec2 instances without nginx setting.
I think aws alb is good load balancer which contains sticky session.
And, by the story of this thread, I cannot use multiple workers for one gunicorn web server and then I should run multiple gunicorns with each individual port numbers on one single EC2 machine.

Then,
I think I can run multiiple flask apps with each individual port numbers without gunicorn at all.
How do you think about this?
I'm going to wait your opinion.

ps.
I don't want to add something into this architecture and I want to keep this setting as simple as possible.
So I want to depend on aws alb with nginx and it's working now well.

BRIDGE-AI on 7 Jan 2020

@BRIDGE-AI You can run any production web server. Gunicorn is one option, but the WSGI servers provided by eventlet and gevent are also good options that work as reliably as Gunicorn.

miguelgrinberg on 7 Jan 2020

Sorry, @miguelgrinberg

Can I run only flask app without any web servers?

BRIDGE-AI on 7 Jan 2020

@BRIDGE-AI I don't understand. How are you going to listen for connections from clients without a web server?

miguelgrinberg on 7 Jan 2020

@miguelgrinberg

just $ flask run is not enough?
I can run multiple flask apps with individual ports.

BRIDGE-AI on 7 Jan 2020

flask run is actually using a web server, the Flask development web server. This is not a production web server, I recommend against using it on a production site. Also consider that WebSocket isn't supported on this web server, so your Socket.IO app will not perform as well as with the others.

miguelgrinberg on 7 Jan 2020

👍1

okay, I understand.
Thank you!

BRIDGE-AI on 7 Jan 2020

I confirm this is a good HA infrastructure: ALB + (NGINX) + Gunicorn + Evenlet + Flask.
=> Gunicorn with 1 worker

I would really avoid Flask dev server. Not real WebSocket (just HTTP) and poor performance.
Thanks Miguel for these information.

UPDATE: NGINX is not needed as @BRIDGE-AI confirmed ALB handle sticky session.

sebastienmascha on 2 Jun 2020

@sebastienmascha I don't understand why NGINX is required. Can't we use ALB + Gunicorn + Flask ? What extra purpose does nginx solve ?

mohit-sentieo on 2 Jul 2020

@mohit-sentieo

The NGINX is not essential by my experience.
You can use ALB which has sticky session.
I think they might refers NGINX which has known which has very typical sticky session.
I've used ALB and its sticky session very well without NGINX at all.

BRIDGE-AI on 6 Jul 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings