I'm just opening up this issue because this is something I've had to implement myself in my own Ratchet Websocket server implementation.
I see that you're using ChannelManager to persist connection state. This is fine for most situations, but I suggest thinking about the flexibility of it early on because I'm sure some users will hit walls with limits on vertical scaling of their WS server.
This is how I did it:
clue/redis-react to have an async Redis clientconnections:$connId or whatever. This is to allow other WS server instances to trigger actions on a connection they don't own themselves. e.g. connection limit per user, so it triggers an event to kill off the oldest connection by its ID.connections:$connId channel to tell any other instances that this one's disconnecting... then unsubscribe from that channel as well.Here's a bunch of implementation details on how I did it (no full source because it was for a closed-source project) https://github.com/simonhamp/laravel-echo-ratchet-server/issues/2
Obviously this would lose some compatibility with the Pusher server-side stuff due to using Redis as the broadcast driver, but all the client-side Pusher stuff should still work as-is, I think. Just need to load balance your websocket server instances (your docs mention HAProxy or Nginx to proxy, Caddy works just fine as well)
Edit: I just realized you don't necessarily need to switch to Redis for broadcasting. TriggerEventController could just publish to Redis itself. Pusher compatibility would be kept that way. You could just send those events to any of the echo server instances and they can propagate it out.
AFAIK the server I'm using now (slanger) has most Pusher functions implemented on top of Redis making it horizontal scalable (running 3 nodes currently). Maybe this also helps finding ideas and or implementations to apply here.
While working on this, @freekmurze and I thought about being able to store the channel information on Redis to allow horizontal scaling, but we didn't implement it right away.
I also think that we could still use Pusher as the broadcaster - but would only use it internally for the WebSocket server to manage the connections.
Maybe we can have different ChannelManager implementations. A MemoryChannelManager or ArrayChannelManager that simply stores the channel information in memory (what we have now) and a RedisChannelManager with the proposed changes.
Since we're running inside of a Laravel app, we could re-use the existing redis configuration values.
This is definitely a bigger task and will require a bit of time :)
@mpociot i would be interested in this myself. Too bad my time is very limited, as i would love to help building this part, after all the hard work you guys already did! Congrats!
Let me share with you my use case and our live numbers. We use pusher for live chat in large rooms. When i say large, i mean easily 2k to 15k people. Now, a single server CAN indeed sustain the connections for 15k people, no problems. However, the real problem is the messages between those people. We had months where we sent 500,000,000 messages. Imagine this case: there's 10k people in the room, and the moderator asks: where are you guys from? And now, you have a few thousand people replying: US, UK, etc... That's where 1 server could not handle (CPU wise). And that was with pusher's default message size, but now imagine those messages are over 10KB each!
slanger implementation had many issues, we tried that too. Had issues with presence channels. In a cluster of 8 servers, refreshing a few tabs a few times will result in duplicate subscription_id's. That was a mess :)
One way people tackled this issue in the past was with pub/sub server model for nginx via nchan. I did not look in depth at how the users are currently stored in the memory, but imagine there's 15k users, then getting them, looping them, sending them each one message might not be the fastest and most performant way. Which is why pub/sub worked great. I understand this might be totally out of scope for this project though. I'm thinking that a RedisChannelManager (assuming one is running that in a cluster) would probably be enough for 99% of people using this package, while giving them the option to scale horizontally.
Anyhow, good job on this package, and i hope this feature comes sooner than later. Will make this package soooo powerful. Again, great great job from you two! Keep it up!
Wanted to leave a quick update and get some feedback on how to best proceed.
I’ve swapped out the ChannelManager for a generic CacheChannelManager. This works with any Laravel cache driver. It might end up being slightly more efficient to use the Redis driver, but I thought the added flexibility of using any cache driver was worth the trade off to start out with.
I’ve run into a problem though. Essentially, the channel manager is responsibile for storing the Channels. But the channels are never updated within the ChannelManager. Specifically the subscribed connections and the users. As you can see, they are also arrays:
https://github.com/beyondcode/laravel-websockets/blob/master/src/WebSockets/Channels/Channel.php#L16
By implementing a cache driver, we are never updating these values.
We have a couple options for proceeding, but wanted to get @freekmurze and @mpociot’s feedback.
Unrelated, but if anyone could point me in the direction of how to best allow the configuration for a cache driver for a third party package, that would be great. For now I’m using the default cache connection, which is going to end up being less than ideal. I’m thinking the easies way is for them to just pass in which cache driver they want to use (if any), and pull it from their Laravel config?
I’d like to have this up and running on Heroku this weekend.
@snellingio You're doing god's work. I was planning to get to this at some point but I'm not going to be able to dive in until after the new year. I look forward to checking out your Heroku implementation.
This will be of no use unless there is some messaging between server instances, a pull/poll method will not work with the realtime-ness of websockets.
Every instance of the websockets server needs to be notified of incoming messages through the API (since only 1 instance will receive the API call) and broadcast it to the clients connected to their process and also broadcast changes in presence channel so all instances update their internal state and broadcast out changes to their clients.
So in theory a RedisChannelManager will hurt performance right now without allowing for horizontal scallability unless the ChannelManager will receive events from (for instance) a Redis pub-sub subscription that will allow the server to broadcast those events (messages broadcasted in a channel or a new user joining a presence channel). So it's a good start allowing to switch out the channel manager but it will require some more things to be added so the server instances are constantly being updated about events on the other instances.
Above is just to clarify the issue for people that might not see the "problems" with running multiple websocket servers that all need to have the same state.
This is ofcourse exactly what @snellingio also remarks:
By implementing a cache driver, we are never updating these values.
IMHO I think the channel manager can remain (/go back to as per 1.) a in-memory interface but something needs to be build around some kind of inter-process/server messaging like Redis Pub/Sub (React PHP implementation) to synchronize the in-memory changes to every running instance and that way internally broadcast all events so each node can broadcast that to it's connected clients.
@stayallive I believe I understand what you are saying... each server be responsible for broadcasting to its own connected clients. I was thinking about this incorrectly, but it makes sense.
Thanks @stayalive, yes that's correct, and is why I suggested doing it with Redis pub/sub above in the OP. This isn't something that will work generally with all cache drivers, because they don't all notify subscribers of changes.
So I have a proof of concept up and working around duplicating messages.
It's pretty messy and rough, but it looks like I should have a PR close to the end of the week. I'll push my branch tomorrow most likely, and will start thinking about how we can write some tests around it.
Big big big thank you to @francislavoie, as I would've never figured out how to include an async redis client within the running webserver without his first comment.
I am 99% sure my current implementation will break statistics, so I might need some direction on where to take that. Maybe as a first release we can omit that as a requirement.
Awesome! Glad that code went to good use finally :)
@snellingio do you want to open a WIP pull request so that we can look at it together?
Just opened a PR. Sorry for the delay!
We'll continue this conversation here: #61
Thanks for all your work on this 👍
Most helpful comment
While working on this, @freekmurze and I thought about being able to store the channel information on Redis to allow horizontal scaling, but we didn't implement it right away.
I also think that we could still use Pusher as the broadcaster - but would only use it internally for the WebSocket server to manage the connections.
Maybe we can have different ChannelManager implementations. A
MemoryChannelManagerorArrayChannelManagerthat simply stores the channel information in memory (what we have now) and aRedisChannelManagerwith the proposed changes.Since we're running inside of a Laravel app, we could re-use the existing redis configuration values.
This is definitely a bigger task and will require a bit of time :)