If you're interested please comment here and come join our "Contributors" community channel on our daily build server, where you can discuss questions with community members and the Mattermost core team. For technical advice or questions, please join our "Developers" community channel.
New contributors please see our Developer's Guide, specifically for machine setup and for developer workflow.
Notes: Jira ticket
Using the current Go WebSocket client, there's currently no built-in way to check if the connection is still alive.
Add a ping/pong and automatic reconnection handler.
https://github.com/mattermost/mattermost-server/blob/master/model/websocket_client.go
The server makes pings here: https://github.com/mattermost/mattermost-server/blob/master/app/web_conn.go#L224
As discussed with @jwilander on the Contributors Chat I'll start working on this issue.
Currently, the websocket client has no test suite at all, and implementing a test suite for the websocket client in general feels out of scope of this issue. I'd imagine for the current initial plan only the implementation itself should be done and another issue opened for the client testing? Alternative is to postpone it until client tests are implemented.
Regarding the implementation, the way forward for me would be to add a ping handler to wsc.Conn here https://github.com/mattermost/mattermost-server/blob/master/model/websocket_client.go#L101 using SetPingHandler that will automatically reexecute the ConnectWithDialer method in case a ping message was not received in the last 65s (60s -> ping time as configured right now + 5 buffer). The issue there is that in the current stage we don't save a customer dialer if passed, so we don't have all req. information for the reconnect. I'd propose we save the dialer in ConnectWithDialer to fix that.
What do you think @jwilander, any concerns or sounds good?
If it's easy, it'd be nice to add testing just for the parts you're adding. If it's not easy, I think it's fine and we can create another ticket for testing
Implementation proposal sounds good to me 馃憤
I guess that will be really tricky, since it integrates quite tight into the already existing code base and only adds another ticket which calls a function. I guess we'd need to create a websocket mock server to simulate connection loss / certain events to test the client in optimal fashion, but I guess it'd be the best to move that to a dedicated ticket.
Perfect, I'll open a PR then once it's ready. Thank you.
Once the PR will be merged this can be closed, the programmer has to implement the reconnect himself but can now subscribe to a signal to react on timeouts himself, which might be necessary anyway depending on the library usage.
Most helpful comment
As discussed with @jwilander on the Contributors Chat I'll start working on this issue.