I've implemented Web Socket reconnection and it works fine for a while (i. e. if I disable wi-fi for 5 minutes or hibernate my laptop) but if I leave my mac book for an hour or so (it goes into sleeping mode I guess), SignalR doesn't recover: it manages to connect but then gets a handshake error and then this cycle repeats again and again.
I reproduced this behaviour on Chrome on Mac and my colleague reproduced it on Chrome on Windows though he was not able to reproduce it on Edge. So it might be a browser-specific problem
Here's a console output
[2019-01-06T11:29:15.846Z] Information: WebSocket connected to ws://api.gateio-demo.cryptolp.net/Liquidity?id=pylLwZAF0045nJxY-MJcTg.
HubConnection.js:552 Uncaught Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
t.processHandshakeResponse @ HubConnection.js:552
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
Utils.js:358 [2019-01-06T11:29:31.696Z] Error: Connection disconnected with error 'Error: Error parsing handshake response: Error: Expected a handshake response from the server.'.
t.log @ Utils.js:358
t.stopConnection @ HttpConnection.js:714
transport.onclose @ HttpConnection.js:435
t.close @ WebSocketTransport.js:288
t.stop @ WebSocketTransport.js:274
(anonymous) @ HttpConnection.js:274
(anonymous) @ HttpConnection.js:125
(anonymous) @ HttpConnection.js:55
a @ HttpConnection.js:7
Promise.then (async)
l @ HttpConnection.js:24
(anonymous) @ HttpConnection.js:27
j @ HttpConnection.js:4
t.stop @ HttpConnection.js:234
t.processHandshakeResponse @ HubConnection.js:555
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
HubConnection.js:552 Uncaught (in promise) Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
Could you explain this:
I've implemented Web Socket reconnection
Also, are there any server logs when you try to reconnect?
Currently not. Should I enable server logging for SignalR?
That would be helpful.
The logs show that there is a 15 second gap between the websocket being connected and the handshake timing out. It would be nice to see if the server ever got the handshake request. Could you also capture the HAR trace?
Is there any documentation on how to activate log on .net core?
@BrennanConroy, me and @nielheo are having trouble enabling the log you asked for. Could you please hint us?
Take a look at https://github.com/aspnet/SignalR/wiki/Diagnostics-Guide
server-side.txt
client-side.txt
After my PC idle for long time, it will try to re-establish signalr connection but got handshake error on client side. latest occurrence at [2019-01-19T01:17:18.430Z]
Few things:
@nielheo could you provide the cleaner logs with disabled Influx? @BrennanConroy here's the reconnection code we're using:
class LiquidityContainer extends Container {
constructor(props) {
super(props)
this.state = {
...
}
this.initiateConnection();
}
openConnection = () => {
let that = this
let interval = setInterval(async () => {
await startConnection()
}, 2000);
async function startConnection () {
if (connection.state === 0) {
try {
await connection.start()
await that.setState({ isConnected: true })
clearInterval(interval)
} catch {
}
}
}
}
initiateConnection = () => {
connection = new HubConnectionBuilder()
.withUrl(config.baseHubURL + 'Liquidity')
.build();
connection.on("UpdateOrdersState", (message) => {
console.log('WebSocket update');
});
connection.onclose(async () => {
this.setState({ isConnected: false, subscribed: '' })
setTimeout(this.openConnection, 1000)
})
this.openConnection()
}
Can you explain the wall of "starting connection" messages you guys have in your logs?
Each one of those means you're starting a new connection and you have 100s of them over the course of a few seconds. I would guess that if you fix that issue you wont see any more handshake response errors.
Utils.js:371 [2019-01-19T01:17:18.663Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.664Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.670Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.671Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.713Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.714Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.717Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.719Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.766Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.766Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.771Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.771Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.940Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.942Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.956Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.959Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.008Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.010Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.041Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.041Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.060Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.061Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.749Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.750Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.751Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.751Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.753Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.753Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.755Z] Debug: Starting HubConnection.
I suppose there's something wrong in our reconnecting logic that I posted above. After looking through the code the only problem I've noticed is that we set await that.setState({ isConnected: true }) right in startConnection instead of doing it in onConnected event but it's our internal variable, so it should not be a problem. Otherwise we're trying to reconnect every 2 seconds, so I don't see how we can trigger so many connection attempts within one second...
Is it possible the LiquidityContainer is being created many times?
Also, your reconnect logic tries to start a connection every 2 seconds even if the previous start is in progress. Which now that I think about it might be causing the issue.
Try something like https://docs.microsoft.com/en-us/aspnet/core/signalr/javascript-client?view=aspnetcore-2.2#reconnect-clients where you do the next connection attempt in the catch statement.
If HubConnection.start is called multiple times in a row (while the original connect is still in progress) it can reset the handshakeResolver and handshakeRejecter values which would cause the previous calls to never finish. We should prevent this from happening.
@BrennanConroy We've switched to a pattern in the docs (reconnect in catch), it got way better but sometimes this sequence of events happens:
WebSocket connecting
LiquidityContainer.js:34 WebSocket failed to connect:
Cannot start a connection that is not in the 'Disconnected' state.
LiquidityContainer.js:29 WebSocket connecting
LiquidityContainer.js:34 WebSocket failed to connect:
Cannot start a connection that is not in the 'Disconnected' state.
HubConnection.js:552 Uncaught Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
t.processHandshakeResponse @ HubConnection.js:552
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
Utils.js:358 [2019-01-31T20:17:21.370Z] Error: Connection disconnected with error 'Error: Error parsing handshake response: Error: Expected a handshake response from the server.'.
t.log @ Utils.js:358
t.stopConnection @ HttpConnection.js:714
transport.onclose @ HttpConnection.js:435
t.close @ WebSocketTransport.js:288
t.stop @ WebSocketTransport.js:274
(anonymous) @ HttpConnection.js:274
(anonymous) @ HttpConnection.js:125
(anonymous) @ HttpConnection.js:55
a @ HttpConnection.js:7
Promise.then (async)
l @ HttpConnection.js:24
(anonymous) @ HttpConnection.js:27
j @ HttpConnection.js:4
t.stop @ HttpConnection.js:234
t.processHandshakeResponse @ HubConnection.js:555
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
LiquidityContainer.js:29 WebSocket connecting
HubConnection.js:552 Uncaught (in promise) Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
t.processHandshakeResponse @ HubConnection.js:552
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
Utils.js:366 [2019-01-31T20:17:22.196Z] Information: WebSocket connected to ws://api.karma-stage.cryptolp.net/Liquidity?id=eYN50svoDbzpXBxFMS8hDA.
LiquidityContainer.js:32 WebSocket connected
LiquidityContainer.js:29 WebSocket connecting
LiquidityContainer.js:34 WebSocket failed to connect:
Cannot start a connection that is not in the 'Disconnected' state.
LiquidityContainer.js:29 WebSocket connecting
It looks strange: it already connected, then somehow starts to connect again and somehow it causes a reconnect
Do you mind sharing your new LiquidityContainer?
Sure
class LiquidityContainer extends Container {
constructor(props) {
this.initiateConnection();
}
openConnection = async () => {
try {
console.log('WebSocket connecting');
await connection.start();
await this.setState({ isConnected: true });
console.log('WebSocket connected');
} catch(e) {
console.log('WebSocket failed to connect:\n' + e.message);
setTimeout(async () => await this.openConnection(), 5000);
}
};
initiateConnection = async () => {
connection = new HubConnectionBuilder()
.withUrl(config.baseHubURL + 'Liquidity')
.build();
connection.on("UpdateOrdersState", (message) => {
console.log('WebSocket update');
this.updateOrderState(message);
});
connection.onclose(async () => {
await this.setState({ isConnected: false, subscribed: '' });
await this.openConnection();
});
await this.openConnection();
};
Looks like there is a bug in how we recommend doing the reconnect. The onclose callback is being called and at the same time connection.start() throws, so you get 2 reconnect attempts. The workaround for now would be to set a flag and check it in the catch and the onclose callback.
Interesting, I somehow assumed onclose is not called if we did not manage to connect. Wouldn't keeping reconnect only in onclose be enough?
No, depending on when the failure is internally the onclose may not be called but start will throw. We'll look at fixing this in a future release.
Ok, I understood. I will try to play with the field and get back to you.
Overall though it's confusing how SignalR after like 10 years of development still not only doesn't have built-in reconnection logic but has bugs in pretty basic connectivity features. Just imagine how much time is being wasted by .net developers facing this kind of basic problems all over the planet. I'm not trying to bash SignalR, I'm just sincerely confused how it is even possible, it's not like I'm doing something extraordinary with SignalR
A field did not seem to help. I've made a stress test now: connected, disabled wi-fi, waited till disconnection and enabled wi-fi and SignalR can't connect, being in this loop:
WebSocket connecting
VM1167:1 OPTIONS http://localhost:53938/Liquidity/negotiate net::ERR_CONNECTION_REFUSED
Utils.js:347 Warning: Error from HTTP request. 0:
index.js:1452 Error: Failed to complete negotiation with the server: Error
index.js:1452 Error: Failed to start the connection: Error
LiquidityContainer.js:34 WebSocket failed to connect:
update:
It was exactly like what @BrennanConroy said. After using flag, everything works smoothly. Thanks!
--
I have the same thing here as well. I think we have the same issue like @BrennanConroy 's comment https://github.com/aspnet/AspNetCore/issues/6434#issuecomment-459897933 here.
And I do think @siberianguy got a pretty good point here. Obviously everyone wants to have a way to make signalR reconnect automatically. I feel like this should be handle in the library as well.
Btw, it is also annoying that SignalR is always disconnect for some reasons after a hour or two with an error like "Error: Connection disconnected with error 'Error: WebSocket closed with status code: 1006". So it is kind of necessary to make reconnect feature work as expected.
We are working on reconnect, the issue is being tracked at https://github.com/aspnet/AspNetCore/issues/5282
We're covering this as part of automatic reconnect https://github.com/aspnet/AspNetCore/issues/8797 . But we'll try to take a look at this scenario before 3.0, since the manual reconnect process we document should also be able to handle this and we may need to fix something.
Fixed by #8566
Has this fix been released in any stable build?
No, not yet. This will be first become available in 3.0.0-preview4 and will be stable with the 3.0 release (expected by the end of this year).