Aspnetcore: Error parsing handshake response during reconnect

Created on 7 Jan 2019  路  27Comments  路  Source: dotnet/aspnetcore

I've implemented Web Socket reconnection and it works fine for a while (i. e. if I disable wi-fi for 5 minutes or hibernate my laptop) but if I leave my mac book for an hour or so (it goes into sleeping mode I guess), SignalR doesn't recover: it manages to connect but then gets a handshake error and then this cycle repeats again and again.

I reproduced this behaviour on Chrome on Mac and my colleague reproduced it on Chrome on Windows though he was not able to reproduce it on Edge. So it might be a browser-specific problem

Here's a console output

[2019-01-06T11:29:15.846Z] Information: WebSocket connected to ws://api.gateio-demo.cryptolp.net/Liquidity?id=pylLwZAF0045nJxY-MJcTg.
HubConnection.js:552 Uncaught Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
t.processHandshakeResponse @ HubConnection.js:552
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238

Utils.js:358 [2019-01-06T11:29:31.696Z] Error: Connection disconnected with error 'Error: Error parsing handshake response: Error: Expected a handshake response from the server.'.
t.log @ Utils.js:358
t.stopConnection @ HttpConnection.js:714
transport.onclose @ HttpConnection.js:435
t.close @ WebSocketTransport.js:288
t.stop @ WebSocketTransport.js:274
(anonymous) @ HttpConnection.js:274
(anonymous) @ HttpConnection.js:125
(anonymous) @ HttpConnection.js:55
a @ HttpConnection.js:7
Promise.then (async)
l @ HttpConnection.js:24
(anonymous) @ HttpConnection.js:27
j @ HttpConnection.js:4
t.stop @ HttpConnection.js:234
t.processHandshakeResponse @ HubConnection.js:555
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238

HubConnection.js:552 Uncaught (in promise) Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
Done accepted area-signalr bug signalr-client-javascript

All 27 comments

Could you explain this:

I've implemented Web Socket reconnection

Also, are there any server logs when you try to reconnect?

Currently not. Should I enable server logging for SignalR?

That would be helpful.

The logs show that there is a 15 second gap between the websocket being connected and the handshake timing out. It would be nice to see if the server ever got the handshake request. Could you also capture the HAR trace?

Is there any documentation on how to activate log on .net core?

@BrennanConroy, me and @nielheo are having trouble enabling the log you asked for. Could you please hint us?

  • server-side : NuGet v 1.1.0
  • client-side: NPM "@aspnet/signalr": "^1.1.0", "react": "^16.7.0",
  • IIS Express
  • OS: Windows 10 (client & Server)
  • Browser: Chrome / Edge

server-side.txt
client-side.txt

After my PC idle for long time, it will try to re-establish signalr connection but got handshake error on client side. latest occurrence at [2019-01-19T01:17:18.430Z]

Few things:

  • Why are you spamming the server with hundreds of connections?
  • There are a ton of unrelated issues in your server side logs which make it very difficult to parse what's going on, could you remove all the InfluxDB stuff from the server so there is a clean environment to see the issue in? Or maybe this could be causing the issue.
  • Could you show the code you're using for reconnect?

@nielheo could you provide the cleaner logs with disabled Influx? @BrennanConroy here's the reconnection code we're using:

class LiquidityContainer extends Container {
  constructor(props) {
    super(props)
    this.state = {
     ...
    }

    this.initiateConnection();
  }

  openConnection = () => {
    let that = this

    let interval = setInterval(async () => {
      await startConnection()
    }, 2000); 

    async function startConnection () {
      if (connection.state === 0) {
        try {
          await connection.start()
          await that.setState({ isConnected: true }) 
          clearInterval(interval)
        } catch {

        }
      }
    }
  }

  initiateConnection = () => {
    connection = new HubConnectionBuilder()
      .withUrl(config.baseHubURL + 'Liquidity')
      .build();

    connection.on("UpdateOrdersState", (message) => {
      console.log('WebSocket update');
    });

    connection.onclose(async () => { 
      this.setState({ isConnected: false, subscribed: '' }) 
      setTimeout(this.openConnection, 1000)
    })

    this.openConnection()
  } 

Can you explain the wall of "starting connection" messages you guys have in your logs?
Each one of those means you're starting a new connection and you have 100s of them over the course of a few seconds. I would guess that if you fix that issue you wont see any more handshake response errors.

Utils.js:371 [2019-01-19T01:17:18.663Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.664Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.670Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.671Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.713Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.714Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.717Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.719Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.766Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.766Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.771Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.771Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.940Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.942Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:18.956Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:18.959Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.008Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.010Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.041Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.041Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.060Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.061Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.749Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.750Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.751Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.751Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.753Z] Debug: Starting HubConnection.
Utils.js:371 [2019-01-19T01:17:19.753Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:371 [2019-01-19T01:17:19.755Z] Debug: Starting HubConnection.

I suppose there's something wrong in our reconnecting logic that I posted above. After looking through the code the only problem I've noticed is that we set await that.setState({ isConnected: true }) right in startConnection instead of doing it in onConnected event but it's our internal variable, so it should not be a problem. Otherwise we're trying to reconnect every 2 seconds, so I don't see how we can trigger so many connection attempts within one second...

Is it possible the LiquidityContainer is being created many times?

Also, your reconnect logic tries to start a connection every 2 seconds even if the previous start is in progress. Which now that I think about it might be causing the issue.
Try something like https://docs.microsoft.com/en-us/aspnet/core/signalr/javascript-client?view=aspnetcore-2.2#reconnect-clients where you do the next connection attempt in the catch statement.

If HubConnection.start is called multiple times in a row (while the original connect is still in progress) it can reset the handshakeResolver and handshakeRejecter values which would cause the previous calls to never finish. We should prevent this from happening.

@BrennanConroy We've switched to a pattern in the docs (reconnect in catch), it got way better but sometimes this sequence of events happens:

WebSocket connecting
LiquidityContainer.js:34 WebSocket failed to connect:
Cannot start a connection that is not in the 'Disconnected' state.
LiquidityContainer.js:29 WebSocket connecting
LiquidityContainer.js:34 WebSocket failed to connect:
Cannot start a connection that is not in the 'Disconnected' state.
HubConnection.js:552 Uncaught Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
t.processHandshakeResponse @ HubConnection.js:552
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
Utils.js:358 [2019-01-31T20:17:21.370Z] Error: Connection disconnected with error 'Error: Error parsing handshake response: Error: Expected a handshake response from the server.'.
t.log @ Utils.js:358
t.stopConnection @ HttpConnection.js:714
transport.onclose @ HttpConnection.js:435
t.close @ WebSocketTransport.js:288
t.stop @ WebSocketTransport.js:274
(anonymous) @ HttpConnection.js:274
(anonymous) @ HttpConnection.js:125
(anonymous) @ HttpConnection.js:55
a @ HttpConnection.js:7
Promise.then (async)
l @ HttpConnection.js:24
(anonymous) @ HttpConnection.js:27
j @ HttpConnection.js:4
t.stop @ HttpConnection.js:234
t.processHandshakeResponse @ HubConnection.js:555
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
LiquidityContainer.js:29 WebSocket connecting
HubConnection.js:552 Uncaught (in promise) Error: Error parsing handshake response: Error: Expected a handshake response from the server.
at t.processHandshakeResponse (HubConnection.js:552)
at t.processIncomingData (HubConnection.js:489)
at t.connection.onreceive (HubConnection.js:179)
at WebSocket.i.onmessage (WebSocketTransport.js:238)
t.processHandshakeResponse @ HubConnection.js:552
t.processIncomingData @ HubConnection.js:489
connection.onreceive @ HubConnection.js:179
i.onmessage @ WebSocketTransport.js:238
Utils.js:366 [2019-01-31T20:17:22.196Z] Information: WebSocket connected to ws://api.karma-stage.cryptolp.net/Liquidity?id=eYN50svoDbzpXBxFMS8hDA.
LiquidityContainer.js:32 WebSocket connected
LiquidityContainer.js:29 WebSocket connecting
LiquidityContainer.js:34 WebSocket failed to connect:
Cannot start a connection that is not in the 'Disconnected' state.
LiquidityContainer.js:29 WebSocket connecting

It looks strange: it already connected, then somehow starts to connect again and somehow it causes a reconnect

Do you mind sharing your new LiquidityContainer?

Sure

class LiquidityContainer extends Container {
  constructor(props) {
     this.initiateConnection();
  }

  openConnection = async () => {
      try {
        console.log('WebSocket connecting');
        await connection.start();
        await this.setState({ isConnected: true });
        console.log('WebSocket connected');
      } catch(e) {
          console.log('WebSocket failed to connect:\n' + e.message);

          setTimeout(async () => await this.openConnection(), 5000);
      }
  };

  initiateConnection = async () => {
    connection = new HubConnectionBuilder()
      .withUrl(config.baseHubURL + 'Liquidity')
      .build();

    connection.on("UpdateOrdersState", (message) => {
      console.log('WebSocket update');
      this.updateOrderState(message);
    });

    connection.onclose(async () => { 
      await this.setState({ isConnected: false, subscribed: '' });
      await this.openConnection();
    });

    await this.openConnection();
  };

Looks like there is a bug in how we recommend doing the reconnect. The onclose callback is being called and at the same time connection.start() throws, so you get 2 reconnect attempts. The workaround for now would be to set a flag and check it in the catch and the onclose callback.

Interesting, I somehow assumed onclose is not called if we did not manage to connect. Wouldn't keeping reconnect only in onclose be enough?

No, depending on when the failure is internally the onclose may not be called but start will throw. We'll look at fixing this in a future release.

Ok, I understood. I will try to play with the field and get back to you.

Overall though it's confusing how SignalR after like 10 years of development still not only doesn't have built-in reconnection logic but has bugs in pretty basic connectivity features. Just imagine how much time is being wasted by .net developers facing this kind of basic problems all over the planet. I'm not trying to bash SignalR, I'm just sincerely confused how it is even possible, it's not like I'm doing something extraordinary with SignalR

A field did not seem to help. I've made a stress test now: connected, disabled wi-fi, waited till disconnection and enabled wi-fi and SignalR can't connect, being in this loop:

WebSocket connecting
VM1167:1 OPTIONS http://localhost:53938/Liquidity/negotiate net::ERR_CONNECTION_REFUSED
Utils.js:347 Warning: Error from HTTP request. 0: 
index.js:1452 Error: Failed to complete negotiation with the server: Error
index.js:1452 Error: Failed to start the connection: Error
LiquidityContainer.js:34 WebSocket failed to connect:

update:
It was exactly like what @BrennanConroy said. After using flag, everything works smoothly. Thanks!

--

I have the same thing here as well. I think we have the same issue like @BrennanConroy 's comment https://github.com/aspnet/AspNetCore/issues/6434#issuecomment-459897933 here.

And I do think @siberianguy got a pretty good point here. Obviously everyone wants to have a way to make signalR reconnect automatically. I feel like this should be handle in the library as well.

Btw, it is also annoying that SignalR is always disconnect for some reasons after a hour or two with an error like "Error: Connection disconnected with error 'Error: WebSocket closed with status code: 1006". So it is kind of necessary to make reconnect feature work as expected.

We are working on reconnect, the issue is being tracked at https://github.com/aspnet/AspNetCore/issues/5282

We're covering this as part of automatic reconnect https://github.com/aspnet/AspNetCore/issues/8797 . But we'll try to take a look at this scenario before 3.0, since the manual reconnect process we document should also be able to handle this and we may need to fix something.

Fixed by #8566

Has this fix been released in any stable build?

No, not yet. This will be first become available in 3.0.0-preview4 and will be stable with the 3.0 release (expected by the end of this year).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aurokk picture aurokk  路  3Comments

BrennanConroy picture BrennanConroy  路  3Comments

fayezmm picture fayezmm  路  3Comments

guardrex picture guardrex  路  3Comments

Kevenvz picture Kevenvz  路  3Comments