Aspnetcore: Hub handshake failed with error 'WebSocket is not in the OPEN state' SignalR WebSockets Angular client with AutoReconnect()

Created on 5 Dec 2019  路  28Comments  路  Source: dotnet/aspnetcore

My application was first built on dotnetcore 2.2, then it was upgraded to dotnetcore 3.0 and it was working properly. But WithAutoReconnect() from Angular web app and all the default settings, gives following log:

[2019-12-05T12:10:14.538Z] Information: Reconnect attempt number 3 will start in 10000 ms.
zone.js:3372 OPTIONS http://127.0.0.1:9938/host/negotiate?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b net::ERR_CONNECTION_REFUSED
Utils.js:208 [2019-12-05T12:10:24.540Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:208 [2019-12-05T12:10:24.540Z] Debug: Sending negotiation request: http://127.0.0.1:9938/host/negotiate?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b.
Utils.js:208 [2019-12-05T12:10:24.683Z] Debug: Selecting transport 'WebSockets'.
Utils.js:208 [2019-12-05T12:10:24.683Z] Trace: (WebSockets transport) Connecting.
Utils.js:204 [2019-12-05T12:10:24.723Z] Information: WebSocket connected to ws://127.0.0.1:9938/host?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b&id=KOrhcPrFdlHaeTzcJxZjlA.
Utils.js:208 [2019-12-05T12:10:24.724Z] Debug: The HttpConnection connected successfully.
Utils.js:208 [2019-12-05T12:10:24.724Z] Debug: Sending handshake request.
Utils.js:208 [2019-12-05T12:10:24.743Z] Debug: Hub handshake failed with error 'WebSocket is not in the OPEN state' during start(). Stopping HubConnection.
Utils.js:208 [2019-12-05T12:10:24.744Z] Trace: (WebSockets transport) socket closed.
Utils.js:208 [2019-12-05T12:10:24.744Z] Debug: HttpConnection.stopConnection(undefined) called while in state Disconnecting.
Utils.js:198 [2019-12-05T12:10:24.744Z] Error: Connection disconnected with error 'WebSocket is not in the OPEN state'.
Utils.js:208 [2019-12-05T12:10:24.745Z] Debug: HubConnection.connectionClosed(WebSocket is not in the OPEN state) called while in state Reconnecting.
Utils.js:204 [2019-12-05T12:10:24.745Z] Information: Reconnect attempt failed because of error 'WebSocket is not in the OPEN state'.
Utils.js:204 [2019-12-05T12:10:24.745Z] Information: Reconnect attempt number 4 will start in 30000 ms.
Utils.js:208 [2019-12-05T12:10:54.747Z] Debug: Starting connection with transfer format 'Text'.
Utils.js:208 [2019-12-05T12:10:54.748Z] Debug: Sending negotiation request: http://127.0.0.1:9938/host/negotiate?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b.
Utils.js:208 [2019-12-05T12:10:54.773Z] Debug: Selecting transport 'WebSockets'.
Utils.js:208 [2019-12-05T12:10:54.773Z] Trace: (WebSockets transport) Connecting.
Utils.js:204 [2019-12-05T12:10:54.780Z] Information: WebSocket connected to ws://127.0.0.1:9938/host?clientIdentifier=b462ed0d-4df0-4f2d-82b1-4ad1ee72b09b&id=JEu_f9AcM9fx4OE3SMWEVg.
Utils.js:208 [2019-12-05T12:10:54.780Z] Debug: The HttpConnection connected successfully.
Utils.js:208 [2019-12-05T12:10:54.781Z] Debug: Sending handshake request.
Utils.js:208 [2019-12-05T12:10:54.781Z] Trace: (WebSockets transport) sending data. String data of length 32.
Utils.js:204 [2019-12-05T12:10:54.782Z] Information: Using HubProtocol 'json'.
Utils.js:208 [2019-12-05T12:10:54.829Z] Trace: (WebSockets transport) data received. String data of length 3.
Utils.js:208 [2019-12-05T12:10:54.830Z] Debug: Server handshake complete.
Utils.js:204 [2019-12-05T12:10:54.831Z] Information: HubConnection reconnected successfully.

The server is disconnected and client tries to reconnect. When server goes up and connected; then the following steps occurred:

  1. 3rd retry attempted and though "The HttpConnection connected successfully." but "Hub handshake failed with error 'WebSocket is not in the OPEN state' during start(). Stopping HubConnection" and (WebSockets transport) socket closed. HttpConnection.stopConnection(undefined) called while in state Disconnecting.
    And as result "Error: Connection disconnected with error 'WebSocket is not in the OPEN state'." is logged in the console.
  2. After 30sec 4th retry attempted and It will automatically reconnected.
    These steps are consistency reproducible.

Please help me to get rid of this error in between 3rd and 4th retry attempt.

area-signalr bug

All 28 comments

I think we need better logging in the WebSocketTransport.ts's WebSocket.onclose handler to see exactly what happened here.

I don't see any log of the WebSocket CloseEvent details, though the logs that exist certainly imply the WebSocket closed shortly after it was open during the hub handshake. Given that the websocket closed during the hub handshake, it's not surprising that it took one more reconnect attempt to successfully reestablish the connection, though it would be nice to know why the WebSocket closed to begin whit.

@paramitamandal can you reliably reproduce the handshake error? If so, is there any trick to make the repro reliable? If you could, putting the following log statement at the top of WebSocketTransport.ts's WebSocket.onclose handler might help diagnose the issue.

https://github.com/aspnet/AspNetCore/blob/f5492be35bc5831e3cd7f394e252690c43bd1c7f/src/SignalR/clients/ts/signalr/src/WebSocketTransport.ts#L110

this.logger.log(LogLevel.Trace, `WebSocket closed with status code: ${event.code} (${event.reason}).`);

ok. thanks @halter73. This error occurred each and every time when client tries to reconnect with the Server after the server goes down. The first attempt always failed due to the handshaking error, and then in 2nd attempt connection established. I will put a log in webSocket.onclose() and let you know the feedback.

We've seen some weird behavior with Angular's use of zone.js here, since it replaces the WebSocket class in JS with a different version. We should do an end-to-end angular test to see what we can find.

@bradygaster to prove out the end-to-end

I have already put some log as suggested by @halter73 but that did not add any other error log other than the previous one.
We are getting this error logged everytime as the first attempt of reconnection. @anurse May be due to angular as you said. I will update you if I find anything other than this log for non angular app during reconnection.

@anurse there is no luck in non angular project also. same error everywhere.

Can you tell us more about your hosting environment?

  • What OS are you on?
  • What server are you using (IIS/Kestrel/HttpSysServer)?
  • Do you have any proxies in between the client and the server (Nginx/Apache/HAproxy/etc.)?

Also, can you capture a network trace? We have documentation on how to do this.

It looks like something is causing the socket to be closed. Unfortunately the browser is somewhat unhelpful in these cases and doesn't give us a detailed error message, just "WebSocket is not in the OPEN state".

We are using Win 10, 1809 x64. aprart from this we are using Win7, win 8.1, and server2016 and also other versions of Windows 10.
We are using "Kestrel" and No proxies have been implemented.

The network trace: IEnwtrace1 - Copy.txt

Hrm, looks like the WebSockets are very short-lived, but there's no real indication why it's being terminated in that trace (Browser dev tools traces have very limited WebSockets support unfortunately). Are you able to try using Fiddler or a tool like Wireshark? (If you're using Wireshark, it's best if you can do the trace over HTTP, and not HTTPS since decrypting the traffic with Wireshark is very difficult).

Can you also collect server-side trace logs? Looking at the client-side logs and the trace you provided, it appears there isn't anything interrupting the connection there.

If you're using HTTPS you need to configure Fiddler to decrypt HTTPS traffic. All I see in that trace is encrypte TLS traffic and I can't decrypt it :(.

Let me know if you have any more questions on how to assemble the necessary information.

Here is the traffic captured by fiddler. Please let me know if this helps to investigate further or not.
errorReconnectingIE.txt

How about server-side logs? Have you been able to collect those?

I'm not able to find anything in the traces. Is it possible for you to share a runnable sample application that reproduces the problem?

I took a look at the fiddler trace too. The following request seems to be the reconnect attempt that failed. The trace shows the request only lasted a second and transferred a few bytes. The last WebSocket request happened 30 seconds later and seems to have stayed open until the end of the capture.

GET https://127.0.0.1:9926/host?id=MWpPeHaWrlIvInzIuzFb4w HTTP/1.1
Origin: http://172.26.95.122:8080
Sec-WebSocket-Key: bUO3pf60pngV4wFz//IFiA==
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: 127.0.0.1:9926
Cache-Control: no-cache


HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Date: Wed, 18 Dec 2019 09:57:37 GMT
Server: Kestrel
Upgrade: websocket
Vary: Origin
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: http://172.26.95.122:8080
Sec-WebSocket-Accept: S82skX5AsL8JE/y2xJrswfNf3FE=
EndTime: 15:27:38.009
ReceivedBytes: 4
SentBytes: 6

I agree with Andrew. It would be super helpful to get a runnable sample that demonstrates the issue. Server-side trace logs might also be helpful.

Just wanted to pinch in: We're experiencing the same behaviour in a react app using @microsoft/signalr 3.0.1. Can share the code later on!

@halter73 @anurse Can you please let me know the specific requirement about the runnable sample application? We just created a connection between dotnetcore web app and an angular web app with @microsoft/signalr 3.0.1 and experiencing the error everytime while reconnecting.

@paramitamandal Ideally a git repo we can clone and run (with some steps you provide) to reproduce the issue. We can certainly try to reproduce this ourselves, but there may be a subtle difference that doesn't become clear until we're comparing code. If it's easy for you to set up a repro project then we don't lose a bunch of time to getting the correct scenario identified.

will give u the repo link soon.

@halter73 @anurse You can use my repo to reproduce the issue
https://github.com/dlimkin/AspNetCore-issues-17613

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. If it is closed, feel free to comment when you are able to provide the additional information and we will re-investigate.

See our Issue Management Policies for more information.

@halter73 @anurse did u able to look into code which @dlimkin provided?

@halter73 @anurse Any luck??

@paramitamandal We've been a bit delayed due to the holidays but will take a look soon.

@anurse no issue..

I have the same issue as described above, the first retry fails immediately with the message 'WebSocket is not in the OPEN state'.
I changed my code to the following:

    // build connection with retry policy
    const connection = new HubConnectionBuilder()
      .withUrl(url, options)
      .withAutomaticReconnect(new InlineRetryPolicy(context => {
        // the first retry fails with this message, but it is unclear why, if I retry immediatly after this, the connection succeeds
        if (context.retryReason.message === 'WebSocket is not in the OPEN state') {
          return 0;
        } else {
          observer.next({ state: HubConnectionState.Disconnected });
          // when callback is called, start emitting state changes with the remaining seconds untill retry
          RETRY_STREAM.pipe(map(x => ({ state: HubConnectionState.Reconnecting, seconds: RETRY_IN_SECONDS - x - 1 }))).subscribe(x => observer.next(x));
          // return the amount of milliseconds the signalr library has to wait
          return RETRY_IN_SECONDS * 1000;
        }
      }))
      .build();

So when that is the fail message, I reconnect directly, otherwise I do my normal retry logic, I don't like hacks/fixes like this, but I don't really know where the issue lies

Thanks, we can reproduce the issue and will look into the problem.

Was this page helpful?
0 / 5 - 0 ratings