Node-slack-sdk: StateMachine: unhandled event 'websocket close' in state 'connecting'

Created on 12 Apr 2018  路  9Comments  路  Source: slackapi/node-slack-sdk

Description

Have an always running bot. Had a random crash on 4.1.0 today. Trace follows, which hopefully will help. Not really reproducible. Hopefully just a state transition missing in your state machine.

/opt/local/slack-bot/node_modules/finity/lib/core/StateMachine.js:76

throw new Error('Unhandled event \'' + event + '\' in state \'' + this.currentState + '\'.');
^
Error: Unhandled event 'websocket close' in state 'connecting'.
    at StateMachine.handleUnhandledEvent (/opt/local/slack-bot/node_modules/finity/lib/core/StateMachine.js:76:13)
    at /opt/local/slack-bot/node_modules/finity/lib/core/HierarchicalStateMachine.js:79:33
    at TaskScheduler.execute (/opt/local/slack-bot/node_modules/finity/lib/core/TaskScheduler.js:29:7)
    at TaskScheduler.enqueue (/opt/local/slack-bot/node_modules/finity/lib/core/TaskScheduler.js:19:12)
    at HierarchicalStateMachine.handle (/opt/local/slack-bot/node_modules/finity/lib/core/HierarchicalStateMachine.js:72:24)
    at WebSocket.websocket.addEventListener.event (/opt/local/slack-bot/node_modules/@slack/client/dist/RTMClient.js:375:77)
    at WebSocket.onClose (/opt/local/slack-bot/node_modules/ws/lib/event-target.js:124:16)
    at emitTwo (events.js:126:13)
    at WebSocket.emit (events.js:214:7)
    at _receiver.cleanup (/opt/local/slack-bot/node_modules/ws/lib/websocket.js:220:12)
    at Receiver.cleanup (/opt/local/slack-bot/node_modules/ws/lib/receiver.js:557:13)
    at WebSocket.finalize (/opt/local/slack-bot/node_modules/ws/lib/websocket.js:206:20)
    at emitNone (events.js:111:20)
    at TLSSocket.emit (events.js:208:7)
    at endReadableNT (_stream_readable.js:1064:12)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickCallback (internal/process/next_tick.js:180:9)

What type of issue is this?

  • [x] bug
  • [ ] enhancement (feature request)
  • [ ] question
  • [ ] documentation related
  • [ ] testing related
  • [ ] discussion

Requirements

  • [x] I've read and understood the Contributing guidelines and have done my best effort to follow them.
  • [x] I've read and agree to the Code of Conduct.
  • [x] I've searched for any related issues and avoided creating a duplicate issue.

Bug Report

Filling out the following details about bugs will help us solve your issue sooner.

Reproducible in:

@slack/client version: 4.1.0

node version: v8.10.0

OS version(s): Amazon Linux (4.9.76-38.79.amzn2.x86_64)

Steps to reproduce:

  1. Single occurrence while running an always-on bot, possibly related to network changes

Expected result:

From the error, it sounds like the websocket should just attempt to restart itself if it gets a close while it is connecting.

Actual result:

Node crashed and systemd restarted it, resulting in very small downtime

Attachments:

None

bug

Most helpful comment

Yeah, I did upgrade. I have no seen any further crashes. I do still have debug logging on. If I see anything I will let you guys know!

All 9 comments

thanks for reporting @malonecm

this sounds like its potentially related to #527, since they both have to do with the websocket being in a closing state unexpectedly. we're a little low on data around how this would happen, but that stack trace is definitely useful, thanks for that.

are you able to increase the logLevel in the RTMClient instance and share those logs in case you see this happen again? i think it would be nice to compare to #527 and also help me trace the steps in order to reproduce.

I've enabled debug logLevel now. If I get another StateMachine issue with websocket close I will post more logs. Thank you!

bot.log

Different crash this time. Debug log doesn't seem to help much to me. The "goodbye" thing I had never noticed before.

[DEBUG] @slack/client:RTMClient transitioning to state: reconnecting
[DEBUG] @slack/client:RTMClient transitioning to state: connecting
[DEBUG] @slack/client:RTMClient transitioning to state: connecting:authenticating
[DEBUG] @slack/client:RTMClient transitioning to state: connecting:authenticated
[DEBUG] @slack/client:RTMClient transitioning to state: connecting:handshaking
[DEBUG] @slack/client:RTMClient received message on websocket: {"type": "hello"}
[DEBUG] @slack/client:RTMClient transitioning to state: connected
[DEBUG] @slack/client:RTMClient transitioning to state: connected:resuming
/opt/local/slack-bot/node_modules/@slack/client/dist/KeepAlive.js:90
throw errors_1.errorWithCode(new Error('no client found'), _1.ErrorCode.KeepAliveInconsistentState);
^
Error: no client found
   at KeepAlive.attemptAcknowledgePong (/opt/local/slack-bot/node_modules/@slack/client/dist/KeepAlive.js:90:54)
   at RTMClient.emit (/opt/local/slack-bot/node_modules/eventemitter3/index.js:182:35)
   at RTMClient.onWebsocketMessage (/opt/local/slack-bot/node_modules/@slack/client/dist/RTMClient.js:430:14)
   at WebSocket.onMessage (/opt/local/slack-bot/node_modules/ws/lib/event-target.js:120:16)
   at emitOne (events.js:116:13)
   at WebSocket.emit (events.js:211:7)
   at Receiver._receiver.onmessage (/opt/local/slack-bot/node_modules/ws/lib/websocket.js:137:47)
   at Receiver.dataMessage (/opt/local/slack-bot/node_modules/ws/lib/receiver.js:409:14)
   at Receiver.getData (/opt/local/slack-bot/node_modules/ws/lib/receiver.js:347:12)
   at Receiver.startLoop (/opt/local/slack-bot/node_modules/ws/lib/receiver.js:143:16)
   at Receiver.add (/opt/local/slack-bot/node_modules/ws/lib/receiver.js:117:10)
   at emitOne (events.js:116:13)
   at TLSSocket.emit (events.js:211:7)
   at addChunk (_stream_readable.js:263:12)
   at readableAddChunk (_stream_readable.js:250:11)
   at TLSSocket.Readable.push (_stream_readable.js:208:10)
   at TLSWrap.onread (net.js:607:20)

And still leaving debug on looking for the first one. If adding this onto this is inappropriate I can open a different issue. Just let me know. Thank you!

@malonecm looks like you found another issue to me. the new one seems a little more insidious, too.

for this second one, it would be useful to see the log statements from the KeepAlive class. did you filter those out or turn them off specifically? if you're able to get in there and tweak things, mind turning them on too for the next occurrence?

it looks like the OP in #527 also had a similar issue with KeepAlive as demonstrated in the last log and stacktrace. i'd like to combine these two issues since they seem related.

we think v4.2.0 fixes this issue. please update and give it a try! if not' feel free to comment and we'll reopen.

@malonecm have you tried v4.2.0? Any better than the previous version? At the very least, we should have more meaningful logs now.

Yeah, I did upgrade. I have no seen any further crashes. I do still have debug logging on. If I see anything I will let you guys know!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bobrik picture bobrik  路  25Comments

aoberoi picture aoberoi  路  11Comments

jayjanssen picture jayjanssen  路  13Comments

aoberoi picture aoberoi  路  10Comments

kurisubrooks picture kurisubrooks  路  36Comments