Botkit: Long running issue.

Created on 25 Jan 2016  Â·  13Comments  Â·  Source: howdyai/botkit

If I run botkit for 2-3hrs it stops working and my bot logs out of slack. Any idea?

Most helpful comment

For others having this issue, note that you can also register a handler for the rtm_close event. So, for instance, if enabling retry isn't what you want to do, you might do something like:

controller.on('rtm_close', function() {
  //Do something - eg log it, or maybe process.exit();
});

In my case, I have a process manager that will restart the process if it detects it died, but your use case my differ. Just wanted to highlight that you have some options for closer control here.

All 13 comments

Hi there, experienced same issue here this morning. First time form me, node was working all seemed fine but my bot was disconnected from the only Slack account it was connected to. Dunno when it disconnected though.

Same here. Bot disconnects from slack, but the process is still running. Does someone have any leads on what the problem might be?

I ran into the same/similar issue although my bot was timing out due to issues with the proxy that it is behind. I found a solution and can submit a pull request for it shortly. If you look at the Slack documentation on the real time messaging, they suggest sending a ping "every few seconds".

Here is the relevant portion of the docs...

Ping and Pong

Clients should try to quickly detect disconnections, even in idle periods, so that users can easily tell the difference between being disconnected and everyone being quiet. Not all web browsers support the WebSocket ping spec, so the RTM protocol also supports ping/pong messages. When there is no other activity clients should send a ping every few seconds. To send a ping, send the following JSON:

{
    "id": 1234, // ID, see "sending messages" above
    "type": "ping",
    …
}

No idea what could cause that, but I've implemented a rudimentary ping solution do detect a disconnection and reboot the bot then:

  // pings and pongs that tests if bot still connected to Slack
  keepAlive: function (team, bot) {
    var that = this;

    if (!this._pongs[team.token]) {
      this._pongs[team.token] = [];

      (function (team, bot) {
        bot.rtm.on('pong', function (resp) {
          that._pongs[team.token].pop();
        });
      })(team, bot);
    }

    setTimeout(function () {
      if (that._pongs[team.token].length >= (that.config.pings_unack_tresshold || 3)) {
        that.controller.logger.error('ping_not_responding', team);

        return that.restart(team);
      }

      bot.rtm.ping();
      that._pongs[team.token].push(true);

      that.keepAlive(team, bot);
    }, this.config.pings_interval_ms || (30 * 1000));
  },

Hope that helps, dunno if I could submit a PR to integrate that natively into botkit. WDYT?

Best

Ah, posted at the same moment @petemichel77 did not see your answer before mine.

I've tried the "Slack" way, but no way for me to get a pong answer :( I then chose to use the websocket ping/pong native implementation in my case above.

@guillaumepotier Mine is pretty basic so far and it is probably better to use the native ws implementation.

I simply was using the bot.rtm.send() and sending a message of type _ping_. The pong comes back in the on message handler.

I'll test yours out with my proxy and see how it works. Thanks for this

@guillaumepotier using the native ws ping method seems to work for my case of being behind a proxy. I didn't use your code verbatim, but the ping method works pretty well.

I had this issue and thought that updating to the latest version would correct it, but I still found that it was happening. After some debugging I realized that it was disconnecting silently after a pong response took too long, and that it was not attempting to reconnect because connection retries are disabled by default.

I suggest adding some logging to make it clear when and why a disconnection has happened, and that perhaps connection retries should be enabled by default (although a log line could go a long way towards realizing this is something that needs to be turned on).

Yes, it seems like retries should be on by default. We'll consider this for an upcoming release!

For others having this issue, note that you can also register a handler for the rtm_close event. So, for instance, if enabling retry isn't what you want to do, you might do something like:

controller.on('rtm_close', function() {
  //Do something - eg log it, or maybe process.exit();
});

In my case, I have a process manager that will restart the process if it detects it died, but your use case my differ. Just wanted to highlight that you have some options for closer control here.

@anyonecancode can you provide more detail when you say "I have a process manager that will restart the process if it detects it died"?

I'm using PM2 in the most basic way to keep our Node app up on Heroku. Are you saying I could also use PM2 to re-connect a dropped RTM session? Ideally if you have some example code, that would be really helpful.

Sure -- in my use case, I actually am using Spotify Helios to deploy my script inside of a docker container.

Helios will restart your container if it dies, and docker containers die if the main process inside them stops. So the flow ends up looking like this:

Detect rtm_close, call process.exit() - > Causes docker container to stop -> Causes helios to restart container (and hence my node script).

I haven't used PM2 so can't speak to that, but generically I'm forcing my app to stop running if it notes a bad state (eg lack of connection), and relying on my deployment tool to restart my app when it detects it stops running.

Thanks! I see now what you mean. 

Was this page helpful?
0 / 5 - 0 ratings

Related issues

abinashmohanty picture abinashmohanty  Â·  4Comments

simpixelated picture simpixelated  Â·  3Comments

iworkforthem picture iworkforthem  Â·  3Comments

seriousssam picture seriousssam  Â·  3Comments

fieldcorbett picture fieldcorbett  Â·  4Comments