Discord.js: Master Version Causing WebSocket Timeout

Created on 4 Feb 2019 · 13Comments · Source: discordjs/discord.js

Hello,

As of 2 weeks ago i was running a slightly outdated version of the master branch of discord.js, my bot with 8 shards was working perfectly fine and booting and rebooting with no problems. When i was performing updates a few days ago, i updated the discord.js master branch to latest and suddenly the shards kept throwing Error [WS_CONNECTION_TIMEOUT]: The connection to the gateway timed out.

I tried extending the time between booting shards, making it 15 seconds, no wait for ready. I tried manually extending the waitforready timeout (required for me to use wait for ready) and still timed out. I tried extending the shards to 12 and it still timedout, only when i bumped my shard count all the way up to 16 did the shards finally boot.

I tried turning off all my events as well and this did not stop it from hitting a timeout.

On a lark i downgraded from master to the stable branch and turned off my bot (simply flipped an if statement that prevents it from acknowledging the 'ready' state) and all 8 shards booted it pretty much the same fashion as they usually do no problems whatsoever.

I do not understand how this could be an issue nor can i really give any ideas where it might be caused, but i think i can firmly say that this is not a problem on my end. If you have a disagreement with that, please let me know I'm wiling to try any solution you give me.

Full Error:

{ Error [WS_CONNECTION_TIMEOUT]: The connection to the gateway timed out.
0|shardManager | at Timeout.setTimeout (/home/holo/gaius/node_modules/discord.js/src/client/Client.js:261:16)
0|shardManager | at ontimeout (timers.js:436:11)
0|shardManager | at tryOnTimeout (timers.js:300:5)
0|shardManager | at unrefdHandle (timers.js:520:7)
0|shardManager | at Timer.processTimers (timers.js:222:12) [Symbol(code)]: 'WS_CONNECTION_TIMEOUT' }

Note: This error happened on 3 of 8 shards and only on those shards i know had the most users (around 280k

Further details:

discord.js version: Master
Node.js version: 10.15.0
Operating system: Ubuntu
Priority this issue should have – please be realistic and elaborate if possible: I'm not good at delegating priorities, so just acknowledgement of existence is fine i suppose

[ ] I have also tested the issue on latest master, commit hash:

gateway bug

Source

Holo-Buckshot

👍2

Most helpful comment

I'm also having this issue.
Tested on:

discord.js version: Master branch (ae72690)
Node.js version: v11.9.0, npm v6.7.0
OS: Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-42-generic x86_64)
Priority: Breaks usability entirely, but it's personally on a feature branch. Overall, critical.

TobiTenno on 6 Feb 2019

👍2

All 13 comments

I'm also having this issue.
Tested on:

discord.js version: Master branch (ae72690)
Node.js version: v11.9.0, npm v6.7.0
OS: Ubuntu 18.04.1 LTS (GNU/Linux 4.15.0-42-generic x86_64)
Priority: Breaks usability entirely, but it's personally on a feature branch. Overall, critical.

TobiTenno on 6 Feb 2019

👍2

Update, i got a link from a generous person over in the support server to an earlier branch that does not have this issue
discordjs/discord.js#01476de58250761ffe5eedc9ee6c9782576ca043

When using this to install master, the problem is not present.

Holo-Buckshot on 6 Feb 2019

👍1

@Holo-Buckshot This leads me to believe the "internal sharding" feature broke it, or at least causes issues

TobiTenno on 6 Feb 2019

@TobiTenno Yes this is the thought process of the one who gave me that link as well.

Holo-Buckshot on 6 Feb 2019

I have a copy working and able to log in with a single shard at 8230255c68b94d68a4e8ffc559a98d08d1a08a7c

TobiTenno on 6 Feb 2019

I've had the same issue (#3021) and even the latest fix is still causing errors. I tried out the hash you said was working @TobiTenno and the same issue occured.

pizzafox on 1 Mar 2019

@pizzafox what i ended up noticing was that half of the shards would fail and timeout, so i reconfigured my shard manager to kill those, but not the cluster, and just retry those shards until they worked. Obviously, this is sub-optimal, but it worked on the most recent commits that i was on at the time, I haven't updated since then.

TobiTenno on 1 Mar 2019

To those running into the issue, I had experienced this as well and opened an issue for the perpetrator at https://github.com/discordjs/discord.js/issues/3028 (you can see in the line above the linked file that the WS_CONNECT_TIMEOUT is thrown from there). The problem is just as the title speaks, this 25 seconds timeout was given on the grounds of "it _shouldn't_ take longer", but my bot does take longer (up to a whole minute and a half per shard) and runs healthily afterward. This occurs on BOTH AWS m5.xlarge and some Digital Ocean mid tier instance so it's not just my internet connection. The temporary fixes I've applied are as follows:

discord.js/src/client/Client.js:
increase the "25e3" timeout number; OR
comment/delete "timeout" lines (ctfl+f for "timeout" and delete resultant lines/functions)

/discord.js/src/sharding/Shard.js
Line 136 : 30000 -> 180e3 (30 seconds to 3 minutes)

Obligatory YMMV and I'm sure some here would be against modifying the library pleading that it may cause unexpected behavior. My issue has not been responded with reasons why these timeouts are necessary and the rationale behind them so I don't have reason to believe they're completely necessary, at least not at the values they are set at. And so what I'm sharing is that my experience of lengthening (or eradicating) them allows my bot to run healthy.

PS: I even spent quite a while digging and added the specific commit that these lines came from to my issue in case anybody is wondering.

mr-tech on 11 Mar 2019

To those running into the issue, I had experienced this as well and opened an issue for the perpetrator at #3028 (you can see in the line above the linked commit that the WS_CONNECT_TIMEOUT is thrown from there). The problem is just as the title speaks, this 25 seconds timeout was given on the grounds of "it _shouldn't_ take longer", but my bot does take longer (up to a whole minute and a half per shard) and runs healthily afterward. This occurs on BOTH AWS m5.xlarge and some Digital Ocean mid tier instance so it's not just my internet connection. The temporary fixes I've applied are as follows:

discord.js/src/client/Client.js:
comment/delete "timeout" lines (ctfl+f for "timeout" and delete resultant lines/functions)

/discord.js/src/sharding/Shard.js
Line 136 : 30000 -> 180e3 (30 seconds to 3 minutes)

Obligatory YMMV and I'm sure some here would be against modifying the library pleading that it may unexpected behavior. Nobody has responded to my issue stating why these timeouts are necessary and the rationale behind them other than opinion so and I'm what I'm sharing is that my experience of lengthening (or eradicating) them allows my bot to run healthy.

this is great and all except that waitForready satisfies the 2nd portion and the 1st portion, once i looked at the code turns out like this

shardCount * 25e3 = 9 * 25000 = 225000 or about 3 minutes and 40 seconds, plenty of time for a single shard to boot

Holo-Buckshot on 11 Mar 2019

this is great and all except that waitForready satisfies the 2nd portion and the 1st portion, once i looked at the code turns out like this

shardCount * 25e3 = 9 * 25000 = 225000 or about 3 minutes and 40 seconds, plenty of time for a single shard to boot

Ah yes, I was kind of heavy-handed in my approach of deleting the lines entirely, I edited my comment to include the much simpler yet just as effective option of increasing the value. I hope this eases your pain until a fix is officially implemented! Cheers.

mr-tech on 11 Mar 2019

Hey.

Can you guys test out PR #3140 and let me know if this issue is also present? The arbitrary timeout was removed in that PR, so you shouldn't have an issue with it anymore 😄