There seems to be an issue with the websockets library which is causing this error. Even though this is a problem in a different library, I feel like discord.py should gracefully handle it and either retry the request or something, as it's completely crashing my bot.
Basically, my bot runs on clusters which are Docker containers. My bot uses about 10 clusters each responsible for 3 shards. On 1 cluster out of the 10, this problem constantly happens. It happens about once every 30 minutes, which takes down the process, and then the cluster respawns.
Something interesting to note: I edited the number of clusters and shards to spawn so it would force the distribution to change, and every time I change it, this problem always happens on the cluster with the shard that is responsible for the support server of my bot. Now, a few days ago, I bought the $25 "game developer fee" on my bot and picked my support server as the target server to receive the perks, so maybe this is related, but I'm not sure. I just know that when I change the cluster distribution, the 1 cluster which is affected happens to be where the game developer perks are on.
Explained above.
No error
Stacktrace:
[36mmaster |[0m Traceback (most recent call last):
[36mmaster |[0m
[36mmaster |[0m File "bot.py", line 48, in
[36mmaster |[0m
[36mmaster |[0m client.run(token)
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/discord/client.py", line 571, in run
[36mmaster |[0m
[36mmaster |[0m return task.result()
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/discord/client.py", line 479, in start
[36mmaster |[0m
[36mmaster |[0m await self.connect(reconnect=reconnect)
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/discord/client.py", line 402, in connect
[36mmaster |[0m
[36mmaster |[0m await self._connect()
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/discord/shard.py", line 271, in _connect
[36mmaster |[0m
[36mmaster |[0m f.result()
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/discord/shard.py", line 77, in poll
[36mmaster |[0m
[36mmaster |[0m await self.ws.poll_event()
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/discord/gateway.py", line 464, in poll_event
[36mmaster |[0m
[36mmaster |[0m msg = await self.recv()
[36mmaster |[0m
[36mmaster |[0m File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 441, in recv
[36mmaster |[0m
[36mmaster |[0m self._put_message_waiter.set_result(None)
[36mmaster |[0m
[36mmaster |[0m asyncio.base_futures.InvalidStateError: invalid state
Hello, so currently my entire bot with 33k+ servers has been offline for days because of this issue as it keeps making 1 cluster out of 10 crash, so then my token gets reset daily. I cannot run the bot with this issue and I don't know what I can do to fix it. I tried editing d.py source locally to suppress the error, but it leads to more problems, so I reverted it. I can supply all the information needed to help debug this issue for Danny, but I'm just asking for some help and opinions on what I should do as this is pretty urgent for my bot.
Do you have anything that may be changing how your event loop operates, like uvloop, in use?
I don't do anything to the event loop
Can you use pip show websockets to show me which version of websockets you're using?
Name: websockets
Version: 7.0
Summary: An implementation of the WebSocket Protocol (RFC 6455 & 7692)
Home-page: https://github.com/aaugustin/websockets
Author: Aymeric Augustin
Author-email: aymeric.[email protected]
License: BSD
Location: /usr/local/lib/python3.6/site-packages
Requires:
Required-by: discord.py
Can confirm that this is also breaking on my bot with 14,800 servers.
Had to disable alot of events to just let it run.
This exception seems to be caused by an issue rooted in websockets and/or asyncio itself. The best suggestion I can give without further investigation is to patch your websockets/protocol.py at line 441 to look like this:
if self._put_message_waiter is not None:
if not self._put_message_waiter.done():
self._put_message_waiter.set_result(None)
self._put_message_waiter = None
An issue should be raised on the websockets repo since I don't think there's anything we can do about this fundamentally.
Judging by the summary though, I'm guessing this is some bad data edge case that occurs from Discord sending some incomplete or broken data down the websocket for servers that have this "game developer perks" feature enabled.
That code snippet prevents the process from dying, but the websocket continuously closes, making the cluster responsible for those shards unresponsive.
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
I'm not sure how I'd word the issue on the websockets repo, as I don't know much about how the websocket stuff works myself; would it be possible for someone else to make the issue?
It seems to me that right now the solution right now is to downgrade websockets to 6.0, i.e. pip install -U 'websockets<7.0' and then file an issue with the websockets repository.
Getting the same websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason errors on my bot leading to clusters/the entire bot reconnecting several times per day. 32k servers if it's relevant.
I'm no longer having any issues with my bot. @CarlGroth I had the same issue and finally managed to look into it today; the issue was I was calling too many discord API requests for one bot feature which made Discord keep kicking off my bot.
Closing since the main issue was resolved by dropping the websocket version.
Thanks Danny for finding the solution to the issue (by dropping the websocket version).
Note, https://github.com/aaugustin/websockets/issues/551 is probably the relevant websockets issue.
It seems to me that right now the solution right now is to downgrade
websocketsto6.0, i.e.pip install -U 'websockets<7.0'and then file an issue with the websockets repository.
can't thank you enough for the solution
I'm pretty sure it's a bug in websockets, but not the one suggested in an earlier comment: https://github.com/aaugustin/websockets/issues/634
@aaugustin Is this still an issue with websockets==8.0.1? I've been using latest without any problems so far. Perhaps the constraint in setup.py can be lifted?
Yes, I believe it's fixed in websockets ≥ 8.0. It was a cancellation bug. The only known version with this bug is 7.0.
@discosultan The dependency can't be bumped since it conflicts with the current version requirements in the library (ws >= 8.0 requires python >= 3.6, we're on python >= 3.5).