Python-slack-sdk: RTM "socket.error: [Errno 32] Broken pipe" while receiving user status bulk updates

Created on 24 Aug 2017  路  14Comments  路  Source: slackapi/python-slack-sdk

Description

Using the RTM feature in slackclient, script crashes when slack is sending Member Status update bulk data (type: user_change). These crashes basically occur on every cycle when these "user_change" updates arrive. We have about 230 active Slack members...

socket.error: [Errno 32] Broken pipe

What type of issue is this? (place an x in one of the [ ])

  • [x ] bug
  • [ ] enhancement (feature request)
  • [ ] question
  • [ ] documentation related
  • [ ] testing related
  • [ ] discussion

Requirements (place an x in each of the [ ])

  • [ ] I've read and understood the Contributing guidelines and have done my best effort to follow them.
  • [ ] I've read and agree to the Code of Conduct.
  • [ ] I've searched for any related issues and avoided creating a duplicate issue.

Bug Report

```
File "adm_contact_tasks.py", line 155, in
RtmHandler()
File "adm_contact_tasks.py", line 128, in RtmHandler
messages = Slack_Client.rtm_read()
File "/usr/local/lib/python2.7/dist-packages/slackclient/client.py", line 126, in rtm_read
json_data = self.server.websocket_safe_read()
File "/usr/local/lib/python2.7/dist-packages/slackclient/server.py", line 155, in websocket_safe_read
data += "{0}\n".format(self.websocket.recv())
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 293, in recv
opcode, data = self.recv_data()
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 310, in recv_data
opcode, frame = self.recv_data_frame(control_frame)
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 341, in recv_data_frame
self.pong(frame.data)
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 285, in pong
self.send(payload, ABNF.OPCODE_PONG)
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 234, in send
return self.send_frame(frame)
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 259, in send_frame
l = self._send(data)
File "/usr/local/lib/python2.7/dist-packages/websocket/_core.py", line 423, in _send
return send(self.sock, data)
File "/usr/local/lib/python2.7/dist-packages/websocket/_socket.py", line 116, in send
return sock.send(data)
File "/usr/lib/python2.7/ssl.py", line 709, in send
v = self._sslobj.write(data)
socket.error: [Errno 32] Broken pipe

#### Reproducible in:

This is the central part of the code with reproducible crashes.
the code is quite covering the typical examples for Slack RTM client scripts...

def RtmHandler():
PrintMessage("Connect to RTM-Service...")
if Slack_Client.rtm_connect():
PrintMessage("Connected!")
while True:
messages = Slack_Client.rtm_read()
print messages
thread.start_new_thread(MesssageReceiver, (messages,))
time.sleep(1)
else:
PrintMessage("Could not connect to RTM-Service!")

if __name__ == '__main__':
PrintMessage("Perform initial tests...")

PrintMessage("Performing api.test...")
apiTest = Slack_Client.api_call("api.test")
if apiTest['ok'] is True:
    PrintMessage("OK!")
else:
    PrintMessage(apiTest['error'])
    sys.exit(1)

PrintMessage("Performing auth.test...")
authTest = Slack_Client.api_call("auth.test")
if authTest['ok'] is True:
    PrintMessage("OK!")
else:
    PrintMessage(authTest['error'])
    sys.exit(1)

RtmHandler()

```

slackclient version:
slackclient-1.0.7

python version:
2.7

OS version(s):
ubuntu based os
and also in container based on python:2-onbuild

Steps to reproduce:

  1. Run above code base code
  2. Wait for these bulk json "user_change" messages from Slack (print on stdout)
  3. Above socket.error: [Errno 32] Broken pipe is expected on first or second bulk of "user_change" messages
1x

Most helpful comment

I have the same problem. @nohir0 any workaround you found?

All 14 comments

since I opened this issue 6 weeks ago, nobody there picking up that issue?

In the meantime I upgraded to latest 1.0.9, still experiencing same socket.errors

root@036a897016f3:/usr/src/app# pip show slackclient Name: slackclient Version: 1.0.9 Summary: Slack API clients for Web API and RTM API Home-page: https://github.com/slackapi/python-slackclient Author: Slack Technologies, Inc. Author-email: [email protected] License: MIT Location: /usr/local/lib/python2.7/site-packages Requires: six, requests, websocket-client

I have the same problem. @nohir0 any workaround you found?

@nohir0 would you mind posting of sending me the rest of your methods? I had a bit of trouble reproducing this because I have to reconstruct your MesssageReceiver class, etc.

Sure @Roach no problem at all!

Here the complete example code from above sample - only removed some private content - should be good to run though:
example.txt

In general we have this identical issue described with every slackclient RTM code we are running
and in the meantime we have a couple of them in our environment.
Tested and reproduced with slackclient up to 1.1.0 and just began to test under 1.1.3 ...

Just let me know if you need anything else from me to reproduce / debug this.

@knesenko
I currently live with a messy workaround, I catch the exception and just recall the RTM method.
Your bot is back in the game quickly, but of course it could miss something during the seconds the bot restarts ....

Example:

def RtmHandler():

    PrintMessage("Connect to RTM-Service...")
    try: 
        if SC_APPLUSBOT.rtm_connect():
            PrintMessage("Connected!")
            while True:
                messages = SC_APPLUSBOT.rtm_read()
                for event in messages:
                    print event
                time.sleep(1)
        else:
            PrintMessage("Could not connect to RTM-Service!")

    except Exception as e:
        PrintMessage("Unexpected error: " + str(e))
        PrintMessage("Start dirty self-healing recovery...")
        RtmHandler()

I' currently working on adding some reconnect logic to the client. Here's a WiP PR: https://github.com/slackapi/python-slackclient/pull/297

There are a few issues I'm trying to work around to make this not rely on catching an exception as an indicator that the socket's been disconnected... 馃

There is news!
Have running one of my RTM bots for the last 24 hours with latest slackclient 1.1.3
and it is not crashing any more - natively running and inside docker containers as well.

@Roach I see you did not yet release your code update and I assume no other fixes have been done in that direction on slackclient side. What I recognized that with the pip upgrade also websocket-client has been upgraded from 0.46.0 to 0.47.0
So it is likely that they fixed something on their side....?

I will now go on and upgrade my other bots to see if they run stable too.
@knesenko Would be great if you could confirm that it works as well on your side with latest code...

No, unfortunately false alarm @Roach
still crashing in random intervals, sometimes once a day, sometimes multiple crashes per day ...

It is just not happening that often, I think because there was a change on Slack side for frequent "user_change" events - they now come in smaller bulks at a time, easier for RTM client to process.

08/03/18 15:04:28 | Connected!
Traceback (most recent call last):
  File "adm_contact_tasks.py", line 156, in <module>
    RtmHandler()
  File "adm_contact_tasks.py", line 129, in RtmHandler
    messages = Slack_Client.rtm_read()
  File "/usr/local/lib/python2.7/site-packages/slackclient/client.py", line 135, in rtm_read
    json_data = self.server.websocket_safe_read()
  File "/usr/local/lib/python2.7/site-packages/slackclient/server.py", line 194, in websocket_safe_read
    data += "{0}\n".format(self.websocket.recv())
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 300, in recv
    opcode, data = self.recv_data()
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 317, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 348, in recv_data_frame
    self.pong(frame.data)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 291, in pong
    self.send(payload, ABNF.OPCODE_PONG)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 240, in send
    return self.send_frame(frame)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 265, in send_frame
    l = self._send(data)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 430, in _send
    return send(self.sock, data)
  File "/usr/local/lib/python2.7/site-packages/websocket/_socket.py", line 117, in send
    return sock.send(data)
  File "/usr/local/lib/python2.7/ssl.py", line 709, in send
    v = self._sslobj.write(data)
socket.error: [Errno 32] Broken pipe
09/03/18 19:04:19 | Perform initial tests...
09/03/18 19:04:19 | Performing api.test...
09/03/18 19:04:19 | OK!
09/03/18 19:04:19 | Performing auth.test...
09/03/18 19:04:19 | OK!
09/03/18 19:04:19 | Connect to RTM-Service...
09/03/18 19:04:20 | Connected!
Traceback (most recent call last):
  File "adm_contact_tasks.py", line 156, in <module>
    RtmHandler()
  File "adm_contact_tasks.py", line 129, in RtmHandler
    messages = Slack_Client.rtm_read()
  File "/usr/local/lib/python2.7/site-packages/slackclient/client.py", line 135, in rtm_read
    json_data = self.server.websocket_safe_read()
  File "/usr/local/lib/python2.7/site-packages/slackclient/server.py", line 194, in websocket_safe_read
    data += "{0}\n".format(self.websocket.recv())
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 300, in recv
    opcode, data = self.recv_data()
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 317, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 348, in recv_data_frame
    self.pong(frame.data)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 291, in pong
    self.send(payload, ABNF.OPCODE_PONG)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 240, in send
    return self.send_frame(frame)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 265, in send_frame
    l = self._send(data)
  File "/usr/local/lib/python2.7/site-packages/websocket/_core.py", line 430, in _send
    return send(self.sock, data)
  File "/usr/local/lib/python2.7/site-packages/websocket/_socket.py", line 117, in send
    return sock.send(data)
  File "/usr/local/lib/python2.7/ssl.py", line 709, in send
    v = self._sslobj.write(data)
socket.error: [Errno 32] Broken pipe

I am also seeing this bug (same trace as above) 鈥斅爓ill try adding some additional reconnect logic, but I thought this would be handled by auto_reconnect=True.

Traceback (most recent call last):
[...]
  File "./scripts/slack-server", line 69, in run_bot
    for event in slack_client.rtm_read():
  File "/opt/conda/lib/python3.6/site-packages/slackclient/client.py", line 235, in rtm_read
    json_data = self.server.websocket_safe_read()
  File "/opt/conda/lib/python3.6/site-packages/slackclient/server.py", line 278, in websocket_safe_read
    data += "{0}\n".format(self.websocket.recv())
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 310, in recv
    opcode, data = self.recv_data()
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 327, in recv_data
    opcode, frame = self.recv_data_frame(control_frame)
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 358, in recv_data_frame
    self.pong(frame.data)
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 301, in pong
    self.send(payload, ABNF.OPCODE_PONG)
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 250, in send
    return self.send_frame(frame)
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 275, in send_frame
    l = self._send(data)
  File "/opt/conda/lib/python3.6/site-packages/websocket/_core.py", line 445, in _send
    return send(self.sock, data)
  File "/opt/conda/lib/python3.6/site-packages/websocket/_socket.py", line 117, in send
    return sock.send(data)
  File "/opt/conda/lib/python3.6/ssl.py", line 941, in send
    return self._sslobj.write(data)
  File "/opt/conda/lib/python3.6/ssl.py", line 642, in write
    return self._sslobj.write(data)
BrokenPipeError: [Errno 32] Broken pipe

I am just running a simple message-response bot:

def reply_in_thread(client, event, message):
    reply_args = {"channel": event["channel"], "text": message}
    if "thread_ts" in event:
        reply_args["thread_ts"] = event["thread_ts"]
    else:
        reply_args["thread_ts"] = event["ts"]
    return client.api_call("chat.postMessage", **reply_args)

slack_client = SlackClient(bot_key)
if not slack_client.rtm_connect(with_team_state=False, auto_reconnect=True):
    raise Exception("Bot failed to connect.")

while True:
    for event in slack_client.rtm_read():
        if not (event["type"] == "message" and "subtype" not in event):
            continue

        my_reply = do_some_work(event) # Can take a minute or so
        response = reply_in_thread(slack_client, event, my_reply)
    time.sleep(1)

The exception occurs in rtm_read.

I am also experiencing this issue, but only on the deployment EC2 instance.
During development on Mac this issue does not happen.

This is my program flow:

  • Connect to Slack, and start main program loop with:
    if sc.rtm_connect(with_team_state=False, auto_reconnect=True):
  • Immediately after, the rtm_read is called a first time, with success.
  • After, queries are run which take 10 to 20 minutes.
  • Results are published to slack, with success.
  • rtm_read is called again (a second time), which now crashes

Is anyone facing an issue similar to this with v2.x (I mean not with v1.x)?

As this project hasn't been actively supporting v1 for a while, please allow us to close this issue now.

The current latest stable version is v2.9.3. Here is the migration guide to v2 series. Also, we're going to release v3 soon: https://slack.dev/python-slack-sdk/

Both newer major versions should provide more stable implementation for this use case. It'd be appreciated if you could try newer versions. We hope the newer version resolves your issue.

My team use errbotv5.2 and slackclientv1.3.1, because our slack is huge, so rtm.start is naturally more difficult to use with Enterprise Grid and other large workspaces.

We often got the rtm_read error, after troubleshooting, we find out the bottleneck is CPU, old instance type is t3a.medium, which is often crash at errbot restart and need to wait for 6 mins to pull information from slack.

After the change to c5.large, I test 3 times restart, no crash anymore, faster to pull information from slack, only 4 mins

Was this page helpful?
0 / 5 - 0 ratings