Flask-socketio: Server doesn't receive disconnect event

Created on 11 Apr 2017  ·  28Comments  ·  Source: miguelgrinberg/Flask-SocketIO

Server-side disconnect event is never triggered when Android app loses WiFi connectivity, for instance if I shutdown WiFi manually or if I turn off Android phone.

I've adjusted the Server side params and now the client detects the disconnection, but not the server:

params = {
    'ping_timeout': 10,
    'ping_interval': 5
}
socketio = SocketIO(**params)
bug

Most helpful comment

@miguelgrinberg I'm getting disconnects too! First time being so happy about "disconnection" :)
👍 Good job, and thank you a lot

All 28 comments

It's not a good idea to use such short ping timeout and interval numbers. That is going to generate a lot of traffic in your system just for the ping packets. I recommend you go back to the defaults and troubleshoot the disconnect problem instead.

Was your phone connecting over WebSocket or long-polling? With long-polling it takes longer to detect that a client went offline, it may be a minute or two. Any chance you did not wait long enough?

The server I'm working on is for a local home network, so small number of clients and low network latency is normally expected.
There can be several fallback servers on the same network, so I need high speed connection/disconnection events to change communication to a possible fallback server, even if a few false positives are expected.

I'm still working on a magic number and in a few live scenario test cases.
I managed to configure how the client timeouts with the server, but the on_disconnect event is not triggered in the server when the client abruptly closes the connection.

Also, how can I debug to see which connection method is being used? It's possible that long-polling is being used.

Do you see constant http requests flying by? If you do, then you are using
long polling.

On Tue, Apr 11, 2017, 12:25 PM ffleandro notifications@github.com wrote:

Also, how can I debug to see which connection method is being used? It's
possible that long-polling is being used.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/miguelgrinberg/Flask-SocketIO/issues/442#issuecomment-293373946,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AClwzqEUehy6PncZfO-3_QkV66HpkMT-ks5ru9OYgaJpZM4M6RGb
.

Ok, so I activated some logging properties, and now I see the actual requests.
Answering your question, WebSocket is being used.

params = {
    'ping_timeout': 10,
    'ping_interval': 5
}

socketio = SocketIO(logger=True, engineio_logger=True, **params)

The connection starts with HTTP and then a websocket upgrade request is issued.

Also, I still do not see a client disconnect event on the server, even after reestablishing the WiFi after a few minutes.

Is there a faster way to the server to know the client disconnected or is this behaviour normally expected?

For websocket the disconnect occurs when the socket is closed. I don't know how your Android connects, if it is using a proxy server, for example, the phone losing a connection may not cause the actual connection to the server to be lost.

As I said above, the ping numbers are not meant for this, you seem to have ignored my previous statement, so I'm telling you again, if you think reducing those numbers will make disconnections faster you are wrong. For long-polling transports it may achieve that (at a pretty high cost), but it does not help for WebSocket connections, so I suggest you go back to the defaults.

Ok, I understand what you're saying but I'm not sure what is really happening.
I've adjusted to the default values as instructed and here are my logs to better explain:

(2306) wsgi starting up on http://0.0.0.0:5000
(2306) accepted ('187.33.230.94', 52450)
18777ef24c764a53838e862019cb17e0: Sending packet OPEN data {'sid': '18777ef24c764a53838e862019cb17e0', 'upgrades': ['websocket'], 'pingTimeout': 60000, 'pingInterval': 25000}
DEBUG 2017-04-12 11:04:01,222 MainThread                on_connect      (22):    Client connected
18777ef24c764a53838e862019cb17e0: Sending packet MESSAGE data 0
187.33.230.94 - - [12/Apr/2017 11:04:01] "GET /socket.io/?EIO=3&transport=polling HTTP/1.1" 200 381 0.005528
18777ef24c764a53838e862019cb17e0: Received packet MESSAGE data 20["join",{"room":"room_name"}]
received event "join" from 18777ef24c764a53838e862019cb17e0 [/]
DEBUG 2017-04-12 11:04:01,566 MainThread                on_join         (32):    user has joined the room 'room_name'
18777ef24c764a53838e862019cb17e0 is entering room room_name [/]
18777ef24c764a53838e862019cb17e0: Sending packet MESSAGE data 30[{"hello":"world"},{}]
187.33.230.94 - - [12/Apr/2017 11:04:01] "POST /socket.io/?EIO=3&sid=18777ef24c764a53838e862019cb17e0&transport=polling HTTP/1.1" 200 199 0.038673
(2306) accepted ('187.33.230.94', 52451)
(2306) accepted ('187.33.230.94', 52452)
187.33.230.94 - - [12/Apr/2017 11:04:01] "GET /socket.io/?EIO=3&sid=18777ef24c764a53838e862019cb17e0&transport=polling HTTP/1.1" 200 627 0.001076
18777ef24c764a53838e862019cb17e0: Received request to upgrade to websocket
18777ef24c764a53838e862019cb17e0: Sending packet NOOP data None
187.33.230.94 - - [12/Apr/2017 11:04:01] "GET /socket.io/?EIO=3&sid=18777ef24c764a53838e862019cb17e0&transport=polling HTTP/1.1" 200 215 0.000931
18777ef24c764a53838e862019cb17e0: Upgrade to websocket successful
18777ef24c764a53838e862019cb17e0: Received packet PING data None
18777ef24c764a53838e862019cb17e0: Sending packet PONG data None

After I turn off the Phone's WiFi, nothing else is logged and on the phone the CONNECTION_ERROR event only occurs after the ping timeout (30 seconds or so).

I'm using the SocketIO Java Client Library on my Android App and no Proxies or Firewalls between the Phone and Server.
Also, no proxies or firewalls between the Phone and Server, just a normal Local Lan Router.

Can you verify if 187.33.230.94 the IP address of your phone?

Also, do you also have data from your cellular provider? Any chance the connection is not going through wi-fi but through your cellular provider's network?

Yes, that is my phone's IP.
On that particular case of the logs I sent, I'm remote connecting to the server which is in my house.
I've tested using my Office's WiFi, using my Phone Cellular Network and also using WiFi on the same local network as the server (tested last night while I was at home).

In all those cases I turn on Airplane Mode to trigger an abrupt disconnection.
The symptoms are always the same.

Actually, that is my Office's external IP.

Are disconnects detected when you connect to this application from a web browser using the same wi-fi network you were using for the phone?

Yes I'm facing the same problem too!
When my client abrupt disconnected, the server side does not trigger 'disconnect' event, which is really annoying and can cause some bug.

By abrupt disconnection I mean something like Wi-Fi turning off, Battery down, etc, after which the client SDK has no chance to send 'Disconnect' event by it's logic.
I thought there may be some mechanism for the server to detect disconnected client, but after I waited for, like 5 minutes, I guess there are not.

I've been playing with a Chrome client on Android and found the following:

  • If the client calls the disconnect function, the server receives the disconnect event.
  • If you close a tab that has a connection to the server, the server stills receives a disconnect event.
  • If you switch the phone to airplane mode, or turn it off while there is a connection to the server, the socket is not properly disconnected, so the server does not receive a disconnect event.

This behavior can be seen when using eventlet or gevent, and is also present with other websocket servers.

To detect those disconnections, the server will need to keep track of PING packets sent by clients, like it does for long-polling. Any clients that haven't sent a PING packet in more time than the configured ping_interval, will be disconnected. Unfortunately this means that a client that loses its connection abruptly will be declared disconnected after 60 seconds in the worst case, when using the default ping interval. The disconnect may be detected earlier if the server needs to send an event to that client.

Exactly. So does this mean it’s a problem with Android’s networking libraries?
I saw this problem in an Android app using the socketio java client library and you reported the same problem in a Chrome page, so I assume it’s not an implementation issue and rather a problem (feature?) on how Android handles abrupt disconnections.

Lowering the ping interval is an option?

On 5 May 2017, at 18:21, Miguel Grinberg notifications@github.com wrote:

I've been playing with a Chrome client on Android and found the following:

If the client calls the disconnect function, the server receives the disconnect event.
If you close a tab that has a connection to the server, the server stills receives a disconnect event.
If you switch the phone to airplane mode, or turn it off while there is a connection to the server, the socket is not properly disconnected, so the server does not receive a disconnect event.
To detect those disconnections, the server will need to keep track of PING packets sent by clients, like it does for long-polling. Any clients that haven't sent a PING packet in more time than the configured ping_interval, will be disconnected. Unfortunately this means that a client that loses its connection abruptly will be declared disconnected only after 60 seconds, when using the default ping interval.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub https://github.com/miguelgrinberg/Flask-SocketIO/issues/442#issuecomment-299578955, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-scumAaO6SqSat2-RJDAr-Q9Cky6pxks5r25LxgaJpZM4M6RGb.

It's just the way TCP works. It would be nice if Android cleanly disconnected all sockets before going into airplane mode, but still, that would not solve the issue of an abrupt loss of signal.

Lowering the ping interval is not a good idea, as that increases the ping/pong traffic, degrading performance as the number of connected clients grows.

@ffleandro @AZLisme If you upgrade to the master branch of python-engineio, I think you will have your lost connections from Android time out and call the disconnect handler. This will take some time as I explained before, around one minute if you are using the default timeouts.

I still need to code the asyncio version, once I have everything done I'll release a new version. Let me know how this change works for you.

@miguelgrinberg I don't think it's working.

I created a script here to test if it detect client connection lost.
What I do is:

  1. Create a virtual environment and install the requirements, including your python-engineio's latest master branch.
  2. Run the script
  3. Visit it's hosted page / with my phone, which runs a little javascript to send 10 message to the server, one by one with interval of 1 second. The message will be echoed back by the server.
  4. Turn on Airplane Mode on my phone during the process.

It turns out that:

  1. The client detect connection lost immediately, and prints out the event.
  2. The server didn't trigger any event after connection lost, I waited for about 5 minute.

Let me know if I'm doing anything wrong, or you got different result.
Thank you and appreciate your job! :)

@AZLisme And you are sure you installed the engineio package direct from github? I have not released this fix to pypi yet.

Looking at your script, messing with the timeouts is almost never a good idea. In particular, reducing the ping interval means an increase of ping/pong traffic. In your case it'll go up by 6x. Maybe okay for a test, but not good for production. Also setting both values to the same number can be problematic and cause weird bugs in the client, since in high loads it may cause the ping/pong pairs to cross, meaning that the client may need to send the next ping before it gets a response to the outstanding one. I suggest you keep the 25/60 ratio used by default, so maybe go 10/24.

All my tests for this fix used the default ping interval and timeout. I'll try your modified numbers to see if that have an effect.

Also I noticed some interesting behavior of the Ctrl-C exit of the program, as this issue pointed out.

If I let every client connection disconnected correctly, Ctrl-C works fine.
If any abrupt disconnection happens, it freezes the exit process, feels like the sever is still waiting for that connection to be closed.

That was what I got with my script and some experiments.

@miguelgrinberg Yes I'm using the github version of python-engineio, as I write in the top lines of the script, the command is:

pip install flask flask-socketio eventlet
pip uninstall -y python-engineio
pip install git+https://github.com/miguelgrinberg/python-engineio.git

@AZLisme what's your pip freeze output?

@miguelgrinberg my pip freeze outputs:

appdirs==1.4.3
click==6.7
enum-compat==0.0.2
eventlet==0.21.0
Flask==0.12.1
Flask-SocketIO==2.8.6
greenlet==0.4.12
itsdangerous==0.24
Jinja2==2.9.6
MarkupSafe==1.0
packaging==16.8
pyparsing==2.2.0
python-engineio==1.4.0
python-socketio==1.7.4
six==1.10.0
Werkzeug==0.12.1

And I checked engineio's source code located in venv/lib/python3.6/site-packages/engineio, it does match your latest commit 9a9689, in which 3 files are modified: async_eventlet.py, asyncio_socket.py and socket.py.

By the way I'm using Python 3.6.1, if that matters.

@AZLisme okay, I can reproduce with your script, thanks. The difference between my test and yours is that in my test the server is initiating events on its own. Your script doesn't do any of that, once the phone goes into airplane mode, the server never needs to send anything, it just waits for the other side to send. I need to investigate why the code that waits for the client to send is not timing out.

@AZLisme could you refresh your python-engineio from master once again and repeat your test? I'm getting the disconnects now, using your test script. Thanks!

@miguelgrinberg I'm getting disconnects too! First time being so happy about "disconnection" :)
👍 Good job, and thank you a lot

python-engineio 1.5.0 with this fix is now on pypi. Thanks for all your help!

Hi @miguelgrinberg
Just to see if I got this right, in case of abrupt disconnection from Android client (airplane mode or wifi disconnect): disconnect event will be triggered on server after pint_timeout secs (default is 60 secs). Also I'm using flask-socketio not python-engineio. Am I right ?

@NamanJain2050 the disconnection will occur in at most 60 seconds, it could happen sooner. You are actually using python-engineio because Flask-SocketIO uses it.

Was this page helpful?
0 / 5 - 0 ratings