Signal-android: Network connection seems to get stuck (without Google services)

Created on 5 Apr 2018  Â·  63Comments  Â·  Source: signalapp/Signal-Android

Re-filing #7420 due to #7598.



Bug description

This is a revival of #7420 and may be the same issue as #6447, #6644 and #6880. I am using the official Signal APK from https://signal.org/android/apk/, so not Noise or anything like that.

For months now, Signal on Android’s network connection seems to “freeze” sporadically (once or twice per day), leaving it unable to send or receive any messages (even things like read confirmations) for long periods of time. During this time, messages can be sent and received from the Signal desktop client normally and without any delay, but none of these messages show up in the Android client. It does not make a difference whether the Android client is open or any conversations are opened – nothing happens.

It seems that Signal can be made “un-stuck” by resetting the phone’s network connection (e. g. disabling WiFi so that the phone switches to cellular data) or using “force stop” to kill and restart the Signal Android client completely. Upon one of these events, all of the messages from the intervening time period will appear in the Android client all at once.

Steps to reproduce

Unfortunately, since this seems to happen erratically, concrete steps to reproduce are a bit difficult to determine.

  • Use the phone normally (not sure if longer periods of inactivity are required).
  • Have someone else send you a message and/or send messages yourself from the Signal desktop client.

Actual result: No new messages (sent or received) are displayed in the Signal Android app, even if it is open. This occurs when the phone and the desktop are on the same WiFi network, so the issue is not due to poor connectivity.
Expected result: Messages should appear immediately in the Signal Android client.

Device info

Device: OnePlus One
Android version: 7.1.2 (LineageOS 14.1). I have never had Google Play services installed and Android Doze is disabled (“Battery optimization: Not optimized”) for Signal.
Signal version: 4.17.5 (but I’ve been having this issue for months now, so includes many earlier versions)

Link to debug log

https://gist.github.com/anonymous/b1f29ef755a08091528fefff299057d9

At 15:25, my phone switched to cellular data (because I went out of WiFi range). At that point, all the messages from the last hour or so appeared all at once.

This debug log was taken with Signal version 4.15.5, but the issue still occurs with the latest version (4.17.5).

help wanted

Most helpful comment

I've made an updated release which can be found here. This includes the patch that has been merged in 4.26, as well as the network change fix and the patch to remove the update nag screen. It should make Signal work on non-GCM devices for the time being (at least for messages).

Ultimately though, perhaps this is as good a chance as any, to take a hint and realize that expecting Signal to work reliably without GCM, is overly optimistic, bordering on naive. Personal disappointments and grievances aside, certain facts are plain to see and have been so for a long time: One is that there is no interest in support for non-GCM devices from the side of the project. Any issues that concern such devices, such as this one, are simply marked "help wanted". The first reponse to this thread (as well as other similar responses elsewhere) is very clear about the fact that non-GCM functionality can't even be tested, much less investigated or fixed and all these are left to "free software folks". An implicit corollary of this is that any changes made in the course of the "normal" development of Signal, inevitably can't be tested with regard to its effects on the functionality on non-GCM devices.

Another fact is that when "free software folk" respond to the request for help and submit PRs, the review process typically takes a very long time and is very difficult in general. PRs are expected to be submitted fully baked and tested, a process which takes considerable time and effort, but are cursorily dismissed, with little feedback and guidance on what is expected. This has led three separate people to write, test and submit several functionally identical patches (#7100, #7388, #7666, #7723, signalapp/libsignal-service-java#45, signalapp/libsignal-service-java#51, signalapp/libsignal-service-java#52, signalapp/libsignal-service-java#53, signalapp/libsignal-service-java#54) on the same problem, all of which were closed for practically the same reason and with identical feedback. My own initial patch (#7100) was submitted almost a year ago and was functionally the same as the one that was finally merged a few weeks ago.

Together, these mean that any upstream changes, such as the ones that will be made now to support API level 26, are likely to break Signal on non-GCM devices, because nobody will test them with non-GCM devices, to ensure that they won't. If that happens, issues will be opened, marked "help wanted" and a new uphill struggle to fix them will begin, leaving people who can't compile Signal for themselves with a broken application for many months.

And this is the best-case scenario, assuming someone to be available and willing to investigate the issue, write, test and track multiple PRs and wait weeks or months between submission of changes and feedback before repeating the cycle, which is clearly too much work just to be allowed to have an application which will inevitably be perpetually broken (not to mention the psychological toll of such an obvious exercise in futility). Whether this is because of lack of time, indifference, or outright hostility towards non-GCM users, is ultimately of no practical consequence; the situation is still the same whatever the reasons.

All 63 comments

Thanks for the well detailed issue report. I don't have a gapps-free device, so I can't reproduce this. My preference is to let the free software folks handle the free software development, so I've added a "help wanted" tag to the issue.

I don't have a gapps-free device, so i can't reproduce this.

No problem if you don't have one. Just tell me where to ship you a Pixel 2 with copperheados.

Thanks for the offer @Kamkata, but I already have too many phones and not enough time. I'd prefer it if folks who believe in free software took on some development instead.

This (cleaned-up) issue is connected to this and has some relevant information and more dumps: #6732

Also I would like to point you to this pull request regarding the issues: signalapp/Signal-Android#7388

Would be awesome if someone can finish it :tada:

It happens 10 times a day, every time the LineageOS without Google Services switches from LTE to WiFi. To cure it I kill the Signal by pressing and holding the back button on the Samsung phone. After the next run it works fine and all missing messages arrive.
But the same symptom I can observe on some not rooted Android phones with legitimate Gaaps. The cure this time is to bring the process list to the front and close all running apps.

Reboot or cold restart don't help because LTE is always first and in some seconds switched to WiFi so it generates LTE->WiFi event and your Signal is dead.

I confirm this Signal behavior for a long time now. It happens to me about once a day.

My current workaround: Finish the signal process with a long press on the back button and restart. After that everything works fine again.

Device info

Device: Fairphone 2 (FP2
Android version: 7.1.2 (LineageOS 14.1 without google stuff)
Signal (APK update) version: 4.17.5

It happens also on iPhone and new Android devices which were not modified. Of course not so often but it is what it is.
It seems they don't care.
Signal process must be killed in some way to bring it back to life.
I don't know but I use all available communicators and all of them handle network changes without problem... They work perfectly without Google Services...

@maedoredyti Can you please add a debug log from one of the Android devices running GCM?

Although the attached debug log doesn't show anything out of the ordinary, I believe the problems discussed in this thread are probably caused by two separate issues. One is the failure to keep the websocket to the Signal server alive, discussed in #6644, #6732 and maybe elsewhere.

A separate issue, also discussed in #6732 (see this for example) is that on some devices the connection to the Signal client doesn't survive a change of network type.

Can people with more technical insight to the issues perhaps tell us what information we need to further lock down the issues. Since the issues persists on my device I can provide debug logs and information on my device/software. The thing is that I dont know when to best capture the logs. After starting the phone? After I notice the issue? After "soft rebooting" so the connection works again?

Unless this is an entirely different problem from the ones mentioned in the various other tickets, referred to in this thread, I don't think there's much more to investigate for the time being.

As I've mentioned, there are probably two separate issues involved in this. One of these (#6644) is more basic in nature, since it renders Signal completely unusable for many, if not all devices without GCM, and several PRs have been submitted for it. One of them (#7723, signalapp/libsignal-service-java#53) is now in the review phase, and looks like it might have a shot at being accepted.

I've been testing a patch, which addresses the other one, the one involving changes of network type (between WiFi, mobile, etc.) for a week now and it seems to be working pretty well. It attacks the problem in two separate ways by recycling the connection both when a network change is detected and also when a keep-alive request is never responded to (which is based on/inspired by signalapp/libsignal-service-java#49, which has an excellent basic mode of operation, but would likely cause more freezes, at least on my device, due to the way it's implemented).

Having both methods, is not absolutely necessary, but the first is more rapid, and should leave the device without a connection for only a few seconds, when it detects a network change, while the other is more universal, but would take longer to act (somewhere between 1-2 minutes). I have a feeling the PR will have a hard time getting accepted, but in any case I can't submit it, until the first one is through, as it concerns the same functionality of Signal and needs to be based on it.

i have the same problem i believe, i'm also running LineageOS without Google Services and have disabled battery optimisation. to be slightly more specific then above the normal time i notice is instead of the first tick coming up immediately there is a delay of 30 seconds or so and then i get no second tick to say it's been delived. Often linked to changing network, i.e. walking away from wifi.

Happy to submit bug reports but it's hard to know if i should wait untill the problem is presenting it's self, i.e. i can't send or just before and then just after? or try to hit submit as i leave a network, any suggestions and i'll drop in some logs. this problem being solved means a lot to me so please let me know how i can help.

They simply don't want to repair it because they prefer te be tracked by google play services. Snowden, the main advertiser, should be happy that his location is tracked :-)

was kinda confused by moxie saying:

I'd prefer it if folks who believe in free software took on some development instead.

does he not belive in free software?

Naaaaa. This is not the case. We can't really imagine how it is to be part of such a project with just a few developers. But i must say. I know alot of people that wan't contribute anything, because nobody takes care of the pr's. If they would get merged faster. Alot of people would contribute. The Problem here is again the time moxie has. He is working more than 60 hours a week

"does he not belive in free software?"
This is not a meter of believe, this is arrogancy ...
The idea of such private communication is to protect people. How can it protect if it collects google data? How can it protect if it is tight to your phone number and you can't change it? It is nonsense.

There is no problem in registering signal with an anonymous number. Those who really need this know how to do it.

I don't think it is arrogancy, more that it is the time they share with us for free. We as the community demand way too much from 4 people working on a service like this.

Don't get me wrong. I run copperheados and don't have any google shit on my phone. The developers are free to do with their time what ever they want

This earlier versions of Signal were bundled with Cyanogen mod some time ago. I have never used it because it wanted registration and google services, google gmail account. I was always far from crating accounts so I have never used it. The good approach to secure communication was VoIP or XMPP with ZRTP but for the ordinary folk it was too complicated. I don't remember such problems with missed calls. Seamless boot to the Signal network (or WhatsApp) is a very good idea but there should be a way to re-register to anonymous account, not tight to any serial number inside the phone, and use it without google spy services.

Yes true. I don't get it why not allowing random alphanumeric generation as username. Like you click random user and it comes something like: h1167ghUy2

This would kill it

Because I smell a rat.....suspect that someone is guilty of betrayal, deception, or causing a situation to go wrong. As it should be said "Let the people who believe in free software took on some development and implement such functionality".
Big corporations will betray users, their customers so keeping google services is a suicide for activists and big movements like Arab spring, Ukraine, China and other oppressive places.

Could you please continue your discussion in the forum. I would like this issue to remain open for further actual work on it. I am as frustrated as you are that this annoying bug still exists and GApps free Signal development seems to be lowest priority right now, but you are only making it worse.

Try checking the coding if your manually fixing it or get code help

@dpapavas is it possible for you to point me to the branches where both of your fixes are implemented (does this exist?)? I've been using the 'fix' by theBoatman mentioned in the last comment here:

https://github.com/signalapp/Signal-Android/pull/7388

But now Signal is nagging me to update, and it's not clear what the best solution out there is, but it sounds like it's what you're working on (at least this is the one being pursued).

The repositories in question are here:

https://github.com/dpapavas/libsignal-service-java
https://github.com/dpapavas/Signal-Android

You can either check out the 6644-sleep-fix-3 branch in each case, which only contains the fix for the missed keep-alives leading to a dead connection, soon after turning off the screen, or the netchangefix branch, which adds an additional patch to detect dead connections due to network handovers or other issues.

The former has been submitted as a PR and might some day be merged into Signal. The latter has not been submitted yet, as it depends on the former.

Let me know if you notice anything out of the ordinary. Here are the links for each branch:

https://github.com/dpapavas/libsignal-service-java/tree/netchangefix
https://github.com/dpapavas/Signal-Android/tree/netchangefix

https://github.com/dpapavas/libsignal-service-java/tree/6644-sleep-fix-3
https://github.com/dpapavas/Signal-Android/tree/6644-sleep-fix-3

Thanks very much! I've installed the netchangefix version on my phone (well, with the latest changes to master, just because...), and it seems to be working well. I'll pop back in if I have any issues.

And thanks for chasing the PR for this - much appreciated. Would be great to have usable websocket support in Signal.

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


__This issue now has a funding of 1.0 ETH (441.51 USD @ $441.51/ETH) attached to it.__

@jessevill99 : the exact problem has been described by @Socob when opening this issue - thanks.

hello there I'm going to let you know I have an android galaxy s5 and my
phone has same problem, the issue is with the provider such as the
following that I know is cricket. but to fix is simple I use galaxy s5 so
for my prompt menu it will have a reboot option for phone usally works but
most time it will not so if you know you're phone provider contact them
then google play. cause it might just be service down cause lots of android
users have been havingthis issue lately and just may be technical issue
with company. because many android users are currently having this issue I
don't want money just informing you the network has been down for cellular
data and the provider company need tickets to be sent. if phone is rooted
it can not be helped, but most phone providerss can access your phone from
pc and fix the problem.
On Jul 11, 2018 11:08 AM, "Gustav Marwin" notifications@github.com wrote:

@jessevill99 https://github.com/jessevill99 : the exact problem has
been described by @Socob https://github.com/Socob when opening this
issue - thanks.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/signalapp/Signal-Android/issues/7638#issuecomment-404225175,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AlisJ2Pb5Auggn_HH-rfrVTAOtLeuCQmks5uFiLhgaJpZM4THlgM
.

I've said most of this in the past, but I'll reiterate, in the hope of minimizing confusion if the funding results in more people working on this. As far as I can tell the problem under discussion here is solved, in that the causes are known and a patch exists (see here and here for the details).

As expained in the linked posts, the main problem, is not finding out what causes the issue and how to fix it, but rather getting the patches into Signal. There's a long history to this, the details of which I won't go into (although the trail of issues and PRs shouldn't be hard to follow for anyone interested) but the gist of it is that there's no PR for this issue yet, mainly because it depends on another PR (signalapp/libsignal-service-java#53) which is currently open and has been for a while.

Past PRs, with essentially the same functionality were closed relatively switftly (on the order of a day to a week or so) on code design grounds and the time it took for review, which resulted in change requests on this one were similar. It has been almost a month and a half since these changes were implemented though, with no progress. Everyone can draw their own conclusions; there's little use in discussing this further.

In the meantime, and to get to the original reason for this post, I've rebased the branch I made for @wryun, to update it to the latest 4.23.3 version, because the old version it was based on, started to behave erratically. It seems to be alright now and since I anticipate being stuck with this for a while, I added a patch from my original private branch, which disables the nagging update notification, when there's a new version (since I can't be bothered, nor need to, rebuild everytime there's a new upstream release).

The result can be found in this release, where I've also added a buillt apk, for those who want to try it, but can't build it themselves. Whether you want to install a custom binary for a secure communication app is up to you. If you do choose to try it, note that you can't install it on top of the official Signal build, which you'll have to uninstall first. This means backing up any messages, etc. and re-importing them once the custom build is installed, which I think is possible, but have never bothered to try. (I believe a friend has tried it though, wihtout any issues). The same procedure will be necessary, if and when it becomes possible to return to the official build.

If anyone does try this out, please let us know how if any problems still persist (or if it introduces new issues).

Issue Status: 1. Open 2. Cancelled


__The funding of 1.0 ETH (437.92 USD @ $437.92/ETH) attached to this issue has been cancelled by the bounty submitter__

A little update on this: Now that the first stage of the fix for non-GCM devices is in place, in version 4.26.0, I have created a couple of pull requests that should address problems related to swtiching between networks (e.g. from WiFi to cellular data). I've been using these patches for several months now without any missed messages or receipts whatsoever, so if and when these are merged, non-GCM devices should (hopefully) function without any problems (well, connectivity-related problems at least).

For anyone interested, the PRs in question are signalapp/libsignal-service-java#62 and #8230, but please note that discussion related to the problem in general should be kept here. Messages posted on the PR threads should be restricted to discussion on the PRs themselves.

Note that anyone without GCM should still upgrade to 4.26.0 as soon as possible, as on some devices this seems to be sufficient for mostly normal operation and for all devices it does enusre normal operation when the network isn't switched, so it makes Signal at least partly usable without GCM. When the network is switched, Signal can be forced to reset the connection via force stop or switching to airplane mode and back.

@dpapavas - given the response on the new PR, would it be possible for you to build an updated release of https://github.com/dpapavas/Signal-Android/releases/tag/v4.23.3-nogcmfix ?

(this had the additional fix, right? It's been working perfectly for me)

I decided to migrate to that version rather than the one I was building, so it's moderately annoying for me to do this myself (different key, etc.).

I've made an updated release which can be found here. This includes the patch that has been merged in 4.26, as well as the network change fix and the patch to remove the update nag screen. It should make Signal work on non-GCM devices for the time being (at least for messages).

Ultimately though, perhaps this is as good a chance as any, to take a hint and realize that expecting Signal to work reliably without GCM, is overly optimistic, bordering on naive. Personal disappointments and grievances aside, certain facts are plain to see and have been so for a long time: One is that there is no interest in support for non-GCM devices from the side of the project. Any issues that concern such devices, such as this one, are simply marked "help wanted". The first reponse to this thread (as well as other similar responses elsewhere) is very clear about the fact that non-GCM functionality can't even be tested, much less investigated or fixed and all these are left to "free software folks". An implicit corollary of this is that any changes made in the course of the "normal" development of Signal, inevitably can't be tested with regard to its effects on the functionality on non-GCM devices.

Another fact is that when "free software folk" respond to the request for help and submit PRs, the review process typically takes a very long time and is very difficult in general. PRs are expected to be submitted fully baked and tested, a process which takes considerable time and effort, but are cursorily dismissed, with little feedback and guidance on what is expected. This has led three separate people to write, test and submit several functionally identical patches (#7100, #7388, #7666, #7723, signalapp/libsignal-service-java#45, signalapp/libsignal-service-java#51, signalapp/libsignal-service-java#52, signalapp/libsignal-service-java#53, signalapp/libsignal-service-java#54) on the same problem, all of which were closed for practically the same reason and with identical feedback. My own initial patch (#7100) was submitted almost a year ago and was functionally the same as the one that was finally merged a few weeks ago.

Together, these mean that any upstream changes, such as the ones that will be made now to support API level 26, are likely to break Signal on non-GCM devices, because nobody will test them with non-GCM devices, to ensure that they won't. If that happens, issues will be opened, marked "help wanted" and a new uphill struggle to fix them will begin, leaving people who can't compile Signal for themselves with a broken application for many months.

And this is the best-case scenario, assuming someone to be available and willing to investigate the issue, write, test and track multiple PRs and wait weeks or months between submission of changes and feedback before repeating the cycle, which is clearly too much work just to be allowed to have an application which will inevitably be perpetually broken (not to mention the psychological toll of such an obvious exercise in futility). Whether this is because of lack of time, indifference, or outright hostility towards non-GCM users, is ultimately of no practical consequence; the situation is still the same whatever the reasons.

Thanks very much, @dpapavas! To me it would make sense to maintain a minimal fork and keep attempting to upstream the changes, but that didn't go so well in the past (https://github.com/LibreSignal/LibreSignal).

Thanks a ton, @dpapavas for trying so hard to get the non-GCM fixes into Signal upstream. I totally understand your frustation. I want to let you know that there's quite some Signal users desperately waiting for your fixed to get merged upstream. Please read this as a motivation to continue the uphill struggle.

At least in germany, installing LineageOS without Google Services becomes more and more popular among privacy-aware people. Even if not all of them are affected by this issue with dangling net sockets, there's still users where Signal doesn't work reliable at the moment.

At the same time I can see that other issues have higher priority for the one and only Signal Android developer, who does an awesome job here (:heart: @greyson-signal). But maybe @greyson-signal (or @moxie0) could give an estimation on whether they would be willing to spend some time on reviewing (and accepting) the PRs once they got the targetSdk moved to 26?

I'm afraid you may be missing my point. The purpose of my previous post, wasn't to vent my frustration with the "uphill struggle" of trying to fix Signal's issues on non-GCM devices (although perhaps I didn't entirely resist the temptation to do a bit of that too). Nevertheless, my purpose was to argue, as objectively as I could, that you shouldn't expect this to ever happen.

Currently, whether Signal works on your device or not is a matter of chance. Perhaps you have a working network configuration and Signal works all day, perhaps you don't and Signal works until you step out of the house. If you stay inside, it may even work all day for any device, but it's still a matter of circumstance. Once this fix is merged, if that ever happens, it will still be a matter of chance, because any subsequent change made in the course of Signal's development is unlikely to consider whether it might break non-GCM functionality, much less test that it doesn't do so.

Now, although I personally find the concept of having functionality in an application, that you then explicitly ignore during your development and testing cycle, deeply confusing, we could nevertheless be practical about it and hope that, some "free software folk" will come along and fix any new problems as they arise. In other words, one could hope that the "non-GCM" community would do the necessary work to ensure that Signal works adequately on its devices, but given the months a typical review process on such issues seems to take, whatever the reason, this would be clearly less resonable hope and more wishful thinking.

So if the fix is merged, you can perhaps expect Signal to work for a few months, perhaps even a year, until it doesn't. Then you can expect several months of barely any, or no operation at all, until another fix is merged. If you want to be optimistic about it, you can hope for several months of more-or-less trouble-free operation, followed by a few of no operation, if the problem is identified, fixed and merged relatively quickly. If you want to be pessimistic about it, you can well expect the opposite, but the basic concept is still the same.

In the end, whether I continue the "uphill struggle" or not, or whether others do so, is not really important. It only shifts the odds a bit. If you're going to use Signal, you should always expect it to work some of the time, depending on chance and circumstance, because this will be so, either by design, or out of indifference.

what happens when you have a trusted credential giving out personal imformation


From: Dimitris Papavasiliou notifications@github.com
Sent: Monday, October 8, 2018 12:53 PM
To: signalapp/Signal-Android
Cc: Shygent; Manual
Subject: Re: [signalapp/Signal-Android] Network connection seems to get stuck (without Google services) (#7638)

I'm afraid you may be missing my point. The purpose of my previous post, wasn't to vent my frustration with the "uphill struggle" of trying to fix Signal's issues on non-GCM devices (although perhaps I didn't entirely resist the temptation to do a bit of that too). Nevertheless, my purpose was to argue, as objectively as I could, that you shouldn't expect this to ever happen.

Currently, whether Signal works on your device or not is a matter of chance. Perhaps you have a working network configuration and Signal works all day, perhaps you don't and Signal works until you step out of the house. If you stay inside, it may even work all day for any device, but it's still a matter of circumstance. Once this fix is merged, if that ever happens, it will still be a matter of chance, because any subsequent change made in the course of Signal's development is unlikely to consider whether it might break non-GCM functionality, much less test that it doesn't do so.

Now, although I personally find the concept of having functionality in an application, that you then explicitly ignore during your development and testing cycle, deeply confusing, we could nevertheless be practical about it and hope that, some "free software folk" will come along and fix any new problems as they arise. In other words, one could hope that the "non-GCM" community would do the necessary work to ensure that Signal works adequately on its devices, but given the months a typical review process on such issues seems to take, whatever the reason, this would be clearly less resonable hope and more wishful thinking.

So if the fix is merged, you can perhaps expect Signal to work for a few months, perhaps even a year, until it doesn't. Then you can expect several months of barely any, or no operation at all, until another fix is merged. If you want to be optimistic about it, you can hope for several months of more-or-less trouble-free operation, followed by a few of no operation, if the problem is identified, fixed and merged relatively quickly. If you want to be pessimistic about it, you can well expect the opposite, but the basic concept is still the same.

In the end, whether I continue the "uphill struggle" or not, or whether others do so, is not really important. It only shifts the odds a bit. If you're going to use Signal, you should always expect it to work some of the time, depending on chance and circumstance, because this will be so, either by design, or out of indifference.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com/signalapp/Signal-Android/issues/7638#issuecomment-427942220, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Al174mPSktVNwmcbGeb4poutTFck7JRoks5ui59BgaJpZM4THlgM.

@dpapavas I think I got your point.

I disagree that it's the Signal developers' duty to support non-GCM devices as first-class citicens (even though I would love them to do this). They have to do testing on a finite amount of devices, and apparently no no-GCM device is part of this (yet).

But I agree that Signal should care if others report that Signal functions are broken on non-GCM devices, and especially if others put a huge amount of energy and time into debugging and suggesting fixes.

Besides, I hope that once your fixes (or similar logic to unbreak dangling net sockets) get merged that they're kept and not thrown away with the next round of code restructuring. So my hope would be that this particular problem will not pop up again anytime soon. You're right, as long as Signal developers don't actually test on non-GCM devices, there's no guarantee that this won't happen again. But apparently there's enough non-GCM Signal users that care and report bugs in case new problems arise.

I think the linked issues mentioned early on in this issue were resolved, so I'll close this.

Unfortunately, this issue isn't resolved yet.

I'm running Signal 4.31.6 (beta) on LineageOS 14.1 Android without GApps (phone is an HTC One M9) and I still suffer from the same bug: when switching networks (e.g. between wifi and mobile internet), often Signal doesn't detect this switch and I message receiving is delayed literally for hours. Only thing that helps is killing and restarting the app.

@greyson-signal From which version on is this fixed?

The error still always occurs when switching networks (e.g. between WLAN and mobile Internet).

My workaround is still: Stop the signaling process with a long press on the back button and restart. After that everything will work fine again.

Thanks a lot to @dpapavas for trying so hard to get the non-GCM fixes into Signal upstream.

Device info

Device: Fairphone 2 (FP2)
Android version: 8.1.0 (LineageOS 15.1 without Google services)
Signal (APK update) version: 4.30.8

@greyson-signal Original reporter here. Unfortunately, even with some of the PRs merged, this issue (“network connection seems to get stuck without Google services”) is definitely not resolved on my device. There are still some PRs created by @dpapavas which have not been merged yet (signalapp/libsignal-service-java#62, #8230)!

Even though the situation seems (subjectively) perhaps a bit better than when I first opened this issue, the fact is that messages still fail to arrive for minutes to hours after a network change on a regular basis!

Updated device info

Device: OnePlus One
Android version: 8.1.0 (LineageOS 15.1) without Google services
Signal version: 4.30.8


If you need anything from me (e. g. a new debug log with a more recent version of Signal), please let me now.

Be advised that some change, probably introduced around version 4.30, seems to have broken Signal for non-GCM devices again (completely, not just after network changes; see #8402). I'm not sure whether it will affect all devices (it would seem the most probable), or to what extent it will affect Signal's operation, but it would probably be wise to postpone updating the app for as long as possible.

LineageOS 15 and 16 without gapps, messages not sent. Latest Signal.

Probably #8402 but replying here because the comment above seems relevant.

@dpapavas Can you perhaps clarify what you mean by “broken Signal for non-GCM devices completely”? Do you mean just with respect to this network change issue or completely non-functional? I’m currently on 4.30.8 and haven’t upgraded so far because of potential issues with more recent versions.

Perhaps it will somehow vary from device to device, but as far as I can remember, in my case it basically broke the functionality introduced in my previous patch, that woke the device up, in order to send keep-alives and keep the connection open. This would probably mean that Signal would not be able to receive messages for the most part, much as it was before my patch.

If you're using 4.30.8, you should have been affected though, so either you haven't noticed it, which would be unlikely, or you're not affected for some reason, or I may be wrong. The latter would be likely, as I haven't looked much into it, except to find a quick fix, but others seem to be affected too (see #8402).

(Upon closer inspection of #8402, it seems like the problem might also (or just) prevent sending messages. As I said, I'm not sure as I didn't look into its repercussions too much, just for a way of getting rid of the exception.)

@dpapavas Hm, thanks. As I said before, the problem in this issue has never really gone away for me even with your partial patches that were merged into Signal, so I likely just haven’t noticed the difference. If the new problems were introduced before 4.30.8, then I’m hopefully fine (I was mainly worried about not being able to send messages).

@dpapavas, thanks a lot for your research, your patches and your kind endurance in pushing that further (to get the code pulled).
Unfortunately, I totally agree with your opinion stated in this three month old comment. At the bottom line this means for me (as somebody that refuses to use some proprietary web service in this context), that Signal is currently and very likely will never be an option for me and that I can't advertise it anymore to other people (even if they in fact currently still use GCM). It is absolutely great from Open Whisper Systems and Moxie Marlinspike to provide this code base under an indeed free license, especially the server which even is AGPL. But it is a pity, while I naturally accept their resource restrictions and their very own priorities etc., that the only "solution" seems to be to fork due to the lack of cooperation.

For sure there have been a lot of uncooperative requests from "the free software folks", but you have to be able to distinguish -- and I really expect that from Open Whisper Systems. The stuff in your context is/had been cooperative.
If we fail on this and the above scenario happened, this would be far away from useful. The code base will diverge quite fast, time, power and other resources will be wasted on both sides, and in best case there will be two non-federating systems at the end. That is in fact not the idea behind free software -- at least not in long-term.

I would be very glad if Open Whisper Systems find a way to cooperate without investing too much of their very limited resources. IMHO that is possible especially in context of free software. Perhaps some Free Software Support Board, stocked with free software people, could be established in-between Open Whisper Systems and the free software community. For sure these have to be selected/chosen by Open Whisper Systems. For sure these need to be accepted by the free software community. For sure these have to represent the needs of both sides. Since the used licenses are indeed free software licences, I would hope that it is possible to get cooperative "free software people" on board. For sure this construct works only sustainable if these people are really able to influence -- but hey, Open Whisper Systems is free too choose these.

On the bottom line, mainly addressed to Open Whisper Systems (just because they can't be replaced, but there are a lot of free software people from which cooperative ones can be chosen): Please cooperate. Benefit from the community. Of course, this only works if you comprehend their needs. The most important step is already done right: you use strong free software licences (no MIT, BSD etc.). That is indeed a very strong sign (to the community) that you want to cooperate.

[Some technical question will follow in next comment.]

@dpapavas:
You stated in libsignal-service-java!62 that some devices do not suffer from the problem due of the specific netd implementation which recycles (web)sockets after change of connectivity by itself.
Do you know more about the working implementations? Is it possible to patch that one instead of Signal? Probably that solution can only be used by users which do have root access/built their own system, but it would solve the issues for all (or at least several) applications, wouldn't it?

I was assuming netd was related because of log lines such as the following, on a device that was not affected:

09-19 11:23:19.644 328 1265 I Netd : Destroyed 10 sockets on 192.168.removed in 4.5 ms

I didn't find what the difference between the two devices was, that is, whether it was a matter of netd version or its configuration, but I didn't look too much into it. It would have been the correct approach technically, but it wouldn't be applicable for most users.

So I'm not sure whether this specific problem could be fixed that way, but I don't think it would be worth the effort in any case. If someone can troubleshoot and fix a system component, such as netd, they should also be able to build Signal and install a patched version. The latter would likely be a more viable solution, for reasons I've mentioned already. Signal seems already to be broken again for non-GCM devices in general, not just after switching networks. Even if the patch for it ever gets applied, it will likely have been broken by then in other ways.

Using a patched version is therefore the only way I see of having a somewhat functioning Signal without GCM, at least for as long as someone is forced or willing to investigate the problems and write the patches.

@dpapavas, thanks for the reply.

I didn't find what the difference between the two devices was, that is, whether it was a matter of netd version or its configuration, but I didn't look too much into it.

Did you mention which devices you have used? I can't find it neither in this issue nor in the detailed descrption of libsignal-service-java!62.

Signal seems already to be broken again for non-GCM devices in general [...]

A friend of mine is using the current version without issues except the network switching problem. What issues do you mean?

Did you mention which devices you have used?

I know for sure that the whole network switching problem doesn't occur on Xaiomi Mi A1 phones with LineageOS 15 and it for sure occurs on HTC One M8 and M9 with LineageOS 14.1 (all without GApps).

Did you mention which devices you have used? I can't find it neither in this issue nor in the detailed descrption of libsignal-service-java!62.

My own Xiaomi Mi 4 had the problem and a Motorola Moto3 didn't have it. Both devices were running LineageOS 14.1, without Google apps.

A friend of mine is using the current version without issues except the network switching problem. What issues do you mean?

See here and in the next couple of messages.

@mejo- , @dpapavas, thanks for your reply.

My own Xiaomi Mi 4 had the problem and a Motorola Moto3 didn't have it. Both devices were running LineageOS 14.1, without Google apps.

Huh, same SW!? Does that mean that something not within the OS influences that? Or do you mean it depends on some HW specific parts of the OS?
The comment of @mejo- implied (at least for me) that Lineage v15 had changed something in this regard which fixed it.

I looked into the changes made to netd and tried it on an incomplete development version of Replicant (which is based on LineageOS 14.1) and a plain LineageOS 14.1-20170927, both on Samsung S3 i9305 devices but different ones: Replicant did work (at least for my test case, more on that later), LineageOS didn't.
The LineageOS build is quite old, but i3905 is not supported by Lineage anymore (some unsolved issues) and that is the latest version I came across. But since the current LineageOS v14.1 did not work (on a S3 i9300, which is still supported), I assumed that this is not the issue.
The version of netd code on Replicant is older (it seems netd code was refactored afterwards).

I have found this interesting commit (cherry-picked for Lineage v15):
https://github.com/LineageOS/android_system_netd/search?q=f32fc598b01ba8d59873b0a1085716fd84678b54&type=Commits

I am unsure if the improvement on some devices are caused by a change in netd. (Which was my first assumption.)
In the meantime I am not sure anymore that I really talk about the same issue as you: I am able to reproduce doing the following steps:

  • Ensure that sending and receiving works
  • Send message from device A
  • Switch off Wifi on device A
  • Send message from device A
  • Send message from device B
  • Wait 2 seconds
  • Switch on Wifi on device A again

After that Signal "is out of sync". Device B is able to send (which A receives), but their is now acknowledgement. Messages from device A to B are not delivered. After restarting Signal on device A, past messages and acknowledgement are delivered, thus everything works again.
Can you please confirm that this is really your issue??

Huh, same SW!? Does that mean that something not within the OS influences that? Or do you mean it depends on some HW specific parts of the OS?
The comment of @mejo- implied (at least for me) that Lineage v15 had changed something in this regard which fixed it.

I'm not very familiar with the development process of LineageOS, but I was assuming that either different ports of the same version (i.e. 14.1 for Mi4 and Moto3) could have different system component versions, or perhaps netd can be configured to some extent and the configuration between ports varies.

I can't see how it could be due to hardware differences, but I can't rule it out either.

Can you please confirm that this is really your issue??
It might be; I'm not sure about the "messages from device A to B are not delivered" and the "which A receives" parts.

The problem my patch was trying to address, was that a network change left the old sockets intact and bound to the old IP address. Signal never detected anything out of the ordinary, so it kept listening to a severed connection. As such it couldn't receive messages and notifications from the server. I believe it could send though, as there was a backup path which, after failing to send on the severed main connection, temporarily opened a new one and sent through that.

I'm not really sure though and much of that might depend on the device in various ways. You might try to look through the logs after such a session and look for anything out of the ordinary.

@doak I’ve used Signal on a OnePlus One with both LineageOS 14.1 and 15.1, and I haven’t noticed any difference in behavior with respect to this issue. If there were any relevant changes between LineageOS 14.1 and 15.1, they don’t seem to have affected this issue.

Thanks for all your input. I have to have a closer look, it's still weird.

@dpapavas:

The problem my patch was trying to address, was that a network change left the old sockets intact and bound to the old IP address [...]

That's sound like the same issue I am able to reproduce with the mentioned steps.

@dpapavas Are there any updates with the Signal build? Does it still fully work without GCM?

@dpapavas Are there any updates with the Signal build? Does it still fully work without GCM?

Not sure which Signal build you refer to. If it's the official build, probably not, because it never fully worked without GCM. Patching was required, which is open as a pull request (#8230) for more than a year now, happily being ignored. Others have made similar efforts (signalapp/libsignal-service-java#70) which seem to share the same fate. See the linked PRs for more information.

This is too bad. Signal could have been so much.

Wed Sep 25 10:36:40 CDT 2019 Dimitris Papavasiliou notifications@github.com:

@dpapavas [https://github.com/dpapavas] Are there any updates with the Signal build? Does it still fully work without GCM?

Not sure which Signal build you refer to. If it's the official build, probably not, because it never fully worked without GCM. Patching was required, which is open as a pull request ( #8230 [https://github.com/signalapp/Signal-Android/pull/8230] ) for more than a year now, happily being ignored. Others have made similar efforts ( signalapp/libsignal-service-java#70 [https://github.com/signalapp/libsignal-service-java/pull/70] ) which seem to share the same fate. See the linked PRs for more information.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub [https://github.com/signalapp/Signal-Android/issues/7638?email_source=notifications&email_token=ALKOUNXLMYKCXEVP2DUK3UTQLOAQJA5CNFSM4EY6LAGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7SLBSA#issuecomment-535081160] , or mute the thread [https://github.com/notifications/unsubscribe-auth/ALKOUNRJE7LD2XRMEZFNLBLQLOAQJANCNFSM4EY6LAGA] . [https://github.com/notifications/beacon/ALKOUNU6YG7U4W5EBGH7GODQLOAQJA5CNFSM4EY6LAGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7SLBSA.gif]

This bug still happens, with 4.50.5 _with_ GApps / Android 6.0 / Huawei EMUI 4.0.3!

Does anyone care that it makes Signal halfway unusable, or is the project dead? :/

I believe this bug was fixed in e3b66dc7e8906b678576c71d01d98aff17c6aefc.

@navid-zamani This thread is specific to devices without GApps. It's unclear what your specific symptoms are, but you may be experiencing different problems related to delayed notifications. Check out #8692.

Also, be sure to first check out support page on fixing delayed notifications:
https://support.signal.org/hc/en-us/articles/360007318711-Troubleshooting-Notifications

Huawei is known for being super aggressive with battery optimizations, leading to things like delayed notifications.

is the project dead?

Feel free to check the commit history :) I think the team is doing great work.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

duub picture duub  Â·  90Comments

schmeat picture schmeat  Â·  76Comments

ryannathans picture ryannathans  Â·  68Comments

kyanha picture kyanha  Â·  67Comments

vinilox picture vinilox  Â·  61Comments