Android: Reduce connectivity check

Created on 3 May 2019  路  11Comments  路  Source: nextcloud/android

Currently we use connectivity check very often, but e.g. if the data connection did not changed, we do not have to check again.

Also it was reported that some users fire 6x status.php checks in a second:

99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:23:47 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:06 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:06 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:06 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:07 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:50 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:35:50 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:22 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:53 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:53 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
99.99.x.x - user@server [02/May/2019:15:47:53 +0200] "GET /status.php HTTP/1.1" 200 145 "-" "Mozilla/5.0 (Android) ownCloud-android/3.6.0" 0
enhancement high

All 11 comments

I have a refactoring of this ready on my branch somewhere.

The main challenge is that the check uses synchronous getter that is called liberally all over the place. To fix this problem we'd probably need to migrate to a newer, event-based network status API.

Synchronous getters are marked as @Deprecated BTW.

PR #4008 should lay a foundation for this.

a newer, event-based network status API.

That would be a good idea.
Do I understand this correct: every time we get a notification about a network change we try to reach server and store this info.
We assume that the info is valid until a network change occurs again?

That is correct. This should limit amount of queries greatly.

On May 13, 2019 1:21:37 PM UTC, Tobias Kaminsky notifications@github.com wrote:

a newer, event-based network status API.

That would be a good idea.
Do I understand this correct: every time we get a notification about a
network change we try to reach server and store this info.
We assume that the info is valid until a network change occurs again?

--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/nextcloud/android/issues/3947#issuecomment-491819620

I started working on an event based network listener.

I started working on it, but I need to consult you before I move forward.

The check is called synchronously in few background operations. That means that whenever we run a remote op:

  1. pro: we have a current state of network and server connectivity
  2. con: this is rather expensive

My initial simplistic approach was to simply wait for platform network status updates, but I see that this check is also verifying server availability by GET-ting status.php. Generally this looks for me like re-invented 503 Service Unavailable HTTP code. I'll assume there was a reason for this.

I'm trying to understand:

  1. why we do this check in the first place
  2. what value it brings for the user to have it
  3. what is the UI/UX implication of it or lack of it?
  4. can we drop it from mobile client?

The issues I see with it right now:

  1. it is unreliable, as it can change between network ops (upstream network outages, admin actions, etc)
  2. moving from polled to event-based API will make this check event more unreliable
  3. polling status.php just after network change is unreliable as many public networks blocks connections until user accepts TOS (walled garden) or grant you access to the world few seconds after device registers itself (ex. by updating firewall rules), so I expect a lot of edge issues around it

My proposed solution:

  1. use platform API to listen for network changes
  2. assume the server is available if network is available
  3. ops fail gracefully if it's not reachable (including "maintenance mode")

I'm not sure if our UX is ready for 3.

@tobiasKaminsky @mario @AndyScherzinger Can you brainstorm this?

We poke status.php to see if we're behind a walled garden on not pretty much.

We use status.php for old (

My proposed solution:

  1. use platform API to listen for network changes
  2. assume the server is available if network is available
  3. ops fail gracefully if it's not reachable (including "maintenance mode")

In 2. we will get problems if user is connected to wifi, but not to internet (e.g. walled garden, or simply a local network)

  1. we do support "server offline", "server in maintenance mode", both in uploads and if you browse files.

Sorry, if I did not get your point, but why cannot we use event-based check:
If platform api notifies us about a network change, we test with status.php/204 once and use the result of this until a new network change is detected

This way we do not have calls that often?
Also we could re-trigger uploads if we detect a switch from "no connection to server" to "server is available".

@tobiasKaminsky this approach will not work, reason being that network change will not be trigger if for example your "walled garden" session timeouts.

In 2. we will get problems if user is connected to wifi, but not to internet (e.g. walled garden, or simply a local network)

What kind of issues do we expect? I guess that the op will simply fail to be re-scheduled during next sync window.

In comparision, what happens if we're in walled garden? If I understand the code correctly, op is aborted and re-scheduled.

It looks for me like it's the same thing, except that we make 1 redundant call to status.php on happy path:

Here is the pseudo-code of what I believe is happening:

isOnline = checkStatus()
if !isOnline {
    reshedule()
    return
}
success = sync()
if !success {
    reschedule()
} else {
   showUpdatesOnScreen()
}

I think we could do it a bit simpler:

success = sync()
if !success {
    reschedule()
} else {
   showUpdates()
}
  1. we do support "server offline", "server in maintenance mode", both in uploads and if you browse files.

I'm trying to understand what this means for the user.

Is there any existing UX that depends on this status that we're at risk of breaking?

This is exactly what I'm trying to avoid here - ie. UX regressions - but also I'm trying to figure out how much space for change we have.

Sorry, if I did not get your point, but why cannot we use event-based check:
If platform api notifies us about a network change, we test with status.php/204 once and use the result of this until a new network change is detected

Internet access might depend on other factors, such as acceptance of terms-of-service or delayed firewall updates In such case:

  1. device connects to temporarily walled network
  2. app checks status.php and caches "not accessible" state
  3. user accepts TOS, firewall updates, etc
  4. internet access is granted
  5. app is not aware of it

To mitigate 5. we'd need to poll the status again, which means we're back at square 1 or we risk random complaints of "doesn't-work-at-my-workplace-but-works-in-train"-type.

This way we do not have calls that often?

As you see above, we'd still need to call it often. We can throttle it of course, but:

  1. it adds extra complexity (let's think about it from QA perspective for a moment)
  2. is still not reliable, with occasional hiccups that are tricky to reproduce

Also we could re-trigger uploads if we detect a switch from "no connection to server" to "server is available".

This would not be reliable because "having network" and "having clear connection to server" are separate concepts and the latter one ("server") is not signalled asynchronously. We need to perform expensive polling again. Making polling less frequent makes it less and less reliable.

I'd honestly rely on other mechanisms to handle that and treat "walled garden" simply as yet another random network issue. If app can recover from that gracefully, status.php would not be needed for rescheduling.

But I don;t know what UX relies on that status.php. If this is only sync restart, than I'd say we're safe to explore alternatives.

Thanks for your very detailed answer!

  1. we do support "server offline", "server in maintenance mode", both in uploads and if you browse files.

I'm trying to understand what this means for the user.

Is there any existing UX that depends on this status that we're at risk of breaking?

On FDA we check result of sync and show matching info:
https://github.com/nextcloud/android/blob/1cb125a4642e76f75847dc9ef0919a4c0265abe6/src/main/java/com/owncloud/android/ui/activity/FileDisplayActivity.java#L1404-L1423

Thanks for explaining why my idea is not working :-)

I'd honestly rely on other mechanisms to handle that and treat "walled garden" simply as yet another random network issue. If app can recover from that gracefully, status.php would not be needed for rescheduling.

How could we recover gracefully from walled garden / no internet connection without polling to status.php/204?

What about this

  • we do not check for status.php/204 on uploads/calls
  • if we detect that we have no internet connection/server offline/server maintenance mode

    • we set matching UI info

    • we do exponential polling

    • if we have server connection again, we restart pending uplodas

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ThaDaVos picture ThaDaVos  路  3Comments

toobie83 picture toobie83  路  3Comments

Shagequi picture Shagequi  路  3Comments

eppfel picture eppfel  路  3Comments

Tie-fighter picture Tie-fighter  路  3Comments