Doze (1) can disable network access at any moment, once the screen is off. This could result in pending uploads or downloads being abruptly interrupted (and not automatically retried, at this moment).
We need to study (experiment) the real impact on our app and adapt to it.
(1) https://developer.android.com/training/monitoring-device-state/doze-standby.html
UPDATE:
Doze (from Android 6) and App Standby (from Android 7) disable access to network when some conditions are met. These conditions never apply when apps are open, but the ownCloud Android app does a lot of work in background.
Main problems we detected are about pending uploads interrupted, and never retried again - simply left as failed in the uploads view, and leaving the task of retrying them for the user. Downloads can be easily interrupted too, but the feedback we have from users is focused in uploads, probably because uploading multiple files at once is a use case more common than downloading multiple files at once.
Another operation that might be interrupted is automated sync of full account, but this is retried by the system automatically, shouldn't be a problem. Besides, we have no feedback saying otherwise.
Solution applied is scheduling retries of uploads and downloads failed due to lack of network, using the JobScheduler to do it whenever Wifi network is available. We "hardcoded" the requirement of Wifi to make it available sooner, but we expect to make this detail configurable by the user in the scope of #41.
It is worth to say that retries will not be done as soon as the Wifi is back. Android system will decide when exactly the retry is done, granting that Wifi network is available. This is a constrain imposed by JobScheduler, and a trend that Google is introducing step by step. More constrains in this sense are coming with Android O, and we'll need to assume that we'll have less and less control over background operation.
BUGS & IMPROVEMENTS
Downloads
[X] Android 5: Instant uploads get freezed after a connection loss https://github.com/owncloud/android/issues/1684#issuecomment-295306015 [FIXED] @davivel
Uploads
[X] Several uploads pending, not all are resumed https://github.com/owncloud/android/issues/1684#issuecomment-291869388
https://github.com/owncloud/android/issues/1684#issuecomment-294438205
Maybe we should make our current services as foreground services.
This way the App Standby of the Doze Mode will let our application alive till these services finish, because it seems that they are allowed to continue executing in Doze mode:
Maybe we should make our current services as foreground services.
Seems pretty aggressive. The introduction of Doze comes to help users to protect their battery life of network-greedy apps. Adapting to Doze is the way to grant that we are not one of those greedy apps.
I would only go that path as a short-term measure, if the effect of Doze is really catastrophic. For the long term, we need to grant the oC app is a polite citizen.
On the other hand, I don't remember if having a notification being updated with the progress of transfers already makes the service a "foreground service" (writing from memory). Will check the docs.
On the other hand, I don't remember if having a notification being updated with the progress of transfers already makes the service a "foreground service" (writing from memory).
No, that's not the case. The only 'Service' we set to foreground is MediaService, for audio playback (pretty standard). Rest of our services are background, and should stay so.
This needs some more research & testing, no time enough left for this release. Needs to be moved to 2.3.0, sorry.
This needs to be addressed more aggressively.
cc @michaelstingl , @jesmrec , @davigonz
@jesmrec I include here the steps to test this issue (retries only work for android version 5 or higher):
Execute the next command to simulate device unplugging:
adb shell dumpsys battery unplug
Turn off the screen by locking the device.
Execute the next command to get into the different states of doze mode. Execute it several times till you see Stepped to deep: IDLE in your CLI.
adb shell dumpsys deviceidle step
Device is already in idle mode and the upload or download operation/s fails at this point.
Turn on the device screen by unlocking the device.
Upload or download operation/s should be retried. (Sometimes takes a long time to retry, the Android system is who decides when to start a job, so go for a coffee 馃槃 )
Updated first message with our current knowledge and a summary of what's done.
Steps:
adb shell dumpsys battery unplugadb shell dumpsys deviceidle stepuntil IDLECurrent Behaviour
App crashes and notifications remains forever (not swipable to remove)
Expected behaviour
Details view is opened with progress bar
Tested with Android 7.0 (Huawei 6P)
This is not a important one, it's an improvement
Perform these steps
Current behaviour
Some of the uploads are retried and other ones are in the failed section. It is needed to switch to files view (or other different) to retry the failed uploads. If not, they remain as failed
Expected behaviour
All the interrupted uploads in current section.
Steps
adb shell dumpsys battery unplugadb shell dumpsys deviceidle step until IDLECurrent behaviour
uploads are failed and not resumed with "App was terminated"
Expected behaviour
uploads are resumed
Tested with Android 7.0 (Huawei 6P)
After doing some uploads, the app crashes frequeltly qith he following stacktrace. I will try to get a more particular steps:
04-05 18:54:40.236 23697-23697/com.owncloud.android E/JobService: Error while executing job: 1379435768
04-05 18:54:40.237 23697-23697/com.owncloud.android D/AndroidRuntime: Shutting down VM
04-05 18:54:40.239 23697-23697/com.owncloud.android E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.owncloud.android, PID: 23697
java.lang.RuntimeException: java.lang.IllegalArgumentException: Null parameter!
at android.app.job.JobService$JobHandler.handleMessage(JobService.java:147)
at android.os.Handler.dispatchMessage(Handler.java:102)
at android.os.Looper.loop(Looper.java:154)
at android.app.ActivityThread.main(ActivityThread.java:6077)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:865)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:755)
Caused by: java.lang.IllegalArgumentException: Null parameter!
at com.owncloud.android.files.services.FileUploader$UploadRequester.retry(FileUploader.java:286)
at com.owncloud.android.files.services.RetryUploadJobService.onStartJob(RetryUploadJobService.java:30)
at android.app.job.JobService$JobHandler.handleMessage(JobService.java:143)
at android.os.Handler.dispatchMessage(Handler.java:102)聽
at android.os.Looper.loop(Looper.java:154)聽
at android.app.ActivityThread.main(ActivityThread.java:6077)聽
Following these steps:
the pending downloads are not retried. Would it be correct? not sure if it is a bug? @davigonz
When a server is https non-trusted and a download is interrupted, the error in notification is SSL initialization failed instead.
@jesmrec
Following these steps:
Download several files
Switch connection off
Switch device off
Switch device on
Switch connection on
the pending downloads are not retried. Would it be correct? not sure if it is a bug?
It's not a bug actually. There's a parameter in the job scheduler to persist the jobs and prevent them from being deleted after a device restart. Let me enable this option and check how it works.
Steps
Current behaviour
Not all are resumed
Expected behaviour
All pending downloads resumed
i was able to reproduce the crash in https://github.com/owncloud/android/issues/1684#issuecomment-291926676 with the following steps:
App crashes
@davigonz @davivel
Jumping in for some bug fixing...
@jesmrec , the steps you mentioned about the crash in your last comment make sense. Fixing.
Uploads and downloads not retried, fixed.
If Doze sets the device in idle state in the middle of a transfer, this is interrupted before ConnectivityManager is aware of the connectivity lost. Since the app checks if connectivity is available right after the failure to decide if the transfer should be retried, it gets a fake positive, and doesn't reschedule the first interrupted transfer.
Added an extra check to test if the device in idle mode, directly. If true, the transfer is reschedule also.
That should work for the moment. We should refactor services in the app and design a better approach for retries (resyncs) in the near future.
It's not a bug actually. There's a parameter in the job scheduler to persist the jobs and prevent
them from being deleted after a device restart. Let me enable this option and check how it works.
When the upload/download is running and the device is restarted, the upload/download is not resumed. It is only resumed if the connection is switched off before the restarting. Is this the expected behaviour with the parameter of the job scheduler?
@davigonz @davivel
The problem related with instant uploads persists https://github.com/owncloud/android/issues/1684#issuecomment-291920392
Reproduced with Nexus 5X
Regarding the uploads: https://github.com/owncloud/android/issues/1684#issuecomment-291869388
Works properly, but one case: when the app is in uploads view in the moment the screen is switched off. When you switch to another view, they are resumed, so the behaviour is not bad at all.
When the upload/download is running and the device is restarted, the upload/download is not resumed. It is only resumed if the connection is switched off before the restarting. Is this the expected behaviour with the parameter of the job scheduler?
The jobs which retry the uploads/downloads are scheduled only when the connection is lost, so in the case you refer to, the behavior is the expected.
@jesmrec
Works properly, but one case: when the app is in uploads view in the moment the screen is switched off. When you switch to another view, they are resumed, so the behaviour is not bad at all.
We don't have control on the exact moment the uploads are retried. If the system is considering some specific condition of the app state that is related with getting out of uploads view.... there is nothing documented, can be a coincidence, or just that network is used by another part of the app and that 'cheers' the scheduler to start the retries.
Unless there is some error reported in logs, I would not research more now to control this detail further.
The problem related with instant uploads persists #1684 (comment)
Reproduced with Nexus 5X
Checking this.
I could reproduce the problem with instant uploads; but uploads are retried later, when the uploads view where the error "App was terminated" is left by the user.
At this point, the feature is validated with Android 7, where Doze/App Standby is more agressive with uploads and downloads.
Now, it's time to check with Android 6.
Android 6:
Downloads are not resumed after IDLE mode. If the screen is unlocked not immediately (waiting a while), the download is stucked. Checked with one or two downloads.
Galaxy Tab S
Debugged the previous one, found the next conditions:
To handle this issue we should change our network library (too hard) or extend the scope of the issue for a general retry policy, instead of just supporting Doze. In any case we would need much more time.
OK, so it is out of scope but in any case we should forget it. Apart of that, Android 6 works properly with the retries.
So, time for regression in the older and supported Android versions 5 & 4.
@davivel @davigonz
Android 5:
When the connection is lost and then restablished, the download in course is lost. It fails after a couple of minutes while the remaining downloads finish correctly.
Android 5:
Those instant uploads interrupted by a connection loss are never retried, and are stucked in uploads view in Current section. They can not be neither removed nor moved forward.
Steps:
1.- Enable instant uploads
2.- Take some pics/videos
3.- Set airplane mode
4.- Unset airplane mode
Current: uploads stucked
Expected: uploads resumed
In masterbranch the uploads are moved to failed section with Connection error
Android 4: behaviour is the expected one in both uploads and downloads for this version:
About https://github.com/owncloud/android/issues/1684#issuecomment-295275970 , similar situation as in Android 6. If the network is recovered before the timeout exception is triggered, when this is process it's not rescheduled because at that moment there is connectivity. Need to define a serious policy for resyncs to have a download like this recovered.
I made some extra improvements to address the pending bugs. At this moment:
First, about downloads / uploads never retried: I extended the conditions of the retry, so that if they are finished due to a socket timeout, they are scheduled to be retried even if in the moment of processing the exception the network is available again; please, notice that the retry still can take a long time to happen, because Android does it whenever it feels; I was expecting for an scheduled retry about 20 minutes, in a device not specially busy, and with Wifi connectivity ready all the time. In summary: if the transfer doesn't appear as stuck, it will be retried sooner of later.
Second, about stuck transfers: downloads should not happen anymore (there was a buffer not correctly released), uploads in HTTPS connections should not happen anymore, but uploads in HTTP connections still could happen. I will open a separate issue with the details. In summary, if the network is lost right when a PUT operation is writing to the socket, and then the network connectivity is recovered, the PUT operation will block its thread indefinitely. I was able to set a write timeout for HTTPS connections, but can't do something similar for HTTP. Other things can be tried, but I think we should move this forward and deal with the HTTP case later.
Ok, then we will need a separate issue for the case HTTP in Android 5. The other cases are already tested and works as expected. @davivel please do not forget opening it with the details.
Feature approved, now uploads and downloads will be retried if Doze decides to interrupt them.
CC @davivel @davigonz @michaelstingl
Pending corner case in https://github.com/owncloud/android/issues/1950
Most helpful comment
At this point, the feature is validated with Android 7, where Doze/App Standby is more agressive with uploads and downloads.
Now, it's time to check with Android 6.