Describe the bug
I am testing uploading photos to S3 on a very bad network connection. I am finding that the transfer starts, reports 0% progress then 100% but then the upload state changes to FAILED and yet onError is not called. This is surprising to me. I can confirm the upload did not succeed. It never appears to attempt a retry or otherwise report any other information about this transfer.
What is the expected behavior here? How can I ensure that photo uploads don't get "lost" - do I need to manually retry them? The iOS SDK automatically retries them and I configured the retryLimit to be 10, is there a similar approach with the Android SDK? Thanks!
To Reproduce
Steps to reproduce the behavior:
Which AWS service(s) are affected?
S3 TransferUtility
Expected behavior
The upload should succeed, and retry if it fails, eventually succeeding. It should not report 100% progress unless the upload succeeded.
Environment(please complete the following information):
Device Information (please complete the following information):
Logs
2018-12-13 16:12:47.594 13106-13106/com.x.x D/RestHelper: ID:7 bytesCurrent: 0 bytesTotal: 125024 0%
2018-12-13 16:13:11.970 13106-13106/com.x.x D/RestHelper: ID:7 bytesCurrent: 125024 bytesTotal: 125024 100%
2018-12-13 16:13:27.002 13106-13106/com.x.x D/RestHelper: ID:7 bytesCurrent: 125024 bytesTotal: 125024 100%
2018-12-13 16:13:27.027 13106-13106/com.x.x D/RestHelper: Upload state changed to FAILED
Code
TransferUtility transferUtility =
TransferUtility.builder()
.context(context.getApplicationContext())
.awsConfiguration(AWSMobileClient.getInstance().getConfiguration())
.s3Client(new AmazonS3Client(AWSMobileClient.getInstance()))
.build();
final String filename = "image_" + randomAlphaNumeric(10) + ".jpg";
File file = scaleImage(uri, context.getCacheDir().getPath() + File.separator + filename);
TransferObserver transferObserver = transferUtility.upload("bucketname", filename, file);
transferObserver.setTransferListener(new TransferListener() {
@Override
public void onStateChanged(int id, TransferState state) {
Log.d(TAG, "Upload state changed to " + state);
}
@Override
public void onProgressChanged(int id, long bytesCurrent, long bytesTotal) {
float percentDonef = ((float) bytesCurrent / (float) bytesTotal) * 100;
int percentDone = (int)percentDonef;
Log.d(TAG, "ID:" + id + " bytesCurrent: " + bytesCurrent
+ " bytesTotal: " + bytesTotal + " " + percentDone + "%");
}
@Override
public void onError(int id, Exception ex) {
Log.e(TAG, ex);
}
});
If helpful, this is how I initialize it in my startup activity:
AWSMobileClient.getInstance().initialize(getApplicationContext(), new Callback<UserStateDetails>() {
@Override
public void onResult(UserStateDetails result) {
Log.d(TAG, "AWSMobileClient initialized. User State is " + result.getUserState());
TransferUtility transferUtility =
TransferUtility.builder()
.context(getApplicationContext())
.awsConfiguration(AWSMobileClient.getInstance().getConfiguration())
.s3Client(new AmazonS3Client(AWSMobileClient.getInstance()))
.build();
transferUtility.resumeAllWithType(UPLOAD); //resume any uploads that were paused on app termination
}
@Override
public void onError(Exception e) {
Log.e(TAG, "Initialization error.", e);
}
});
getApplicationContext().startService(new Intent(getApplicationContext(), TransferService.class));
@hipwelljo Sorry for the delayed response. I am splitting this question into two separate issues:
_Issue-1: (FEATURE-REQUEST)_ TransferUtility does not retry the failed transfers. This is a feature that is currently not supported by TransferUtility in Android. I will take this feature request to the team for prioritization. As a workaround, I would suggest you to retry the upload/download when you encounter a failure. You can decide the condition for success as: STATE == TransferState.COMPLETED && PROGRESS_PERCENTAGE == 100 and if it's not met it should be a failure. In the failure case, you can retry the transfer by calling the upload and download methods again.
_Issue-2: (BUG)_ There is inconsistency in the error/state/progress reporting mechanism in TransferUtility. If I understand correctly, the transfer progress is 100% where bytesTransferred is equal to bytesTotal, however the upload failed with onStateChanged reporting FAILED state and onError was not invoked. If it's multi-part upload (>= 5MB), my suspicion would be that the multi-part complete request might have failed when the network connection is poor. In this case, it should be a bug and I will look into the reporting mechanism.
Can I know the size of the file being uploaded?
Thanks for the info on item 1! For item 2, yes that is exactly correct - the transfer progress is reported to be 100% where transferred equals the total even though the upload did fail, onStateChanged reported the FAILED state, and onError was not invoked. It is however not a multipart upload since the file size is under 1 MB - only 122 KB in that example.
@hipwelljo Thank you for the response. Can you paste the logs from Logcat when the item#2 occurs? I couldn't reproduce the issue so it would be beneficial if you could share the logs to see where the failure is.
Oh interesting it looks like it's trying to do a multipart even though this photo was just 58 KB.
12-28 16:22:02.513 24459-24459/com.mycompany.myapp D/RestHelper: Upload state changed to IN_PROGRESS
12-28 16:22:02.518 24459-24500/com.mycompany.myapp D/EGL_emulation: eglMakeCurrent: 0xb4369100: ver 2 0
12-28 16:22:02.530 24459-24500/com.mycompany.myapp D/EGL_emulation: eglMakeCurrent: 0xb4369100: ver 2 0
12-28 16:22:02.530 24459-24459/com.mycompany.myapp D/RestHelper: ID:2 bytesCurrent: 0 bytesTotal: 59380 0%
12-28 16:22:02.537 24459-24500/com.mycompany.myapp D/EGL_emulation: eglMakeCurrent: 0xb4369100: ver 2 0
12-28 16:22:02.538 24459-24500/com.mycompany.myapp D/OpenGLRenderer: endAllStagingAnimators on 0xa33e5e00 (RippleDrawable) with handle 0xa31f4fd0
12-28 16:22:02.844 24459-24500/com.mycompany.myapp D/EGL_emulation: eglMakeCurrent: 0xb4369100: ver 2 0
12-28 16:22:06.551 24459-24459/com.mycompany.myapp D/RestHelper: ID:2 bytesCurrent: 59380 bytesTotal: 59380 100%
12-28 16:22:21.578 24459-24459/com.mycompany.myapp D/RestHelper: ID:2 bytesCurrent: 59380 bytesTotal: 59380 100%
12-28 16:22:21.580 24459-24683/com.mycompany.myapp E/UploadTask: Error encountered during multi-part upload: 2 due to Unable to execute HTTP request: Read timed out
com.amazonaws.AmazonClientException: Unable to execute HTTP request: Read timed out
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4659)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1752)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.uploadSinglePartAndWaitForCompletion(UploadTask.java:221)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:99)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:55)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:818)
Caused by: java.net.SocketTimeoutException: Read timed out
at com.android.org.conscrypt.NativeCrypto.SSL_read(Native Method)
at com.android.org.conscrypt.OpenSSLSocketImpl$SSLInputStream.read(OpenSSLSocketImpl.java:699)
at com.android.okio.Okio$2.read(Okio.java:113)
at com.android.okio.RealBufferedSource.indexOf(RealBufferedSource.java:147)
at com.android.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:94)
at com.android.okhttp.internal.http.HttpConnection.readResponse(HttpConnection.java:179)
at com.android.okhttp.internal.http.HttpTransport.readResponseHeaders(HttpTransport.java:101)
at com.android.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:628)
at com.android.okhttp.internal.http.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:388)
at com.android.okhttp.internal.http.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:332)
at com.android.okhttp.internal.http.HttpURLConnectionImpl.getResponseMessage(HttpURLConnectionImpl.java:496)
at com.android.okhttp.internal.http.DelegatingHttpsURLConnection.getResponseMessage(DelegatingHttpsURLConnection.java:109)
at com.android.okhttp.internal.http.HttpsURLConnectionImpl.getResponseMessage(HttpsURLConnectionImpl.java:25)
at com.amazonaws.http.UrlHttpClient.createHttpResponse(UrlHttpClient.java:92)
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:85)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)聽
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4659)聽
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1752)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.uploadSinglePartAndWaitForCompletion(UploadTask.java:221)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:99)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:55)聽
at java.util.concurrent.FutureTask.run(FutureTask.java:237)聽
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)聽
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)聽
at java.lang.Thread.run(Thread.java:818)聽
12-28 16:22:21.603 24459-24459/com.mycompany.myapp D/RestHelper: Upload state changed to FAILED
FWIW, I reproduced that last one using the Network Link Conditioner in macOS with the android emulator, not a physical device this time.
@hipwelljo From the stacktrace, I see uploadSinglePartAndWaitForCompletion which means its single-part. The error message might be misleading as it has the word multi-part. Sorry about that. I will check on the error message.
Do you have TransferService declared in the AndroidManifest? Are you starting the service from your app?
Ok good catch.
Yes, TransferService is in the manifest like so:
<service android:name="com.amazonaws.mobileconnectors.s3.transferutility.TransferService" android:enabled="true" />
Yes, it is started from the first activity - see startService in the original report in this thread - the last line there.
@hipwelljo I would suggest to start the service from Application.onCreate()
public class MyApplication extends Application {
// Overriding this method is totally optional!
@Override
public void onCreate() {
super.onCreate();
// Required initialization logic here!
// Network service
getApplicationContext().startService(new Intent(getApplicationContext(), TransferService.class));
}
}
Can you try with this?
Okay I've moved that startService call into an Application subclass, but I continue to get the same outcome. Do you recommend moving the initialize call into there as well, or keep that in the first activity?
In reproducing this issue more today, I am seeing the upload does in fact succeed, which explains why it reports 100% progress, so it seems the only issue is the incorrect state update of FAILED when it should be COMPLETED.
@hipwelljo We can debug this issue in the call.
To document this for anyone following along, as discussed on the call, it was suspected maybe the poor network results in a failure to verify the photo was uploaded to s3 due to a timeout - s3 has a 20s socket timeout. That could maybe explain why the photo was uploaded successfully yet the sdk couldn't confirm that and thus it reports it FAILED. Karthik is investigating to see if another request is made for single-part uploads. As a temporary workaround, it was suggested to check for a FAILED state and then find out if the object exists in s3 via a metadata HEAD request and if not then retry it.
I can confirm that we do not make additional request for single-part uploads. The retry functionality is a feature request for TransferUtility in Android. Meanwhile as @hipwelljo pointed out, one could check for the FAILED state and resume or start a new upload transfer.
Thanks for checking into that @kvasukib! Do you have any other ideas as to why sometimes the state changes to FAILED even though the photo did upload to S3 on a poor connection like edge?
@hipwelljo I would check the device, API level and network conditions. Can you try the same experiment on a different device or network condition? You could also check if there any firewall policies set on the network?
@kvasukib The issue only occurs with a poor network condition like Edge data speeds (but many of our customers will be in that scenario) that I've seen. I do only have the one physical device available to test with, besides emulators which I can reproduce this issue on those as well (Pixel and Pixel 2, let me know if you would like me to try others). I can confirm there's not any firewall policies set on the network and have reproduced it on two different WiFi networks.
@hipwelljo @kvasukib Hi guys, first of all sorry for my English.
I have the same issue simulating 3G conditions (1024 kbps Upload bandwidth).
I think resume or restart the failed upload is not worth it. Under the same conditions timeout will take place again.
I receive "Failed" signal but observing with a proxy I can see that all parts have been "Completed". Very strange. However I stop receiving "InProgress" updates from the SDK.
The size of the uploaded video is 110 Mb.
Any solution or alternative workaround? Thanks in advance.
Timeout log
01-25 10:49:35.041 31095-31521/io.n3n.cubicam E/UploadPartTask: Encountered error uploading part
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:51)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:29)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
at java.lang.Thread.run(Thread.java:761)
Caused by: java.net.SocketTimeoutException: timeout
at com.android.okhttp.okio.Okio$3.newTimeoutException(Okio.java:212)
at com.android.okhttp.okio.AsyncTimeout.exit(AsyncTimeout.java:250)
at com.android.okhttp.okio.AsyncTimeout$2.read(AsyncTimeout.java:217)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
at com.android.okhttp.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
at com.android.okhttp.internal.http.HttpConnection.readResponse(HttpConnection.java:191)
at com.android.okhttp.internal.http.HttpTransport.readResponseHeaders(HttpTransport.java:80)
at com.android.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:906)
at com.android.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:782)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:463)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:405)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponseMessage(HttpURLConnectionImpl.java:517)
at com.android.okhttp.internal.huc.DelegatingHttpsURLConnection.getResponseMessage(DelegatingHttpsURLConnection.java:109)
at com.android.okhttp.internal.huc.HttpsURLConnectionImpl.getResponseMessage(HttpsURLConnectionImpl.java)
at com.amazonaws.http.UrlHttpClient.createHttpResponse(UrlHttpClient.java:92)
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:85)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)聽
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)聽
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:51)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:29)聽
at java.util.concurrent.FutureTask.run(FutureTask.java:237)聽
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)聽
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)聽
at java.lang.Thread.run(Thread.java:761)聽
@gilbertdigio Sorry for the inconvenience caused.
Do this happen every time you upload the file?
What version of the SDK are you using?
@kvasukib Every time I repeat the test under those conditions. Also increasing the bandwidth to 2048 kbps.
I'm using 2.11.0 version of the SDK.
@hipwelljo @gilbertdigio
I have fixed this issue in 2.11.1 version of the SDK. It contains improved exception handling and pause/resume handling and fixes the progress reporting issue when a transfer is paused and resumed. Please upgrade to 2.11.1 and let us know if it fixes the issue.
Root cause analysis:
1) When a transfer is paused by the user or due to network drop, the thread that is managing the multipart transfer is interrupted. When the thread is interrupted and encounters an exception, there is a case where we skip setting the transfer to PAUSED or WAITING_FOR_NETWORK, and instead moves to FAILED.
2) When a transfer is paused by the user or due to network drop, the thread that is managing the multipart transfer is interrupted. When a multi-part transfer (N parts) is interrupted, there are X number of parts that may have been completed, Y number of parts that may been in progress and Z number of parts that are yet to be started (X + Y + Z = N). When a part that is in progress is interrupted, and then resumed at a later point, the part is retried from the beginning, however the progress made in the previous attempt that is interrupted is not reset from the total progress. This caused the progress reported to go over 100%.
Description of the fix:
The fix involves
1) improving the exception handling of transfers:
Improved the state, progress and error reporting when the transfers are interrupted.
WAITING_FOR_NETWORK when the TransferNetworkLossHandler is used.FAILED and the exception is reported via TransferListener.onError callback.2) Fix the progress reporting to not go beyond 100% and reset when a part failed.
See 2.11.1 for more information.
Hi @kvasukib , I have updated to 2.11.1.
I have also instantiated TransferNetworkLossHandler before retrieving TransferUtility but I still receive timeout error under the same scenario. (simulating 3G conditions 1024 kbps Upload bandwidth.)
TransferNetworkLossHandler.getInstance(context.applicationContext)
sTransferUtility = TransferUtility.builder().context(context.applicationContext)
.s3Client(getS3Client(context.applicationContext))
.build()
It`s supposed that I should receive WAITING_FOR_NETWORK state but I receive FAILED state instead. Maybe I'm missing something.
I receive multiple timeout errors from the pending parts of the transfers. However I'm watching the transfer with a proxy and none of the pending parts is marked as error.
Timeout log
02-05 14:05:20.321 29475-2010/io.n3n.cubicam.develop E/UploadPartTask: Upload part interrupted: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
02-05 14:05:20.378 29475-2010/io.n3n.cubicam.develop E/UploadPartTask: Encountered error uploading part
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:60)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:30)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
at java.lang.Thread.run(Thread.java:818)
Caused by: java.net.SocketTimeoutException: timeout
at com.android.okhttp.okio.Okio$3.newTimeoutException(Okio.java:207)
at com.android.okhttp.okio.AsyncTimeout.exit(AsyncTimeout.java:250)
at com.android.okhttp.okio.AsyncTimeout$2.read(AsyncTimeout.java:217)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
at com.android.okhttp.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
at com.android.okhttp.internal.http.HttpConnection.readResponse(HttpConnection.java:191)
at com.android.okhttp.internal.http.HttpTransport.readResponseHeaders(HttpTransport.java:80)
at com.android.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:905)
at com.android.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:789)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:443)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:388)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponseMessage(HttpURLConnectionImpl.java:497)
at com.android.okhttp.internal.huc.DelegatingHttpsURLConnection.getResponseMessage(DelegatingHttpsURLConnection.java:109)
at com.android.okhttp.internal.huc.HttpsURLConnectionImpl.getResponseMessage(HttpsURLConnectionImpl.java:25)
at com.amazonaws.http.UrlHttpClient.createHttpResponse(UrlHttpClient.java:92)
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:85)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)聽
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)聽
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:60)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:30)聽
at java.util.concurrent.FutureTask.run(FutureTask.java:237)聽
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)聽
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)聽
at java.lang.Thread.run(Thread.java:818)聽
Any suggestions? thanks in advance, and sorry for my English.
@gilbertdigio Basically I expect the part task to encounter this condition: https://github.com/aws-amplify/aws-sdk-android/blob/master/aws-android-sdk-s3/src/main/java/com/amazonaws/mobileconnectors/s3/transferutility/UploadPartTask.java#L72
This is where we check if a network loss handler is available and registered. Can you check if you see the log messages from https://github.com/aws-amplify/aws-sdk-android/blob/master/aws-android-sdk-s3/src/main/java/com/amazonaws/mobileconnectors/s3/transferutility/UploadPartTask.java#L75 and https://github.com/aws-amplify/aws-sdk-android/blob/master/aws-android-sdk-s3/src/main/java/com/amazonaws/mobileconnectors/s3/transferutility/UploadPartTask.java#L82 lines in your Logcat window? If you get these messages, it tell you that a network loss handler is available and registered.
Also have you registered the network loss handler in AndroidManifest.xml?
<receiver
android:name="com.amazonaws.mobileconnectors.s3.transferutility.TransferNetworkLossHandler">
<intent-filter>
<action android:name="android.net.conn.CONNECTIVITY_CHANGE" />
</intent-filter>
</receiver>
Hi @kvasukib,
Sorry for the delay. I already registered the TransferNetworkLossHandler in Manifest but I think that isn't the problem.
The only logs that I see are Timeout Exceptions:
02-06 10:35:51.945 26028-26400/io.n3n.cubicam.develop E/UploadPartTask: Upload part interrupted: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
02-06 10:35:51.963 26028-26400/io.n3n.cubicam.develop E/UploadPartTask: Encountered error uploading part
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:60)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:30)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
at java.lang.Thread.run(Thread.java:818)
Caused by: java.net.SocketTimeoutException: timeout
at com.android.okhttp.okio.Okio$3.newTimeoutException(Okio.java:207)
at com.android.okhttp.okio.AsyncTimeout.exit(AsyncTimeout.java:250)
at com.android.okhttp.okio.AsyncTimeout$2.read(AsyncTimeout.java:217)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
at com.android.okhttp.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
at com.android.okhttp.internal.http.HttpConnection.readResponse(HttpConnection.java:191)
at com.android.okhttp.internal.http.HttpTransport.readResponseHeaders(HttpTransport.java:80)
at com.android.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:905)
at com.android.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:789)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:443)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:388)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponseMessage(HttpURLConnectionImpl.java:497)
at com.android.okhttp.internal.huc.DelegatingHttpsURLConnection.getResponseMessage(DelegatingHttpsURLConnection.java:109)
at com.android.okhttp.internal.huc.HttpsURLConnectionImpl.getResponseMessage(HttpsURLConnectionImpl.java:25)
at com.amazonaws.http.UrlHttpClient.createHttpResponse(UrlHttpClient.java:92)
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:85)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)聽
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)聽
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:60)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:30)聽
at java.util.concurrent.FutureTask.run(FutureTask.java:237)聽
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)聽
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)聽
at java.lang.Thread.run(Thread.java:818)聽
... and multiple Aborted Exception:
02-06 10:41:56.153 26028-26404/io.n3n.cubicam.develop E/UploadPartTask: Upload part interrupted: com.amazonaws.AbortedException:
02-06 10:41:56.176 26028-26404/io.n3n.cubicam.develop E/UploadPartTask: Encountered error uploading part
com.amazonaws.AbortedException:
at com.amazonaws.internal.SdkInputStream.abortIfNeeded(SdkInputStream.java:59)
at com.amazonaws.auth.AwsChunkedEncodingInputStream.read(AwsChunkedEncodingInputStream.java:159)
at java.io.InputStream.read(InputStream.java:162)
at com.amazonaws.http.UrlHttpClient.write(UrlHttpClient.java:244)
at com.amazonaws.http.UrlHttpClient.writeContentToConnection(UrlHttpClient.java:171)
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:75)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)
at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3672)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:60)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadPartTask.call(UploadPartTask.java:30)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
at java.lang.Thread.run(Thread.java:818)
The connection isn't lost, I can continue making queries before and after the exceptions.
I have increased AmazonS3Client timeouts to 200 seconds (ConnectionTimeout and SocketTimeout)
In need to upload videos under bad network conditions (upload bandwith = 512kbps or less).
Is there any limitation in the SDK about this?
Thanks in advance
@gilbertdigio The connection timeout from S3 is 20 seconds. S3 will close the connection after 20 seconds of idle time. This can happen under bad network conditions as you said. One option would be to catch this exception and resume the transfer. Have you tried that?
Hi @kvasukib I have updated to 2.12.0 and tried this out. It does seem to be better! I was able to successfully upload a photo on an edge simulated connection. However if I simulated an even slower connection - Out Bandwidth of 40 instead of 200 - I see it reached 100% progress then the state changed to FAILED as I saw previously. Here's the logs I see leaving it to retry for a while:
2019-02-12 10:41:47.841: Will upload image_UCdMaE7pnQ.jpg
2019-02-12 10:41:47.892: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:41:58.990: ID:5 bytesCurrent: 81420 bytesTotal: 81420 100%
2019-02-12 10:42:14.001: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:42:14.025: Upload state changed to FAILED for 5
2019-02-12 10:42:14.025: Retrying photo upload for 5
2019-02-12 10:42:14.063: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:42:44.488 Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:42:44.510: Upload state changed to FAILED for 5
2019-02-12 10:42:44.511: Retrying photo upload for 5
2019-02-12 10:42:44.553: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:43:13.976: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:43:13.997: Upload state changed to FAILED for 5
2019-02-12 10:43:13.998: Retrying photo upload for 5
2019-02-12 10:43:14.039: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:43:43.811: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:43:43.842: Upload state changed to FAILED for 5
2019-02-12 10:43:43.842: Retrying photo upload for 5
2019-02-12 10:43:43.886: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:44:13.141: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:44:13.162: Upload state changed to FAILED for 5
2019-02-12 10:44:13.162: Retrying photo upload for 5
2019-02-12 10:44:13.202: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:44:21.570: Upload state changed to WAITING_FOR_NETWORK for 5
2019-02-12 10:44:25.875: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:44:36.816: Upload state changed to WAITING_FOR_NETWORK for 5
2019-02-12 10:44:45.916: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:45:11.691: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:45:11.715: Upload state changed to FAILED for 5
2019-02-12 10:45:11.716: Retrying photo upload for 5
2019-02-12 10:45:11.763: Upload state changed to IN_PROGRESS for 5
2019-02-12 10:45:45.435: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-02-12 10:45:45.455: Upload state changed to FAILED for 5
2019-02-12 10:45:45.456: Retrying photo upload for 5
2019-02-12 10:45:45.491: Upload state changed to IN_PROGRESS for 5
This is concerning. I disabled the network link conditioner to go back to regular networking conditions and it did not successfully upload. The last lines logged were
Upload state changed to FAILED for 5
2019-02-12 10:47:34.069: Retrying photo upload for 5
2019-02-12 10:47:47.987: Failed to upload 5: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
@hipwelljo Thank you for the feedback. I made improvements to the exception handling logic but I haven't optimized anything for a poor network condition except when you use a TransferNetworkLossHander. Are you using the TransferService or just the TransferNetworkLossHander in the AndroidManifest.xml?
From the file size, I am assuming it's a single-part upload. If a single-part upload encounters an exception, it gets into the catch block here.
Can you enable a breakpoint at the catch entry level to see the type of exception. I would expect the log message if you had a transfer network loss handler registered.
The suspicion I have is that, we check if the exception that was encountered is an interruptedexception before checking for network availability. See https://github.com/aws-amplify/aws-sdk-android/blob/master/aws-android-sdk-s3/src/main/java/com/amazonaws/mobileconnectors/s3/transferutility/UploadTask.java#L283
If the type of exception you encountered is not a type of exception we check for, you will be getting a FAILED state. Can you confirm this?
Thanks @hipwelljo for the detailed analysis!
I am using just the TransferService in the manifest, not TransferNetworkLossHander.
I checked the type of exception. When it fails I see the exception is one of the following:
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
com.amazonaws.AmazonClientException: Unable to execute HTTP request: SSL handshake timed out
com.amazonaws.AmazonClientException: Unable to execute HTTP request: Unable to resolve host "bucket-name.s3.us-west-2.amazonaws.com": No address associated with hostname
Note that for the first two exceptions noted, I was simulating a slower than edge connection (out bandwidth of 40), and the last one I was simulating 100% network loss.
In all of these cases, it does not check if the network is not connected because RetryUtils.isInterrupted(e) is false, and the state changes to FAILED as you suspected.
@hipwelljo Thank you for the prompt response. Yeah! I expected that to happen. I am working on a change to remove checking for the type of exception. Regardless of the type of exception, if the network is not connected, we should set it to WAITING_FOR_NETWORK. I will notify here once the change is approved and released. Thank you for your patience.
@hipwelljo Sorry for the delay. I will continue my progress on the fix for this issue and have an update soon.
Hello, @kvasukib ,
I'm developer of TruVideo project and we use your lib for uploading video/photo files to S3 Bucket.
We encountered the same problem in case bad connection.
The most frequent issue I see in logs is next:
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:441)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1748)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.uploadSinglePartAndWaitForCompletion(UploadTask.java:259)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:113)
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:58)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
at java.lang.Thread.run(Thread.java:764)
Caused by: java.net.SocketTimeoutException: timeout
at com.android.okhttp.okio.Okio$3.newTimeoutException(Okio.java:212)
at com.android.okhttp.okio.AsyncTimeout.exit(AsyncTimeout.java:261)
at com.android.okhttp.okio.AsyncTimeout$2.read(AsyncTimeout.java:215)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
at com.android.okhttp.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
at com.android.okhttp.internal.http.Http1xStream.readResponse(Http1xStream.java:186)
at com.android.okhttp.internal.http.Http1xStream.readResponseHeaders(Http1xStream.java:127)
at com.android.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:737)
at com.android.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:609)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:471)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:407)
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponseMessage(HttpURLConnectionImpl.java:534)
at com.android.okhttp.internal.huc.DelegatingHttpsURLConnection.getResponseMessage(DelegatingHttpsURLConnection.java:109)
at com.android.okhttp.internal.huc.HttpsURLConnectionImpl.getResponseMessage(Unknown Source:0)
at com.amazonaws.http.UrlHttpClient.createHttpResponse(UrlHttpClient.java:92)
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:85)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)聽
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)聽
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1748)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.uploadSinglePartAndWaitForCompletion(UploadTask.java:259)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:113)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:58)聽
at java.util.concurrent.FutureTask.run(FutureTask.java:266)聽
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)聽
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)聽
at java.lang.Thread.run(Thread.java:764)聽
Caused by: java.net.SocketTimeoutException: Read timed out
at com.google.android.gms.org.conscrypt.NativeCrypto.SSL_read(Native Method)
at com.google.android.gms.org.conscrypt.NativeSsl.read(:com.google.android.gms@[email protected] (040306-234885623):4)
at com.google.android.gms.org.conscrypt.ConscryptFileDescriptorSocket$SSLInputStream.read(:com.google.android.gms@[email protected] (040306-234885623):7)
at com.android.okhttp.okio.Okio$2.read(Okio.java:136)
at com.android.okhttp.okio.AsyncTimeout$2.read(AsyncTimeout.java:211)
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)聽
at com.android.okhttp.okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)聽
at com.android.okhttp.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)聽
at com.android.okhttp.internal.http.Http1xStream.readResponse(Http1xStream.java:186)聽
at com.android.okhttp.internal.http.Http1xStream.readResponseHeaders(Http1xStream.java:127)聽
at com.android.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:737)聽
at com.android.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:609)聽
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:471)聽
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:407)聽
at com.android.okhttp.internal.huc.HttpURLConnectionImpl.getResponseMessage(HttpURLConnectionImpl.java:534)聽
at com.android.okhttp.internal.huc.DelegatingHttpsURLConnection.getResponseMessage(DelegatingHttpsURLConnection.java:109)聽
at com.android.okhttp.internal.huc.HttpsURLConnectionImpl.getResponseMessage(Unknown Source:0)聽
at com.amazonaws.http.UrlHttpClient.createHttpResponse(UrlHttpClient.java:92)聽
at com.amazonaws.http.UrlHttpClient.execute(UrlHttpClient.java:85)聽
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:371)聽
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:212)聽
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4655)聽
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1748)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.uploadSinglePartAndWaitForCompletion(UploadTask.java:259)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:113)聽
at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:58)聽
at java.util.concurrent.FutureTask.run(FutureTask.java:266)聽
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)聽
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)聽
at java.lang.Thread.run(Thread.java:764)聽
As I know from the issue description the Socket Timeout is 20 sec.
My thought that it's too short.
Could you please provide the information if this Timeout can be increased? Are you going to add possibility to set this value by config?
We use the recommended approach for uploading files:
TransferObserver thumbnailTransfer = transferUtility.upload(s3BucketName, descriptor.thumbnailS3Key, descriptor.thumbnailFile);
thumbnailTransfer.setTransferListener(new TransferListener() {
@Override
public void onStateChanged(int id, TransferState state) {
if (state == TransferState.COMPLETED) {
} else if (state == TransferState.FAILED) {
} else if (state == TransferState.WAITING_FOR_NETWORK) {
}
@Override
public void onProgressChanged(int id, long bytesCurrent, long bytesTotal) {
}
@Override
public void onError(int id, Exception ex) {
}
});
Thank you for advice,
Inna
@dmillermx Sorry for the inconvenience caused. The 20 second timeout is set by Amazon S3 and the SDK cannot control the timeout set by Amazon S3.
However, I am trying to make a change where I could catch this exception and set to WAITING_FOR_NETWORK so the user could resume later. I will let you know when I finish my change so you could test it.
Thank you for response.
@kvasukib how is it going on this? Could you please send an update as to when we can expect a fix for this? We are having a lot of customers experiencing this issue. Thank you!
@dmillermx I am working on the fix. Will have an update soon. Thank you for your patience.
Np thanks @kvasukib, please keep us updated and we can test this out asap
Thanks again
Hey @kvasukib how is it going? Anything that we can do to assist you to push this along? We are still hurting bad from these issues.
Thank you
@dmillermx I have cut a PR: https://github.com/aws-amplify/aws-sdk-android/pull/865. For immediate fix, you can clone the PR and build from source and test it in your app. We aim to get this approved and released in one of our upcoming releases.
@kvasukib sounds good, we have a new sprint that starts Monday. We will give it a shot, thank you!
Hello @dmillermx
We have fixed this issue with SDK release 2.13.1. Please try it out and let us know if it works for you.
Best,
Rohan
Thanks for the update!
I tested it just now and it did successfully upload after it failed many times on a really poor connection. It never did say it reached 100%, it just failed repeatedly then successfully uploaded once I disabled the poor network speed simulation.
I tested it a second time and I saw the behavior I previously reported. It reached 100%, failed, I retried it and it failed 6 more times, but then it would not upload even after I disabled the simulated poor network conditions - it was just stuck.
The third time I tested it I saw it fail once, report 100%, fail, then I disabled the simulated speeds and it completed.
The fourth time I tested it I saw it report 100%, fail 15x, then I disabled the simulation and it completed.
The fifth time I tested it I left it alone for about 15 minutes and it behaved like my first test - failed many times, didn't say 100%, upon disabling it succeeded.
The sixth time I tested it I saw it report 100% progress, then it failed 6x, then didn't do anything after that even after I disabled the simulated speeds. I let it sit like that for 10 minutes before giving up. See logs below.
So it seems this is improved but is not working consistently. Notably it's not ever changing to WAITING_FOR_NETWORK. 馃
2019-04-05 20:39:19.436 17188-17188/com.company.app D/RestHelper: Will upload image_DH23gSjbQp.jpg
2019-04-05 20:39:19.603 17188-17188/com.company.app D/RestHelper: Upload state changed to IN_PROGRESS for 4
2019-04-05 20:39:39.031 17188-17188/com.company.app D/RestHelper: ID:4 bytesCurrent: 41094 bytesTotal: 41094 100%
2019-04-05 20:39:54.027 17188-17188/com.company.app E/RestHelper: Failed to upload 4: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
2019-04-05 20:39:54.038 17188-17188/com.company.app D/RestHelper: Upload state changed to FAILED for 4
2019-04-05 20:39:54.038 17188-17188/com.company.app D/RestHelper: Retrying photo upload for 4
2019-04-05 20:39:54.055 17188-17188/com.company.app D/RestHelper: Upload state changed to IN_PROGRESS for 4
2019-04-05 20:40:04.539 17188-17188/com.company.app E/RestHelper: Failed to upload 4: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Unable to resolve host "bucketname.s3.us-west-2.amazonaws.com": No address associated with hostname
2019-04-05 20:40:04.550 17188-17188/com.company.app D/RestHelper: Upload state changed to FAILED for 4
2019-04-05 20:40:04.551 17188-17188/com.company.app D/RestHelper: Retrying photo upload for 4
2019-04-05 20:40:04.564 17188-17188/com.company.app D/RestHelper: Upload state changed to IN_PROGRESS for 4
2019-04-05 20:40:05.055 17188-17188/com.company.app E/RestHelper: Failed to upload 4: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Unable to resolve host "bucketname.s3.us-west-2.amazonaws.com": No address associated with hostname
2019-04-05 20:40:05.067 17188-17188/com.company.app D/RestHelper: Upload state changed to FAILED for 4
2019-04-05 20:40:05.067 17188-17188/com.company.app D/RestHelper: Retrying photo upload for 4
2019-04-05 20:40:05.084 17188-17188/com.company.app D/RestHelper: Upload state changed to IN_PROGRESS for 4
2019-04-05 20:40:05.408 17188-17188/com.company.app E/RestHelper: Failed to upload 4: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Unable to resolve host "bucketname.s3.us-west-2.amazonaws.com": No address associated with hostname
2019-04-05 20:40:05.423 17188-17188/com.company.app D/RestHelper: Upload state changed to FAILED for 4
2019-04-05 20:40:05.423 17188-17188/com.company.app D/RestHelper: Retrying photo upload for 4
2019-04-05 20:40:05.436 17188-17188/com.company.app D/RestHelper: Upload state changed to IN_PROGRESS for 4
2019-04-05 20:40:05.791 17188-17188/com.company.app E/RestHelper: Failed to upload 4: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Unable to resolve host "bucketname.s3.us-west-2.amazonaws.com": No address associated with hostname
2019-04-05 20:40:05.801 17188-17188/com.company.app D/RestHelper: Upload state changed to FAILED for 4
2019-04-05 20:40:05.801 17188-17188/com.company.app D/RestHelper: Retrying photo upload for 4
2019-04-05 20:40:05.814 17188-17188/com.company.app D/RestHelper: Upload state changed to IN_PROGRESS for 4
2019-04-05 20:40:06.038 17188-17188/com.company.app E/RestHelper: Failed to upload 4: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Unable to resolve host "bucketname.s3.us-west-2.amazonaws.com": No address associated with hostname
2019-04-05 20:40:06.057 17188-17188/com.company.app D/RestHelper: Upload state changed to FAILED for 4
2019-04-05 20:40:06.057 17188-17188/com.company.app D/RestHelper: Retrying photo upload for 4
@hipwelljo Thank you for the detailed feedback. We only detect when the network disconnects and currently TransferUtility does not detect poor network condition. We set the state to WAITING_FOR_NETWORK only when the network disconnects and we get a callback from Android about the network disconnect.
Okay. Are you working towards a fix for these issues we're encountering reported here?
1) On a really slow connection a photo may successfully upload 100% but the SDK may then report it failed instead of completed
2) Sometimes retrying the photo upload will not fail or succeed - it'll get stuck
@hipwelljo When the state is reported as FAILED, do you get the onError callback with the exception?
Hello @kvasukib
We've tested the last version 2.13.1 and it still fails on bad network connection:
The error: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
We uploading 2 files: first one is picture, the second one is video file, here the log if it will help you.
Uploading file /data/user/0/com.truvideo.staging/files/Camera/Videos/ed666241cf2fd5c9d1ba2b781769987_mp4_thumb.png
TransferStatusUpdater: Creating a new progress listener for transfer: 5
onStateChanged: 5, state=IN_PROGRESS
Error encountered while uploading thumb: com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
onStateChanged: 5, state=FAILED
Uploading video [/data/user/0/com.truvideo.staging/files/Camera/Videos/ed666241cf2fd5c9d1ba2b781769987.mp4]
onStateChanged: 6, state=IN_PROGRESS
TransferStatusUpdater: Creating a new progress listener for transfer: 6
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
onProgressChanged: 6, bytesCurrent=765862, bytesTotal=765862
onStateChanged: 6, state=FAILED
@kvasukib Yes, onError is where I log Log.e(TAG, "Failed to upload " + id + ": " + ex);
@hipwelljo @dmillermx From the logs (onProgressChanged: 6, bytesCurrent=765862, bytesTotal=765862) I can see that the bytesCurrent and bytesTotal are equal and the file has been uploaded successfully but the TransferUtility reported the transfer as FAILED.
When you get the onError callback, what is the exception that is being passed into the onError callback?
@kvasukib, onError callback we see next in log:
com.amazonaws.AmazonClientException: Unable to execute HTTP request: timeout
Thank you
@dmillermx Does this happen for all transfers (single-part, multi-part uploads and downloads)?
Currently our logic when the transfer errors out with an exception is the following:-
CANCELEDPAUSEDWAITING_FOR_NETWORKFAILEDFAILEDWe don't have a mechanism at this point to detect a poor network condition when the transfer errors out. We are only able to detect when the network is completely disconnected.
Yes, that same exception dmillermx posted. This is single part uploads since all our photo uploads are small file sizes. We don't utilize the others in this app.
This scenario where the state changes to FAILED even though it may have actually succeeded on a bad connection isn't as worrying to me as the issue where sometimes retrying the photo upload will neither fail nor succeed, just gets lost.
@hipwelljo Can you elaborate on what you mean when the upload gets lost?
As shown in the logs in this reply, from my 6th test, I retried the upload but nothing ever happened after that (I left it for 10 minutes) - state never changed to IN_PROGRESS or FAILED. So the end result is the UI shows a progress bar filled to 100% but it will never complete.
Does this happen for all transfers (single-part, multi-part uploads and downloads)?
During testing with emulation of bad connection it happens always - 100% for single-part and for multi-part uploads.
@dmillermx @hipwelljo I will take this as a feature request to the team as we may have to implement a retry mechanism to retry the transfer when the network condition is poor by detecting the request timeout.
Meanwhile, we would suggest you to retry the transfer based on the exception thrown in the onError and check if the object exists on Amazon S3 by using AmazonS3Client.doesObjectExist(bucket, key) to successfully upload an object.
@kvasukib Thank you, we'll try.
@kvasukib I am already retrying the transfer manually when the state changes to FAILED, which is called right after onError. After doing that though it sometimes gets stuck and nothing happens after that.
I retry the upload like so:
@Override
public void onStateChanged(int id, TransferState state) {
Log.d(TAG, "Upload state changed to " + state + " for " + id);
if (state == TransferState.COMPLETED) {
//...
} else if (state == TransferState.FAILED) {
//try again - sdk doesn't automatically retry
Log.d(TAG, "Retrying photo upload for " + id);
TransferUtility transferUtility = getTransferUtility(context);
TransferObserver newTransferObserver = transferUtility.resume(id);
newTransferObserver.setTransferListener(this);
}
}
@hipwelljo @dmillermx Sorry for the delayed response. Can you try attaching a new TransferListener in the observer for every retry? The transfer listeners registered for a transfer will be cleared when the transfer terminates (successfully or not).
My solution for this problem would be to retry the transfer when this exception occurs. The retry can be done at the SDK level or at the app level with the resume functionality. Regardless of the solution being in SDK or app, the best we could do is to retry the transfer for N number of times or until it eventually succeeds.
On a side note, can you find the number of available processors on the device where you are experiencing this problem? When you do a multi-part upload, we create two thread pools: Pool-1 of size N + 1 and Pool-2 of size N + 1, where N is the number of available processors on the mobile device found through (Runtime.getRuntime().availableProcessors()). Pool-1 is used for single-part uploads, downloads and Pool-2 is used for multi-part uploads. When the network condition is not stable (poor but still connected and available) and there are a lot of transfers competing for the network resources, there is a lot of contention in the network and the connection may be left idle for more than 20 seconds. My suggestion would be to lower the number of threads in the thread-pool using TransferUtilityOptions or try single-part uploads in the retry mechanism.
Hi @kvasukib I was able to look at this again. I should always be performing single-part uploads because the file sizes are so small (<200 KB). I attached a new TransferListener for each retry, instead of this, and so far I have not seen the issue where retrying the upload causes it to get lost and not ever fail nor succeed. It is always failing on repeat until it eventually succeeds in my tests.
This still seems to be an issue. Setting an emulator into a poor network state will make the transfer fail (which is expected). Once a good network is reestablished the transfer will resume itself and complete. However, the TransferObserver won't have its TransfertState updated. The issue is that we can't seem to accurately tell if an upload has actually failed (for real) or has failed initially but retried by the SDK. I've also setup a RetryPolicy to not retry and have 0 max retries, but yet the SDK still retries to upload on timeouts.
Hi @Shakezulla57,
TransferNetworkLossHandler currently only supports recovering from complete loss of network, and does not support retrying upon slow/unstable connection.
I've also setup a RetryPolicy to not retry and have 0 max retries, but yet the SDK still retries to upload on timeouts.
This is an unrelated multipart upload bug that may have been resolved recently in #1536 . Would you be able to check if you are experiencing this issue with our newest SDK release and submit a new ticket? Thank you.