Unclear what exactly this would mean, some investigation is required. But for uploads I imagine we'd do some buffering of the content, if it's greater than 5MB we would do a multipart upload with some retries of parts. For downloads this might be a special input stream returned to the caller that can reconnect on failure by doing a range GET to retrieve the rest of the bytes.
I'd like to vote for automatic reconnect on download. So you'd have something that looks like a BufferedInputStream. In one mode each BufferSize chunk would be a GetRange and it goes right into the byte array. That way if the client is slow to process the stream they will not get any connection reset errors. With a good buffer size (maybe even a buffer array to load some buffers in parallel) it should be super fast while being very stable. If there is a Connection read error in the short time it takes to copy the range into the byte buffer then I would want it to use the client configured Retry configuration to retry.
Support for retrying with non-resettable InputStreams would be nice. See https://github.com/aws/aws-sdk-java/issues/427
Additionally, it looks to me like TransferManager doesn't support doing multi-part uploads of data with an unknown size. This makes it pretty useless for big data.
Any updates on that issue?
@laymain
We are focusing on getting the SDK 2.0 ready with low-level clients. Most of the high level libraries (TransferManager, DynamoDB mapper) won't be available until early 2019.
I understand, thank you for the answer.
Tracking in #37
Most helpful comment
Additionally, it looks to me like TransferManager doesn't support doing multi-part uploads of data with an unknown size. This makes it pretty useless for big data.