Query/Question
BlobClient blobClient = container.getBlobClient(objectKey);
blobClient.uploadWithResponse(new BufferedInputStream(inputStream), contentLength, /*parallelTransferOptions*/ null, httpHeaders, userMetaData, AccessTier.HOT, /*blobRequestConditions*/ null, /*timeout*/ null, Context.NONE);
My InputStream is a S3ObjectInputStream which doesn't support mark/reset so as suggested I wrapped it into BufferedInputStream but the request has been always failing with com.azure.core.exception.UnexpectedLengthException and the error message is always saying Request body emitted n+1 bytes, more than the expected n bytes.
Why is this not a Bug or a feature Request?
I don't know if it is a bug as I am not much aware of how the stream is being read.
Setup (please complete the following information if applicable):
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
Hi, @gitmohit. Thank you for posting this question. In our latest release, we added the ability to pass a non-markable stream to the blobClient.upload method. Can you try using this latest version and removing the BufferedStream wrapper before we investigate further?
@rickle-msft The InputStream passed ( S3ObjectInputStream ) has the below overridden version of available()
@Override
public int available() throws IOException {
int estimate = super.available();
return estimate == 0 ? 1 : estimate;
}
Since this always returns a non-zero value, the below check in the Azure SDK fails. Taken from Utility's convertStreamToByteBuffer
if (data.available() > 0) {
long totalLength = currentTotalLength[0] + (long)data.available(); // available used here
throw LOGGER.logExceptionAsError(new UnexpectedLengthException(String.format("Request body emitted %d bytes, more than the expected %d bytes.", totalLength, length), totalLength, length));
}
@rickle-msft Did you get a chance to take a look at the above snippet?
@somanshreddy sorry I was away for a couple weeks. That seems rather odd to me. I'm not very familiar with s3 or this type. Do you know why if the estimate is 0, they change that to return 1?
@rickle-msft No issues, thanks for the reply. This is the explanation for their weird fallback to 1 if the estimate is 0.
/**
* Returns the value of super.available() if the result is nonzero, or 1
* otherwise.
* <p>
* This is necessary to work around a known bug in
* GZIPInputStream.available(), which returns zero in some edge cases,
* causing file truncation.
* <p>
* Ref: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7036144
*/
@Override
public int available() throws IOException {
int estimate = super.available();
return estimate == 0 ? 1 : estimate;
}
Interesting. That workaround seems fundamentally incompatible with the check in the Azure SDK. We have that check to guard against data corruption/loss in case someone accidentally specifies a stream size that's smaller than the amount of data in the stream. In that case we tell the customer "there's more data here than you told us about and we don't want to leave anything behind."
I would like to say that because these are both related to data protection but don't seem to work together very well, our recommendation has to be that you wrap the S3 stream and undo the workaround if you aren't using GZIPInputStream at all. But undoing the workaround would probably mean you're converting from 1 back to 0 in some cases where the value is legitimately 1.
@somanshreddy I think we've agreed internally that we should switch to using read() instead of available() and check for -1, which should get rid of the exception you're seeing.
That sounds good. Thank you @rickle-msft
Most helpful comment
Hi, @gitmohit. Thank you for posting this question. In our latest release, we added the ability to pass a non-markable stream to the
blobClient.uploadmethod. Can you try using this latest version and removing the BufferedStream wrapper before we investigate further?