Azure-sdk-for-java: BlobClient.uploadWithResponse is always failing with error message `Request body emitted n+1 bytes, more than the expected n bytes`

Created on 18 Nov 2020 · 8Comments · Source: Azure/azure-sdk-for-java

Query/Question

BlobClient blobClient = container.getBlobClient(objectKey);
blobClient.uploadWithResponse(new BufferedInputStream(inputStream), contentLength, /*parallelTransferOptions*/ null, httpHeaders, userMetaData, AccessTier.HOT, /*blobRequestConditions*/ null, /*timeout*/ null, Context.NONE);

My InputStream is a S3ObjectInputStream which doesn't support mark/reset so as suggested I wrapped it into BufferedInputStream but the request has been always failing with com.azure.core.exception.UnexpectedLengthException and the error message is always saying Request body emitted n+1 bytes, more than the expected n bytes.

Why is this not a Bug or a feature Request?
I don't know if it is a bug as I am not much aware of how the stream is being read.

Setup (please complete the following information if applicable):

OS: linux
IDE : IntelliJ
Version of the Library used: azure-storage-blob v12.9.0-beta.1

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

[x] Query Added
[x] Setup information Added

Client Storage customer-reported

Source

gitmohit

👍1

Most helpful comment

Hi, @gitmohit. Thank you for posting this question. In our latest release, we added the ability to pass a non-markable stream to the blobClient.upload method. Can you try using this latest version and removing the BufferedStream wrapper before we investigate further?

rickle-msft on 18 Nov 2020

👍2

All 8 comments

rickle-msft on 18 Nov 2020

👍2

@rickle-msft The InputStream passed ( S3ObjectInputStream ) has the below overridden version of available()

    @Override
    public int available() throws IOException {
        int estimate = super.available();
        return estimate == 0 ? 1 : estimate;
    }

Since this always returns a non-zero value, the below check in the Azure SDK fails. Taken from Utility's convertStreamToByteBuffer

if (data.available() > 0) {
     long totalLength = currentTotalLength[0] + (long)data.available();  // available used here
     throw LOGGER.logExceptionAsError(new UnexpectedLengthException(String.format("Request body emitted %d bytes, more than the expected %d bytes.", totalLength, length), totalLength, length));
}

somanshreddy on 28 Dec 2020

@rickle-msft Did you get a chance to take a look at the above snippet?

somanshreddy on 6 Jan 2021

@somanshreddy sorry I was away for a couple weeks. That seems rather odd to me. I'm not very familiar with s3 or this type. Do you know why if the estimate is 0, they change that to return 1?

rickle-msft on 11 Jan 2021

@rickle-msft No issues, thanks for the reply. This is the explanation for their weird fallback to 1 if the estimate is 0.

/**
 * Returns the value of super.available() if the result is nonzero, or 1
 * otherwise.
 * <p>
 * This is necessary to work around a known bug in
 * GZIPInputStream.available(), which returns zero in some edge cases,
 * causing file truncation.
 * <p>
 * Ref: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7036144
 */
@Override
public int available() throws IOException {
    int estimate = super.available();
    return estimate == 0 ? 1 : estimate;
}

somanshreddy on 11 Jan 2021

Interesting. That workaround seems fundamentally incompatible with the check in the Azure SDK. We have that check to guard against data corruption/loss in case someone accidentally specifies a stream size that's smaller than the amount of data in the stream. In that case we tell the customer "there's more data here than you told us about and we don't want to leave anything behind."

I would like to say that because these are both related to data protection but don't seem to work together very well, our recommendation has to be that you wrap the S3 stream and undo the workaround if you aren't using GZIPInputStream at all. But undoing the workaround would probably mean you're converting from 1 back to 0 in some cases where the value is legitimately 1.

rickle-msft on 11 Jan 2021

@somanshreddy I think we've agreed internally that we should switch to using read() instead of available() and check for -1, which should get rid of the exception you're seeing.

rickle-msft on 12 Jan 2021

👍1

That sounds good. Thank you @rickle-msft

somanshreddy on 14 Jan 2021

Was this page helpful?

0 / 5 - 0 ratings