Azure-sdk-for-java: [BUG] Getting an invalid MD5 Exception while uploading a Blob even when MD5 is valid

Created on 25 Aug 2020  路  12Comments  路  Source: Azure/azure-sdk-for-java

Describe the bug
Unable to set MD5 on an object during upload even though the MD5 is valid. The MD5 sent is Base64 encoded yet the error message says that the value isn't. The same value works in the v4.0.0 SDK but fails in the v12.7.0 SDK.

Is it that we no longer have to send the Base64 encoded version and should send the 16 byte array MD5 directly?

Exception or Stack Trace

Caused by: com.azure.storage.blob.models.BlobStorageException: Status code 400, "<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidMd5</Code><Message>The MD5 value specified in the request is invalid. MD5 value must be 128 bits and base64 encoded.

RequestId:8c9c9d76-c01e-006f-0638-7a98a2000000

Time:2020-08-24T17:02:06.1907490Z</Message></Error>"

To Reproduce
Upload a blob with an MD5 value using BlobHttpHeaders.

Code Snippet

md5 = "3b2e740dae9b434b6ec9922a54a6919f";
byte[] md5Bytes = Hex.decodeHex(md5.toCharArray());
byte[] md5Base64Bytes = Base64.encodeBase64(md5Bytes);

blob.uploadWithResponse(new BufferedInputStream(inputStream), size, /*parallelTransferOptions*/ null, new BlobHttpHeaders().setContentMd5(md5Base64Bytes),
                    metadata, /*tier*/ null, new BlobRequestConditions(), TIMEOUT, Context.NONE);

Expected behavior
The blob should be uploaded successfully.

Screenshots
N/A

Setup

  • OS: Ubuntu 18.04
  • IDE : IntelliJ 19.1.4
  • SDK: azure-storage-blob v12.7.0

Additional context
This same Base64 MD5 works in the v4.0.0 SDK when we use the below lines

String md5base64 = new String(md5Base64Bytes);
properties.setContentMD5(md5base64);

I tried the above request WITHOUT the base64 encoding. And the request was successful.

byte[] md5Bytes = Hex.decodeHex(md5.toCharArray());
// byte[] md5Base64Bytes = Base64.encodeBase64(md5Bytes);  // This line was commented out

blob.uploadWithResponse(new BufferedInputStream(inputStream), size, /*parallelTransferOptions*/ null, new BlobHttpHeaders().setContentMd5(md5Bytes)),
                    metadata, /*tier*/ null, new BlobRequestConditions(), TIMEOUT, Context.NONE);

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • [x] Bug Description Added
  • [x] Repro Steps Added
  • [x] Setup information Added
Client Storage customer-reported question

All 12 comments

Thanks for filing this issue @somanshreddy. Someone from the storage team will follow up shortly.

/cc @gapra-msft @rickle-msft

Hi @somanshreddy, Thank you for posting this issue. We will work on investigating this issue and get back to you.

Hi @somanshreddy, I was able to reproduce your issue and it looks like the problem is md5Base64Bytes has more than 128 bits in it. I'm not too sure what v4 is doing with your md5 byte array (it could only be sending the first 128 bits or something like that).

Could you capture a fiddler trace or debug your program to see what is being sent to the service in v4 vs what is being sent in v12 when you don't base 64 the bytes?

So it was sending a 16 byte array before the Base64 encoding. And after Base encoding it is a 24 byte array.

This is the MD5 = 3b2e740dae9b434b6ec9922a54a6919f

Hex decoded array which is a 16 byte array :
byte[] md5Bytes = {59, 46, 116, 13, -82, -101, 67, 75, 110, -55, -110, 42, 84, -90, -111, -97}

Array after Base64 Encoding is 24 byes :
byte[] md5Base64Bytes = {79, 121, 53, 48, 68, 97, 54, 98, 81, 48, 116, 117, 90, 73, 121, 113, 86, 75, 97, 82, 110, 119, 61, 61}

In V4, the above 24 byte array was converted to a String -> "Oy50Da6bQ0tuyZIqVKaRnw==" and then we set it on the blob.

Now the parameters of the blob http header setter methods take a byte[] array. So I skipped the last string conversion and directly tried to set the 24 byte base 64 encoded array but that failed. Since they mentioned 128 bits, I decided to send the 16 byte array ( 16 x 8 == 128 ) and that worked.

Hi @somanshreddy, Took a more detailed look into the code and it looks like we internally take your byte array and base 64 encode it into a string before we send it to the service, so just passing in the md5 bytes is what the API expects.

I think this sufficiently resolves your issue. If you don't have any more concerns, feel free to close the issue.

Thanks

Yes, this is as suspected. So, this means that the error message is outdated, right?

I think the error message is correct since it mentions that "MD5 value must be 128 bits and base64 encoded."

Since the md5 passed in in the error case was not 128 bits, it did not pass the check.

But the MD5 wasn't base 64 encoded right?

So MD5 is always 128 bits. If we perform Base64 encoding, it would break the input into 6 bit groups. And each group would be 1 byte after base encoding. 21 such bytes would represent 21 x 6 == 126 bits. So we need a minimum of 22 bytes to store the MD5. 2 bytes are added as padding so we have a total of 24 bytes.

So I think the error message should be "MD5 value must be 128 bits". Because the internal SDK performs the Base 64 encoding.

Hi @somanshreddy That error you are seeing is actually returned by the service, so it is correct to be more specific, in case people are developing against the REST endpoint.

Okay. Can the SDK documentation at least be updated with this requirement? Otherwise those who are upgrading from a previous version would have issues

Yes, absolutely. I will create an issue to update the documentation to address this. Thank you for the feedback!

Since your issue seems to have been resolved, I will go ahead and close this issue. Please feel free to reopen/create issues if you need further support.

Was this page helpful?
0 / 5 - 0 ratings