Azure-storage-azcopy: Precheck Missing MD5 for FailIfDifferentOrMissing

Created on 31 Dec 2019  路  2Comments  路  Source: Azure/azure-storage-azcopy

Currently the remote MD5 value is checked, after the file download is finished, for both _different_ and _missing_.

When the remote MD5 is missing, the download process can take hours, then fail.

Request:
Would it be possible to pre-check for the existence of the remote MD5 value then fail fast, before the download. This would save time.

MD5 check for --check-md5=FailIfDifferentOrMissing:
https://github.com/Azure/azure-storage-azcopy/blob/d4e856e572f7f63521329703d18d9f96bac21613/ste/md5Comparer.go#L64-L65

Repro:
azcopy.exe cp "https://lilablobssc.blob.core.windows.net/snapshotserengeti-v-2-0/SnapshotSerengeti_S10_v2_0_part2.zip" "F:\datasets\SnapshotSerengeti\raw\" --check-md5=FailIfDifferentOrMissing --overwrite=false

File is 166GB and public (src).

Error: _(only in log; should be raised to user level)_
000 : no MD5 was stored in the Blob/File service against this file. So the downloaded data cannot be MD5-validated. This application is currently configured to treat missing MD5 hashes as errors. When Checking MD5 hash. X-Ms-Request-Id:

Most helpful comment

Whoof, yeah, hours is pretty rough for this kind of a thing. I think we could perform that early during transfer init. I'll pop open a PR for that here in a moment.

All 2 comments

Whoof, yeah, hours is pretty rough for this kind of a thing. I think we could perform that early during transfer init. I'll pop open a PR for that here in a moment.

Closing since this work has now been done.

Was this page helpful?
0 / 5 - 0 ratings