Zstd: Unpacking error (V1.1.4)

Created on 2 Apr 2017 · 24Comments · Source: facebook/zstd

Quick test :
Archive created with ZSTDMT 1.1.4 (x64) cannot be unpacked with ZSTDMT/ZSTD 1.1.4. (x64)
It says :
Error 36 : Decoding error : Destination buffer is too small

ZSTD x86 was not tested.
ATM i have no more time to further testing.

bug

Source

sebbellob

All 24 comments

I guess we will need a few more details.
Was the bug observed while using the API or the cli ?
If using the API, could you list the precise methods used during compression and decompression ?

Quick guess :
Could this issue be related to #634 ? #634 was recently fixed in dev branch.

Note that the project includes a fuzzer test suite which checks decodability of ZSTDMT : https://github.com/facebook/zstd/blob/dev/tests/zstreamtest.c#L792 .
Since v1.1.4 released, this fuzzer test is running 24/7 on a server, with address sanitizer enabled. It has not yet found a problem. You may have found a specific corner case which is not triggered by the test.

For example, the fuzzer test decodes data with the streaming API (the same as the cli), which does not need the content size in header.

To complete test suite, one-pass functionZSTD_decompress() is now also tested with zstdmt :
https://github.com/facebook/zstd/blob/dev/tests/fuzzer.c#L193
because this one requires content size in header, and can trap #634 issue.

Cyan4973 on 2 Apr 2017

I dont know what API or CLI is.
Im just an absolutely End-User.
The "bug" which i mentioned above is the following.
Downloaded the 1.1.4 x64 binary and created an archive with it:

Zstdmt --ultra -22 -T2 mydata.dat -o mydata.zstd
Mydata.dat is a ~510 MB file.

The result is corrupted,etc ,as i wrote above.
Thata all what i can say. :/
Maybe a more experienced user can help you.

sebbellob on 4 Apr 2017

This is the CLI (Command Line Interface).

To analyze the pb further, I need to reproduce it.

For example, is it possible to access the data that cause this issue ?

Cyan4973 on 4 Apr 2017

My mistake. Very sorry !
The -T2 option caused the bug. :/
Otherwise it works fine.

sebbellob on 5 Apr 2017

The -T2 option caused the bug.

-T2 is supposed to trigger "2 threads" during compression.
It's ineffective during decompression.
At -22 --ultra setting, the second thread is triggered only if source size is > 512 MB (using default settings, there are way to change that, but let's keep it simple for now).

If you mean that using -T2 setting produces a broken archive, it's really a bug, that's worthwhile to investigate.
I've been testing it locally with a few very large files, and none of them is broken so far. But your sample may trigger a rare corner case that is not observed here.

A simple way for us to investigate is to get access to mydata.dat, so that we can reproduce the issue.
If that's no possible, we'll have to find another way to reproduce the problem,
such as generating synthetic data which would fail the same way.

Cyan4973 on 5 Apr 2017

ZSTDMT V1.1.4
MD5 : 699220469abf7a5eb58be0641c8d5c2c
SHA1 : d9576260d2ca5b28e32058bdf31a91b38f72b30c

ZSTD V1.1.4
MD5 : 2a77adb3d8123b763a0f47d04a1555d9
SHA1 : 6fc33026a7979637005f8223d323de6a50624a2d

zstdmt.exe --ultra -22 -T2 mydata.dat -o td.zstd

C:\x>zstdmt --test td.zstd
Error 36 : Decoding error : Destination buffer is too small
C:\x>zstd --test td.zstd
Error 36 : Decoding error : Destination buffer is too small

And the datafile : (1.073.744.011 bytes)
https://www.dropbox.com/s/02g6rwrk1f03r5q/testdata.dat?dl=0

sebbellob on 5 Apr 2017

👍1

I've reproduced the error.

terrelln on 5 Apr 2017

👍1

Excellent ! let's investigate and fix it

Cyan4973 on 5 Apr 2017

zstdmt is writing a bogus header. It has frame header descriptor 0x24, which means single segment with a checksum, and the frame content size is one byte. Then the frame content size is 0x00.

terrelln on 5 Apr 2017

👍1

I'm not successful at reproducing the issue with testdata.dat.
Though I tested it on a Mac with latest zstdmt version (dev branch).
Could it be an issue specific to the Windows binary ?

_Edit_ : I could reproduce it with v1.1.4, which means it was incidentally fixed later in dev branch.

_Edit_ : I also confirm that header seems wrong, with header descriptor 0x24.

Cyan4973 on 5 Apr 2017

It doesn't occur on the dev branch, only the tag v1.1.4. I'm bisecting to find the commit that introduced it.

terrelln on 5 Apr 2017

👍1

It was fixed by commit ca5a8bb.

terrelln on 5 Apr 2017

👍1

The error will occur before that commit if zstdmt knows the content size, a workaround is to use cat file | zstdmt -T2 -o file.zst.

terrelln on 5 Apr 2017

👍1

Indeed, this commit is the only one to impact streaming zstdmt after v1.1.4.
So this is a replay of bug #634, which I did not expected to impact cli too.

I initially thought that it was strange, because this code is supposed to be applied starting from the second job. But in fact, when there is no dictionary, this code is called even for the first job. And that's where the header is written.

ZSTD_compressBegin_advanced(..., 0) used to mean "source size unknown". But, due to other changes in the code, it knows mean "source size == 0". Hence it breaks.

The change to 0 (to mean "source size unknown") was made in anticipation of a later scenario involving srcSize control. But srcSize control is currently applied at CStream level only, which means it does not apply yet. So it's not strictly necessary yet.

Maybe I should add a warning about this problem and work-around in the v1.1.4 release notice.
_Edit_ : warning notification posted on v1.1.4 release.

Cyan4973 on 5 Apr 2017

I was surprised that such a bug would evade our multithreading fuzzer test suite :
https://github.com/facebook/zstd/blob/dev/tests/zstreamtest.c#L820

It appears these tests relied on the old assumption that pledgedSrcSize==0 means "unknown" while pledgedSrcSize>0 means "here is the size to compress".

But we recently changed the API behavior to be able to introduce srcSize==0, so it's no longer the case : now contentSizeFlag is the value to set in order to tell if pledgedSrcSize must be interpreted or not.
But the test was not setting this flag (since it relied on the previous behavior).
Hence, fuzzer was only testing the path "streaming with unknown size", which works fine.

On the other hand, the cli try to pass the size whenever it can, so it triggers the bug.

I changed the test to be able to trap such bug in the future :
https://github.com/facebook/zstd/pull/648

We should also start running automated cli tests for zstdmt

Cyan4973 on 6 Apr 2017

I've uploaded fixed binaries of zstdmt for Windows in attach release :
https://github.com/facebook/zstd/releases/tag/v1.1.4

Named zstd-v1.1.4-win64-fix.zip.

@sebbellob could you please test if it works better for you ?

Cyan4973 on 6 Apr 2017

👍1

ASAP. Within 12 hrs.

sebbellob on 6 Apr 2017

I did a quick test :
Every possible combinations works fine with the x64 fix. (Testdata.dat)
Notice : help panel displays v1.1.5

OTOH : i dont know is it actually or not ,but it looks like V1.1.3 fails for me on a large file.(44GB,Ultra/22,single thread version : Decoded : 38026 MB... Error 39 : Read error : premature end)
Yesterday i started some batch compressions ,it was the first archive. The batch finished successfully. Maybe there was a crash,etc ,i dont know ,because the system successfully powered down in the night.
Today i'll do it again ,without powering down.

sebbellob on 6 Apr 2017

👍1

Error 39 : Read error : premature end

It should be this one : https://github.com/facebook/zstd/blob/dev/programs/fileio.c#L914

This is generally an I/O error _during decompression_. It means reader was expecting more data, but was cut off before the end.

Possible causes : media was stopped before the end, or compressed data is actually damaged (end part is missing).

Quick way to check second hypothesis is to do a decompression test :
zstd -t filenameToTest.zst

Cyan4973 on 7 Apr 2017

Maybe you have right ,but::
1.1.3 x64 ends the process with some kind of COMPRESSION ERROR @~97% ,while the Win64 fix version finishes successfully ! (Same file)
I'll post the details from home.

sebbellob on 7 Apr 2017

OK,
so decoder failed to decode file created with v1.1.3 with a "premature end" error
simply because archive created with v1.1.3 is incomplete,
because compression failed before reaching the end.
That sounds like a credible scenario.

What we don't know is why v1.1.3 failed at 97%.
In particular, we don't know if this is reliably reproducible, or a one-off fluke.

Cyan4973 on 7 Apr 2017

well ,while compressinng with 1.1.3 x64 ,the error message was :
Read : 38045 / 1181 MB ==> 98.08%Error 26 : Compression error during frame end : Src size incorrect

x64 1.1.4 MT fix : successfully finished,tested and its fine.
x64 1.1.3 : under all circumstances creates the error(3rd times!!) (used for many days without any bug ,this is the first false archive)
I dont know how to reproduce it ,due to my 44 GB file.... Parameters was --ultra -22.

Is it worth to post errors about 1.1.3 ?

Edit :
First 1.1.4 fix :
@ testing : OK
@ decompression : use --no-sparse

Latest build :
@ testing : OK
@ decompression : OK

Testing compression on above mentioned 44GB file. (Ultra/22)

@EDIT :

1.1.4. MT x64 MD5 : ade5f75dc9d9135fe9ae5eba4bf4aba7
1.1.4. MT x64 SHA1 : f3959a71d07124b39a21bb3de894500044feadde
1.1.4. x64 MD5 : df518584f42e517d4ed79cac4158f49e
1.1.4. x64 SHA1 : eb999c7645fbcc1d3649477f13d53d109425e0ff

Everything works fine !

sebbellob on 8 Apr 2017

👍1

Thanks @sebbellob , you made some very thorough investigation.

Read : 38045 / 1181 MB ==> 98.08%Error 26 : Compression error during frame end : Src size incorrect

This is this error :
https://github.com/facebook/zstd/blob/v1.1.3/programs/fileio.c#L402

It happens because the size of the file was incorrectly reported (1181 MB, while it should be closer to 44 GB). Therefore, when closing the frame, the size discrepancy is detected, and compression fails.

If this diagnosis is correct, that means this problem is unrelated to compression level, or multithreading. In fact, the same issue should happen at compression level 1 and with a single thread (it should be fast to test).

File size is provided by this function :
https://github.com/facebook/zstd/blob/v1.1.3/programs/util.h#L186
It was changed in v1.1.4 :
https://github.com/facebook/zstd/blob/dev/programs/util.h#L246

The difference is that a new mode has been added for MinGW (patch by @ds77).
And the pre-compiled Windows binary distributed with the release is compiled with MinGW.

At this stage, it seems the most likely reason : binaries compiled with MinGW have a buggy file size report function for very large files.

It's a pity the issue was not picked by the automated tests.
We do have a test for large files (> 4 GB) : https://github.com/facebook/zstd/blob/dev/tests/playTests.sh#L477
But it uses "pipe mode" to avoid creating any file on local system for the test.
Consequently, it does not have the size of source file.
Hence, file size is "unknown", so it cannot be "wrong".
(Pipe mode is, btw, a work-around to have mingw v1.1.3 binaries work with large files).

To solve this test limitation, we would have to create a large file on disk, in order to force the system to report its size, and test if size reporting is buggy.
Problem is, most CI environments enforce strong space requirements.
It's not clear if a file > 4 GB is allowed to be created on such environments (Travis CI, Appveyor CI and Circle CI).

In the meantime, I'll remove the v1.1.3 Windows binaries from the release tab, since they are buggy.

Cyan4973 on 8 Apr 2017

👍1

Thanks for your help and investigation @sebbellob .
We have added new tests in our CI environment, so hopefully, such cases will be trapped in the future.

Cyan4973 on 2 May 2017

Was this page helpful?

0 / 5 - 0 ratings