Zstd: Significantly slower compression (@21) with non-aligned inputs

Created on 1 Aug 2016  路  5Comments  路  Source: facebook/zstd

When doing some testing where ZSTD is involved, I noticed that adding just a few hundred bytes to the end of a large input would significantly impact compression time.

Basically what I do is:

  • Copy a 1k static buffer into a single contiguous buffer up to the size I want
  • Compress that contiguous buffer with ZSTD_compress(...) with a compression level of 21
  • Decompress compressed buffer into a new buffer.

Test code:

FB_ROUNDTRIP_ZSTD(1 * 1024 * 1024);
FB_ROUNDTRIP_ZSTD(1 * 1024 * 1024 + 23);
FB_ROUNDTRIP_ZSTD(2 * 1024 * 1024);
FB_ROUNDTRIP_ZSTD(2 * 1024 * 1024 + 64);
FB_ROUNDTRIP_ZSTD(33 * 1024 * 1024);
FB_ROUNDTRIP_ZSTD(33 * 1024 * 1024 + 128);

Output:

Creating contiguous buffer (1048576)...done 0.002302s
Compressing...done 0.015497s
Decompressing...done 0.003197s
Creating contiguous buffer (1048599)...done 0.000638s
Compressing...done 0.191606s
Decompressing...done 0.003058s
Creating contiguous buffer (2097152)...done 0.000771s
Compressing...done 0.020405s
Decompressing...done 0.005225s
Creating contiguous buffer (2097216)...done 0.000727s
Compressing...done 0.818459s
Decompressing...done 0.005411s
Creating contiguous buffer (34603008)...done 0.012068s
Compressing...done 0.137107s
Decompressing...done 0.086206s
Creating contiguous buffer (34603136)...done 0.012254s
Compressing...^CPress any key to continue . . .

I have to kill it once it goes into compressing the _34603136_ buffer because it takes minutes.

In fact, it seems like all the compression levels other than 21 don't really show any significant differences between compression time, though this testing is obviously quite ad-hoc and anecdotal, but it is quite consistent.

This might just be some kind of weird pathological case because of how I am constructing the buffer out of the same 1k chunk of data, except the last chunk will be truncated, but I thought it was worth mentioning something here in case this is an edge case that could be patched up or something.

Most helpful comment

fixed in v0.8.1

All 5 comments

Hi @Jake-Shadle

Thanks for the notification,
yes, you are right, this is something we should look into.
It's quite possible this is an edge case to take care of.
We might need a sample to better investigate the issue. Is it accessible ?

Sure, I can put something together later tonight, will just need to disentangle what I have into a standalone sample. Thanks for the quick response!

I couldn't reproduce the problem. What version of zstd did you use? Please provide your file.

Latest update in "dev" branch includes a fix by @inikep which should fix this issue

fixed in v0.8.1

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pjebs picture pjebs  路  3Comments

animalize picture animalize  路  3Comments

escalade picture escalade  路  3Comments

rgdoliveira picture rgdoliveira  路  3Comments

icebluey picture icebluey  路  3Comments