When doing some testing where ZSTD is involved, I noticed that adding just a few hundred bytes to the end of a large input would significantly impact compression time.
Basically what I do is:
Test code:
FB_ROUNDTRIP_ZSTD(1 * 1024 * 1024);
FB_ROUNDTRIP_ZSTD(1 * 1024 * 1024 + 23);
FB_ROUNDTRIP_ZSTD(2 * 1024 * 1024);
FB_ROUNDTRIP_ZSTD(2 * 1024 * 1024 + 64);
FB_ROUNDTRIP_ZSTD(33 * 1024 * 1024);
FB_ROUNDTRIP_ZSTD(33 * 1024 * 1024 + 128);
Output:
Creating contiguous buffer (1048576)...done 0.002302s
Compressing...done 0.015497s
Decompressing...done 0.003197s
Creating contiguous buffer (1048599)...done 0.000638s
Compressing...done 0.191606s
Decompressing...done 0.003058s
Creating contiguous buffer (2097152)...done 0.000771s
Compressing...done 0.020405s
Decompressing...done 0.005225s
Creating contiguous buffer (2097216)...done 0.000727s
Compressing...done 0.818459s
Decompressing...done 0.005411s
Creating contiguous buffer (34603008)...done 0.012068s
Compressing...done 0.137107s
Decompressing...done 0.086206s
Creating contiguous buffer (34603136)...done 0.012254s
Compressing...^CPress any key to continue . . .
I have to kill it once it goes into compressing the _34603136_ buffer because it takes minutes.
In fact, it seems like all the compression levels other than 21 don't really show any significant differences between compression time, though this testing is obviously quite ad-hoc and anecdotal, but it is quite consistent.
This might just be some kind of weird pathological case because of how I am constructing the buffer out of the same 1k chunk of data, except the last chunk will be truncated, but I thought it was worth mentioning something here in case this is an edge case that could be patched up or something.
Hi @Jake-Shadle
Thanks for the notification,
yes, you are right, this is something we should look into.
It's quite possible this is an edge case to take care of.
We might need a sample to better investigate the issue. Is it accessible ?
Sure, I can put something together later tonight, will just need to disentangle what I have into a standalone sample. Thanks for the quick response!
I couldn't reproduce the problem. What version of zstd did you use? Please provide your file.
Latest update in "dev" branch includes a fix by @inikep which should fix this issue
fixed in v0.8.1
Most helpful comment
fixed in v0.8.1