Zstd: Compression with negative levels worse on latest version compared to 1.3.4

Created on 8 Feb 2019  路  7Comments  路  Source: facebook/zstd

We are trying to update from 1.3.4 to 1.3.8. Most of our compression is done streaming at level -3, which seems to work well for us. With version 1.3.8 compression at -3 is faster (5% - 10%) but the compression rate is worse. In one case the rate went from 0.27 to 0.55. I had a look at individual parameters, but I can only improve the rate by switching strategy from fast to dfast, at level -3. Sadly, now the compression speed is significantly lower than with 1.3.4.

Whatever I try, I don't seem to be able to get the same speed and compression rate with 1.3.8 as we get with 1.3.4 at level -3. What could be the cause of this? We compress large amounts of small images, no more than a few kb each, using ZSTD_compressStream2().

Most helpful comment

I'll add an advanced parameter to enable Huffman compression. The proposed experimental parameter is approximately:

typedef enum {
  ZSTD_lcm_huffman,  /*< Always use Huffman compression */
  ZSTD_lcm_uncompressed,  /*< Never use Huffman compression (emit raw literals) */
  ZSTD_lcm_auto  /*< Automatically determine based on the compression level (enabled only for positive levels) */
} ZSTD_literalsCompressionMode_e;

All 7 comments

I can't reproduce the issue with the zstd CLI, which also uses ZSTD_compressStream2() under the hood. Could you please show me how you are using the ZSTD_compressStream2() API? I shouldn't need the data, but if you can extract a code snippet to share, that will make reproducing it much easier.

Negative compression levels turn off Huffman coding at level -1, and then as you increase the level, we start sampling positions during compression with stepSize, but the sampling only works with ZSTD_fast. That explains why it got significantly slow with ZSTD_dfast, along with the fact that ZSTD_dfast is a slower strategy.

I vaguely remember that there was a bug in v1.3.4 which made it possible to request negative compression levels but only get the "faster sampling" side, without disabling Huffman compression of literals. So the requested negative compression level was not the "real" one, but something in between, compressing more and slower (but still faster than level +1).

To achieve this effect, it was necessary to invoke an experimental entry point, such as ZSTD_compress_advanced().

This "bug" was fixed in v1.3.5+, but maybe it was actually a comfortable speed / compression level for your use case ?

A quick work around could be to try compression levels -2, -1 and +1 of v1.3.8, and see if one of them fits your needs.

Thanks for your replies. All negative levels have a much reduced compression rate in 1.3.8, compared to 1.3.4. Switching to -2 or -1 makes little difference. Switching to +1 gives us virtually the same compression rates as 1.3.4, but slightly faster, so that's clear progress. The negative levels are also faster than the same levels in 1.3.4 but compression is much worse, in the order of 20%-40%.

We are not doing anything out of the ordinary when using ZStd. Initially we didn't change the code at all, just replaced the library. Later I changed to the advanced API, hoping that would make a difference, but it didn't change the results at all.

Is there a way we could try enabling Huffman compression for negative levels again, so we can see if this indeed makes the difference?

We never used ZSTD_compress_advanced() btw.
We create a context like this:

zs = ZSTD_createCStream();
ZSTD_initCStream(zs, compressionLevel);

then compress a stream with this call:

ZSTD_compressStream(zs, &outBuf, &inBuf);

and ZSTD_endStream(zs, &outBuf)
or ZSTD_flushStream(zs, &outBuf)

I'll add an advanced parameter to enable Huffman compression. The proposed experimental parameter is approximately:

typedef enum {
  ZSTD_lcm_huffman,  /*< Always use Huffman compression */
  ZSTD_lcm_uncompressed,  /*< Never use Huffman compression (emit raw literals) */
  ZSTD_lcm_auto  /*< Automatically determine based on the compression level (enabled only for positive levels) */
} ZSTD_literalsCompressionMode_e;

Wow, that's fantastic. Thank you very much. I'm looking forward to trying it out.

Thanks, that works great. We now get excellent results.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

animalize picture animalize  路  3Comments

escalade picture escalade  路  3Comments

planet36 picture planet36  路  3Comments

itsnotvalid picture itsnotvalid  路  3Comments

terrelln picture terrelln  路  3Comments