Zstd: Document/show compression levels' --zstd options

Created on 31 Dec 2018  路  6Comments  路  Source: facebook/zstd

Please provide a table with
{compression level: --zstd options set by that level}
in man zstd(1), or provide a way to ask the zstd program.

Without this, it's hard to know what we are doing with
zstd --zstd=.

feature request

All 6 comments

That's a good point @hfuru .

From the time being, a possible work-around is to consult these tables .

If you have access to the API, there is also this function in the _experimental_ section which goes straight to the point.

A difficulty is that default parameters depend on both compression level and source size, making it more complex to request.

Possible implementation idea : if the point is to _show_ (as opposed to _request_) advanced compression parameters being used for current compression, maybe this could be displayed as part of -vv (very verbose) command ?

I want to tune the compression level some scripts use. That only
makes sense if my scripts knows more than zstd does.

So, the tables with a filesize key looks like the answer here.
Maybe until you modify them in a later zstd version and my tuning
becomes obsolete:-)

It also sounds nice nice to the see the params in -vv.. output, yes.
But if so, please add a "(params depend on file size)" note. Otherwise
that info becomes a trap instead: Users could start with what -vv..
shows, benchmark compression for a few files, tweak the params, and
miss out because zstd with default params would have been smarter with
real-world input files.

Back to tuning, I guess the flexible way is to be able to say
zstd -level --ztd=params --zstd-defaults=params
This would add the zstd-defaults to --zstd, except when
"zstd -level" would use better values.

I'll write an zstd-wrapper doing something like that. With your
tables inverted a bit: {zstd-level: {filesize: params, ...}, ...}.

Question: When does zstd know the filesize - when the input is a
regular file? That should be documented too.

Also, it looks like --sizehint= or --zstd=sizehint=...
might be a useful option, for pipes/sockets where zstd does not
know what size to expect. Or call it maxsize_hint / minsize_hint
to clarify whether it's better to guess to high or too low size.

The zstd library knows the file size when using a one-pass function where the whole input is available, or if it is explicitly provided when starting a streaming operation. The zstd CLI knows the size when the input is a regular file.

When zstd doesn't know the size, it assumes the largest size. Zstd will lose some speed, and will use a bit more memory than necessary, but it will compress just as well. Zstd uses the size to downscale its tables/window size when it knows larger tables/window won't help because the file is small enough.

A size hint could improve the (de)compression speed/memory usage. However, if the size hint was too small, compression could be significantly degraded, so it would be a "sharp / advanced" feature. It could be a reasonable flag if it was called --max-size-hint though.

zstd (the CLI) now supports a --size-hint=# option : https://github.com/facebook/zstd/blob/dev/programs/zstdcli.c#L145

This is only useful for piped data though. When size can be known (typically when it's a file), it takes over this information.

The new cli endpoint can be used to display the default cparams before running compression.

zstd --show-default-cparams  appveyor.yml

appveyor.yml (13822 bytes)
 - windowLog    : 14
 - chainLog     : 14
 - searchLog    : 2
 - minMatch     : 4
 - targetLength : 0
 - strategy     : ZSTD_dfast
Was this page helpful?
0 / 5 - 0 ratings

Related issues

pjebs picture pjebs  路  3Comments

TheSil picture TheSil  路  3Comments

animalize picture animalize  路  3Comments

ga92yup picture ga92yup  路  3Comments

rgdoliveira picture rgdoliveira  路  3Comments