Please provide a table with
{compression level: --zstd options set by that level}
in man zstd(1), or provide a way to ask the zstd program.
Without this, it's hard to know what we are doing with
zstd --zstd=
That's a good point @hfuru .
From the time being, a possible work-around is to consult these tables .
If you have access to the API, there is also this function in the _experimental_ section which goes straight to the point.
A difficulty is that default parameters depend on both compression level and source size, making it more complex to request.
Possible implementation idea : if the point is to _show_ (as opposed to _request_) advanced compression parameters being used for current compression, maybe this could be displayed as part of -vv (very verbose) command ?
I want to tune the compression level some scripts use. That only
makes sense if my scripts knows more than zstd does.
So, the tables with a filesize key looks like the answer here.
Maybe until you modify them in a later zstd version and my tuning
becomes obsolete:-)
It also sounds nice nice to the see the params in -vv.. output, yes.
But if so, please add a "(params depend on file size)" note. Otherwise
that info becomes a trap instead: Users could start with what -vv..
shows, benchmark compression for a few files, tweak the params, and
miss out because zstd with default params would have been smarter with
real-world input files.
Back to tuning, I guess the flexible way is to be able to say
zstd -level --ztd=params --zstd-defaults=params
This would add the zstd-defaults to --zstd, except when
"zstd -level" would use better values.
I'll write an zstd-wrapper doing something like that. With your
tables inverted a bit: {zstd-level: {filesize: params, ...}, ...}.
Question: When does zstd know the filesize - when the input is a
regular file? That should be documented too.
Also, it looks like --sizehint=
might be a useful option, for pipes/sockets where zstd does not
know what size to expect. Or call it maxsize_hint / minsize_hint
to clarify whether it's better to guess to high or too low size.
The zstd library knows the file size when using a one-pass function where the whole input is available, or if it is explicitly provided when starting a streaming operation. The zstd CLI knows the size when the input is a regular file.
When zstd doesn't know the size, it assumes the largest size. Zstd will lose some speed, and will use a bit more memory than necessary, but it will compress just as well. Zstd uses the size to downscale its tables/window size when it knows larger tables/window won't help because the file is small enough.
A size hint could improve the (de)compression speed/memory usage. However, if the size hint was too small, compression could be significantly degraded, so it would be a "sharp / advanced" feature. It could be a reasonable flag if it was called --max-size-hint though.
zstd (the CLI) now supports a --size-hint=# option : https://github.com/facebook/zstd/blob/dev/programs/zstdcli.c#L145
This is only useful for piped data though. When size can be known (typically when it's a file), it takes over this information.
The new cli endpoint can be used to display the default cparams before running compression.
zstd --show-default-cparams appveyor.yml
appveyor.yml (13822 bytes)
- windowLog : 14
- chainLog : 14
- searchLog : 2
- minMatch : 4
- targetLength : 0
- strategy : ZSTD_dfast