I am trying to add multi-thread to examples/streaming_compression.c. I added a few lines in compressFile_orDie()(full code here):
#ifdef ZSTD_MULTITHREAD
assert(!ZSTD_isError(ZSTD_CCtx_setParameter(cstream, ZSTD_c_nbWorkers, 2)));
#endif
// size_t const initResult = ZSTD_initCStream(cstream, cLevel);
// if (ZSTD_isError(initResult)) {
// fprintf(stderr, "ZSTD_initCStream() error : %s \n",
// ZSTD_getErrorName(initResult));
// exit(11);
// }
It seems that https://github.com/facebook/zstd/blob/470344d33e1d52a2ada75d278466da8d4ee2faf6/lib/compress/zstd_compress.c#L4021 would create mtctx only if cctx is not initialized(i.e. cctx->streamStage == zcss_init). So I comment ZSTD_initCStream and set ZSTD_c_nbWorkers to 2.
But this modified version of streaming_compression dies after calling ZSTD_endStream and exit with message "not fully flushed". This can be reproduced on large enough input, for example, silesia.tar.
I think maybe I misunderstand the way to use ZSTD. So how to use streaming compression with multi-thread?
Hey @bennyyip, there are two things that should help you here:
ZSTD_CCtx_setParameter(cctx, ZSTD_c_nbWorkers, 2)) where the other parameters are set.ZSTD_endStream(), or equivalently ZSTD_compressStream2() with ZSTD_e_end, doesn't make maximum forward progress, just some progress, so you need to call it until it returns 0. We realized that this was a confusing behavior, so in commit 48a6427d22f290157b8acc3f7c03c0f762a768be we made ZSTD_endStream() block until either its output buffer is full, or the stream is complete.We'll be releasing a new version of zstd on Monday with a focus on the advanced API. Please let me know if you find any of the new documentation confusing, so I can improve it.
@terrelln Thanks for your quick reply! I will open an issue if I found any confusion in the new documentation.
Most helpful comment
Hey @bennyyip, there are two things that should help you here:
ZSTD_CCtx_setParameter(cctx, ZSTD_c_nbWorkers, 2))where the other parameters are set.ZSTD_endStream(), or equivalentlyZSTD_compressStream2()withZSTD_e_end, doesn't make maximum forward progress, just some progress, so you need to call it until it returns0. We realized that this was a confusing behavior, so in commit 48a6427d22f290157b8acc3f7c03c0f762a768be we madeZSTD_endStream()block until either its output buffer is full, or the stream is complete.We'll be releasing a new version of zstd on Monday with a focus on the advanced API. Please let me know if you find any of the new documentation confusing, so I can improve it.