Zstd: pzstd uses significantly more memory

Created on 18 Sep 2016  路  6Comments  路  Source: facebook/zstd

$ command time -f '%Es %MKB' zstd -2 -f 2GB
0:25.86s 3232KB
$ command time -f '%Es %MKB' pzstd -2 -f -n 1 2GB
0:23.30s 19668KB
$ command time -f '%Es %MKB' pzstd -2 -f -n 2 2GB
0:13.84s 26696KB

This CPU has 2 cores with 4 threads, so there's not much benefit beyond 2 threads.

$ command time -f '%Es %MKB' pzstd -2 -f -n 4 2GB
0:12.55s 52724KB
$ command time -f '%Es %MKB' pzstd -2 -f -n 16 2GB
0:13.76s 171532KB
question

All 6 comments

The memory usage is expected to scale linearly with the number of threads, since each thread needs its own Zstd (de)compression context.

However, I believe the constant factor can be reduced. I'll play with the memory usage and see how much I can reduce it before performance starts getting impacted.

1c209a4febff1de239160bfdc0961b4b64eda16a reduced the memory usage to 60-75% of the previous amount.

The input gets broken up into chunks of size 4 times Zstd's window size. If we halved that, we would almost half memory usage again. However, that may come at the cost of compression ratio. More experimentation should be done here, and perhaps we should make the multiple of window size configurable.

Is this issue still opened, or considered answered by 1c209a4 ?

I guess that depends on how ambitious you are. I reported it because I thought that some reduction is probably a low hanging fruit, which it was. In context, memory usage is quite low even with pzstd.

PS. Why is pzstd consistently faster with a single core? Is it because zstd prints more status information? If so, that seems like significant overhead. I got this in the new results too.

$ command time -f '%Es %MKB' zstd -2 -f 2GB
0:26.67s 3196KB
$ command time -f '%Es %MKB' pzstd -2 -p 1 -f 2GB
0:23.13s 18656KB
$ command time -f '%Es %MKB' pzstd -2 -p 2 -f 2GB
0:13.88s 27424KB
$ command time -f '%Es %MKB' pzstd -2 -p 4 -f 2GB
0:13.34s 43380KB
$ command time -f '%Es %MKB' pzstd -2 -p 16 -f 2GB
0:14.22s 121220KB

Why is pzstd consistently faster with a single core?

pzstd uses one separate thread for overlapped I/O, which is not counted in -p 1.

Since v1.1.1, both zstd and pzstd are expected to use less memory at equivalent compression levels.

Was this page helpful?
0 / 5 - 0 ratings