Paper: Implementation of zstd lib

Created on 8 Sep 2016  路  15Comments  路  Source: PaperMC/Paper

Hello,

Some days ago, I heard about a new library for compression, zstd (http://facebook.github.io/zstd/).
When I saw the compression speed improvement, I wonder if it would be a great idea to implement it on Minecraft instead of zlib.

This can not really improve the compression ratio and so the file size, but it could, maybe, improve performance by reducing the amount of time used to compress world and players data in the .mca files.
I don't think changing the structure of the worlds files would be a good idea, just the compression library.

I'll start to work on it to on my side and I'll tell you if I can see a real improvement.

rejected by team prs welcome input wanted feature

Most helpful comment

If we were to swap to another compression lib, I'd say to look at LZ4 instead.

Write compression doesn't mean as much, since it's async and no user-facing latency.

However read speed, does affect chunk loading overall speed. That chart shows LZ4 to be blazing in decompression. So if we were to do the effort of managing an alt compression format, rather go for LZ4.

But this would be better suited for Mojang, so I'll see what Grum thinks.

All 15 comments

If we were to swap to another compression lib, I'd say to look at LZ4 instead.

Write compression doesn't mean as much, since it's async and no user-facing latency.

However read speed, does affect chunk loading overall speed. That chart shows LZ4 to be blazing in decompression. So if we were to do the effort of managing an alt compression format, rather go for LZ4.

But this would be better suited for Mojang, so I'll see what Grum thinks.

Anybody got the figures on how much zlib does compress world data? Might be worth giving a config option to disable zlib if it isn't significant.

lz4-java includes 3 different options for using the library:

  • JNI bindings to the original C implementation by Yann Collet,
  • a pure Java port of the compression and decompression algorithms,
  • a Java port that uses the sun.misc.Unsafe API in order to achieve compression and decompression speeds close to the C implementation.

I think the JNI bindings would be the best option, since we're going for the best performance.

Where in the code would this be implemented? I'd love to work on this.

Edit: There are also JNI bindings for zstd.

@aikar the interest of zstd, on my point of view, is to keep one the the better compression ratio and, at the same time, improve compression / decompression speed. The major problem with LZ4 is the compression ratio. About 0.8 less than zstd and zlib. This can't be used for huge servers (mini-games or little maps can, but 50k x 50k servers will explode their storage).

@AlfieC this library is really new, no one used it on Minecraft yet.

@phase I didn't search for a long time, but it seems all the compression / decompression work for world and player data goes there: https://github.com/Bukkit/mc-dev/blob/master/net/minecraft/server/NBTCompressedStreamTools.java (yes I know, it's outdated ...). I'm not sure there is a lot of work for this, I figured we just have to change the library, not the data system but I'm not an expert.

@SpyL1nk you and @aikar are talking about using the compression in two different places. Sending chunks over the wire vs. saving them to disk. You want more speed when sending over the wire and a better ratio when storing to disk.

Edit: oh maybe you weren't, still you should consider both cases: the former obviously requires changes from mojang.

@Deamon5550 Oh yes, I didn't understood it like that. I was just talking about the server side not the compression through network which would include client side modification. But I don't want a better ratio when storing to disk, even if it could be fine, only to improve compression / decompression speed of the chunk, and others data, when they are loaded by the server (only server side) or saved.

For a protocol-extension adding network compression, LZ4 is probably the best option.
However, we'll have to hack further junk into the handshake packet, and modify Waterfall to support it.
I don't think this is a good idea at this point, although it could _potentially_ result in a massive performance increase for latency/ping if it gets client side support.

@Techcable this issue is about the compression of region files tho ;)

If we really do this we have to provide a way to migrate existing world's to the new compression algorithm and back to standard.

@Techcable LZ4 implementation is probably a Mojang issue but you true, it could result in a massive performance increase.

@MiniDigger that was my intention too, a little script which just have to decompress from zlib and compress to zstd.

@SpyL1nk
Agreed, mojang is the one who needs to ultimately make this change. However, I've been considering a protocol-extension mechanism for Waterfall, to add better encryption/authentication support. It could be possible for Waterfall to automatically detect that Paper supports LZ4 compression, and use it instead. Additionally, if the protocol extension method was designed properly, it could be implemented as a forge mod too, allowing clients to opt-into the new compression system. Knowing how much PvPers love placebo performance boosts, I bet they'd love to opt into a _real_ performance boost, with order of magnitude gains.
@MiniDigger
If someone manages to provide an _optional_ patch, with a converter able to convert back and forth between the new format, I'm all for it.
However, IIRC the region-file loading code isn't very abstracted, and it'd be a large diff to handle a new format transparently.

It wouldn't be a large diff at all. It would simply be conditional on which compression stream wrapper to use to/from.

the only important detail is accurately determining which format the data has been saved in when decompression comes, and then the config will control which format data is saved in.

Network level I don't think is needed. Internal networks can already potentially benefit from turning off compression entirely (given that LAN interface bandwidth is not a concern, but this theory is not proven yet)

@Techcable maybe an option to disable network compression entirely if Waterfall is detected and have Waterfall compress itself. Could drastically reduce CPU usage on the server - for we can easily add more Waterfalls but not more Paper instances.

@AlfieC the server properties have a network-compression-threshold option that can be set to -1 to completely disable compression. this is suggested for local bungee/waterfall setups.

Oh - I must have missed this. Apologies.

I don't see anyone on the team taking this up considering it doesn't have significant promising results.

But if someone wishes to PR something like this, we can review it based on that but it needs to be sigificantly durable to ensure no data could ever be lost.

Was this page helpful?
0 / 5 - 0 ratings