The newly added compression algorithm Brotli as tracked by dotnet/runtime#23936 defaults to quality 11 as seen here:
https://github.com/dotnet/corefx/blob/8f9cfa462dfc2ec5bdbbe2462693036861fb0199/src/System.IO.Compression.Brotli/src/System/IO/Compression/BrotliUtils.cs#L13
From a quick look at Brotli's code (https://github.com/google/brotli), it seems level 5 and 6 is intended for optimal ratio/speed and the Brotli team seems to tweak the algorithm to make sure it is the case.
I tested against a 518,461,845 byte file I use for compression benchmarks.
It seems that defaulting to 11 makes the algorithm very slow for the common case.
That being said. CompressionLevel only has 3 levels:
I would suggest having 4 levels:
Note that I've also ordered the levels to reflect their compression ratio, which is currently not the case. If needed, I can create a separate issue for this suggestion.
‘Optimal’ and ‘best’ sound like synonyms to me.
@jnm2 Optimal does mean the balance between speed and ratio, and best does mean the highest ratio, but you are right; it is not intuitive to developers. Words like "minimal", "normal" and "maximum" are used in common compression applications. Most applications have about 5 levels:
I think it would make better sense to use those levels than the current ones. However, this issue is really about the default quality level of 11 in the BrotliEncoder which is just crazy.
@asthana86 could you label this with area-System.IO.Compression instead? I would also say that it should be marked as a bug as this setting prevents all use of the new algorithm.
@Genbox
Optimal does mean the balance between speed and ratio, and best does mean the highest ratio, but you are right; it is not intuitive to developers.
It's just as easy for me to talk about the best balance between speed and ratio, versus the optimal compression ratio. 😊
@joshfree, @ianhays, anything that needs to be done here for 2.1?
Google's C brotli implementation uses 11 as the default quality. Our C# brotli code is built on top of Google's C code, so the default compression level is the same.
If you want another value like 5 or 6, you can always just pass an integer for the exact level you want, whether using BrotliEncoder or BrotliStream
Optimal does mean the balance between speed and ratio, and best does mean the highest ratio, but you are right; it is not intuitive to developers.
Optimal does not mean optimal balance, it means optimal compression ratio i.e. best compression. See the docs for CompressionLevel
Closing this as by-design.
FWIW, I agree with you about the need for a middle-ground CompressionLevel that represents the best balance between speed and compression ratio. There were enough compat concerns about adding a new enum that we decided against it.
However, that did result in us flip-flopping between making Optimal for DeflateStream map to 9 before changing it back to 6 for a balance later. For DeflateStream there are heavy diminishing returns in increased Compression Ratio past level 6 and huge speed hits, so we determined it to be not worth it.
For Brotli, I chose 11 as the Optimal for two main reasons:
Optimal as the absolute best compression without regard to compression time, and for the scenarios we expect Brotli to be primarily used in (server-side compression, browser-based decompression, preference for smallest files) every little bit of compression ratio is important.Just for fun (because it's far too breaking to do), if I was designing CompressionLevel as a new enum, I would have made it:
It is not acceptable to choose the highest level as the default for a library like .NET Core. You say the use case is server-side compression, browser-based decompression and preference for smallest files, however, they are irrelevant this is just a technology and will be used for whatever by users.
From the announcement here: https://blogs.msdn.microsoft.com/dotnet/2018/05/07/announcing-net-core-2-1-rc-1/ it says: "The BrotliStream behavior is the same as that of DeflateStream or GZipStream to allow easily converting DeflateStream/GZipStream code to use BrotliStream."
That's not the case! We switched from DeflateStream to BrotliStream and our application took between 30x and 20000x the time to compress the same data due to this default value. Compatability is not just about APIs, it is also about similar performance characteristics, especially when there is a global announcement that BrotliStream is better than DeflateStream, and people can just switch...
I understand you like to have backwards compatibility to the original C code, but there is no need to move Google's choice of default into .NET Core, especially considering the huge impact it will have on applications worldwide once 2.1 is out.
Just for fun (because it's far too breaking to do), if I was designing CompressionLevel as a new enum, I would have made it:
- NoCompression (Deflate 0, Brotli 0)
- Fastest (Deflate 1, Brotli 1)
- Balanced (Deflate 6, Brotli 9)
- Best (Deflate 9, Brotli 11)
Was this considered "far too breaking" only for a point release, or would it also too breaking for 3.0?
Although it's possibe to explicitly set the desired Brotli quality level in BrotliEncoder, there are other places, such as ASP.NET Core's BrotliCompressionProvider, where config is restricted to the CompressionLevel enum.
In fact, the Response Compression Middleware docs actually recommend using CompressionLevel.Optimal with Brotli, which could be catastrophic for performance.
The Brotli Compression Provider defaults to the fastest compression level (CompressionLevel.Fastest), which might not produce the most efficient compression. If the most efficient compression is desired, configure the middleware for optimal compression.
It's also not ideal that zlib level 9 is no longer available at all.
The revised enum suggested above would really clear things up.
For BrotliStream and likely other places that are restricted to CompressionLevel, you can use the work around described by @taritsyn in dotnet/runtime#1556: new BrotliStream(dest, (CompressionLevel) level).
I moved from another library that implemented a BrotliStream and the performance got 30x worse. I did some (rought) testing and for my data (45Mb JSON file) I go the following results, a massive 385x difference between Fastest and Optimal:


This issue makes the provided .net functionality completely unusable. Quality 1 is pretty much the same as gzip, with perhaps some performance speedup and Quality 11 is too slow to be ever practical.
The comment by @ianhays absolutely baffles me. Are you guys making an API as a brain exercise in design or do you expect people to actually use it?
The only way to use it through a hack: new BrotliStream(target, (CompressionLevel)9); for example.
Optimal should always be 9 on Brotli. There is never any reason to use anything higher. For anything higher you should use a different algorithm.
Most helpful comment
It is not acceptable to choose the highest level as the default for a library like .NET Core. You say the use case is server-side compression, browser-based decompression and preference for smallest files, however, they are irrelevant this is just a technology and will be used for whatever by users.
From the announcement here: https://blogs.msdn.microsoft.com/dotnet/2018/05/07/announcing-net-core-2-1-rc-1/ it says: "The BrotliStream behavior is the same as that of DeflateStream or GZipStream to allow easily converting DeflateStream/GZipStream code to use BrotliStream."
That's not the case! We switched from DeflateStream to BrotliStream and our application took between 30x and 20000x the time to compress the same data due to this default value. Compatability is not just about APIs, it is also about similar performance characteristics, especially when there is a global announcement that BrotliStream is better than DeflateStream, and people can just switch...
I understand you like to have backwards compatibility to the original C code, but there is no need to move Google's choice of default into .NET Core, especially considering the huge impact it will have on applications worldwide once 2.1 is out.