Imagesharp: Faster Jpeg Huffman Scan Decoding

Created on 30 May 2018  路  15Comments  路  Source: SixLabors/ImageSharp

Prerequisites

  • [x] I have written a descriptive issue title
  • [x] I have verified that I am running the latest version of ImageSharp
  • [x] I have verified if the problem exist in both DEBUG and RELEASE mode
  • [x] I have searched open and closed issues to ensure it has not already been reported

Description

With #571 now included in the codebase we have an opportunity to further enhance the performance of our jpeg decoder.

There is one obvious candidate for improvement within the decoder: The decoding of the SOS (Start of Scan) segment.

Our approach works but is na茂ve. We only read/decode one byte at a time and do not use optimized tables for Huffman code lookup. There are further optimizations available with AC Huffman decoding that we also do not do.

I believe that with the correct implementation we can reduce our decoding time by at least 10ms when compared to our current benchmarks.

Inspiration for better code

I'm looking for help here. I've made several attempts to port from both sources but have failed miserably so far.

Update

@bitbank2 Has kindly written some information with pointers to demonstrate how to optimise entropy decoding.

http://bitbanksoftware.blogspot.com/2018/06/optimizing-jpeg-entropy-decoding.html

performance jpeg help needed

Most helpful comment

Got both baseline and progressive working!! Latest changes pushed to the branch, will cleanup code asap.

All 15 comments

@JimBobSquarePants As of StbImage, take a look at this:
StbSharp
May be it could be of help.

@hypeartist Very useful indeed! I've got something I can debug against now. Thanks!

I had another look at StbImage and managed to decode our Calliphora jpeg. It's a lot faster but there's inaccuracy in the spectral output compared to our current implementation. I think it might be better to try to improve our existing decoder instead of attempting a new port.

@JimBobSquarePants I could give a try and port mozjpeg to C# if you like. What do you think?

If you could pull the off I would be blown away! Have a look at our code, it should just be the huffman decoder you need to port.

@JimBobSquarePants It's much easier to me to port the whole stuff so you can strip off unneeded bits. :)

It鈥檚 totally up to you but we鈥檝e already got all the marker parsing code in place plus we have SIMD optimized IDCT and colorspace conversion code.

As far as I understand the Mozjpeg source it鈥檚 only two jdhuff files to port

@JimBobSquarePants Ok. I got you. Already grabbed the source and started to examine. Will write you back asap.

Brilliant thanks!

@JimBobSquarePants isn't "Faster Jpeg Huffman Decoding" a better title for this?

Gonna post some up-to-date profiler results tonight, but TryDecodeHuffman() and TryReadBit() are our major Jpeg Decoder bottlenecks as far as I remember.

@hypeartist in my opinion the fastest way to look for improvement opportunities is doing a comparative debug/analysis against other decoders. Doing a full port is very time consuming + having something fast in languages like C, C++, go, rust etc. doesn't guarantee the same code will be fast in C#. (We've been there several times!)
If you can figure out something, please let us know! Any help is appreciated.

@antonfirsov Perhaps, yeah... Naming is hard, it's the Scan segment we're decoding but it's Huffman encoded.

Those two sections will definitely be slowing us down.

  • TryReadBit() We should be working with a 4-byte buffer that gets cleared out when we hit a restart marker.

  • TryDecodeHuffman should be using a LUT for most of the returned results. Something like 95% of the code values should hit that LUT.

There's established practises that we should definitely be trying to adapt from MozJpeg, I just haven't managed to get it working with restart markers.

Performance profile for running all the JpegProfilingBenchmarks.DecodeJpeg_PdfJs() (baseline) cases together (some AggressiveInlining were removed to get more information about the calls, but not all):

image

@antonfirsov @hypeartist @saucecontrol

So I revisited this problem this morning and took another look at porting the huffman decoder from StbSharp.

Check out the ScanDecoder.cs class in the new-jpeg-scan-decoder branch.

I'm actually getting somewhere!

I'm working on baseline currently, with 6/10 tests passing with spectral accuracy. I think the failing tests are due to me not reading the correct byte following a marker (I could be wrong though).

I could really do with another pair of eyes on the problem as I think once we have baseline ported, progressive will follow swiftly. It's definitely worth it imo as without any additional optimisation the port is already yielding healthy performance improvements (PdfJs Port).

BenchmarkDotNet=v0.10.14, OS=Windows 10.0.17134
Intel Core i7-6600U CPU 2.60GHz (Skylake), 1 CPU, 4 logical and 2 physical cores
Frequency=2742192 Hz, Resolution=364.6718 ns, Timer=TSC
.NET Core SDK=2.1.300
  [Host]     : .NET Core 2.0.7 (CoreCLR 4.6.26328.01, CoreFX 4.6.26403.03), 64bit RyuJIT
  Job-JQBLQX : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3110.0
  Job-UIHOCS : .NET Core 2.0.7 (CoreCLR 4.6.26328.01, CoreFX 4.6.26403.03), 64bit RyuJIT

LaunchCount=1  TargetCount=3  WarmupCount=3

                           Method | Runtime |                    TestImage |      Mean |      Error |    StdDev | Scaled | ScaledSD |    Gen 0 | Allocated |
--------------------------------- |-------- |----------------------------- |----------:|-----------:|----------:|-------:|---------:|---------:|----------:|
   'Decode Jpeg - System.Drawing' |     Clr |  Jpg/baseline/Calliphora.jpg |  7.280 ms |   2.821 ms | 0.1594 ms |   1.00 |     0.00 | 117.1875 | 254.47 KB |
       'Decode Jpeg - ImageSharp' |     Clr |  Jpg/baseline/Calliphora.jpg | 36.378 ms |  12.269 ms | 0.6932 ms |   5.00 |     0.12 |        - |  52.63 KB |
 'Decode Jpeg - ImageSharp PdfJs' |     Clr |  Jpg/baseline/Calliphora.jpg | 28.817 ms |  33.441 ms | 1.8895 ms |   3.96 |     0.22 |        - |  25.25 KB |
                                  |         |                              |           |            |           |        |          |          |           |
   'Decode Jpeg - System.Drawing' |    Core |  Jpg/baseline/Calliphora.jpg |  8.807 ms |  13.427 ms | 0.7587 ms |   1.00 |     0.00 | 117.1875 | 254.11 KB |
       'Decode Jpeg - ImageSharp' |    Core |  Jpg/baseline/Calliphora.jpg | 37.305 ms |  12.295 ms | 0.6947 ms |   4.26 |     0.31 |        - |  47.73 KB |
 'Decode Jpeg - ImageSharp PdfJs' |    Core |  Jpg/baseline/Calliphora.jpg | 29.534 ms |  19.468 ms | 1.1000 ms |   3.37 |     0.26 |        - |   21.5 KB |
                                  |         |                              |           |            |           |        |          |          |           |
   'Decode Jpeg - System.Drawing' |     Clr | Jpg/baseline/jpeg420exif.jpg | 18.796 ms |  12.260 ms | 0.6927 ms |   1.00 |     0.00 | 343.7500 | 757.89 KB |
       'Decode Jpeg - ImageSharp' |     Clr | Jpg/baseline/jpeg420exif.jpg | 88.237 ms |  20.475 ms | 1.1569 ms |   4.70 |     0.15 | 250.0000 | 564.65 KB |
 'Decode Jpeg - ImageSharp PdfJs' |     Clr | Jpg/baseline/jpeg420exif.jpg | 61.836 ms |  15.687 ms | 0.8863 ms |   3.29 |     0.10 | 250.0000 | 535.01 KB |
                                  |         |                              |           |            |           |        |          |          |           |
   'Decode Jpeg - System.Drawing' |    Core | Jpg/baseline/jpeg420exif.jpg | 19.141 ms |  16.113 ms | 0.9104 ms |   1.00 |     0.00 | 343.7500 | 757.04 KB |
       'Decode Jpeg - ImageSharp' |    Core | Jpg/baseline/jpeg420exif.jpg | 94.172 ms | 130.098 ms | 7.3508 ms |   4.93 |     0.37 | 250.0000 | 548.71 KB |
 'Decode Jpeg - ImageSharp PdfJs' |    Core | Jpg/baseline/jpeg420exif.jpg | 64.507 ms |  37.116 ms | 2.0971 ms |   3.38 |     0.16 | 250.0000 | 522.28 KB |

9/10 working now. Only MultiScanBaselineCMYK.jpg to go.

Got both baseline and progressive working!! Latest changes pushed to the branch, will cleanup code asap.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

artem-avanesov picture artem-avanesov  路  3Comments

olivif picture olivif  路  3Comments

jarroda picture jarroda  路  3Comments

Inumedia picture Inumedia  路  3Comments

DAGA86 picture DAGA86  路  3Comments