Rawtherapee: pink color cast when using input profile when rawtherapee is compiled with gcc-10 and -O3

Created on 10 May 2020  路  40Comments  路  Source: Beep6581/RawTherapee

Hello,

Here are 3 screenshots: https://cloud.xwing.info/s/zBMSRnrL2dXnAkx
I think it will speak by itself:

  • _rawtherapee-O2-input-profile.png_: using an input profile with rawtherapee build with gcc-10 and -O2 -> it鈥檚 fine
  • _rawtherapee-O3-input-profile.png_: using an input profile with rawtherapee build with gcc-10 and -O3 -> bogus pink color cast
  • _rawtherapee-O3-no-input-profile.png_: using no input profile with rawtherapee build with gcc-10 and -O3 -> colors are a bit off since there is no input profile, but it鈥檚 OK

I can reproduce this as soon as input profile in enabled, with any choice (either the embedded one, the one provide by RT, or any custom DCP), and with any raw file I could test (I tested Canon 400D, 5D, Fuji X20, X100s and X-T30).

I have not yet tested an old version that may work to bisect this, I just started using gcc-10 last week, so this is a new discover for me. This may hide some bug as gcc-10 seems a bit sensitive than gcc-9 when dealing with optimisations.

Version: 5.8-402-gb5af6eec6
Branch: dev
Commit: b5af6eec6
Commit date: 2020-05-09
Compiler: gcc 10.1.0
Processor: Intel(R)\ Core(TM)\ i5-4690\ CPU\ @\ 3.50GHz
System: Linux
Bit depth: 64 bits
Gtkmm: V3.24.2
Lensfun: V0.3.2.0
Build type: Gentoo
Build flags: -march=native -O3 -mtune=native -pipe -std=c++11 -Werror=unused-label -Werror=delete-incomplete -fno-math-errno -flto -Wall -Wuninitialized -Wcast-qual -Wno-deprecated-declarations -Wno-unused-result -Wunused-macros -fopenmp -Werror=unknown-pragmas -ftree-vectorize
Link flags: -Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed -Wl,--sort-common -flto
OpenMP support: yes
MMAP support: ON
Build OS: Linux 5.6.11-gentoo x86_64
Build date: Sat, 09 May 2020 18:43:34 +0000 UTC
Build epoch: 1589049814
Build UUID: 9ae996ea-2840-4d76-8f25-1be0cb097757

compilation

Most helpful comment

As gcc 10.2 is announced for this week, I enabled -ftree-loop-vectorize for gcc versions != 10.1

All 40 comments

Windows 10 : same problem with GCC 10. I discovered the problem yesterday. You made more investigation than me.
Going back to 9.3 solves the problem.
So either it is a bug inside GCC or some GCC utilities, or a latent bug inside RT revealed by GCC10 optimisationas you wrote.
I am going to build 5.8 with GCC10 on windows.

Same problem building ART https://bitbucket.org/agriggio/art/issues/82/red-color-cast-on-all-raw-files-with

I will update to gcc 10 now and have a look...

@heckflosse My findings are, that when you dump camMatrix here the middle line is off with GCC 10.

@Floessie I will try to bisect the switches set by -O3. Maybe that helps to find the culprit:

    Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the following optimization flags:

    -fgcse-after-reload 
    -fipa-cp-clone
    -floop-interchange 
    -floop-unroll-and-jam 
    -fpeel-loops 
    -fpredictive-commoning 
    -fsplit-loops 
    -fsplit-paths 
    -ftree-loop-distribution 
    -ftree-loop-vectorize 
    -ftree-partial-pre 
    -ftree-slp-vectorize 
    -funswitch-loops 
    -fvect-cost-model 
    -fvect-cost-model=dynamic 
    -fversion-loops-for-strides

If you printf Thumbnail::cam2xyz in Thumbnail::init() it is correct again. Typical heisenbug.

@Floessie Do your thumbnails look correct? Here they do not, at least not all...

I built on MSYS2 using GCC 10 on my x64 machine with additional warnings enabled. These warnings stand out, but I have no idea if they may be relevant.

C:/msys64/home/Roel/rtdev/rtengine/dcraw.cc: In member function 'void DCraw::identify()':
C:/msys64/home/Roel/rtdev/rtengine/dcraw.cc:9465:3: warning: missing initializer for member 'DCraw::identify()::<unnamed struct>::offset' [-Wmissing-field-initializers]
 9465 |   };
      |   ^

C:/msys64/home/Roel/rtdev/rtgui/options.cc: In member function 'void Options::setDefaults()':
C:/msys64/home/Roel/rtdev/rtgui/options.cc:365:30: warning: missing initializer for member 'SHELLFLAGSTATE::fShowExtensions' [-Wmissing-field-initializers]
  365 |     SHELLFLAGSTATE sft = { 0 };
      |                              ^
(and for a bunch of other members)

C:/msys64/home/Roel/rtdev/rtgui/rtimage.cc: In copy constructor 'RTImage::RTImage(RTImage&)':
C:/msys64/home/Roel/rtdev/rtgui/rtimage.cc:41:1: warning: base class 'struct sigc::trackable' should be explicitly initialized in the copy constructor [-Wextra]
   41 | RTImage::RTImage (RTImage &other) : surface(other.surface), pixbuf(other.pixbuf)
      | ^~~~~~~
C:/msys64/home/Roel/rtdev/rtgui/rtimage.cc:41:1: warning: base class 'class Glib::ObjectBase' should be explicitly initialized in the copy constructor [-Wextra]
C:/msys64/home/Roel/rtdev/rtgui/rtimage.cc:41:1: warning: base class 'class Gtk::Image' should be explicitly initialized in the copy constructor [-Wextra]

There are also some deprecated GTK function calls in main.cc, thumbbrowserbase.cc and rtwindow.cc but I think these should still be harmless.

Edit: my thumbnails are also broken. It only seems to be an issue for raw files. Tiffs and jpegs are unaffected.

@Thanatomanic

my thumbnails are also broken. It only seems to be an issue for raw files. Tiffs and jpegs are unaffected.

not all of my raw thumbs are broken. Will check, if there is a pattern...

@heckflosse I saw a few good ones too, except, after clearing the cache, they all went pink as well. So there must be a general problem for raws.

I can confirm @casta's initial report:

I can reproduce this as soon as input profile in enabled, with any choice (either the embedded one, the one provide by RT, or any custom DCP), and with any raw file I could test (I tested Canon 400D, 5D, Fuji X20, X100s and X-T30).

So this seems to be related to either dcraw, camconst.json handling or dcp processing in general?

This is supported by the fact that the White Balance module seems to be well-behaved. Picking a grey value from a shot of a ColorChecker card doesn't cause any wrong temperature or tint values.

@heckflosse

Do your thumbnails look correct? Here they do not, at least not all...

They all look wrong.

I've been playing around with my i386 Qemu VM:

  • -cpu pentium3: :heavy_check_mark:
  • -cpu Nehalem: :heavy_check_mark:
  • -cpu SandyBridge: :x:
  • -cpu SandyBridge,-avx: :heavy_check_mark:

Further tests with AVX CPU:

  • -mno-avx: :x:
  • -mprefer-avx128: :x:
  • -mprefer-vector-width=128: :x:

I checked the -mno-avx disassembly: no AVX instructions, indeed. But the build was done with -DPROC_TARGET_NUMBER=2 (-march=native). Without -DPROC_TARGET_NUMBER=2:

  • -mno-avx: :heavy_check_mark:

Here's the i386 disassembly of Thumbnail::init() with -march=native -mno-avx on a CPU without AVX:

init-no-avx.txt

And the same on a CPU with AVX:

init-avx.txt

What else shall I test?

Thanks for working on this. I'm also getting a pink/red color cast when I compile it on Windows 10. It doesn't matter if it's dev or releases.

Version: 5.8-430-g5db57952e
Branch: dev
Commit: 5db57952e
Commit date: 2020-05-13
Compiler: gcc 10.1.0
Processor: undefined
System: Windows
Bit depth: 64 bits
Gtkmm: V3.24.2
Lensfun: V0.3.2.0
Build type: release
Build flags: -std=c++11 -march=native -Werror=unused-label -Werror=delete-incomplete -fno-math-errno -Wall -Wuninitialized -Wcast-qual -Wno-deprecated-declarations -Wno-unused-result -Wunused-macros -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG -ftree-vectorize
Link flags: -march=native
OpenMP support: ON
MMAP support: ON
Build OS: MINGW64_NT-10.0-18363 3.0.7-338.x86_64 x86_64
Build date: Wed, 13 May 2020 14:08:23 +0000 UTC
Build epoch: 1589378903
Build UUID: b8d31e1f-7737-47d9-adf5-b9bdde5d153f

Bisecting showed that it's caused by -ftree-loop-vectorize. Imho it's a bug in gcc 10. I reverted to gcc 9.3...

Heisenbug all right. If you make the following change and select 'Camera standard' as profile, the colors are normal (on my Windows machine at least).

```diff
diff --git a/rtengine/rawimagesource.cc b/rtengine/rawimagesource.cc
index f94070f2d..74e66b9c9 100644
--- a/rtengine/rawimagesource.cc
+++ b/rtengine/rawimagesource.cc
@@ -1142,6 +1142,7 @@ int RawImageSource::load (const Glib::ustring &fname, bool firstFrameOnly)
for (int j = 0; j < 3; j++)
for (int k = 0; k < 3; k++) {
imatrices.xyz_cam[i][j] += xyz_sRGB[i][k] * imatrices.rgb_cam[k][j];

  • printf("%g %g %g\n",imatrices.xyz_cam[i][j],xyz_sRGB[i][k],imatrices.rgb_cam[k][j]);
    }
 camProfile = ICCStore::getInstance()->createFromMatrix (imatrices.xyz_cam, false, "Camera");

Interestingly enough, again xyz_sRGB is involved, which is a constexpr double[][]. I already converted it to a classic static const double[][] but that didn't do the trick.

Edit: It's basically the same computation as in Thumbnail::init(), isn't it?

@heckflosse Confirmed with -fno-tree-loop-vectorize.

There seem to have been only a few commits to the GCC code directly related to the tree vectorization since GCC9.3 was released in the beginning of March:
https://github.com/gcc-mirror/gcc/commits/master/gcc/tree-vectorizer.c
https://github.com/gcc-mirror/gcc/commits/master/gcc/tree-vectorizer.h

No idea if this helps us finding a possible bug in our own code, or a place where we do not conform the specs somehow.

Quick question. How does one downgrade to to GCC 9 on Windows 10 msys64? I also want to add it to the ignore packages list for now.

I can't seem to figure out how to do it. I've tried a lot of pacman commands that are similar to Archlinux commands, but can't seem to get it to work.

^ Thanks a lot!

To prevent this issue from causing trouble for unfamiliar users, @agriggio put in some safeguards in CMakeListst.txt in ART, which I took the liberty of using for us as well

Just for information: Processing time of for example Capture Sharpening a 26 MP file on my machine increased from 670 to 1130 ms when using -fno-tree-loop-vectorize

IMHO there's no other way than reducing the testcase and to open an issue on the GCC bugzilla. As I already had that pleasure (and I mean it: they were very friendly and helpful) maybe someone else volunteers this time. I'll be mostly AFK next week.

@Floessie I will try that

@heckflosse Creating the bugzilla account takes some time as it's moderated. Let's hope this time it's really a compiler bug...

@Floessie

Let's hope this time it's really a compiler bug...

If I change

    for (int i = 0; i < 3; i++)
        for (int j = 0; j < 3; j++)
            for (int k = 0; k < 3; k++) {
                imatrices.xyz_cam[i][j] += xyz_sRGB[i][k] * imatrices.rgb_cam[k][j];
            }

to

    for (int k = 0; k < 3; k++)
        for (int i = 0; i < 3; i++)
            for (int j = 0; j < 3; j++) {
                imatrices.xyz_cam[i][j] += xyz_sRGB[i][k] * imatrices.rgb_cam[k][j];
            }

which should do exactly the same, it works fine.

Edit: I just detected that it works fine for some files, but not for all...

This may be relevant https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93334
@heckflosse The code just before the triple loop has a memset to zero the array.

Just for curiosity I tried this variant (without memset, should do exactly the same as the current code) and get the same color cast

    for (int i = 0; i < 3; i++) {
        for (int j = 0; j < 3; j++) {
            imatrices.xyz_cam[i][j] = xyz_sRGB[i][0] * imatrices.rgb_cam[0][j] + xyz_sRGB[i][1] * imatrices.rgb_cam[1][j] + xyz_sRGB[i][2] * imatrices.rgb_cam[2][j];
        }
    }

Here's a simple example code to reproduce the bug.

#include <iostream>

constexpr double xyz_sRGB[3][3] = {
    {0.4360747,  0.3850649, 0.1430804},
    {0.2225045,  0.7168786,  0.0606169},
    {0.0139322,  0.0971045,  0.7141733}
};

int main() {
    double rgb_cam[3][3] = {{1.0, 2.0, 3.0}, {4.0, 5.0, 6.0}, {7.0, 8.0, 9.0}};
    double xyz_cam[3][3] = {{0.0, 0.0, 0.0}, {0.0, 0.0, 0.0}, {0.0, 0.0, 0.0}};


    for (int i = 0; i < 3; i++)
        for (int j = 0; j < 3; j++)
            for (int k = 0; k < 3; k++) {
                xyz_cam[i][j] += xyz_sRGB[i][k] * rgb_cam[k][j];
            }

        for (int i = 0; i < 3; i++)
            for (int j = 0; j < 3; j++) {
                std::cout << xyz_cam[i][j] << std::endl;
            } 
return 0;
}

works fine using gcc 9.3 -O3, gcc10.1 -O2, but not using gcc10.1 -O3

I removed the RT bug label, because it's not a bug in RT code and we already have a fix (though slowing down processing) for gcc 10.1 builds.

Richard fixed it, but godbolt hasn't yet caught up.

I don't completely understand their commit message.

This fixes the case where we try to fold a read from an
array initalizer and happen to cross the boundary of
multiple CTORs which isn't really supported. For the
interesting cases like the testcase we actually handle
the folding by encoding the whole initializer.

Where do the multiple CTORs come from? Are we doing something that "isn't really supported"?

but godbolt hasn't yet caught up.

Now it has

Are we doing something that "isn't really supported

My guess is that they were doing something which isn't supported by the vectorizer, and also that they wouldn't have fixed the bug so quickly if we did something unsupported.

@Thanatomanic

Do not recurse to folding a CTOR that does not fully cover the asked for object.

Seems to be something along the line like filling 9 values into three buckets of four values (in our example) or six values into two buckets (in Richard's example).

As gcc 10.2 is announced for this week, I enabled -ftree-loop-vectorize for gcc versions != 10.1

I built with gcc 10.2 on msys2. Works fine. Closing as fixed \o/

Was this page helpful?
0 / 5 - 0 ratings

Related issues

heckflosse picture heckflosse  路  5Comments

Floessie picture Floessie  路  5Comments

heckflosse picture heckflosse  路  5Comments

jade-nl picture jade-nl  路  4Comments

elecprog picture elecprog  路  5Comments