I get a SIGBUS error as soon as I enable the Wavelet tool on the following image:
http://rawtherapee.com/shared/test_images/hdrmerge_045.dng
when using the attached pp3 as a starting point.
I'm including also the output of bt full. This is on a Debug build of the current dev (git changeset id: 81917406d5968aa2ab0c40d3856a874d627d832b). AboutThisBuild reports:
Compiler: gcc-6 6.2.0
Processor: x86_64
System: Linux
Bit depth: 64 bits
Gtkmm: V3.18.0
Build type: Debug
Build flags: -fdiagnostics-color=always -std=c++11 -march=native -Werror=unused-label -g
Link flags: -march=native
OpenMP support: OFF
MMAP support: ON
Works fine here
Version: 5.0-r1-gtk3-299-g8191740
Branch: dev
Commit: 8191740
Commit date: 2017-03-27
Compiler: gcc 5.3.0
Processor: undefined
System: Windows
Bit depth: 64 bits
Gtkmm: V3.18.0
Build type: Release
Build flags: -std=c++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -O3 -DNDEBUG
Link flags: -march=native
OpenMP support: ON
MMAP support: ON
Can't reproduce
Version: 5.0-r1-gtk3-299-g81917406
Branch: dev
Commit: 81917406
Commit date: 2017-03-27
Compiler: gcc 6.3.0
Processor: undefined
System: Windows
Bit depth: 64 bits
Gtkmm: V3.22.0
Build type: release
Build flags: -mwin32 -m64 -mthreads -msse2 -std=c++11 -march=native -Werror=unused-label -fopenmp -Werror=unknown-pragmas -mwindows -Wno-aggressive-loop-optimizations -DNDEBUG -O3
Link flags: -m64 -mthreads -static-libgcc -march=native -mwindows -s -O3
OpenMP support: ON
MMAP support: ON
@heckflosse @TooWaBoo thanks for testing. I see you are both on windows, so it's not so strange it can't be reproduced. It's probably a memory corruption (perhaps an OOB access?) where exactly it gets triggered can change on different platforms. I'll also add that on a Release build I had to work harder to get the crash, and didn't manage to do so in a reliable/reproducible manner. But with my current debug build, it happens consistently
@agriggio I will run it through valgrind
No crash here on Linux.
chprovBuffer = <error reading variable chprovBuffer (Cannot access memory at address 0x80000cc662c0)>
Strange address.
==9289== Thread 7:
==9289== Use of uninitialised value of size 8
==9289== at 0xE42858: rtengine::ImProcFunctions::ip_wavelet(rtengine::LabImage*, rtengine::LabImage*, int, rtengine::procparams::WaveletParams const&, rtengine::WavCurve const&, rtengine::WavOpacityCurveRG const&, rtengine::WavOpacityCurveBY const&, rtengine::WavOpacityCurveW const&, rtengine::WavOpacityCurveWL const&, LUT<float>&, bool, int) [clone ._omp_fn.0] (ipwavelet.cc:1084)
==9289== by 0x93679CE: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==9289== by 0xE35F7B: rtengine::ImProcFunctions::ip_wavelet(rtengine::LabImage*, rtengine::LabImage*, int, rtengine::procparams::WaveletParams const&, rtengine::WavCurve const&, rtengine::WavOpacityCurveRG const&, rtengine::WavOpacityCurveBY const&, rtengine::WavOpacityCurveW const&, rtengine::WavOpacityCurveWL const&, LUT<float>&, bool, int) (ipwavelet.cc:669)
==9289== by 0xCF2941: rtengine::ImProcCoordinator::updatePreviewImage(int, rtengine::Crop*) (improccoordinator.cc:687)
==9289== by 0xCF5E30: rtengine::ImProcCoordinator::process() (improccoordinator.cc:1309)
==9289== by 0xCF9400: sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>::operator()() const (in /home/ingo/rt5/build2/debug/rawtherapee)
==9289== by 0xCF912D: sigc::adaptor_functor<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator> >::operator()() const (adaptor_trait.h:251)
==9289== by 0xCF8D10: sigc::internal::slot_call0<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>, void>::call_it(sigc::internal::slot_rep*) (slot.h:103)
==9289== by 0x5CD7A4C: ??? (in /usr/lib/x86_64-linux-gnu/libglibmm-2.4.so.1.3.0)
==9289== by 0x59FBF04: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4002.0)
==9289== by 0x979D181: start_thread (pthread_create.c:312)
==9289== by 0x9AAD47C: clone (clone.S:111)
==9289==
==9289== Use of uninitialised value of size 8
==9289== at 0x9367990: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==9289== by 0x43370300423E38FF: ???
==9289== by 0x93679CE: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==9289== by 0xE35F7B: rtengine::ImProcFunctions::ip_wavelet(rtengine::LabImage*, rtengine::LabImage*, int, rtengine::procparams::WaveletParams const&, rtengine::WavCurve const&, rtengine::WavOpacityCurveRG const&, rtengine::WavOpacityCurveBY const&, rtengine::WavOpacityCurveW const&, rtengine::WavOpacityCurveWL const&, LUT<float>&, bool, int) (ipwavelet.cc:669)
==9289== by 0xCF2941: rtengine::ImProcCoordinator::updatePreviewImage(int, rtengine::Crop*) (improccoordinator.cc:687)
==9289== by 0xCF5E30: rtengine::ImProcCoordinator::process() (improccoordinator.cc:1309)
==9289== by 0xCF9400: sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>::operator()() const (in /home/ingo/rt5/build2/debug/rawtherapee)
==9289== by 0xCF912D: sigc::adaptor_functor<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator> >::operator()() const (adaptor_trait.h:251)
==9289== by 0xCF8D10: sigc::internal::slot_call0<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>, void>::call_it(sigc::internal::slot_rep*) (slot.h:103)
==9289== by 0x5CD7A4C: ??? (in /usr/lib/x86_64-linux-gnu/libglibmm-2.4.so.1.3.0)
==9289== by 0x59FBF04: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4002.0)
==9289== by 0x979D181: start_thread (pthread_create.c:312)
==9289==
==9289== Use of uninitialised value of size 8
==9289== at 0xE42858: rtengine::ImProcFunctions::ip_wavelet(rtengine::LabImage*, rtengine::LabImage*, int, rtengine::procparams::WaveletParams const&, rtengine::WavCurve const&, rtengine::WavOpacityCurveRG const&, rtengine::WavOpacityCurveBY const&, rtengine::WavOpacityCurveW const&, rtengine::WavOpacityCurveWL const&, LUT<float>&, bool, int) [clone ._omp_fn.0] (ipwavelet.cc:1084)
==9289== by 0x93679CE: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==9289== by 0xE35F7B: rtengine::ImProcFunctions::ip_wavelet(rtengine::LabImage*, rtengine::LabImage*, int, rtengine::procparams::WaveletParams const&, rtengine::WavCurve const&, rtengine::WavOpacityCurveRG const&, rtengine::WavOpacityCurveBY const&, rtengine::WavOpacityCurveW const&, rtengine::WavOpacityCurveWL const&, LUT<float>&, bool, int) (ipwavelet.cc:669)
==9289== by 0xBF2136: rtengine::Crop::update(int) (dcrop.cc:911)
==9289== by 0xCF3575: rtengine::ImProcCoordinator::updatePreviewImage(int, rtengine::Crop*) (improccoordinator.cc:791)
==9289== by 0xCF5E30: rtengine::ImProcCoordinator::process() (improccoordinator.cc:1309)
==9289== by 0xCF9400: sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>::operator()() const (in /home/ingo/rt5/build2/debug/rawtherapee)
==9289== by 0xCF912D: sigc::adaptor_functor<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator> >::operator()() const (adaptor_trait.h:251)
==9289== by 0xCF8D10: sigc::internal::slot_call0<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>, void>::call_it(sigc::internal::slot_rep*) (slot.h:103)
==9289== by 0x5CD7A4C: ??? (in /usr/lib/x86_64-linux-gnu/libglibmm-2.4.so.1.3.0)
==9289== by 0x59FBF04: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4002.0)
==9289== by 0x979D181: start_thread (pthread_create.c:312)
==9289==
==9289== Use of uninitialised value of size 8
==9289== at 0x9367990: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==9289== by 0x43739980437DE73F: ???
==9289== by 0x93679CE: GOMP_parallel (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==9289== by 0xE35F7B: rtengine::ImProcFunctions::ip_wavelet(rtengine::LabImage*, rtengine::LabImage*, int, rtengine::procparams::WaveletParams const&, rtengine::WavCurve const&, rtengine::WavOpacityCurveRG const&, rtengine::WavOpacityCurveBY const&, rtengine::WavOpacityCurveW const&, rtengine::WavOpacityCurveWL const&, LUT<float>&, bool, int) (ipwavelet.cc:669)
==9289== by 0xBF2136: rtengine::Crop::update(int) (dcrop.cc:911)
==9289== by 0xCF3575: rtengine::ImProcCoordinator::updatePreviewImage(int, rtengine::Crop*) (improccoordinator.cc:791)
==9289== by 0xCF5E30: rtengine::ImProcCoordinator::process() (improccoordinator.cc:1309)
==9289== by 0xCF9400: sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>::operator()() const (in /home/ingo/rt5/build2/debug/rawtherapee)
==9289== by 0xCF912D: sigc::adaptor_functor<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator> >::operator()() const (adaptor_trait.h:251)
==9289== by 0xCF8D10: sigc::internal::slot_call0<sigc::bound_mem_functor0<void, rtengine::ImProcCoordinator>, void>::call_it(sigc::internal::slot_rep*) (slot.h:103)
==9289== by 0x5CD7A4C: ??? (in /usr/lib/x86_64-linux-gnu/libglibmm-2.4.so.1.3.0)
==9289== by 0x59FBF04: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4002.0)
I don't understand how these variables can be uninitialised. I'll make a new valgrind run with --track-origins=yes
@heckflosse did you try disabling openmp? Perhaps the debugging becomes a bit easier with less things going on in parallel... (it usually is for me at least)
Would a Coverity build help? I could make a new one, but I got the feeling that the last few weren't used...
I wasn't aware we could use Coverity tools -- I'd certainly take a look (time permitting).
I'll also try valgrind on my debug build.
when running through valgrind I could not reproduce it... :frowning_face:
This is one of the inexplicable errors that we often attribute to the compiler. Believe it or not, but this fixes it:
diff --git a/rtengine/ipwavelet.cc b/rtengine/ipwavelet.cc
index 882dd8cd..4bb559b4 100644
--- a/rtengine/ipwavelet.cc
+++ b/rtengine/ipwavelet.cc
@@ -1045,8 +1045,8 @@ SSEFUNCTION void ImProcFunctions::ip_wavelet(LabImage * lab, LabImage * dst, int
if(numtiles > 1 || (numtiles == 1 /*&& cp.avoi*/)) {//in all case since I add contrast curve
//calculate mask for feathering output tile overlaps
- float Vmask[height + overlap] ALIGNED16;
- float Hmask[width + overlap] ALIGNED16;
+ std::vector<float> Vmask(height + overlap);
+ std::vector<float> Hmask(width + overlap);
if(numtiles > 1) {
for (int i = 0; i < height; i++) {
@heckflosse Removing ALIGNED16 alone didn't help. Maybe you come up with a better solution. My guess is that there's something wrong with the stack frame calculation. We should have a look at the assembly. wavclCurve is still intact right before the if, and it's broken after the float arrays on the stack. No wonder that Alberto got a SIGBUS.
Debugging it caused me quite a headache. Valgrind is grindingly slow.
HTH,
Flössie
great job @Floessie! I wonder whether the c++11 aligned memory facilities can be of any help here in case they are needed for performance?
@agriggio My understanding is, that they are based on the same mechanism as ALIGNED16 under the hood. Actually, I don't quite understand, why ALIGNED16 should be needed in this place other than to enable better SSE code by the compiler.
Nevertheless, omitting ALIGNED16 didn't help, so it's not really a matter of the attribute.
@Floessie Maybe we ran simply ran out of stack space?
@heckflosse Is it 2MB on Linux? Well, I tested with -fstack-protector-all but that didn't find anything, so I guess, we can rule it out...
@Floessie @agriggio
We don't need the ALIGNED16 in this place, but it also should not hurt to enable better SSE code by the compiler.
@heckflosse any reason to use stack based arrays here? is there really a measurable performance difference in this context?
@agriggio No, not in this context. Feel free to change it :+1:
So, this should give you 16-byte aligned arrays allocated on the heap:
std::unique_ptr<char[]> Vmask_mem(new char[sizeof(float)/sizeof(char) * (height + overlap) + (16-1)]);
std::unique_ptr<char[]> Hmask_mem(new char[sizeof(float)/sizeof(char) * (width + overlap) + (16-1)]);
float *Vmask = reinterpret_cast<float *>((reinterpret_cast<uintptr_t>(Vmask_mem.get()) + 15) & ~uintptr_t(0x0F));
float *Hmask = reinterpret_cast<float *>((reinterpret_cast<uintptr_t>(Hmask_mem.get()) + 15) & ~uintptr_t(0x0F));
(see e.g. https://stackoverflow.com/questions/227897/how-to-allocate-aligned-memory-only-using-the-standard-library)
However, I'd take @Floessie's patch, unless there is a measurable performance difference.
Note also that there are other parts of the wavelet code that use stack-allocated arrays, which might be prone to the same failures (maybe... I haven't really checked, just pointing this out).
@agriggio It's much simpler with std::align().
Note also that there are other parts of the wavelet code that use stack-allocated arrays, which might be prone to the same failures (maybe... I haven't really checked, just pointing this out).
As you're the only one who can see the SIGBUS in gdb, why not dig a bit deeper? Set a breakpoint on the allocation line and print wavclCurve, then step after the allocation and print it again. See if &wavclCurve changed in between.
@heckflosse I also tried ASAN and gdb, but can't see it there. Pretty mysterious. For my tests I #undefed __SSE2__, _RT_NESTED_OPENMP, and _OPENMP in ipwavelet.cc, but still there can be other threads. wavclCurve isn't changed concurrently and should be passed as const LUTf&, btw. wavcontlutili isn't used at all and there are unused instances (LUTs e.g.) on the path to ip_wavelet() that should be cleaned. Ever tried a -Wall build? It's horrific.
HTH,
Flössie
@Floessie thanks for the std::align pointer, but I'm actually not sure how to use it correctly :-) Do you have an example?
Regarding debugging, I can try to do that, but don't hold your breath waiting for me, these are busy days... :-/
@agriggio Scratch that. Search for AlignedBuffer in the code. :smile:
@Floessie I will try to make ipwavelet.cc -Wall-clean
@Floessie :+1:, this should do then:
AlignedBuffer<float> Vmask_buf(height + overlap);
AlignedBuffer<float> Hmask_buf(width + overlap);
float *Vmask = Vmask_buf.data;
float *Hmask = Hmask_buf.data;
@agriggio AFAIK, AlignedBuffer has an operator [](), so it's just
AlignedBuffer<float> Vmask(height + overlap);
AlignedBuffer<float> Hmask(width + overlap);
But again: The alignment is not the culprit here. And another thought: How is the compiler supposed to know, that the buffer is well aligned for SSE? Whenever we use an AlignedBuffer it's because we use it explicitly for SSE code...
@heckflosse
I will try to make ipwavelet.cc
-Wall-clean
:+1:
For what it's worth: My -Wall comment was targeted at the codebase as a whole. Just try it: You can spend a whole day just reading all those warnings. :wink:
Maybe we ran simply ran out of stack space?
Does anyone know, how to check the stack space in GDB? I'd rather like to know than to rely on -fstack-protector...
The alignment is not the culprit here.
hi @Flossie, that's exactly my point :-) I was trying to preserve the
previous alignment and only move the array to the heap... I have no idea
though whether this gives any measurable improvement, and that's why I was
suggesting to take your initial solution with std::vector. but
alignedbuffer is equally concise, so both are fine. just please get rid of
that stack allocation, because it leads to troubles, at least on my machine
with my compiler...
oh btw on Linux the stack size is not fixed at compile time, you can change
it with ulimit. I'll try to see if it makes a difference, just to confirm
that it is the culprit
I checked, but it doesn't. E.g:
../rtengine/ipwavelet.cc:1056:34: error: no match for ‘operator[]’ (operand types are ‘AlignedBuffer<float>’ and ‘int’)
Vmask[i] = 1;
We could easily add one though...
Also, enlarging the stack doesn't help :-(
I'm now replacing all stack-allocated arrays in ipwavelet.cc with vectors
(or AlignedBuffers), I'll let you know how that goes.
I've added a new branch ipwavelet-heap-buffers. It seems to work here. Please have a look at the code whenever you can, thanks!
Code-wise it's fine. But I have the gut feeling that it's not supportive for performance. @heckflosse?
We still don't know, if it's really a stack overflow, though.
Honestly, I'd be very surprised if this had a measurable performance impact. The ipwavelet code is already using a bunch of other heap-allocated arrays anyway... But I agree to hear @heckflosse 's take on this!
Some tips for debugging:
diff --git a/rtengine/ipwavelet.cc b/rtengine/ipwavelet.cc
index 882dd8cd..94f2c1ec 100644
--- a/rtengine/ipwavelet.cc
+++ b/rtengine/ipwavelet.cc
@@ -55,6 +55,10 @@
#define epsilon 0.001f/(TS*TS) //tolerance
+#undef __SSE2__
+#undef _RT_NESTED_OPENMP
+#undef _OPENMP
+
namespace rtengine
{
@@ -1048,6 +1052,10 @@ SSEFUNCTION void ImProcFunctions::ip_wavelet(LabImage * lab, LabImage * dst, int
float Vmask[height + overlap] ALIGNED16;
float Hmask[width + overlap] ALIGNED16;
+ if(wavclCurve && cp.finena) {
+ printf("ok\n");
+ }
+
if(numtiles > 1) {
for (int i = 0; i < height; i++) {
Vmask[i] = 1;
break ipwavelet.cc:1051print &wavclCurvewavclCurve again. It should be different than before.wavclCurve: watch *0x55555cb7a6d0HTH,
Flössie
@agriggio @Floessie
We should avoid allocation inside of loops I guess. I commented here
There are a lot of places with this stack-allocations of line buffers.
If you like, I can fix them all (because I introduced them) ;-)
@agriggio @heckflosse Did you see, that the right side of the image is broken in 1:1 view?
@Floessie Good catch! It's broken even with neutral profile.
@Floessie @agriggio
I exported the file twice in queue as 16-bit tiff and checked for differences. There are no differences.
That means it is very likely (I would say >99.9%) that the broken part is not caused by processing in rt but is already in dng file. And as it's a float dng file and code to read that file checks for NaN we probably can exclude NaN in dng file as well.
@heckflosse :+1: I'll move the allocations
@Floessie what do you mean the image is "broken"? do you have a screenshot? I'm not seeing anything strange...
@agriggio

ah, ok. I had played a bit more with the pp3 in the meantime, enabling straightening among other things. So those pixels were cut out from my view...
@Floessie
For what it's worth: My -Wall comment was targeted at the codebase as a whole. Just try it: You can spend a whole day just reading all those warnings
Yes, it's really bad. I will start a task now with the goal to make one compilation unit per day -Wall-clean.
I will open a new Issue for that.
I am not the one
On Thu, Mar 30, 2017 at 7:41 AM, Ingo Weyrich notifications@github.com
wrote:
@Floessie https://github.com/Floessie
For what it's worth: My -Wall comment was targeted at the codebase as a
whole. Just try it: You can spend a whole day just reading all those
warningsYes, it's really bad. I will start a task now with the goal to make one
compilation unit per day -Wall-clean.
I will open a new Issue for that.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/Beep6581/RawTherapee/issues/3785#issuecomment-290432253,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFC4Pb-OSJrXsPNSndnEtNmTeYrRcL1Tks5rq78QgaJpZM4MqNZl
.
@flossie Hi! No, you must be my swedish gal pal from the 70s. :grin:
Joking apart: Sorry you have to follow this thread now, but @agriggio mistakenly mentioned you, when he wanted to mention me. Just ignore the mails...
maybe try not to mention me, I got these emails many times
On Thu, Mar 30, 2017 at 1:03 PM, Floessie notifications@github.com wrote:
@flossie https://github.com/flossie Hi! No, you must be my swedish gal
pal from the 70s. 😁Joking apart: Sorry you have to follow this thread now, but @agriggio
https://github.com/agriggio mistakenly mentioned you, when he wanted to
mention me. Just ignore the mails...—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/Beep6581/RawTherapee/issues/3785#issuecomment-290528580,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFC4PYoM9ArtPGqd6PoMLeuuRG7qlv1nks5rrAqdgaJpZM4MqNZl
.
@flossie hello, the emails have a link at the bottom to mute the thread.
oh I see, it's not my fault, and I need to try to do something?
and this is not the only thread, I was evolved in multiple threads before
@flossie sorry, it was a typo of mine while replying from the phone. Now you are marked as subscribed to the thread, and I have no way of removing you (or at least, I don't know how to do it). You have to do it yourself I'm afraid. And sorry for the inconvenience.
@agriggio You should mention @ghost now :grin:
Candidate fix in 14f544f
See also the last comment in #3789
Most helpful comment
@agriggio You should mention @ghost now :grin: