Rawtherapee: Fattal

Created on 4 Nov 2017  ·  111Comments  ·  Source: Beep6581/RawTherapee

Parameters

Regarding the slider names, I will quote from http://osp.wikidot.com/parameters-for-photographers

Alpha (Default 0.1)

The parameter alpha is a threshold: details whose derivative is less than alpha are amplified, those whose derivative is greater than alpha are decreased. In other words, decreasing alpha the quantity of details that are made more evident is decreased. Increasing alpha will make the details more evident, but the picture will look more unrealistic and noisy (because noise will be amplified as well). The authors said to have used the value of 0.1 in each test with best results.

Beta (Default 0.8)

The parameter beta expresses how much the algorithm will be effective. Setting beta=1, the algorithm will perform no operation on the HDRI, there will be only a linear shrink of the dynamics (i.e. you are just making a gamma correction with gamma=1). Decreasing beta you will increase the effectiveness of the algorithm, in other words you will increase the compression of the dynamics making the details (and noise) much more evident. Low values of beta will create a very bright image with clear details and an unrealistic look, choosing a beta close to 1 you will get more realistic effects. The authors said to have used values between 0.8 and 0.9 with good results.

If we were to rename the sliders, I would suggest "Threshold" and "Amount".

Good defaults for us seem to be alpha = 1 and beta = 0.8.

The code would need to be double-checked first to make sure that alpha and beta still do what the docs say they do. I presume the above was written based on the paper by Fattal, Lischinski and Werman, whereas our code was originally written by Grzegorz Krawczyk for pfstmo, then underwent changes by several generations of Luminance HDR developers, most recently by @fcomida

Gradient Domain High Dynamic Range Compression: http://www.cs.huji.ac.il/~danix/hdr/

Position in the Pipeline

The output of this tool seems to be normalized or balanced in some way. In it's current position in the pipeline this is a problem because it results in images which are generally too bright, and mitigating that using the Exposure or L*a*b* tools is not possible.

imgur-2017_11_04-14 26 33

Notice the highlight compression and curve:
imgur-2017_11_04-14 26 47

The only way I found to control the overexposure is by using the Wavelets > Final Touchup "After" contrast curve, but that really is not the way this should work.
imgur-2017_11_04-14 28 07

Samples

I uploaded some high dynamic range samples to help in testing:
http://rawtherapee.com/shared/test_images/great_bernera_1.hdr.dng
http://rawtherapee.com/shared/test_images/great_bernera_2.hdr.dng
http://rawtherapee.com/shared/test_images/karin_library_1.hdr.dng
http://rawtherapee.com/shared/test_images/karin_library_2.hdr.dng
http://rawtherapee.com/shared/test_images/soderasen_1.hdr.dng
http://rawtherapee.com/shared/test_images/soderasen_2.hdr.dng
http://rawtherapee.com/shared/test_images/skane_1.hdr.dng
http://rawtherapee.com/shared/test_images/london_bridge_moving_1-3.hdr.dng

enhancement

All 111 comments

@Beep6581 thanks for the docs! I've just pushed a changeset that moves Fattal earlier in the pipeline, before RGB processing and right after distortion/perspective/ca correction. Now the sliders and curves should behave as expected (if not, please let me know).

I agree about changing the names, but if we go for "Beta => Amount" I also suggest that in the GUI we expose 1/beta, so that increasing it will increase the amount... :slightly_smiling_face:

P.S: Fattal is by far the best tone-mapping operator that I have seen. Kudos to the LuminanceHDR guys!

@Beep6581 @agriggio Since we already use Strength here and there in other tools, could we use it instead of Amount ?

Since Amount (strength) and 1/beta does not refere to anything (other than what is clearly visible as being fill light, in photographic term), I don't think we need to annoy users with 1/beta in the label.

Btw, I'm ready to commit a patch for the GUI, I'm just waiting for the answer to my question.

@Hombre57 I was suggesting to implement amount/strength as 1/beta, not to show this label to users :-)
otherwise, I would find it very counterintuitive that to increase the amount you need to move the slider to the right...

sorry, meant "to the left". I have a problem distinguishing between left and right (seriously, I think this syndrome has even a name...)

I recently implemented global post-saturation and post-gamma. I'm open for
renaming parameters in GUI and changing their behavior (if a parameter is
named contrast, users intuitively expect increasing 'contrast' yields an
image with a stronger contrast, obviously).
Since LHDR is out there for many years there are users accustomed to the
old behavior though, probably with many pre-sets...

@Beep6581 @fcomida I used Amount (Beta) finally. No need to change anything in LHDR, however RT's documentation can explain that the most visible effect of pulling down Beta is to relighten dark areas, hence the _fill light_ terminology. We are accused to produce a software for scientists and engineers, not for photographers because of terminology, we have to explain things for several audience.

One more test image (not meant to be nice, but to test sunset) : https://filebin.net/4jflbkhd4yuybxf8

Do not put in test_images.

Cool to see the addition of this TMO to RT!

@agriggio Short notice about the crash caused by using fftw in more than one thread and the lock: Maybe we need to use the same lock we use in denoise also in fattal to avoid fftw being called in denoise while it is called in fattal or vice versa. Otherwise we may get crash situations which are very hard to detect...

Because of fftw usage in fattal I would like to suggest this tests before merging fattal into dev:

1) Apply fattal to all files of a folder. Close rt and start rt. Should not crash.

2) Put all the fattal files to queue, start the queue, while queue is processing, open fattal files. Should not crash

3) Put all the fattal files in queue. Enable Denoise and also Luminance Denoise inside of Denoise (this should call fftw in denoise) for a file. Open this file in editor, zoom to 100%, start the queue, go back to editor and pan around. Should not crash.

Ingo

@heckflosse
I did the tests above, and all 3 passed fine.
I just had one segfault in 2., while I was zooming in/out on a file and turning fattal on/off while the queue was processing. Unfortunately I had verbose=false so I didn't see any additional information. I tried to trigger the crash again but couldn't so far.

@sguyader Thanks for testing!!!

@agriggio Please don't waste time on speeding up findMaxMinPercentile. I've an about 100x speedup for this function on hold.

@agriggio I pushed the findMaxMinPercentile speedup

@agriggio
I am absent 4 days, and when I come back there is a new interesting feature in RT. Alberto is shooting faster than his shadow. :)

I tested on several files:

  • overall this version of TM is less aggressive than that already present in RT and gives more natural results.
  • what surprises a little is that: alpha = 0 and beta = 0 there is an action, it is not very intuitive
  • more after many tests on the same images between "Tone mapping" and "HDR tone mapping" I found a slowdown of the computer, without crash, then can not stop normally, but it must be for the reasons mentioned in issue #4170

@heckflosse awesome! :+1:
@Desmis thanks for testing! three main motivation for the new tool is that it works much better (for me) with images with lots of dynamic range. regarding the 0,0 setting, I don't know why it has an effect, did you try on luminance HDR to see if it happens also there? (neutral would be 1.0,1.0 there). anyway I think it's not a big deal, and if you prefer we could just set a lower bound of 1,1 in the GUI s that people won't be surprised that it still has an effect :-)

@agriggio
I don't test with luminance HDR
If I set 1, 1, it's the same thing. For me no problem, we can easily enabled / disabled "HDR tone mapping". I think write in Rawpedia is enough :)

jacques

I don't experience a slow-down, but if you do, you can use GBD to attach to the already-running process gdb -p 123, stop it ctrl+c, and see what it's doing bt full.

@Desmis yes it is the same thing, bit since the sliders are not at 0 it should be less unintuitive that they still have an effect ;-) and I agree with you: if you want no effect at all, you can just disable the tool

I was totally unaware of this new branch and gave it a short try. First of all: 👍 I also like Fattal best in LHDR. Thanks, Alberto, for porting this over.

  • Although I have the latest commit with Ingo's optimization, it takes 6 to 7 seconds in 1:1 from moving one of the sliders to see a result here (albeit just a two-core VM). The processing bar shows 100% all the time.
  • Are you planning for further TMOs? If so, shouldn't that already be reflected in the PP3 param names and class interfaces. No idea how concretely, but having "fattal" on the top level in naming could hinder adding other TMOs, doesn't it?

Best,
Flössie

I reviewed the other LHDR TMOs again yesterday, you can find sample HDR images, intermediate LHDR output TIFFs and final JPEGs after tweaking in RT here: https://filebin.net/t8wlo4kvkglulf30

@Floessie, thanks for testing! Regarding your questions:

  • 1:1 preview is slow because Fattal is a global operator, so to have meaningful results you need to apply it to the full image. This means that even if you open a crop detail window, RT will work on the full 1:1 image if Fattal is enabled. There are some things that can be done to speed this up, but I'm a bit reluctant to invest time on this now (not ruling out the possibility completely, just unlikely) because I think that a proper mid-term solution is to rewrite and unify all the processing pipelines in RT; then, it would be much easier to optimize this kind of stuff (I'm not committing to a timeline for this though)

  • I personally have no interest in porting other TMOs. I tried filmic (not in LHDR) but it didn't do much for me. I like Fattal a lot more than the others that LHDR provides

@agriggio

There are some things that can be done to speed this up, but I'm a bit reluctant to invest time on this now

I don't know what you have in mind here, but we could at least cache some data to reuse between Preview and Crop (if not already done, didn't checked that yet). If you think it's too much data to store, we could make it optional in "Preferences / Performance".

PS : and if you don't have time to do it, I should be able to do it, though memory management guru here might have better technic.

@Floessie About internal names of TMO, in don't think it will be a problem to assign a different _root_ name to each TMO which will be independent tool anyway (if one ever want to add more, however this one is so effective that I personally don't need other). In such case, having a dedicated ToneMap tab could make sense in a GUI p.o.v. (not necessary actually)

@Hombre57

I don't think it will be a problem to assign a different root name to each TMO which will be independent tool anyway...

Okay, I thought the way to go would be to stick with the "HDR Tone Mapping" box and have a drop-down for the different TMOs. A dedicated tab feels a bit like an overkill to me. Just my 2¢.

I don't know what you have in mind here, but we could at least cache some data to reuse between Preview and Crop...

As I read "cache" here: If you're looking for a generic LRU cache, there is one in cache.h (shameless self-promotion 😁).

Okay, I thought the way to go would be to stick with the "HDR Tone Mapping" box and have a drop-down for the different TMOs

That would also be my expectation... except that I agree that we don't need more :-)

I don't know what you have in mind here, but we could at least cache some data to reuse between Preview and Crop...

we could do also that, but I didn't think about it. What I have in mind is a general restructuring of the pipeline, but this is a longer-term solution. For the specific issue here, what I had in mind is to use the full image only until Fattal is applied (which is relatively early in the pipeline), and then crop afterwards. In principle it's easy, but the code in dcrop.cc is a bit messy so doing this properly would require some care...

@Floessie

can you try to build with this #ifdefs removed:

RT_FFTW3F_OMP

The fftw3 lib on Windows works fine with multithreaded fftw even when the fftw3f_omp is not installed.
Maybe we can remove this condition also on Linux.

Maybe we can remove this condition also on Linux.

and then it will fail at link time on my machine :-)

I'll post a patch here in a moment, so you can test a more general solution.

@heckflosse Ingo can you try this?

diff --git a/CMakeLists.txt b/CMakeLists.txt
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -343,11 +343,30 @@
 endif()

 # check for libfftw3f_omp
+include(CheckCSourceCompiles)
 if(OPENMP_FOUND)
-    find_library(fftw3f_omp fftw3f_omp PATHS ${FFTW3F_LIBRARY_DIRS})
-    if(fftw3f_omp)
+    set(CMAKE_REQUIRED_INCLUDES ${FFTW3F_INCLUDE_DIRS})
+    set(CMAKE_REQUIRED_LIBRARIES)
+    foreach(l ${FFTW3F_LIBRARIES})
+        find_library(_f ${l} PATHS ${FFTW3F_LIBRARY_DIRS})
+        set(CMAKE_REQUIRED_LIBRARIES ${CMAKE_REQUIRED_LIBRARIES} ${_f})
+    endforeach()
+    check_c_source_compiles(
+"#include <fftw3.h>
+int main()
+{
+    fftwf_init_threads();
+    fftwf_plan_with_nthreads(1);
+    return 0;
+}" _fftw3f_multithread)
+    if(_fftw3f_multithread)
         add_definitions(-DRT_FFTW3F_OMP)
-        set(FFTW3F_LIBRARIES ${FFTW3F_LIBRARIES} ${fftw3f_omp})
+    else()
+        find_library(fftw3f_omp fftw3f_omp PATHS ${FFTW3F_LIBRARY_DIRS})
+        if(fftw3f_omp)
+            add_definitions(-DRT_FFTW3F_OMP)
+            set(FFTW3F_LIBRARIES ${FFTW3F_LIBRARIES} ${fftw3f_omp})
+        endif()
     endif()
 endif()

@agriggio Works fine with the patch.

@heckflosse great, feel free to commit. Also, could you please add a comment explaining the general idea behind your opitmized findMinMaxPercentile implementation? It's not completely obvious to me... (and that's an understatement ;-)

@agriggio I will commit your patch and some small speedups. I also will add a comment for findMinMaxPercentile. The general idea is derived from a histogram based median search. But I will explain it a bit more detailed in code.

@agriggio I will explain it also here a bit more detailed.

findMaxMinPercentile picks 2 values out of a large amount of values
minLum is the minPrct * size smallest value and
maxLum is the maxPrct * size smallest value

To do that, the original version sorted the input values and just picked at position minPrct * size and maxPrct * size.

If minPrct would be 0.5 that would be a median search for example.
To find the median of a large number of numbers a histogram based median search is one of the faster solutions (at least much faster than sorting the values and picking the middle one). Now, the principle of a histogram based median search can also be used for other values than 0.5. That's what I did ;-)

It does not give exactly the same values, but in my tests they have been very very close.

Edit: I will improve it a bit to get even closer

@heckflosse thanks!

@Floessie

Okay, I thought the way to go would be to stick with the "HDR Tone Mapping" box and have a drop-down for the different TMOs.

This is still possible if necessary. Lens geometry has such main folder but each sub-tool are defined in root namespace.

@agriggio I pushed your patch and a speedup for rescale_bilinear.
More small speedups will follow.

There's a problem in dcrop I guess, try this :

  1. open an image and enable Fattal tool
  2. set Amount to 20
  3. set zoom rate to "Fit to screen" (the whole image must be seen)
  4. disable the tool -> it doesn't refresh properly
  5. enable the tool again -> the effect is cumulated, while Navigator and thumbnail are ok

@heckflosse By reverting your last patch, it works fine.

@Hombre57 I think this is my fault. It seems that using an intermediate image was actually necessary after all, at least in some cases. I'll fix it ASAP.

@Hombre57 can you try now?

@heckflosse here's a patch to do median filtering only on the darkest pixels. Can you give it a look when you have time? Thanks!

diff --git a/rtengine/FTblockDN.cc b/rtengine/FTblockDN.cc
--- a/rtengine/FTblockDN.cc
+++ b/rtengine/FTblockDN.cc
@@ -74,8 +74,13 @@
 extern MyMutex *fftwMutex;


-void ImProcFunctions::Median_Denoise(float **src, float **dst, const int width, const int height, const Median medianType, const int iterations, const int numThreads, float **buffer)
+namespace {
+
+template <bool useUpperBound>
+void do_median_denoise(float **src, float **dst, float upperBound, const int width, const int height, const ImProcFunctions::Median medianType, const int iterations, const int numThreads, float **buffer)
 {
+    typedef ImProcFunctions::Median Median;
+    
     int border = 1;

     switch (medianType) {
@@ -156,13 +161,17 @@
             switch (medianType) {
                 case Median::TYPE_3X3_SOFT: {
                     for (; j < width - border; ++j) {
-                        medianOut[i][j] = median(
-                                              medianIn[i - 1][j],
-                                              medianIn[i][j - 1],
-                                              medianIn[i][j],
-                                              medianIn[i][j + 1],
-                                              medianIn[i + 1][j]
-                                          );
+                        if (!useUpperBound || medianIn[i][j] <= upperBound) {
+                            medianOut[i][j] = median(
+                                                  medianIn[i - 1][j],
+                                                  medianIn[i][j - 1],
+                                                  medianIn[i][j],
+                                                  medianIn[i][j + 1],
+                                                  medianIn[i + 1][j]
+                                              );
+                        } else {
+                            medianOut[i][j] = medianIn[i][j];
+                        }
                     }

                     break;
@@ -170,17 +179,21 @@

                 case Median::TYPE_3X3_STRONG: {
                     for (; j < width - border; ++j) {
-                        medianOut[i][j] = median(
-                                              medianIn[i - 1][j - 1],
-                                              medianIn[i - 1][j],
-                                              medianIn[i - 1][j + 1],
-                                              medianIn[i][j - 1],
-                                              medianIn[i][j],
-                                              medianIn[i][j + 1],
-                                              medianIn[i + 1][j - 1],
-                                              medianIn[i + 1][j],
-                                              medianIn[i + 1][j + 1]
-                                          );
+                        if (!useUpperBound || medianIn[i][j] <= upperBound) {
+                            medianOut[i][j] = median(
+                                                  medianIn[i - 1][j - 1],
+                                                  medianIn[i - 1][j],
+                                                  medianIn[i - 1][j + 1],
+                                                  medianIn[i][j - 1],
+                                                  medianIn[i][j],
+                                                  medianIn[i][j + 1],
+                                                  medianIn[i + 1][j - 1],
+                                                  medianIn[i + 1][j],
+                                                  medianIn[i + 1][j + 1]
+                                              );
+                        } else {
+                            medianOut[i][j] = medianIn[i][j];
+                        }
                     }

                     break;
@@ -188,21 +201,25 @@

                 case Median::TYPE_5X5_SOFT: {
                     for (; j < width - border; ++j) {
-                        medianOut[i][j] = median(
-                                              medianIn[i - 2][j],
-                                              medianIn[i - 1][j - 1],
-                                              medianIn[i - 1][j],
-                                              medianIn[i - 1][j + 1],
-                                              medianIn[i][j - 2],
-                                              medianIn[i][j - 1],
-                                              medianIn[i][j],
-                                              medianIn[i][j + 1],
-                                              medianIn[i][j + 2],
-                                              medianIn[i + 1][j - 1],
-                                              medianIn[i + 1][j],
-                                              medianIn[i + 1][j + 1],
-                                              medianIn[i + 2][j]
-                                          );
+                        if (!useUpperBound || medianIn[i][j] <= upperBound) {
+                            medianOut[i][j] = median(
+                                                  medianIn[i - 2][j],
+                                                  medianIn[i - 1][j - 1],
+                                                  medianIn[i - 1][j],
+                                                  medianIn[i - 1][j + 1],
+                                                  medianIn[i][j - 2],
+                                                  medianIn[i][j - 1],
+                                                  medianIn[i][j],
+                                                  medianIn[i][j + 1],
+                                                  medianIn[i][j + 2],
+                                                  medianIn[i + 1][j - 1],
+                                                  medianIn[i + 1][j],
+                                                  medianIn[i + 1][j + 1],
+                                                  medianIn[i + 2][j]
+                                              );
+                        } else {
+                            medianOut[i][j] = medianIn[i][j];
+                        }
                     }

                     break;
@@ -210,8 +227,7 @@

                 case Median::TYPE_5X5_STRONG: {
 #ifdef __SSE2__
-
-                    for (; j < width - border - 3; j += 4) {
+                    for (; !useUpperBound && j < width - border - 3; j += 4) {
                         STVFU(
                             medianOut[i][j],
                             median(
@@ -245,35 +261,38 @@
                     }

 #endif
-
                     for (; j < width - border; ++j) {
-                        medianOut[i][j] = median(
-                                              medianIn[i - 2][j - 2],
-                                              medianIn[i - 2][j - 1],
-                                              medianIn[i - 2][j],
-                                              medianIn[i - 2][j + 1],
-                                              medianIn[i - 2][j + 2],
-                                              medianIn[i - 1][j - 2],
-                                              medianIn[i - 1][j - 1],
-                                              medianIn[i - 1][j],
-                                              medianIn[i - 1][j + 1],
-                                              medianIn[i - 1][j + 2],
-                                              medianIn[i][j - 2],
-                                              medianIn[i][j - 1],
-                                              medianIn[i][j],
-                                              medianIn[i][j + 1],
-                                              medianIn[i][j + 2],
-                                              medianIn[i + 1][j - 2],
-                                              medianIn[i + 1][j - 1],
-                                              medianIn[i + 1][j],
-                                              medianIn[i + 1][j + 1],
-                                              medianIn[i + 1][j + 2],
-                                              medianIn[i + 2][j - 2],
-                                              medianIn[i + 2][j - 1],
-                                              medianIn[i + 2][j],
-                                              medianIn[i + 2][j + 1],
-                                              medianIn[i + 2][j + 2]
-                                          );
+                        if (!useUpperBound || medianIn[i][j] <= upperBound) {
+                            medianOut[i][j] = median(
+                                                  medianIn[i - 2][j - 2],
+                                                  medianIn[i - 2][j - 1],
+                                                  medianIn[i - 2][j],
+                                                  medianIn[i - 2][j + 1],
+                                                  medianIn[i - 2][j + 2],
+                                                  medianIn[i - 1][j - 2],
+                                                  medianIn[i - 1][j - 1],
+                                                  medianIn[i - 1][j],
+                                                  medianIn[i - 1][j + 1],
+                                                  medianIn[i - 1][j + 2],
+                                                  medianIn[i][j - 2],
+                                                  medianIn[i][j - 1],
+                                                  medianIn[i][j],
+                                                  medianIn[i][j + 1],
+                                                  medianIn[i][j + 2],
+                                                  medianIn[i + 1][j - 2],
+                                                  medianIn[i + 1][j - 1],
+                                                  medianIn[i + 1][j],
+                                                  medianIn[i + 1][j + 1],
+                                                  medianIn[i + 1][j + 2],
+                                                  medianIn[i + 2][j - 2],
+                                                  medianIn[i + 2][j - 1],
+                                                  medianIn[i + 2][j],
+                                                  medianIn[i + 2][j + 1],
+                                                  medianIn[i + 2][j + 2]
+                                              );
+                        } else {
+                            medianOut[i][j] = medianIn[i][j];
+                        }
                     }

                     break;
@@ -283,7 +302,7 @@
 #ifdef __SSE2__
                     std::array<vfloat, 49> vpp ALIGNED16;

-                    for (; j < width - border - 3; j += 4) {
+                    for (; !useUpperBound && j < width - border - 3; j += 4) {
                         for (int kk = 0, ii = -border; ii <= border; ++ii) {
                             for (int jj = -border; jj <= border; ++jj, ++kk) {
                                 vpp[kk] = LVFU(medianIn[i + ii][j + jj]);
@@ -296,15 +315,19 @@
 #endif

                     std::array<float, 49> pp;
-
+                    
                     for (; j < width - border; ++j) {
-                        for (int kk = 0, ii = -border; ii <= border; ++ii) {
-                            for (int jj = -border; jj <= border; ++jj, ++kk) {
-                                pp[kk] = medianIn[i + ii][j + jj];
+                        if (!useUpperBound || medianIn[i][j] <= upperBound) {
+                            for (int kk = 0, ii = -border; ii <= border; ++ii) {
+                                for (int jj = -border; jj <= border; ++jj, ++kk) {
+                                    pp[kk] = medianIn[i + ii][j + jj];
+                                }
                             }
+                            
+                            medianOut[i][j] = median(pp);
+                        } else {
+                            medianOut[i][j] = medianIn[i][j];
                         }
-
-                        medianOut[i][j] = median(pp);
                     }

                     break;
@@ -314,7 +337,7 @@
 #ifdef __SSE2__
                     std::array<vfloat, 81> vpp ALIGNED16;

-                    for (; j < width - border - 3; j += 4) {
+                    for (; !useUpperBound && j < width - border - 3; j += 4) {
                         for (int kk = 0, ii = -border; ii <= border; ++ii) {
                             for (int jj = -border; jj <= border; ++jj, ++kk) {
                                 vpp[kk] = LVFU(medianIn[i + ii][j + jj]);
@@ -327,15 +350,19 @@
 #endif

                     std::array<float, 81> pp;
-
+                    
                     for (; j < width - border; ++j) {
-                        for (int kk = 0, ii = -border; ii <= border; ++ii) {
-                            for (int jj = -border; jj <= border; ++jj, ++kk) {
-                                pp[kk] = medianIn[i + ii][j + jj];
+                        if (!useUpperBound || medianIn[i][j] <= upperBound) {
+                            for (int kk = 0, ii = -border; ii <= border; ++ii) {
+                                for (int jj = -border; jj <= border; ++jj, ++kk) {
+                                    pp[kk] = medianIn[i + ii][j + jj];
+                                }
                             }
+                        
+                            medianOut[i][j] = median(pp);
+                        } else {
+                            medianOut[i][j] = medianIn[i][j];
                         }
-
-                        medianOut[i][j] = median(pp);
                     }

                     for (; j < width; ++j) {
@@ -382,6 +409,21 @@
     }
 }

+} // namespace
+
+
+void ImProcFunctions::Median_Denoise(float **src, float **dst, const int width, const int height, const Median medianType, const int iterations, const int numThreads, float **buffer)
+{
+    do_median_denoise<false>(src, dst, width, height, 0.f, medianType, iterations, numThreads, buffer);
+}
+
+
+void ImProcFunctions::Median_Denoise(float **src, float **dst, float upperBound, const int width, const int height, const Median medianType, const int iterations, const int numThreads, float **buffer)
+{
+    do_median_denoise<true>(src, dst, upperBound, width, height, medianType, iterations, numThreads, buffer);
+}
+
+
 void ImProcFunctions::Tile_calc(int tilesize, int overlap, int kall, int imwidth, int imheight, int &numtiles_W, int &numtiles_H, int &tilewidth, int &tileheight, int &tileWskip, int &tileHskip)

 {
diff --git a/rtengine/improcfun.h b/rtengine/improcfun.h
--- a/rtengine/improcfun.h
+++ b/rtengine/improcfun.h
@@ -305,6 +305,7 @@


     void Median_Denoise ( float **src, float **dst, int width, int height, Median medianType, int iterations, int numThreads, float **buffer = nullptr);
+    void Median_Denoise ( float **src, float **dst, float upperBound, int width, int height, Median medianType, int iterations, int numThreads, float **buffer = nullptr);
     void RGB_denoise (int kall, Imagefloat * src, Imagefloat * dst, Imagefloat * calclum, float * ch_M, float *max_r, float *max_b, bool isRAW, const procparams::DirPyrDenoiseParams & dnparams, const double expcomp, const NoiseCurve & noiseLCurve, const NoiseCurve & noiseCCurve, float &chaut, float &redaut, float &blueaut, float &maxredaut, float & maxblueaut, float &nresi, float &highresi);
     void RGB_denoise_infoGamCurve (const procparams::DirPyrDenoiseParams & dnparams, const bool isRAW, LUTf &gamcurve, float &gam, float &gamthresh, float &gamslope);
     void RGB_denoise_info (Imagefloat * src, Imagefloat * provicalc, bool isRAW, LUTf &gamcurve, float gam, float gamthresh, float gamslope, const procparams::DirPyrDenoiseParams & dnparams, const double expcomp, float &chaut, int &Nb, float &redaut, float &blueaut, float &maxredaut, float & maxblueaut, float &minredaut, float & minblueaut, float &chromina, float &sigma, float &lumema, float &sigma_L, float &redyel, float &skinc, float &nsknc, bool multiThread = false);
diff --git a/rtengine/tmo_fattal02.cc b/rtengine/tmo_fattal02.cc
--- a/rtengine/tmo_fattal02.cc
+++ b/rtengine/tmo_fattal02.cc
@@ -1196,7 +1196,6 @@
 #else
         int num_threads = 1;
 #endif
-        Array2Df Yr_med(w, h);
         float r = float(std::max(w, h)) / float(RT_dimension_cap);
         Median med;
         if (r >= 3) {
@@ -1208,18 +1207,7 @@
         } else {
             med = Median::TYPE_3X3_STRONG;
         }
-        Median_Denoise(Yr, Yr_med, w, h, med, 1, num_threads);
-
-#ifdef _OPENMP
-        #pragma omp parallel for if (multiThread)
-#endif
-        for (int y = 0; y < h; y++) {
-            for (int x = 0; x < w; x++) {
-                if (Yr(x, y) <= luminance_noise_floor) {
-                    Yr(x, y) = Yr_med(x, y);
-                }
-            }
-        }
+        Median_Denoise(Yr, Yr, luminance_noise_floor, w, h, med, 1, num_threads);
     }


@agriggio

FTblockDN.cc:423:39: error: 'threshold' was not declared in this scope
     do_median_denoise<true>(src, dst, threshold, width, height, medianType, iterations, numThreads, buffer);

I fixed it already (was a typo in the patch). Can you try again please?

Builds fine now. Zoom from fit to 100% for a 100 MP file now takes 14.5 seconds (15.6 seconds before patch).

Thanks for testing. Is median filtering in fattal still a bottleneck, or is the time spent elsewhere?

@agriggio Median is no bottleneck after your changes. I have a speedup on hold which reduces the processing time to ~12.7 seconds for my test file.

I detected something strange. To reproduce:
Put the same file 5 times into the queue
Start the queue

I get this:

ToneMapFattal02 took 12681 ms
ToneMapFattal02 took 18094 ms
ToneMapFattal02 took 17972 ms
ToneMapFattal02 took 18108 ms
ToneMapFattal02 took 18068 ms

The first file in queue is processed a lot faster than the next files.
It's clearly reproducible here on my windows system.
Can you try that on your system as well?

Well, modern CPUs are crazy and as soon as you add in multithreading, various levels of caches, NUMA, OS scheduling, and whatnot... all bets are off :-)
Here on my laptop, if I run on battery I get anything from ~24 seconds to ~10 seconds for a 50 mpix file (the largest raw I could find on the web) depending on the cpu frequency scaling governor and other interesting stuff (including thermal management in the CPU).

As an anecdote, here we had a similar variance problem on the local HPC cluster. A guy from Intel came over to understand what was going on, but despite various attempts we could never nail down the issue for sure...

@agriggio I just tested. The variance in processing time is clearly caused by fftw_plan and fftw_execute...

@heckflosse but is it something under the control of fftw, or just something that manifests itself in there but comes from the lower levels? FWIW, I don't observer the same pattern here, although I do see some variance as I wrote above.

@agriggio I can only reproduce the pattern with multithreaded fftwf usage. With single threaded fftwf usage the processing times are almost equal (though very slow then)

@agriggio In fattal code there are some of these patterns:

Array2Df  *X = new Array2Df(width,height);
Array2Df  *Y = new Array2Df(width,height);

do someting with X and Y...

delete X;

Array2Df  *Z = new Array2Df(width,height);

do something...

meaning, a buffer of size width*height is allocated immediately after a buffer of same size is deleted.
On Windows (don't know about Linux) deleting large buffers is quite expensive (I read that they are filled with zero at deletion).

Would you mind to reuse existing buffers instead of deleting old ones and creating new ones with a different name?

Edit: I would take that task.

@heckflosse can you try this on your 100Mp file? WARNING: it will increase the memory usage, but it's suppose to give a nice performance boost... If it does, then we can think about reducing the peak memory needed

EDIT: try the one below instead

Actually, try this one:

diff --git a/rtengine/tmo_fattal02.cc b/rtengine/tmo_fattal02.cc
--- a/rtengine/tmo_fattal02.cc
+++ b/rtengine/tmo_fattal02.cc
@@ -1147,6 +1147,22 @@
     return r * ws[1][0] + g * ws[1][1] + b * ws[1][2];
 }

+
+inline int round_up_pow2(int dim)
+{
+    // from https://graphics.stanford.edu/~seander/bithacks.html
+    assert(dim > 0);
+    unsigned int v = dim;
+    v--;
+    v |= v >> 1;
+    v |= v >> 2;
+    v |= v >> 4;
+    v |= v >> 8;
+    v |= v >> 16;
+    v++;
+    return v;
+}
+
 } // namespace


@@ -1218,7 +1234,18 @@
                   << ", detail_level = " << detail_level << std::endl;
     }

-    tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);
+    {
+        int w2 = round_up_pow2(w);
+        int h2 = round_up_pow2(h);
+        w2 = ((w2 - w - w/10 > w - w2/2) ? w2 / 2 : w2) + 1;
+        h2 = ((h2 - h - h/10 > h - h2/2) ? h2 / 2 : h2) + 1;
+        Array2Df yrs(w2, h2);
+        rescale_bilinear(Yr, yrs, multiThread);
+        Array2Df ls(w2, h2);
+        tmo_fattal02(w2, h2, yrs, ls, alpha, beta, noise, detail_level, multiThread);
+        rescale_bilinear(ls, L, multiThread);
+    }
+//    tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);

 #ifdef _OPENMP
     #pragma omp parallel for if(multiThread)

Beside in our implementation the actual array is allocated dynamically in
the constructor and deallocated in the destructor, so no need of something
like Array2Df *somearray = new Array2Df ....

On 8 November 2017 at 14:03, Alberto Griggio notifications@github.com
wrote:

Actually, try this one:

diff --git a/rtengine/tmo_fattal02.cc b/rtengine/tmo_fattal02.cc--- a/rtengine/tmo_fattal02.cc+++ b/rtengine/tmo_fattal02.cc@@ -1147,6 +1147,22 @@
return r * ws[1][0] + g * ws[1][1] + b * ws[1][2];
}
++inline int round_up_pow2(int dim)+{+ // from https://graphics.stanford.edu/~seander/bithacks.html+ assert(dim > 0);+ unsigned int v = dim;+ v--;+ v |= v >> 1;+ v |= v >> 2;+ v |= v >> 4;+ v |= v >> 8;+ v |= v >> 16;+ v++;+ return v;+}+
} // namespace

@@ -1218,7 +1234,18 @@
<< ", detail_level = " << detail_level << std::endl;
}

  • tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);+ {+ int w2 = round_up_pow2(w);+ int h2 = round_up_pow2(h);+ w2 = ((w2 - w > w - w2/2) ? w2 / 2 : w2) + 1;+ h2 = ((h2 - h > h - h2/2) ? h2 / 2 : h2) + 1;+ Array2Df yrs(w2, h2);+ rescale_bilinear(Yr, yrs, multiThread);+ Array2Df ls(w2, h2);+ tmo_fattal02(w2, h2, yrs, ls, alpha, beta, noise, detail_level, multiThread);+ rescale_bilinear(ls, L, multiThread);+ }+// tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);

#ifdef _OPENMP
#pragma omp parallel for if(multiThread)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/Beep6581/RawTherapee/issues/4168#issuecomment-342811445,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKgh5mtoB8RA9DgxpOTj6c4XCvpu49LJks5s0aaygaJpZM4QSC3j
.

@agriggio Processing time with my latest optimizations without your patch is 11.7 seconds. With my latest opts and your patch is 4.2 seconds.

But it destroys some fine detail (left is without, right is with patch)
fattal

Edit: < 100% preview looks really soft with the patch, especially the thumbs

@heckflosse yes, I was afraid about that. Can you try with this and see if the details are restored? The drawback is that memory usage will grow... but again, I think we can handle it later on.

-        w2 = ((w2 - w - w/10 > w - w2/2) ? w2 / 2 : w2) + 1;
-        h2 = ((h2 - h - h/10 > h - h2/2) ? h2 / 2 : h2) + 1;
+        w2 += 1;
+        h2 += 1;

@agriggio
14.2 seconds now, which is slower than before patch. Details are a bit better than last patch but thumb is still very soft.

Details are a bit better than last patch

ok, but that's a bit vague... is the loss of detail still there?

but thumb is still very soft.

don't care about that, we can fix it later easily

There is still a tiny loss of detail, but it's acceptable now imho

There is still a tiny loss of detail, but it's acceptable now imho

Ok. Then here's a new version to try:

diff --git a/rtengine/tmo_fattal02.cc b/rtengine/tmo_fattal02.cc
--- a/rtengine/tmo_fattal02.cc
+++ b/rtengine/tmo_fattal02.cc
@@ -1142,11 +1142,68 @@
 }


+void rescale_nearest(const Array2Df &src, Array2Df &dst, bool multithread)
+{
+    const int width = src.getCols();
+    const int height = src.getRows();
+    const int nw = dst.getCols();
+    const int nh = dst.getRows();
+
+#ifdef _OPENMP
+    #pragma omp parallel for if (multithread)
+#endif
+    for (int y = 0; y < nh; ++y) {
+        int sy = y * height / nh;
+        for (int x = 0; x < nw; ++x) {
+            int sx = x * width / nw;
+            dst(x, y) = src(sx, sy);
+        }
+    }
+}
+
+
 inline float luminance(float r, float g, float b, TMatrix ws)
 {
     return r * ws[1][0] + g * ws[1][1] + b * ws[1][2];
 }

+
+inline int round_up_pow2(int dim)
+{
+    // from https://graphics.stanford.edu/~seander/bithacks.html
+    assert(dim > 0);
+    unsigned int v = dim;
+    v--;
+    v |= v >> 1;
+    v |= v >> 2;
+    v |= v >> 4;
+    v |= v >> 8;
+    v |= v >> 16;
+    v++;
+    return v;
+}
+
+inline int find_fast_dim(int dim)
+{
+    int d1 = round_up_pow2(dim);
+    std::vector<int> d = {
+        (d1/16) * 9,
+        (d1/8) * 5,
+        (d1/16) * 11,
+        (d1/4) * 3,
+        (d1/16) * 13,
+        (d1/8) * 7,
+        d1
+    };
+    for (size_t i = 0; i < d.size(); ++i) {
+        if (d[i] >= dim) {
+            return d[i];
+        }
+    }
+    assert(false);
+    return dim;
+}
+
 } // namespace


@@ -1218,7 +1275,15 @@
                   << ", detail_level = " << detail_level << std::endl;
     }

-    tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);
+    {
+        int w2 = find_fast_dim(w) + 1;
+        int h2 = find_fast_dim(h) + 1;
+        Array2Df buf(w2, h2);
+        rescale_nearest(Yr, buf, multiThread);
+        tmo_fattal02(w2, h2, buf, buf, alpha, beta, noise, detail_level, multiThread);
+        rescale_nearest(buf, L, multiThread);
+    }
+//    tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);

 #ifdef _OPENMP
     #pragma omp parallel for if(multiThread)

@agriggio That's great. 7 seconds and good details 👍 :

Left is before todays changes, right with your latest patch:
fattal01

@heckflosse excellent. I just tried on a 50 Mpix GFX 50S file (found on the web) and I still see a tiny bit of detail loss however. So, I was wondering whether it makes sense to expose another parameter "High quality" to the user. What do you think? (100% crop, left is old, right is new)

detail

@agriggio I've to clean my glasses first...

@agriggio Alberto, here's the 100 Mpix file I use for my tests

I can see a very tiny bit of detail loss in your example, bit that's pixel peeping par excellence

LOL, maybe you are right... OTOH, I'm not the one who developed pixelshift :stuck_out_tongue_winking_eye:

Anyway, here's a slightly improved version (I've changed only the vector of good dimensions in find_fast_dim -- actually, we might want to tablulate this more extensively...):

diff --git a/rtengine/tmo_fattal02.cc b/rtengine/tmo_fattal02.cc
--- a/rtengine/tmo_fattal02.cc
+++ b/rtengine/tmo_fattal02.cc
@@ -1142,11 +1142,73 @@
 }


+void rescale_nearest(const Array2Df &src, Array2Df &dst, bool multithread)
+{
+    const int width = src.getCols();
+    const int height = src.getRows();
+    const int nw = dst.getCols();
+    const int nh = dst.getRows();
+
+#ifdef _OPENMP
+    #pragma omp parallel for if (multithread)
+#endif
+    for (int y = 0; y < nh; ++y) {
+        int sy = y * height / nh;
+        for (int x = 0; x < nw; ++x) {
+            int sx = x * width / nw;
+            dst(x, y) = src(sx, sy);
+        }
+    }
+}
+
+
 inline float luminance(float r, float g, float b, TMatrix ws)
 {
     return r * ws[1][0] + g * ws[1][1] + b * ws[1][2];
 }

+
+inline int round_up_pow2(int dim)
+{
+    // from https://graphics.stanford.edu/~seander/bithacks.html
+    assert(dim > 0);
+    unsigned int v = dim;
+    v--;
+    v |= v >> 1;
+    v |= v >> 2;
+    v |= v >> 4;
+    v |= v >> 8;
+    v |= v >> 16;
+    v++;
+    return v;
+}
+
+inline int find_fast_dim(int dim)
+{
+    int d1 = round_up_pow2(dim);
+    std::vector<int> d = {
+        d1/128 * 65,
+        d1/64 * 33,
+        d1/512 * 273,
+        d1/16 * 9,
+        d1/8 * 5,
+        d1/16 * 11,
+        d1/128 * 91,
+        d1/4 * 3,
+        d1/64 * 49,
+        d1/16 * 13,
+        d1/8 * 7,
+        d1
+    };
+    for (size_t i = 0; i < d.size(); ++i) {
+        if (d[i] >= dim) {
+            return d[i];
+        }
+    }
+    assert(false);
+    return dim;
+}
+
 } // namespace


@@ -1209,7 +1271,6 @@
         }
         Median_Denoise(Yr, Yr, luminance_noise_floor, w, h, med, 1, num_threads);
     }
-    

     float noise = alpha * 0.01f;

@@ -1218,7 +1279,15 @@
                   << ", detail_level = " << detail_level << std::endl;
     }

-    tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);
+    //tmo_fattal02(w, h, Yr, L, alpha, beta, noise, detail_level, multiThread);
+    {
+        int w2 = find_fast_dim(w) + 1;
+        int h2 = find_fast_dim(h) + 1;
+        Array2Df buf(w2, h2);
+        rescale_nearest(Yr, buf, multiThread);
+        tmo_fattal02(w2, h2, buf, buf, alpha, beta, noise, detail_level, multiThread);
+        rescale_nearest(buf, L, multiThread);
+    }

 #ifdef _OPENMP
     #pragma omp parallel for if(multiThread)

@agriggio I will test in a few minutes.

btw: changing this

+    {
+        int w2 = find_fast_dim(w) + 1;
+        int h2 = find_fast_dim(h) + 1;
+        Array2Df buf(w2, h2);
+        rescale_nearest(Yr, buf, multiThread);
+        tmo_fattal02(w2, h2, buf, buf, alpha, beta, noise, detail_level, multiThread);
+        rescale_nearest(buf, L, multiThread);
+    }

to this

+        int w2 = find_fast_dim(w) + 1;
+        int h2 = find_fast_dim(h) + 1;
+        Array2Df buf(w2, h2);
+        rescale_nearest(Yr, buf, multiThread);
+        tmo_fattal02(w2, h2, buf, buf, alpha, beta, noise, detail_level, multiThread);
+        Array2Df L(w, h);
+        rescale_nearest(buf, L, multiThread);

and of course removing
Array2Df L(w, h);

some lines above should reduce peak memory usage a bit

@agriggio your latest modification gave a 10% speedup compared to the one before, mesaured whole fattal

@heckflosse great! If we are happy about speed and quality, then we can start getting rid of all those intermediate buffers now :smile:

BTW I'll push so that we do not have the patch floating around

ok

@agriggio I've a lot of small speedups on hold, which I would like to push after you pushed

done

@agriggio I will need about an hour

@agriggio pushed
Edit: processing time with your latest changes was 8.1 seconds. Now it's 5.2 seconds

I tested last version. Very impressive indeed.
There is one thing I don't understand.
I use the 100Mpix image. I observe that processing time (and memory usage) increase with zoom factor.
At 100% zoom processing is very long though we observe a very small part of image.
With a detail window, it is the same, and the display is very delayed.

@agriggio Alberto, with your latest optimizations for fftw usage the strange pattern in queue processing has gone 👍

@gaaned92 This is because the whole image is computed for preview, even though you only see a crop of that full image. Since its size depend on the scale, the closest you are from 1:1, the longer it will take.

@agriggio @heckflosse What an impressive council of wizards. :it: :fire: :de:

Here's my humble suggestion, don't know if it's of worth: In find_fast_dim() it could be faster to use a (const) std::array<> instead of a std::vector<>. The latter needs heap allocation, the former probably not.

Best,
Flössie

@Hombre57 you wrote:
Since its size depend on the scale, the closest you are from 1:1, the longer it will take.
Surely there is still something I miss.
Until now I thought that the size of the whole image (the raw image) was independant of the scale of the preview. So whatever the scale, the processing time of the whole image should be the same.
Thanks for clarifying.

thanks @Floessie, it's a good idea :+1:

Concerning the processing time in preview mode:

If we would move fattal directly after demosaic, the processing time in preview mode would be reduced by a large amount because only fattal would be processed in full size but not the subsequent tools.

I'm not sure it's a good idea to have it before CA and noise removal, but maybe it's worth trying. I think a better solution though would be to spend our energy in restructuring the pipeline (sorry if I insist on this...)

It would be after raw-CA but before noise removal. I agree that we should restructure the pipeline as well.
Hombre suggested in irc to cache the result of fattal so we don't need to recalculate it every time (we could keep it at the current place in pipeline then).
Though the real culprit is not fattal anymore but the fact that every tool after fattal works on the image size required by fattal (in preview)

@agriggio Another thing: Do you have a link to an example file which showed the artifacts in dark regions (the ones you use the median to avoid them)?

the recent playraw on the pier at dusk

I think that caching may be a good idea, combined with cropping right after fattal. I'll give this a try.
regarding CA: after raw CA yes, but what about CA in the lens tab? (either manual or profiled)

Here's a tentative patch to make dcrop.cc faster when using fattal. Still not optimal, but better solutions might require nontrivial refactorings...

diff --git a/rtengine/dcrop.cc b/rtengine/dcrop.cc
--- a/rtengine/dcrop.cc
+++ b/rtengine/dcrop.cc
@@ -690,14 +690,27 @@
     // has to be called after setCropSizes! Tools prior to this point can't handle the Edit mechanism, but that shouldn't be a problem.
     createBuffer (cropw, croph);

+    bool need_fattal = (todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled;
+
     // transform
     if (needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) {
+        int tcx = cropx;
+        int tcy = cropy;
+        int tcw = cropw;
+        int tch = croph;
+        if (need_fattal) {
+            tcx = trafx;
+            tcy = trafy;
+            tcw = trafw;
+            tch = trafh;
+        }
+        
         if (!transCrop) {
-            transCrop = new Imagefloat (cropw, croph);
+            transCrop = new Imagefloat (tcw, tch);
         }

         if (needstransform)
-            parent->ipf.transform (baseCrop, transCrop, cropx / skip, cropy / skip, trafx / skip, trafy / skip, skips (parent->fw, skip), skips (parent->fh, skip), parent->getFullWidth(), parent->getFullHeight(),
+            parent->ipf.transform (baseCrop, transCrop, tcx / skip, tcy / skip, trafx / skip, trafy / skip, skips (parent->fw, skip), skips (parent->fh, skip), parent->getFullWidth(), parent->getFullHeight(),
                                    parent->imgsrc->getMetaData(),
                                    parent->imgsrc->getRotateDegree(), false);
         else {
@@ -716,14 +729,28 @@
     }

     std::unique_ptr<Imagefloat> fattalCrop;
-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
+    if (need_fattal) {
         Imagefloat *f = baseCrop;
         if (f == origCrop) {
             fattalCrop.reset(baseCrop->copy());
             f = fattalCrop.get();
         }
         parent->ipf.ToneMapFattal02(f);
-        baseCrop = f;
+        Imagefloat *c = new Imagefloat(cropw, croph);
+#ifdef _OPENMP
+        #pragma omp parallel for
+#endif
+        for (int y = 0; y < croph; ++y) {
+            int cy = y + cropy;
+            for (int x = 0; x < cropw; ++x) {
+                int cx = x + cropx;
+                c->r(y, x) = f->r(cy, cx);
+                c->g(y, x) = f->g(cy, cx);
+                c->b(y, x) = f->b(cy, cx);
+            }
+        }
+        fattalCrop.reset(c);
+        baseCrop = c;
     }

     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
@@ -1156,13 +1183,6 @@
     orw = bw;
     orh = bh;

-    if (check_need_full_image(parent->params)) {
-        orx = bx1 = 0;
-        ory = by1 = 0;
-        orw = bw = parent->fullw;
-        orh = bh = parent->fullh;
-    }
-    
     ProcParams& params = parent->params;

     parent->ipf.transCoord (parent->fw, parent->fh, bx1, by1, bw, bh, orx, ory, orw, orh);
@@ -1202,6 +1222,13 @@
         orh = min (y2 - y1, parent->fh - ory);
     }

+    if (check_need_full_image(parent->params)) {
+        orx = 0;
+        ory = 0;
+        orw = parent->fullw;
+        orh = parent->fullh;
+    }    
+    
     PreviewProps cp (orx, ory, orw, orh, skip);
     int orW, orH;
     parent->imgsrc->getSize (cp, orW, orH);

@agriggio OK for having CA in lens tab before fattal, but yesterday Hombre detected that Graduated Filter before fattal leads to unpredictable results. We should consider moving fattal after CA but before Graduated Filter.

@heckflosse I think you (and @Hombre57) are right. In fact, I suspect that also vignetting will behave in a weird manner if performed before fattal. So, how about separating CA from the rest of transformations, and put fattal in the middle?

Or maybe for the time being just move Fattal before ImProcFunctions::transform and see how it goes...

@agriggio Yes, vignetting from exposure tab should also be after fattal

@agriggio Let's try moving fattal before transform at least...

@heckflosse trying...

@heckflosse here's a candidate patch for moving fattal before transform. can you please test it when you have time?

diff --git a/rtengine/dcrop.cc b/rtengine/dcrop.cc
--- a/rtengine/dcrop.cc
+++ b/rtengine/dcrop.cc
@@ -690,6 +690,20 @@
     // has to be called after setCropSizes! Tools prior to this point can't handle the Edit mechanism, but that shouldn't be a problem.
     createBuffer (cropw, croph);

+    std::unique_ptr<Imagefloat> fattalCrop;
+    bool need_cropping = false;
+    
+    if ((todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled) {
+        Imagefloat *f = baseCrop;
+        if (f == origCrop) {
+            fattalCrop.reset(baseCrop->copy());
+            f = fattalCrop.get();
+        }
+        parent->ipf.ToneMapFattal02(f);
+        need_cropping = (cropx || cropy || trafw != cropw || trafh != croph);
+        baseCrop = f;
+    }    
+    
     // transform
     if (needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) {
         if (!transCrop) {
@@ -707,6 +721,25 @@
         if (transCrop) {
             baseCrop = transCrop;
         }
+    } else if (need_cropping) {
+        if (!transCrop) {
+            transCrop = new Imagefloat;
+        }
+        transCrop->allocate(cropw, croph);
+
+#ifdef _OPENMP
+        #pragma omp parallel for
+#endif
+        for (int y = 0; y < croph; ++y) {
+            int cy = y + cropy;
+            for (int x = 0; x < cropw; ++x) {
+                int cx = x + cropx;
+                transCrop->r(y, x) = baseCrop->r(cy, cx);
+                transCrop->g(y, x) = baseCrop->g(cy, cx);
+                transCrop->b(y, x) = baseCrop->b(cy, cx);
+            }
+        }
+        baseCrop = transCrop;
     } else {
         if (transCrop) {
             delete transCrop;
@@ -715,17 +748,6 @@
         transCrop = nullptr;
     }

-    std::unique_ptr<Imagefloat> fattalCrop;
-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
-        Imagefloat *f = baseCrop;
-        if (f == origCrop) {
-            fattalCrop.reset(baseCrop->copy());
-            f = fattalCrop.get();
-        }
-        parent->ipf.ToneMapFattal02(f);
-        baseCrop = f;
-    }
-    
     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {

         const int W = baseCrop->getWidth();
@@ -1156,13 +1178,6 @@
     orw = bw;
     orh = bh;

-    if (check_need_full_image(parent->params)) {
-        orx = bx1 = 0;
-        ory = by1 = 0;
-        orw = bw = parent->fullw;
-        orh = bh = parent->fullh;
-    }
-    
     ProcParams& params = parent->params;

     parent->ipf.transCoord (parent->fw, parent->fh, bx1, by1, bw, bh, orx, ory, orw, orh);
@@ -1202,6 +1217,13 @@
         orh = min (y2 - y1, parent->fh - ory);
     }

+    if (check_need_full_image(parent->params)) {
+        orx = 0;
+        ory = 0;
+        orw = parent->fullw;
+        orh = parent->fullh;
+    }    
+    
     PreviewProps cp (orx, ory, orw, orh, skip);
     int orW, orH;
     parent->imgsrc->getSize (cp, orW, orH);
diff --git a/rtengine/improccoordinator.cc b/rtengine/improccoordinator.cc
--- a/rtengine/improccoordinator.cc
+++ b/rtengine/improccoordinator.cc
@@ -385,15 +385,26 @@

     readyphase++;

+    if ((todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled) {
+        Imagefloat *fattalprev = orig_prev->copy();
+        ipf.ToneMapFattal02(fattalprev);
+        if (oprevi != orig_prev) {
+            delete oprevi;
+        }
+        delete orig_prev;
+        orig_prev = fattalprev;
+        oprevi = nullptr;
+    } 
+
     progress ("Rotate / Distortion...", 100 * readyphase / numofphases);
     // Remove transformation if unneeded
     bool needstransform = ipf.needsTransform();
-
+    
     if (!needstransform && ! ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) && orig_prev != oprevi) {
         delete oprevi;
         oprevi = orig_prev;
     }
-
+    
     if ((needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) ) {
         if (!oprevi || oprevi == orig_prev) {
             oprevi = new Imagefloat (pW, pH);
@@ -407,15 +418,6 @@
         }
     }

-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
-        Imagefloat *fattalprev = oprevi->copy();
-        ipf.ToneMapFattal02(fattalprev);
-        if (oprevi != orig_prev) {
-            delete oprevi;
-        }
-        oprevi = fattalprev;
-    } 
-
     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
         const int W = oprevi->getWidth();
         const int H = oprevi->getHeight();
diff --git a/rtengine/rtthumbnail.cc b/rtengine/rtthumbnail.cc
--- a/rtengine/rtthumbnail.cc
+++ b/rtengine/rtthumbnail.cc
@@ -1090,6 +1090,10 @@

     ipf.firstAnalysis (baseImg, params, hist16);

+    if (params.fattal.enabled) {
+        ipf.ToneMapFattal02(baseImg);
+    }
+    
     // perform transform
     if (ipf.needsTransform()) {
         Imagefloat* trImg = new Imagefloat (fw, fh);
@@ -1102,10 +1106,6 @@
         baseImg = trImg;
     }

-    if (params.fattal.enabled) {
-        ipf.ToneMapFattal02(baseImg);
-    }
-
     // update blurmap
     SHMap* shmap = nullptr;

diff --git a/rtengine/simpleprocess.cc b/rtengine/simpleprocess.cc
--- a/rtengine/simpleprocess.cc
+++ b/rtengine/simpleprocess.cc
@@ -810,6 +810,10 @@

         ipf.firstAnalysis (baseImg, params, hist16);

+        if (params.fattal.enabled) {
+            ipf.ToneMapFattal02(baseImg);
+        }
+                
         // perform transform (excepted resizing)
         if (ipf.needsTransform()) {
             Imagefloat* trImg = nullptr;
@@ -833,10 +837,6 @@
         //ImProcFunctions ipf (&params, true);
         ImProcFunctions &ipf = * (ipf_p.get());

-        if (params.fattal.enabled) {
-            ipf.ToneMapFattal02(baseImg);
-        }
-        
         if (params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
             const int W = baseImg->getWidth();
             const int H = baseImg->getHeight();

@agriggio I just tried your dcrop patch. That already improves the preview time (at 100%) a lot. I will try to improve it further for using detail windows.
I will try your latest patch now.

@heckflosse ok, thanks! Note that my latest patch subsumes my earlier one (i.e. just drop the previous one...)

@agriggio How about moving std::unique_ptr<Imagefloat> fattalCrop; to improccordinator.h and access it from dcrop using parent->fattalCrop; ? That should make opening detail windows faster (at least for the 2nd to nth detail window)

@heckflosse feel free to go ahead, my confidence with that part of the code is not the highest

@agriggio Ok, I will try

hold on, I found a bug. A refined patch will follow shortly

Here's a better patch

diff --git a/rtengine/dcrop.cc b/rtengine/dcrop.cc
--- a/rtengine/dcrop.cc
+++ b/rtengine/dcrop.cc
@@ -690,6 +690,20 @@
     // has to be called after setCropSizes! Tools prior to this point can't handle the Edit mechanism, but that shouldn't be a problem.
     createBuffer (cropw, croph);

+    std::unique_ptr<Imagefloat> fattalCrop;
+    bool need_cropping = false;
+    
+    if ((todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled) {
+        Imagefloat *f = baseCrop;
+        if (f == origCrop) {
+            fattalCrop.reset(baseCrop->copy());
+            f = fattalCrop.get();
+        }
+        parent->ipf.ToneMapFattal02(f);
+        need_cropping = (cropx || cropy || trafw != cropw || trafh != croph);
+        baseCrop = f;
+    }    
+    
     // transform
     if (needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) {
         if (!transCrop) {
@@ -707,6 +721,25 @@
         if (transCrop) {
             baseCrop = transCrop;
         }
+    } else if (need_cropping) {
+        if (!transCrop) {
+            transCrop = new Imagefloat;
+        }
+        transCrop->allocate(cropw, croph);
+
+#ifdef _OPENMP
+        #pragma omp parallel for
+#endif
+        for (int y = 0; y < croph; ++y) {
+            int cy = y + cropy;
+            for (int x = 0; x < cropw; ++x) {
+                int cx = x + cropx;
+                transCrop->r(y, x) = baseCrop->r(cy, cx);
+                transCrop->g(y, x) = baseCrop->g(cy, cx);
+                transCrop->b(y, x) = baseCrop->b(cy, cx);
+            }
+        }
+        baseCrop = transCrop;
     } else {
         if (transCrop) {
             delete transCrop;
@@ -715,17 +748,6 @@
         transCrop = nullptr;
     }

-    std::unique_ptr<Imagefloat> fattalCrop;
-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
-        Imagefloat *f = baseCrop;
-        if (f == origCrop) {
-            fattalCrop.reset(baseCrop->copy());
-            f = fattalCrop.get();
-        }
-        parent->ipf.ToneMapFattal02(f);
-        baseCrop = f;
-    }
-    
     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {

         const int W = baseCrop->getWidth();
@@ -1156,13 +1178,6 @@
     orw = bw;
     orh = bh;

-    if (check_need_full_image(parent->params)) {
-        orx = bx1 = 0;
-        ory = by1 = 0;
-        orw = bw = parent->fullw;
-        orh = bh = parent->fullh;
-    }
-    
     ProcParams& params = parent->params;

     parent->ipf.transCoord (parent->fw, parent->fh, bx1, by1, bw, bh, orx, ory, orw, orh);
@@ -1202,6 +1217,13 @@
         orh = min (y2 - y1, parent->fh - ory);
     }

+    if (check_need_full_image(parent->params)) {
+        orx = 0;
+        ory = 0;
+        orw = parent->fullw;
+        orh = parent->fullh;
+    }    
+    
     PreviewProps cp (orx, ory, orw, orh, skip);
     int orW, orH;
     parent->imgsrc->getSize (cp, orW, orH);
diff --git a/rtengine/improccoordinator.cc b/rtengine/improccoordinator.cc
--- a/rtengine/improccoordinator.cc
+++ b/rtengine/improccoordinator.cc
@@ -385,36 +385,33 @@

     readyphase++;

-    progress ("Rotate / Distortion...", 100 * readyphase / numofphases);
-    // Remove transformation if unneeded
-    bool needstransform = ipf.needsTransform();
-
-    if (!needstransform && ! ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) && orig_prev != oprevi) {
-        delete oprevi;
-        oprevi = orig_prev;
-    }
-
-    if ((needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) ) {
-        if (!oprevi || oprevi == orig_prev) {
-            oprevi = new Imagefloat (pW, pH);
-        }
-
-        if (needstransform)
-            ipf.transform (orig_prev, oprevi, 0, 0, 0, 0, pW, pH, fw, fh, 
-                           imgsrc->getMetaData(), imgsrc->getRotateDegree(), false);
-        else {
-            orig_prev->copyData (oprevi);
-        }
-    }
-
-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
-        Imagefloat *fattalprev = oprevi->copy();
+    if ((todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled) {
+        Imagefloat *fattalprev = orig_prev->copy();
         ipf.ToneMapFattal02(fattalprev);
         if (oprevi != orig_prev) {
             delete oprevi;
         }
         oprevi = fattalprev;
-    } 
+    } else {
+        oprevi = orig_prev;
+    }
+
+    progress ("Rotate / Distortion...", 100 * readyphase / numofphases);
+    // Remove transformation if unneeded
+    bool needstransform = ipf.needsTransform();
+    
+    if ((needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) ) {
+        assert(oprevi);
+        Imagefloat *op = oprevi;
+        oprevi = new Imagefloat (pW, pH);
+
+        if (needstransform)
+            ipf.transform (op, oprevi, 0, 0, 0, 0, pW, pH, fw, fh, 
+                           imgsrc->getMetaData(), imgsrc->getRotateDegree(), false);
+        else {
+            op->copyData (oprevi);
+        }
+    }

     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
         const int W = oprevi->getWidth();
diff --git a/rtengine/rtthumbnail.cc b/rtengine/rtthumbnail.cc
--- a/rtengine/rtthumbnail.cc
+++ b/rtengine/rtthumbnail.cc
@@ -1090,6 +1090,10 @@

     ipf.firstAnalysis (baseImg, params, hist16);

+    if (params.fattal.enabled) {
+        ipf.ToneMapFattal02(baseImg);
+    }
+    
     // perform transform
     if (ipf.needsTransform()) {
         Imagefloat* trImg = new Imagefloat (fw, fh);
@@ -1102,10 +1106,6 @@
         baseImg = trImg;
     }

-    if (params.fattal.enabled) {
-        ipf.ToneMapFattal02(baseImg);
-    }
-
     // update blurmap
     SHMap* shmap = nullptr;

diff --git a/rtengine/simpleprocess.cc b/rtengine/simpleprocess.cc
--- a/rtengine/simpleprocess.cc
+++ b/rtengine/simpleprocess.cc
@@ -810,6 +810,10 @@

         ipf.firstAnalysis (baseImg, params, hist16);

+        if (params.fattal.enabled) {
+            ipf.ToneMapFattal02(baseImg);
+        }
+                
         // perform transform (excepted resizing)
         if (ipf.needsTransform()) {
             Imagefloat* trImg = nullptr;
@@ -833,10 +837,6 @@
         //ImProcFunctions ipf (&params, true);
         ImProcFunctions &ipf = * (ipf_p.get());

-        if (params.fattal.enabled) {
-            ipf.ToneMapFattal02(baseImg);
-        }
-        
         if (params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
             const int W = baseImg->getWidth();
             const int H = baseImg->getHeight();

@agriggio Graduated filter works fine now as does the navigator preview ;-)

@agriggio I found a bug. To reproduce. Open image, apply fattal, press z to zoom to centre of image.
Result: Upper left corner of the image is shown instead of centre

@agriggio I forgot to mention that to reproduce the bug, graduated Filter has to be enabled

@heckflosse I'll take a look

Updated patch that should fix the issue above. Please note though that:

  1. in principle we could do better wrt. caching/buffering, but

  2. the dcrop.cc code is getting messier every day, and this is not helping

So, I suggest that we call it a day at some point without aiming for perfection at the moment. If we are reasonably happy, let's try to allocate some time for the more general pipeline refactoring. Playing with dcrop in its current state is not much fun, at least for me...

diff --git a/rtengine/dcrop.cc b/rtengine/dcrop.cc
--- a/rtengine/dcrop.cc
+++ b/rtengine/dcrop.cc
@@ -690,14 +690,46 @@
     // has to be called after setCropSizes! Tools prior to this point can't handle the Edit mechanism, but that shouldn't be a problem.
     createBuffer (cropw, croph);

+    std::unique_ptr<Imagefloat> fattalCrop;
+    bool need_cropping = false;
+    bool has_fattal = false;
+    
+    if ((todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled) {
+        has_fattal = true;
+        Imagefloat *f = baseCrop;
+        if (f == origCrop) {
+            fattalCrop.reset(baseCrop->copy());
+            f = fattalCrop.get();
+        }
+        parent->ipf.ToneMapFattal02(f);
+        need_cropping = (cropx || cropy || trafw != cropw || trafh != croph);
+        baseCrop = f;
+    }
+    
     // transform
     if (needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) {
+        int tx = cropx;
+        int ty = cropy;
+        int tw = cropw;
+        int th = croph;
+        
+        if (has_fattal) {
+            tx = 0;
+            ty = 0;
+            tw = trafw;
+            th = trafh;
+            if (transCrop) {
+                delete transCrop;
+                transCrop = nullptr;
+            }
+        }
+        
         if (!transCrop) {
-            transCrop = new Imagefloat (cropw, croph);
+            transCrop = new Imagefloat (tw, th);
         }

         if (needstransform)
-            parent->ipf.transform (baseCrop, transCrop, cropx / skip, cropy / skip, trafx / skip, trafy / skip, skips (parent->fw, skip), skips (parent->fh, skip), parent->getFullWidth(), parent->getFullHeight(),
+            parent->ipf.transform (baseCrop, transCrop, tx / skip, ty / skip, trafx / skip, trafy / skip, skips (parent->fw, skip), skips (parent->fh, skip), parent->getFullWidth(), parent->getFullHeight(),
                                    parent->imgsrc->getMetaData(),
                                    parent->imgsrc->getRotateDegree(), false);
         else {
@@ -715,17 +747,26 @@
         transCrop = nullptr;
     }

-    std::unique_ptr<Imagefloat> fattalCrop;
-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
-        Imagefloat *f = baseCrop;
-        if (f == origCrop) {
-            fattalCrop.reset(baseCrop->copy());
-            f = fattalCrop.get();
+    if (need_cropping) {
+        Imagefloat *c = new Imagefloat(cropw, croph);
+
+#ifdef _OPENMP
+        #pragma omp parallel for
+#endif
+        for (int y = 0; y < croph; ++y) {
+            int cy = y + cropy;
+            for (int x = 0; x < cropw; ++x) {
+                int cx = x + cropx;
+                c->r(y, x) = baseCrop->r(cy, cx);
+                c->g(y, x) = baseCrop->g(cy, cx);
+                c->b(y, x) = baseCrop->b(cy, cx);
+            }
         }
-        parent->ipf.ToneMapFattal02(f);
-        baseCrop = f;
+        fattalCrop.reset(c);
+        baseCrop = c;
     }

+
     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {

         const int W = baseCrop->getWidth();
@@ -1156,13 +1197,6 @@
     orw = bw;
     orh = bh;

-    if (check_need_full_image(parent->params)) {
-        orx = bx1 = 0;
-        ory = by1 = 0;
-        orw = bw = parent->fullw;
-        orh = bh = parent->fullh;
-    }
-    
     ProcParams& params = parent->params;

     parent->ipf.transCoord (parent->fw, parent->fh, bx1, by1, bw, bh, orx, ory, orw, orh);
@@ -1202,6 +1236,16 @@
         orh = min (y2 - y1, parent->fh - ory);
     }

+    leftBorder  = skips (rqx1 - bx1, skip);
+    upperBorder = skips (rqy1 - by1, skip);
+
+    if (check_need_full_image(parent->params)) {
+        orx = 0;
+        ory = 0;
+        orw = parent->fullw;
+        orh = parent->fullh;
+    }    
+    
     PreviewProps cp (orx, ory, orw, orh, skip);
     int orW, orH;
     parent->imgsrc->getSize (cp, orW, orH);
@@ -1212,9 +1256,6 @@
     int cw = skips (bw, skip);
     int ch = skips (bh, skip);

-    leftBorder  = skips (rqx1 - bx1, skip);
-    upperBorder = skips (rqy1 - by1, skip);
-
     if (settings->verbose) {
         printf ("setsizes starts (%d, %d, %d, %d, %d, %d)\n", orW, orH, trafw, trafh, cw, ch);
     }
diff --git a/rtengine/improccoordinator.cc b/rtengine/improccoordinator.cc
--- a/rtengine/improccoordinator.cc
+++ b/rtengine/improccoordinator.cc
@@ -385,36 +385,33 @@

     readyphase++;

-    progress ("Rotate / Distortion...", 100 * readyphase / numofphases);
-    // Remove transformation if unneeded
-    bool needstransform = ipf.needsTransform();
-
-    if (!needstransform && ! ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) && orig_prev != oprevi) {
-        delete oprevi;
-        oprevi = orig_prev;
-    }
-
-    if ((needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) ) {
-        if (!oprevi || oprevi == orig_prev) {
-            oprevi = new Imagefloat (pW, pH);
-        }
-
-        if (needstransform)
-            ipf.transform (orig_prev, oprevi, 0, 0, 0, 0, pW, pH, fw, fh, 
-                           imgsrc->getMetaData(), imgsrc->getRotateDegree(), false);
-        else {
-            orig_prev->copyData (oprevi);
-        }
-    }
-
-    if ((todo & M_RGBCURVE) && params.fattal.enabled) {
-        Imagefloat *fattalprev = oprevi->copy();
+    if ((todo & (M_TRANSFORM | M_RGBCURVE)) && params.fattal.enabled) {
+        Imagefloat *fattalprev = orig_prev->copy();
         ipf.ToneMapFattal02(fattalprev);
         if (oprevi != orig_prev) {
             delete oprevi;
         }
         oprevi = fattalprev;
-    } 
+    } else {
+        oprevi = orig_prev;
+    }
+
+    progress ("Rotate / Distortion...", 100 * readyphase / numofphases);
+    // Remove transformation if unneeded
+    bool needstransform = ipf.needsTransform();
+    
+    if ((needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled)) ) {
+        assert(oprevi);
+        Imagefloat *op = oprevi;
+        oprevi = new Imagefloat (pW, pH);
+
+        if (needstransform)
+            ipf.transform (op, oprevi, 0, 0, 0, 0, pW, pH, fw, fh, 
+                           imgsrc->getMetaData(), imgsrc->getRotateDegree(), false);
+        else {
+            op->copyData (oprevi);
+        }
+    }

     if ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
         const int W = oprevi->getWidth();
diff --git a/rtengine/rtthumbnail.cc b/rtengine/rtthumbnail.cc
--- a/rtengine/rtthumbnail.cc
+++ b/rtengine/rtthumbnail.cc
@@ -1090,6 +1090,10 @@

     ipf.firstAnalysis (baseImg, params, hist16);

+    if (params.fattal.enabled) {
+        ipf.ToneMapFattal02(baseImg);
+    }
+    
     // perform transform
     if (ipf.needsTransform()) {
         Imagefloat* trImg = new Imagefloat (fw, fh);
@@ -1102,10 +1106,6 @@
         baseImg = trImg;
     }

-    if (params.fattal.enabled) {
-        ipf.ToneMapFattal02(baseImg);
-    }
-
     // update blurmap
     SHMap* shmap = nullptr;

diff --git a/rtengine/simpleprocess.cc b/rtengine/simpleprocess.cc
--- a/rtengine/simpleprocess.cc
+++ b/rtengine/simpleprocess.cc
@@ -810,6 +810,10 @@

         ipf.firstAnalysis (baseImg, params, hist16);

+        if (params.fattal.enabled) {
+            ipf.ToneMapFattal02(baseImg);
+        }
+                
         // perform transform (excepted resizing)
         if (ipf.needsTransform()) {
             Imagefloat* trImg = nullptr;
@@ -833,10 +837,6 @@
         //ImProcFunctions ipf (&params, true);
         ImProcFunctions &ipf = * (ipf_p.get());

-        if (params.fattal.enabled) {
-            ipf.ToneMapFattal02(baseImg);
-        }
-        
         if (params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled) {
             const int W = baseImg->getWidth();
             const int H = baseImg->getHeight();

@agriggio Great! I agree about dcrop and refactoring. I will test your new patch in about two hours.

@agriggio I'll have a look at your proposed patch and modify it if you don't mind.

@Hombre57 sure, go ahead :+1:

I started the patch, but I need to dig into the mechanism again, so I'll finish this tomorrow (Saturday).

Just a few words to say that I'm working on this (since few hours now), but the way that image size and scale are handled is just a mess and should be unified.

Was this page helpful?
0 / 5 - 0 ratings