Rawtherapee: Memory leak of updated preview image

Created on 10 Nov 2018  路  17Comments  路  Source: Beep6581/RawTherapee

In case of active transformations, after eg. zooming in and out the image many times there there will be a huge memory leak (because the original Imagefloat object will never be deleted after improccoordinator.cc:445 ?)

  if ((needstransform || ((todo & (M_TRANSFORM | M_RGBCURVE))  && params.dirpyrequalizer.cbdlMethod == "bef" && params.dirpyrequalizer.enabled && !params.colorappearance.enabled))) {
            assert(oprevi);
            Imagefloat *op = oprevi;
            oprevi = new Imagefloat(pW, pH);

            if (needstransform)
                ipf.transform(op, oprevi, 0, 0, 0, 0, pW, pH, fw, fh,
                              imgsrc->getMetaData(), imgsrc->getRotateDegree(), false);
            else {
                op->copyData(oprevi);
            }
        }

Leakage report:

Indirect leak of 26901216 byte(s) in 9 object(s) allocated from:
    #0 0x7fce1ecd3ed0 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xe8ed0)
    #1 0x5604ae8d497c in AlignedBuffer<float>::resize(unsigned long, int) /store1/tmp/RawTherapee-repo/rtengine/alignedbuffer.h:112
    #2 0x5604ae8d497c in rtengine::PlanarRGBData<float>::allocate(int, int) /store1/tmp/RawTherapee-repo/rtengine/iimage.h:700
    #3 0x5604ae8fef1c in rtengine::ImProcCoordinator::updatePreviewImage(int, bool) /store1/tmp/RawTherapee-repo/rtengine/improccoordinator.cc:445
    #4 0x5604ae90cdcb in rtengine::ImProcCoordinator::process() /store1/tmp/RawTherapee-repo/rtengine/improccoordinator.cc:1468
    #5 0x7fce1d50a6a9  (/usr/lib/x86_64-linux-gnu/libglibmm-2.4.so.1+0x516a9)

ps: latest dev build

patch provided bug

Most helpful comment

I have investigated this with an exquisitely painful massif run with low-level page allocation turned on. For those interested, I am providing a sample ms_print_output.txt dump.

Functions that underwent special scrutiny include AlignedBuffer::resize() and wavelet_level::create(); they are slinging a lot of raw pointers. Everything appears to be in order, however.

I believe that with @heckflosse's patch, the actual leak is resolved, and RT is managing memory correctly, at least in this part of the code. The fact remains that the program's resident set size continues to increase, and I think that malloc is managing system memory poorly, because Valgrind finds gigabytes of memory allocated by malloc via mmap. I am not sure why it behaves this way. @PkmX's env var workaround from #4416 in this comment causes RT's memory usage to behave.

I think this issue is resolved, and any further effort should go toward nailing down whatever pathological edge case is being triggered in malloc to cause the inflation.

All 17 comments

This should fix the leak:

diff --git a/rtengine/improccoordinator.cc b/rtengine/improccoordinator.cc
index f68629564..ac330db23 100644
--- a/rtengine/improccoordinator.cc
+++ b/rtengine/improccoordinator.cc
@@ -953,6 +953,11 @@ void ImProcCoordinator::updatePreviewImage(int todo, bool panningRelatedChange)
             hListener->histogramChanged(histRed, histGreen, histBlue, histLuma, histToneCurve, histLCurve, histCCurve, /*histCLurve, histLLCurve,*/ histLCAM, histCCAM, histRedRaw, histGreenRaw, histBlueRaw, histChroma, histLRETI);
         }
     }
+    if (orig_prev != oprevi) {
+        delete oprevi;
+        oprevi = nullptr;
+    }
+

 }

Thx, but it seems it does not solve the root issue. The provided delete executed one or two times if I zoom in-out-in-out the same image endlessly, but the memory footprint is getting bigger and bigger similarly fast as before.

@Konyicsiva If you put a printf in line 443 of improccoordinator.cc you will see that the oprevi = new Imagefloat(pW, pH); is not called when zooming, so the other leak must be somewhere else.

@Konyicsiva I straightened an image (to have a transform), zoomed in and out about 5 minutes while whatching the memory footprint. No leak here.

You are right, line 443 does not executed, only twice, like your added delete.
I compiled it again in debug mode with sanitize and I could not reproduce the error! Eh.

But if I compile it in release mode with full optimalization, it leaks.
I open an image (20Mpix CR2 canon raw), set some transformation, and just zoom in-out, and see the RSS growing. (The growth stops if I does not play with the zoom).

while true; do echo -n `date`"  RSS=" ;ps -p `pidof rawtherapee` -h -o rss; sleep 1;done
Sat 10 Nov 17:30:16 CET 2018  RSS=128696
Sat 10 Nov 17:30:17 CET 2018  RSS=218108
Sat 10 Nov 17:30:18 CET 2018  RSS=235400
Sat 10 Nov 17:30:19 CET 2018  RSS=244116
Sat 10 Nov 17:30:20 CET 2018  RSS=252412
Sat 10 Nov 17:30:21 CET 2018  RSS=478260
Sat 10 Nov 17:30:22 CET 2018  RSS=767892
Sat 10 Nov 17:30:23 CET 2018  RSS=767892
Sat 10 Nov 17:30:24 CET 2018  RSS=767892
Sat 10 Nov 17:30:25 CET 2018  RSS=767892
Sat 10 Nov 17:30:26 CET 2018  RSS=1090836
Sat 10 Nov 17:30:27 CET 2018  RSS=1172732
Sat 10 Nov 17:30:28 CET 2018  RSS=1401624
Sat 10 Nov 17:30:29 CET 2018  RSS=1468748
Sat 10 Nov 17:30:30 CET 2018  RSS=1484880
Sat 10 Nov 17:30:31 CET 2018  RSS=1503880
Sat 10 Nov 17:30:32 CET 2018  RSS=1602996
Sat 10 Nov 17:30:33 CET 2018  RSS=1626180
Sat 10 Nov 17:30:34 CET 2018  RSS=1578384
Sat 10 Nov 17:30:35 CET 2018  RSS=1700416
Sat 10 Nov 17:30:36 CET 2018  RSS=1670052
Sat 10 Nov 17:30:37 CET 2018  RSS=1698628
Sat 10 Nov 17:30:38 CET 2018  RSS=1792440
Sat 10 Nov 17:30:39 CET 2018  RSS=1824948
Sat 10 Nov 17:30:40 CET 2018  RSS=1789220
Sat 10 Nov 17:30:41 CET 2018  RSS=1865732
Sat 10 Nov 17:30:42 CET 2018  RSS=2044240
Sat 10 Nov 17:30:43 CET 2018  RSS=2058816
Sat 10 Nov 17:30:44 CET 2018  RSS=2242996
Sat 10 Nov 17:30:45 CET 2018  RSS=2307128
Sat 10 Nov 17:30:46 CET 2018  RSS=2370520
Sat 10 Nov 17:30:47 CET 2018  RSS=2523844
Sat 10 Nov 17:30:48 CET 2018  RSS=2524068
Sat 10 Nov 17:30:49 CET 2018  RSS=2533792
Sat 10 Nov 17:30:50 CET 2018  RSS=2667956
Sat 10 Nov 17:30:51 CET 2018  RSS=2742424
Sat 10 Nov 17:30:52 CET 2018  RSS=2686892
Sat 10 Nov 17:30:53 CET 2018  RSS=2730672
Sat 10 Nov 17:30:54 CET 2018  RSS=2735180
Sat 10 Nov 17:30:55 CET 2018  RSS=2780244
Sat 10 Nov 17:30:56 CET 2018  RSS=2810608
Sat 10 Nov 17:30:57 CET 2018  RSS=2900772
Sat 10 Nov 17:30:58 CET 2018  RSS=2926980
Sat 10 Nov 17:30:59 CET 2018  RSS=3013920
Sat 10 Nov 17:31:00 CET 2018  RSS=2923760
Sat 10 Nov 17:31:01 CET 2018  RSS=3051196
Sat 10 Nov 17:31:02 CET 2018  RSS=2991352
Sat 10 Nov 17:31:03 CET 2018  RSS=3092900
Sat 10 Nov 17:31:04 CET 2018  RSS=3159528
Sat 10 Nov 17:31:05 CET 2018  RSS=3148588
Sat 10 Nov 17:31:06 CET 2018  RSS=3165000
Sat 10 Nov 17:31:07 CET 2018  RSS=3277140
Sat 10 Nov 17:31:09 CET 2018  RSS=3332584
Sat 10 Nov 17:31:10 CET 2018  RSS=3335164
Sat 10 Nov 17:31:11 CET 2018  RSS=3351692
Sat 10 Nov 17:31:12 CET 2018  RSS=3349816
Sat 10 Nov 17:31:13 CET 2018  RSS=3356068
Sat 10 Nov 17:31:14 CET 2018  RSS=3502792
Sat 10 Nov 17:31:15 CET 2018  RSS=3415828
Sat 10 Nov 17:31:16 CET 2018  RSS=3468152
Sat 10 Nov 17:31:17 CET 2018  RSS=3469340
Sat 10 Nov 17:31:18 CET 2018  RSS=3475080
Sat 10 Nov 17:31:19 CET 2018  RSS=3534880
Sat 10 Nov 17:31:20 CET 2018  RSS=3601092
Sat 10 Nov 17:31:21 CET 2018  RSS=3598632
Sat 10 Nov 17:31:22 CET 2018  RSS=3753216

@Konyicsiva maybe related to #4416 ?

No, exporting the images does not make RSS bigger. Only editing/zooming makes it leak. I'm going to try sanitize the optimized build to catch the cause.

@Konyicsiva I just pushed the fix for the leak from first post. Issue stays open for the other leak(s)

I also recently found the mentioned leak when zooming in/out. But I experience this also with neutral profile. There also were left 0.9% (from 8GB) for each image after closing the editor tab.
Furthermore I have about >2% left for each image when processing the queue.

I also commented on #4416. I was somewhat unsure. Now with this issue I think there might really be more leaks. It's a real pain for me, recently I came home with lots of photos, started processing them - and made the 8GB machine go into full swap mode, with no image opened anymore. RT was the only running application.

I just processed 30 images using the queue on Win7. Can not reproduce the leaks.

I also processed some photos to test the patch for #4979 - this is the result:

memory_leak

All images closed, queue empty all the time, still consumes 51%/~4GB of RAM. The only thing I did was loading Auto matched curve and then Standard film curve and compared them by switching back and forth in the history.

Strange. If I switch on the gcc address sanitizer, the bug does not occur any more. But if I compile the project with the very same options apart from removing "-fsanitize=address" from CXX_FLAGS, the leak still gobbles all the memory. Here is a show. It took about a minute to reach from 800MB to 4GB RSS:
youtube/memleak

@Konyicsiva Thanks for the video :+1: I already trusted your findings before viewing the video. But I still can't reproduce it. I'm on Windows. Maybe @Floessie has an idea?

Seems related to #4416, though I couldn't reproduce it at that time. @PkmX provided some good hints there we could build on. And I guess @thirtythreeforty has expertise on this toppic. I'm currently busy, sorry.

HTH,
Fl枚ssie

Hmm. This smells like the leak I was observing in 5.4. I will feed a debug build to massif and see if I can find where all the allocation is happening.

I have investigated this with an exquisitely painful massif run with low-level page allocation turned on. For those interested, I am providing a sample ms_print_output.txt dump.

Functions that underwent special scrutiny include AlignedBuffer::resize() and wavelet_level::create(); they are slinging a lot of raw pointers. Everything appears to be in order, however.

I believe that with @heckflosse's patch, the actual leak is resolved, and RT is managing memory correctly, at least in this part of the code. The fact remains that the program's resident set size continues to increase, and I think that malloc is managing system memory poorly, because Valgrind finds gigabytes of memory allocated by malloc via mmap. I am not sure why it behaves this way. @PkmX's env var workaround from #4416 in this comment causes RT's memory usage to behave.

I think this issue is resolved, and any further effort should go toward nailing down whatever pathological edge case is being triggered in malloc to cause the inflation.

As this issue is likely resolved, @heckflosse would you be ok with closing it for 5.5 and eventually opening a new one for 5.6?

Was this page helpful?
0 / 5 - 0 ratings