Caffe: Double free or corruption issue when make runtest or train mnist model

Created on 14 Feb 2017  Â·  11Comments  Â·  Source: BVLC/caffe

Issue summary

I can successfully build caffe, with make all, make pycaffe, make test without error.
When I make runtest, it stops immediately; When I train mnist model, it stops ealierly, and gives the same errors.

I didn't change anything, just clone, and make. I have struggled with this issue for a long time, anybody can help me find out what's it wrong? thanks

* Error in `.build_debug/tools/caffe': double free or corruption (out): 0x0000000002119160 *
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f01f4ea87e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7f01f4eb0e0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f01f4eb498c]
/usr/lib/x86_64-linux-gnu/libprotobuf.so.9(_ZN6google8protobuf8internal28DestroyDefaultRepeatedFieldsEv+0x1f)[0x7f01f61be8af]
/usr/lib/x86_64-linux-gnu/libprotobuf.so.9(_ZN6google8protobuf23ShutdownProtobufLibraryEv+0x8b)[0x7f01f61bdb3b]
/usr/lib/x86_64-linux-gnu/libmirprotobuf.so.3(+0x20329)[0x7f01d04fd329]
/lib64/ld-linux-x86-64.so.2(+0x10c17)[0x7f01f85e8c17]
/lib/x86_64-linux-gnu/libc.so.6(+0x39ff8)[0x7f01f4e6aff8]
/lib/x86_64-linux-gnu/libc.so.6(+0x3a045)[0x7f01f4e6b045]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf7)[0x7f01f4e51837]
.build_debug/tools/caffe[0x426dd9]

I attached my Makefile.config at
Makefile.config.pdf

I also attached the full debug output for record.
debug output.pdf

Your system configuration

Operating system: Ubuntu 16.04 Desktop
Compiler: gcc
CUDA version (if applicable): 8.0
CUDNN version (if applicable): 5.1
BLAS: atlas
Python or MATLAB version (for pycaffe and matcaffe respectively): anaconda python 2.7

Best,
Weldon

Most helpful comment

Hi, I encounter this problem recently when running the caffe/ssd branch. The cause turned out to be that caffe has simultaneously linked to libprotobuf.so and libprotobuf-lite.so, which double free allocated memory. You may check whether you have this double-link problem by checking the libraries that the built caffe has linked to by typing:

ldd caffe | grep proto

In my case, the caffe has simultaneously linked to libprotobuf.so.10, libprotobuf-lite.so.10 and libmirprotobuf.so.3, and the latter two were originally linked to opencv_highgui. By removing the opencv's highgui library from caffe's makefile and the involved functions in the source files, the problem was gone.

Hope this helps and good luck!

All 11 comments

Sorry, this seems to be a system issue. Please ask installation questions on the mailing list.

From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

_Please do not post usage, installation, or modeling questions, or other requests for help to Issues._
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.

Hi, I encounter this problem recently when running the caffe/ssd branch. The cause turned out to be that caffe has simultaneously linked to libprotobuf.so and libprotobuf-lite.so, which double free allocated memory. You may check whether you have this double-link problem by checking the libraries that the built caffe has linked to by typing:

ldd caffe | grep proto

In my case, the caffe has simultaneously linked to libprotobuf.so.10, libprotobuf-lite.so.10 and libmirprotobuf.so.3, and the latter two were originally linked to opencv_highgui. By removing the opencv's highgui library from caffe's makefile and the involved functions in the source files, the problem was gone.

Hope this helps and good luck!

@cailile thank you for your comment, I encountered this problem recently and you helped me to fix it. The GTK build of opencv_highgui was responsible for bringing in libprotobuf-lite.so. The fix that I did, which does not require changing the source code, was to rebuild OpenCV against Qt5 instead of GTK, and rebuild caffe. On Ubuntu 16.04 the qt5 package is "qt5-default" and the OpenCV cmake option is WITH_QT.

@cailile I have encountered the exact same problem during installing caffe/ssd branch as mentioned here. However, the solution you directed is a bit unclear and it would really help if you could elaborate more on how you solved it. Thanks a lot.

Hi, as far as I can recall, the only place that used functions in
highgui_core is in bbox_util.cpp. Comment these lines:
cv::imshow("detections", image);
if (cv::waitKey(1) == 27) {
raise(SIGINT);
}
should solve the problem.

However, I do think jmuncaster's solution is better, since the root cause
is the libprotobuf-lite incurred by libgtk-3.0. Roll back to Ubuntu 14.04
will also solve this problem, since Ubuntu 14.04 use gtk-2.0 that did not
include libprotobuf-lite.

Best Regards,
Lile

On Tue, Jun 13, 2017 at 7:45 PM, Jonti Talukdar notifications@github.com
wrote:

@cailile https://github.com/cailile I have encountered the exact same
problem during installing caffe/ssd branch as mentioned here. However, the
solution you directed is a bit unclear and it would really help if you
could elaborate more on how you solved it. Thanks a lot.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/BVLC/caffe/issues/5282#issuecomment-308090548, or mute
the thread
https://github.com/notifications/unsubscribe-auth/Aad21FGjR02o3_9aFvfJXpaMyhrtqipJks5sDnZngaJpZM4L_7T3
.

@jontitalukdar Here are some more comments. The solution I currently adopt is to roll back to Ubuntu 14.04, because simply excluding opencv_highgui when building caffe will only solve the problem on the caffe side. Later on when I want to import both caffe and cv2 in Python, the problem came up again. I am not sure whether there is a solution for libprotobuf and libprotobuf-lite to run together. @jmuncaster's solution is worth a try. If he post it earlier, I may not have to roll back to Ubuntu 14.04:)

@cailile Thank you so much for your reply. You are absolutely correct, the opencv_highgui will cause problems when importing both caffe and cv2 withing the same script. Moreover, I installed opencv in a python virtual environment, which caused some further errors. Removing any one of the two, libprotobuf and libprotobuf-lite, might cause further unforeseen problems in the future.

So I tried rebuilding OpenCV using Qt5 instead of GTK as proposed by @jmuncaster , and it worked!
I cleaned the original OpenCV build and then reinstalled it with Qt5.

make clean
mkdir build
cd build/
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=/usr/local -DFORCE_VTK=ON -DWITH_TBB=ON -DWITH_V4L=ON -DWITH_QT=ON -DWITH_OPENGL=ON -DWITH_CUBLAS=ON -DCUDA_NVCC_FLAGS="-D_FORCE_INLINES" -DWITH_GDAL=ON -DWITH_XINE=ON -DBUILD_EXAMPLES=ON ..

I also added the library path of OpenCV in the Caffe Makefile.config and then reibuilt ssd/caffe using make.

LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial /usr/local/share/OpenCV/3rdparty/lib/

It seems to have worked for me for now. I will keep a close watch if any other discrepancies crop up, but it Works for now!
Thank you so much for your help @cailile :)

@caille Thank you so much for your solution. Now the problem of double free or corruption has gone. The side effect is that when we make caffe without highgui, we can't utilize things like webcam or output detections as video.
@jontitalukdar Here is something I suggest: when making openCV, I strongly suggest add -D WITH_GTK=NO, without this my computer will automatically build with gtk if it can find gtk packs on computer which I don’t know why.
What’s more, I can’t install qt5-default(don’t know why, but can’t apt-get, lots of unmets), but I use qt4 instead for compiling openCV, and it works.

@cailile thanks for your suggestion,It worked on my computer,but,I have another problem.
The same code I used three months ago,it run smoothly.When I use it tomorrow,it run with error.
so what happens during this period?

I solved it according to #5777.

Nice. Also work for the "./upgrade_net_proto_binary" abort problem.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

OpenHero picture OpenHero  Â·  3Comments

iamhankai picture iamhankai  Â·  3Comments

malreddysid picture malreddysid  Â·  3Comments

serimp picture serimp  Â·  3Comments

lixin7895123 picture lixin7895123  Â·  3Comments