Incubator-mxnet: The conflict between MXNet and OpenCV

Created on 7 Nov 2017  Â·  12Comments  Â·  Source: apache/incubator-mxnet

Hi, there.

I found the reason of the conflict between MXNet and OpenCV.

Environment info

Operation System: Arch Linux 4.13.6
MXNet: 3f37577 (Date: Tue Nov 7 02:13:07 2017 +0800)
OpenCV: 3.3.1
Python: 2.7.14/3.6.3
GCC: 6.3.1 20170109
Build config: make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas

Steps to reproduce

  1. I built the MXNet core shared library with make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas
    make/config.mk is default.
    The building was successful.
  2. Then I was going to install the MXNet Python binding.
    cd python
    sudo python setup.py install

It showed the error that:

*** Error in python': free(): invalid pointer:0x000055ec46fe1520 ***

What I have tried to solve it

I deleted all "import cv2" in $(MXNET_PATH)/python/mxnet/{recordio.py, image/{detection.py, image.py}}

Then I made two tests in the folder $(MXNET_PATH)/python/.

➜  python git:(master) ✗ python 
Python 3.6.3 (default, Oct 24 2017, 14:48:20) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
>>> import cv2
*** Error in `python': free(): invalid pointer: 0x0000564c7470d520 ***
[1]    116917 abort (core dumped)  python

➜  python git:(master) ✗ python
Python 3.6.3 (default, Oct 24 2017, 14:48:20) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> import mxnet
[src/tcmalloc.cc:283] Attempt to free invalid pointer 0x5568689f8fc0 
[1]    116946 abort (core dumped)  python

src/tcmalloc.cc is the code of gperftools.
So I think there is a conflict between gperftools and opencv2.

I set USE_GPERFTOOLS = 0 and USE_JEMALLOC = 0 in $(MXNET_PATH)/make/config.mk, and rebuild MXNet.
The problem is solved.

I think the reason is that gperftools or jemalloc replaces the memory allocator including malloc, however python-opencv uses the default allocator.

There are some shared pointers between MXNet and OpenCV, but it's not available to free the memories different allocators(gperftools, jemallo, glibc) allocated.

Solutions

There are two solutions to use MXNet and OpenCV simultaneously.


    1. Use python-opencv with the builtin memory allocator, and set USE_GPERFTOOLS = 0 and USE_JEMALLOC = 0 in $(MXNET_PATH)/make/config.mk. Rebuild MXNet or use pip to install MXNet.


    1. Rebuild python-opencv with the responding memory allocator with MXNet, such as python-opencv with gperftools memory allocator and MXNet with gperftools memory allocator.

Most helpful comment

Hi there,

I can confirm setting USE_GPERFTOOLS=0 fixes the issue (it was double free for me). I kept USE_JEMALLOC=1 and both modules loads together in any order.

Adam.

All 12 comments

I've seen the same issue when building with cuda9 + cudnn7 with gperftools option on.

I think the reason is that gperftools or jemalloc replaces the memory allocator including malloc, however python-opencv uses the default allocator.

There are some shared pointers between MXNet and OpenCV, but it's not available to free the memories different allocators(gperftools, jemallo, glibc) allocated.

Out of curiosity, does the crash occur if you import cv2 after mxnet?

@KellenSunderland The crash will occur too if importing cv2 after mxnet with gperftools.
I set USE_GPERFTOOLS = 0 and USE_JEMALLOC = 0. Rebuild mxnet without gperftools. There will be no crash.
USE_GPERFTOOLS = 1 is the default setting. When the machine doesn't have gperftools, the building will not have gperftools.

@wkcn Thanks, for the info. I think your description on the relation between the USE_GPERFTOOLS flag and the crash is clear. I just saw some other similar reports that were dependent on the order you initialize opencv in relation to the gperf using library (i.e. mxnet in this case).

@KellenSunderland Thank you!

Hi there,

I can confirm setting USE_GPERFTOOLS=0 fixes the issue (it was double free for me). I kept USE_JEMALLOC=1 and both modules loads together in any order.

Adam.

USE_JEMALLOC=1didn't work for me with opencv, I had to disable both USE_GPERFTOOLS=0 and USE_JEMALLOC=0, otherwise python setup fails.

Maybe this should be reopened?
This affects many people building from source with defaults. (see e.g. Arch AUR)

Possible solutions can be:

  • change build so that USE_GPERFOOLS=0 and USE_JEMALLOC=0 are forced, if external opencv is used.
  • include opencv in build-tree, and build it with appropriate allocators

as already suggested by @wkcn.

You can use LD_PRELOAD to workaround this in the meantime.

@tequilaguru Thank you! I will try it next time.

I hava met tha same problem, that I build mxnet-1.2.1 with cuda-9.1,cudnn7.1 with USE_GPERFOOLS =1 . When I put the libmxnet.so in another machine to run inference, it asks me the lack of libtcmalloc.so.4, then I install the gperftools by yum, the problem occur when I import mxnet in python

It should work with LD_PRELOAD, either with tcmalloc or jemalloc

Was this page helpful?
0 / 5 - 0 ratings

Related issues

realbns2008 picture realbns2008  Â·  3Comments

qiliux picture qiliux  Â·  3Comments

sbodenstein picture sbodenstein  Â·  3Comments

seongkyun picture seongkyun  Â·  3Comments

dmadeka picture dmadeka  Â·  3Comments