Dlib: Segmentation fault (core dumped)

Created on 2 Feb 2018  路  5Comments  路  Source: davisking/dlib

Hi,

I have been using the python bindings of dlib for quite some time now without any problems. However very recently (read yesterday), I started getting a segmentation fault error. I tried debugging the issue using gdb as suggested by links on an extensive google search. On investigating, the problem seems to arise from dlib.so.

I am primarily working with the face detection module of dlib using get_frontal_face_detector, shape_predictor and face_recognition_model_v1. However I am fairly certain the fault has nothing to do with the aforementioned functions. In addition to the above I am using mult-threading to process images. What is perplexing is that, all of my codebase including the code mentioned below was running perfectly fine until a week back.

Below is the output of gdb inspection and my failed attempt at running a basic python program.

(gdb) r server_functions.py
Starting program: /usr/bin/python server_functions.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff286f700 (LWP 129076)]
[New Thread 0x7ffff206e700 (LWP 129077)]
[New Thread 0x7fffef86d700 (LWP 129078)]
[New Thread 0x7fffda2a3700 (LWP 129079)]
[New Thread 0x7fffd7aa2700 (LWP 129080)]
[New Thread 0x7fffd52a1700 (LWP 129081)]

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffdc91e03b in ?? ()
from /usr/local/lib/python2.7/dist-packages/dlib-19.9.99-py2.7-linux-x86_64.egg/dlib.so
(gdb) where

0 0x00007fffdc91e03b in ?? ()

from /usr/local/lib/python2.7/dist-packages/dlib-19.9.99-py2.7-linux-x86_64.egg/dlib.so

1 0x00000000004c37ed in PyEval_EvalFrameEx ()

2 0x00000000004c136f in PyEval_EvalFrameEx ()

3 0x00000000004c136f in PyEval_EvalFrameEx ()

4 0x00000000004c136f in PyEval_EvalFrameEx ()

5 0x00000000004c136f in PyEval_EvalFrameEx ()

6 0x00000000004b9ab6 in PyEval_EvalCodeEx ()

7 0x00000000004b97a6 in PyEval_EvalCode ()

8 0x00000000004b96df in PyImport_ExecCodeModuleEx ()

9 0x00000000004b2b06 in ?? ()

10 0x00000000004a4ae1 in ?? ()

11 0x00000000004a3e84 in PyImport_ImportModuleLevel ()

12 0x00000000004a59e4 in ?? ()

13 0x00000000004a577e in PyObject_Call ()

14 0x00000000004c5e10 in PyEval_CallObjectWithKeywords ()

15 0x00000000004be6d7 in PyEval_EvalFrameEx ()

16 0x00000000004b9ab6 in PyEval_EvalCodeEx ()

17 0x00000000004eb30f in ?? ()

18 0x00000000004e5422 in PyRun_FileExFlags ()

19 0x00000000004e3cd6 in PyRun_SimpleFileExFlags ()

20 0x0000000000493ae2 in Py_Main ()

21 0x00007ffff7810830 in __libc_start_main (main=0x4934c0
, argc=2,

argv=0x7fffffffe588, init=<optimized out>, fini=<optimized out>, 
rtld_fini=<optimized out>, stack_end=0x7fffffffe578) at ../csu/libc-start.c:291

22 0x00000000004933e9 in _start ()

Below is the configuration of my systems
OS : Ubuntu 16.04 LTS
python version : 2.7.12
dlib version : 19.9
dlib source : Earlier I installed the python bindings using pip. Later I also tried installing using "python setup.py install" post cloning from github and removing the earlier installed pip version.

Looking forward to any pointers that might help me here. Thanks !

Most helpful comment

I was facing the same problem since my last update of the Dlib to version 19.10.0. Interesting is that the problem appeared within a tqdm loop.

for in_file_path in tqdm(glob.glob(os.path.join(directory, "*.jpg")), desc="..."):
     image = cv2.imread(in_file_path)
     # A call to the dlib.cnn_face_detection_model_v1 created object with the previous image

Using faulthandler I was able to get the following stack:

Fatal Python error: Segmentation fault

Thread 0x00007f561fdb9700 <python> (most recent call first):
  File "/usr/local/lib/python2.7/dist-packages/tqdm/_tqdm.py", line 97 in run
  File "/usr/lib/python2.7/threading.py", line 801 in __bootstrap_inner
  File "/usr/lib/python2.7/threading.py", line 774 in __bootstrap

Current thread 0x00007f56781eb700 <python> (most recent call first):
  File "detector/face_helper.py", line 51 in build_dlib_cnn_detected_faces
  File "detector/face_helper.py", line 129 in detect_faces
  File "/home/eduardo/simple_data.py", line 46 in write_aligned_faces
  File "/home/eduardo/simple_data.py", line 67 in <module>
  File "/usr/lib/python2.7/runpy.py", line 72 in _run_code
  File "/usr/lib/python2.7/runpy.py", line 174 in _run_module_as_main
[1]    2409 segmentation fault  python simple_data.py

By removing the tqdm declaration the issue has gone away! I am not really sure what happened but in my case it was not a dlib issue.

All 5 comments

That's typically how errors in multithreading appear. Your program will seem to work but just randomly crash every now and then. You really need to understand multithreading if you want to write multithreaded applications. Either you design it correctly by static analysis or you are going to have these bugs. The python tools for multithreading don't make this particularly easy either since they are not very well thought out.

Anyway, the most important rule when writing multithreaded applications is that you can't have two threads modifying the same thing at the same time. So if you have threads that touch the same instance of any object and you haven't placed appropriate thread synchronization around those points then you have a bug, unless those objects are inherently immutable, which is generally not the case.

@davisking, thanks for the quick reply and also for the suggestions. Even though I made sure the application was working statically and that exists a proper queuing mechanism along with locks on objects, it seems that there is a problem with my implementation.

I was facing the same problem since my last update of the Dlib to version 19.10.0. Interesting is that the problem appeared within a tqdm loop.

for in_file_path in tqdm(glob.glob(os.path.join(directory, "*.jpg")), desc="..."):
     image = cv2.imread(in_file_path)
     # A call to the dlib.cnn_face_detection_model_v1 created object with the previous image

Using faulthandler I was able to get the following stack:

Fatal Python error: Segmentation fault

Thread 0x00007f561fdb9700 <python> (most recent call first):
  File "/usr/local/lib/python2.7/dist-packages/tqdm/_tqdm.py", line 97 in run
  File "/usr/lib/python2.7/threading.py", line 801 in __bootstrap_inner
  File "/usr/lib/python2.7/threading.py", line 774 in __bootstrap

Current thread 0x00007f56781eb700 <python> (most recent call first):
  File "detector/face_helper.py", line 51 in build_dlib_cnn_detected_faces
  File "detector/face_helper.py", line 129 in detect_faces
  File "/home/eduardo/simple_data.py", line 46 in write_aligned_faces
  File "/home/eduardo/simple_data.py", line 67 in <module>
  File "/usr/lib/python2.7/runpy.py", line 72 in _run_code
  File "/usr/lib/python2.7/runpy.py", line 174 in _run_module_as_main
[1]    2409 segmentation fault  python simple_data.py

By removing the tqdm declaration the issue has gone away! I am not really sure what happened but in my case it was not a dlib issue.

@edumucelli I had the same exact experience
Thanks

I am experiencing the same issue, I am using the dnn_face_recognition_ex.cpp (so no multithreading nor python) and I have a segmentation fault when the code tries to use the front detector:
for (auto face : detector(img))
This code was working fine when testing with old pictures of mine. I tried with a selfie from my phone camera (2472x3296 pixels) and now I have this core dumped.
If I downside the photo, then it is working. I did not see any restriction on image size, is there some?

Note sure if it can help but I tried to get some debug info:
Program received signal SIGSEGV, Segmentation fault.
0x000000000805e856 in dlib::matrix, dlib::row_major_layout>::matrix(dlib::matrix, dlib::row_major_layout> const&) ()
(gdb) bt

0 0x000000000805e856 in dlib::matrix, dlib::row_major_layout>::matrix(dlib::matrix, dlib::row_major_layout> const&)

()

1 0x000000000807cbc1 in void std::vector, dlib::row_major_layout>, std::allocator, dlib::row_major_layout> > >::_M_realloc_insert, dlib::row_major_layout> const&>(__gnu_cxx::__normal_iterator, dlib::row_major_layout>*, std::vector, dlib::row_major_layout>, std::allocator, dlib::row_major_layout> > > >, dlib::matrix, dlib::row_major_layout> const&) ()

2 0x0000000008051ff3 in main ()

Was this page helpful?
0 / 5 - 0 ratings