Protobuf: python : segfault in PyImport_Cleanup

Created on 3 Nov 2017  路  11Comments  路  Source: protocolbuffers/protobuf

Hello,
we're currently triggering a segfault in protobuf with python 3.4.2 :

protobuf is embedded in uwsgi which calls PyImport_Cleanup (https://github.com/python/cpython/blob/dbb126103e1c4f2818e0dfc7aa4a689d86565e7a/Python/import.c#L398)
The code segfaults when dealloc'ing in protobuf : google::protobuf::python::descriptor::Dealloc

Any help is appreciated on this.

Here is the traceback :

#1  0x00007fc58b8ffb29 in __run_exit_handlers (status=1, listp=0x7fc58bc6d5a8 <__exit_funcs>, 
    run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#2  0x00007fc58b8ffb75 in __GI_exit (status=<optimized out>) at exit.c:104
#3  0x000000000042210f in uwsgi_exit ()
#4  0x0000000000469e51 in uwsgi_segfault ()
#5  <signal handler called>
#6  0x00007fc583437586 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fc583437a6d in std::_Rb_tree_rebalance_for_erase(std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007fc582c92af1 in _M_erase_aux (__position=..., 
    this=0x7fc582ff5160 <google::protobuf::python::interned_descriptors>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/stl_tree.h:1745
#9  erase (__position=..., this=0x7fc582ff5160 <google::protobuf::python::interned_descriptors>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/stl_tree.h:830
#10 _M_erase_aux (__last=..., __first=..., 
    this=0x7fc582ff5160 <google::protobuf::python::interned_descriptors>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/stl_tree.h:1760
#11 erase (__last=..., __first=..., 
    this=0x7fc582ff5160 <google::protobuf::python::interned_descriptors>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/stl_tree.h:848
#12 erase (__x=@0x7fc57277cd60: 0x2df5da0, 
    this=0x7fc582ff5160 <google::protobuf::python::interned_descriptors>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/stl_tree.h:1771
#13 erase (__x=@0x7fc57277cd60: 0x2df5da0, 
    this=0x7fc582ff5160 <google::protobuf::python::interned_descriptors>)
    at /opt/rh/devtoolset-2/root/usr/include/c++/4.8.2/bits/stl_map.h:727
#14 google::protobuf::python::descriptor::Dealloc (self=0x7fc57277cd50)
    at google/protobuf/pyext/descriptor.cc:370
#15 0x00007fc58bf522b7 in free_keys_object (keys=<optimized out>) at ../Objects/dictobject.c:369
#16 PyDict_Clear (op=0x2d5f450) at ../Objects/dictobject.c:1282
#17 0x00007fc58bf522e9 in dict_tp_clear.lto_priv.414 (op=<optimized out>)
    at ../Objects/dictobject.c:2483
#18 0x00007fc58c0bd7c7 in delete_garbage (collectable=collectable@entry=0x7ffd01875850, 
    old=old@entry=0x7fc58c4a3c80 <generations+64>) at ../Modules/gcmodule.c:866
#19 0x00007fc58c0bf18c in collect (generation=2, n_collected=0x0, n_uncollectable=0x0, nofail=1)
    at ../Modules/gcmodule.c:1032
#20 _PyGC_CollectNoFail () at ../Modules/gcmodule.c:1638
#21 0x00007fc58c039739 in PyImport_Cleanup () at ../Python/import.c:483
#22 0x00007fc58bfed4e4 in Py_Finalize () at ../Python/pythonrun.c:616
#23 0x0000000000466d31 in uwsgi_plugins_atexit ()
#24 0x00007fc58b8ffb29 in __run_exit_handlers (status=0, listp=0x7fc58bc6d5a8 <__exit_funcs>, 
    run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#25 0x00007fc58b8ffb75 in __GI_exit (status=<optimized out>) at exit.c:104
#26 0x000000000042210f in uwsgi_exit ()
#27 0x00000000004694a6 in simple_goodbye_cruel_world ()
#28 0x00000000004694d8 in goodbye_cruel_world ()
#29 0x0000000000422df9 in uwsgi_close_request ()
#30 0x0000000000465dad in simple_loop_run ()
#31 0x000000000046a0fb in uwsgi_ignition ()
#32 0x000000000046e805 in uwsgi_worker_run ()
#33 0x000000000046edbc in uwsgi_run ()
#34 0x000000000041ef4e in main ()

Here is some debug of frame 14:

(gdb) p *self
$12 = {
  ob_base = {
    ob_refcnt = 0, 
    ob_type = 0x7fc582ff11e0 <google::protobuf::python::PyFileDescriptor_Type>
  }, 
  descriptor = 0x2df5da0, 
  pool = 0x7fc583b072d0
}
(gdb) p *(self->ob_base->ob_type )
$13 = {
  ob_base = {
    ob_base = {
      ob_refcnt = 18, 
      ob_type = 0x7fc58c4cd5c0 <PyType_Type>
    }, 
    ob_size = 0
  }, 
  tp_name = 0x7fc582d8cc90 "google.protobuf.pyext._message.FileDescriptor", 
  tp_basicsize = 40, 
  tp_itemsize = 0, 
  tp_dealloc = 0x7fc582c92c00 <google::protobuf::python::file_descriptor::Dealloc(google::protobuf::python::PyFileDescriptor*)>, 
  tp_print = 0x0, 
  tp_getattr = 0x0, 
  tp_setattr = 0x0, 
  tp_reserved = 0x0, 
  tp_repr = 0x7fc58c062910 <object_repr.lto_priv.253>, 
  tp_as_number = 0x0, 
  tp_as_sequence = 0x0, 
  tp_as_mapping = 0x0, 
  tp_hash = 0x7fc58bfe6d90 <_Py_HashPointer>, 
  tp_call = 0x0, 
  tp_str = 0x7fc58c057740 <object_str.lto_priv.254>, 
  tp_getattro = 0x7fc58bf2f420 <PyObject_GenericGetAttr>, 
  tp_setattro = 0x7fc58bf2f1b0 <PyObject_GenericSetAttr>, 
  tp_as_buffer = 0x0, 
  tp_flags = 790528, 
  tp_doc = 0x7fc582d8c763 "A File Descriptor", 
  tp_traverse = 0x0, 
  tp_clear = 0x0, 
  tp_richcompare = 0x7fc58c064470 <object_richcompare.lto_priv.255>, 
  tp_weaklistoffset = 0, 
  tp_iter = 0x0, 
  tp_iternext = 0x0, 
  tp_methods = 0x7fc582ff2020 <google::protobuf::python::file_descriptor::Methods>, 
  tp_members = 0x0, 
  tp_getset = 0x7fc582ff2080 <google::protobuf::python::file_descriptor::Getters>, 
  tp_base = 0x7fc582ff19e0 <google::protobuf::python::descriptor::PyBaseDescriptor_Type>, 
  tp_dict = 0x7fc583b05748, 
  tp_descr_get = 0x0, 
  tp_descr_set = 0x0, 
  tp_dictoffset = 0, 
  tp_init = 0x7fc58c062ea0 <object_init.lto_priv.256>, 
  tp_alloc = 0x7fc58c065420 <PyType_GenericAlloc>, 
  tp_new = 0x0, 
  tp_free = 0x7fc58bf1fdf0 <PyObject_Free>, 
  tp_is_gc = 0x0, 
  tp_bases = 0x7fc583af1828, 
  tp_mro = 0x7fc583b0a2d0, 
  tp_cache = 0x0, 
  tp_subclasses = 0x0, 
  tp_weaklist = 0x7fc583b081d8, 
  tp_del = 0x0, 
  tp_version_tag = 48170, 
  tp_finalize = 0x0
}

Most helpful comment

I seem to be seeing this with protobuf version 3.6.1 & Python 3.6.6:

*** backtrace of 14911 ***
uWSGI worker 4(uwsgi_backtrace+0x35) [0x55e235b65d15]
uWSGI worker 4(uwsgi_segfault+0x23) [0x55e235b660c3]
/lib/x86_64-linux-gnu/libc.so.6(+0x33060) [0x7fe416004060]
/home/lex/lib/python3.6/site-packages/google/protobuf/pyext/_message.cpython-36m-x86_64-linux-gnu.so(+0xbab03) [0x7fe3ff2bdb03]
/opt/python/lib/libpython3.6m.so.1.0(+0x12576c) [0x7fe4166cd76c]
/opt/python/lib/libpython3.6m.so.1.0(+0x14125f) [0x7fe4166e925f]
/opt/python/lib/libpython3.6m.so.1.0(PyDict_Clear+0x13f) [0x7fe4166cfd4f]
/opt/python/lib/libpython3.6m.so.1.0(+0x127c09) [0x7fe4166cfc09]
/opt/python/lib/libpython3.6m.so.1.0(+0x1b8cb0) [0x7fe416760cb0]
/opt/python/lib/libpython3.6m.so.1.0(_PyGC_CollectNoFail+0x2a) [0x7fe4167b54ca]
/opt/python/lib/libpython3.6m.so.1.0(PyImport_Cleanup+0x4b9) [0x7fe4167452d9]
/opt/python/lib/libpython3.6m.so.1.0(Py_FinalizeEx+0x62) [0x7fe4167ad092]
uWSGI worker 4(uwsgi_plugins_atexit+0x81) [0x55e235b62dc1]
/lib/x86_64-linux-gnu/libc.so.6(+0x35940) [0x7fe416006940]
/lib/x86_64-linux-gnu/libc.so.6(+0x3599a) [0x7fe41600699a]
uWSGI worker 4(+0x3f53f) [0x55e235b1b53f]
uWSGI worker 4(+0x89748) [0x55e235b65748]
uWSGI worker 4(+0x89778) [0x55e235b65778]
uWSGI worker 4(uwsgi_close_request+0x561) [0x55e235b1c261]
uWSGI worker 4(simple_loop_run+0xdd) [0x55e235b61c1d]
uWSGI worker 4(simple_loop+0x10) [0x55e235b61a20]
uWSGI worker 4(uwsgi_ignition+0x241) [0x55e235b663c1]
uWSGI worker 4(uwsgi_worker_run+0x275) [0x55e235b6acd5]
uWSGI worker 4(+0x8f2cc) [0x55e235b6b2cc]
uWSGI worker 4(+0x3c0ce) [0x55e235b180ce]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fe415ff12e1]
uWSGI worker 4(_start+0x2a) [0x55e235b180fa]
*** end of backtrace ***
$ pip show protobuf
Name: protobuf
Version: 3.6.1
Summary: Protocol Buffers
Home-page: https://developers.google.com/protocol-buffers/
Author: None
Author-email: None
License: 3-Clause BSD License
Location: /home/lex/lib/python3.6/site-packages
Requires: setuptools, six
Required-by: googleapis-common-protos, google-api-core

All 11 comments

It is a bad idea to call Py_Finalize in the exit handler because other things might have been cleaned up before and so anything e.g. STL related might break, which seems to be the issue here. The solution is really to not call Py_Finalize in the exit handler but before, e.g. at the end of main or in uwsgi_exit or whereever you think it is appropriate in your application.

well, seems you're right.
There is a special case in uwsgi for apps that should'nt call Py_Finalize: https://github.com/unbit/uwsgi/blob/master/plugins/python/python_plugin.c#L414
an option is added in another commit : https://github.com/avar/uwsgi/commit/f41da12c546717fefc22c389912543bc05488add

Thanks for your pointer !

Note that usually you do not want to skip calling Py_Finalize at all. Some Python modules really expect to be cleaned up, e.g. to remove some file locks or so. The solution rather is to call Py_Finalize earlier and not in the atexit handler.

we're also running into this with python 3.4.2 at the moment. is there a workaround for this that we can apply while this makes it out to an official uwsgi release? are there any plans/timelines for when this will make it to a release?

any help would be appreciated, since we're currently stuck between a rock and a hard place with this since it manifests as a leak that we can't even mitigate with max-worker-lifetime and such.

Hello @snoshy, sorry for the late response. We mitigated this issue by publishing a wheel without the cpp compilation on our private pypi. We're now using the python implementation.

@dzen Could you elaborate on that matter?
We are facing the same problem (seg fault in PyImport_Cleanup) while running 16 workers (using lazy-apps=true)
I was thinking of setting the skip-atexit-teardown flag to true but I am not sure what the consequences will be.


backtrace

* backtrace of 760 *
/usr/local/bin/uwsgi(uwsgi_backtrace+0x35) [0x560eae8ddc05]
/usr/local/bin/uwsgi(uwsgi_segfault+0x23) [0x560eae8ddfc3]
/lib/x86_64-linux-gnu/libc.so.6(+0x33030) [0x7fcb1a30b030]
/usr/local/lib/python3.6/site-packages/google/protobuf/pyext/_message.cpython-36m-x86_64-linux-gnu.so(+0xa34e0) [0x7fcad12874e0]
/usr/local/lib/libpython3.6m.so.1.0(+0xc5257) [0x7fcb1a974257]
/usr/local/lib/libpython3.6m.so.1.0(+0xcaf79) [0x7fcb1a979f79]
/usr/local/lib/libpython3.6m.so.1.0(+0x1bbbb2) [0x7fcb1aa6abb2]
/usr/local/lib/libpython3.6m.so.1.0(_PyGC_CollectNoFail+0x31) [0x7fcb1aa6b871]
/usr/local/lib/libpython3.6m.so.1.0(PyImport_Cleanup+0x330) [0x7fcb1aa3a0f0]
/usr/local/lib/libpython3.6m.so.1.0(Py_FinalizeEx+0x6b) [0x7fcb1aa4810b]
/usr/local/bin/uwsgi(uwsgi_plugins_atexit+0x81) [0x560eae8dadc1]
/lib/x86_64-linux-gnu/libc.so.6(+0x35910) [0x7fcb1a30d910]
/lib/x86_64-linux-gnu/libc.so.6(+0x3596a) [0x7fcb1a30d96a]
/usr/local/bin/uwsgi(+0x3ecef) [0x560eae893cef]
/usr/local/bin/uwsgi(end_me+0x25) [0x560eae8dae05]
/usr/local/bin/uwsgi(uwsgi_ignition+0x14a) [0x560eae8de1ca]
/usr/local/bin/uwsgi(uwsgi_worker_run+0x275) [0x560eae8e2ad5]
/usr/local/bin/uwsgi(+0x8e0cc) [0x560eae8e30cc]
/usr/local/bin/uwsgi(+0x3b8de) [0x560eae8908de]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fcb1a2f82b1]
/usr/local/bin/uwsgi(_start+0x2a) [0x560eae89090a]

The atexit() is only called in the cpp implementation. While the fix is not so easy, we stick to the python implementation : the python package is uploaded to a private devpi instance.

https://github.com/google/protobuf/blob/master/python/setup.py#L176

Can you double check the version number is 3.4.2? I can't find 3.4.2 in Pypi

Does it work well with 3.4.0?

The problem is: interned_descriptors is defined as a static object but atexit handlers are called AFTER C++ destructors of static objects.
We have changed interned_descriptors to pointer in internal code which should fix this issue. Expect to see the fix in next relase.

Will close this issue for clean up. Feel free to reopen if it is still an issue after next release.

I seem to be seeing this with protobuf version 3.6.1 & Python 3.6.6:

*** backtrace of 14911 ***
uWSGI worker 4(uwsgi_backtrace+0x35) [0x55e235b65d15]
uWSGI worker 4(uwsgi_segfault+0x23) [0x55e235b660c3]
/lib/x86_64-linux-gnu/libc.so.6(+0x33060) [0x7fe416004060]
/home/lex/lib/python3.6/site-packages/google/protobuf/pyext/_message.cpython-36m-x86_64-linux-gnu.so(+0xbab03) [0x7fe3ff2bdb03]
/opt/python/lib/libpython3.6m.so.1.0(+0x12576c) [0x7fe4166cd76c]
/opt/python/lib/libpython3.6m.so.1.0(+0x14125f) [0x7fe4166e925f]
/opt/python/lib/libpython3.6m.so.1.0(PyDict_Clear+0x13f) [0x7fe4166cfd4f]
/opt/python/lib/libpython3.6m.so.1.0(+0x127c09) [0x7fe4166cfc09]
/opt/python/lib/libpython3.6m.so.1.0(+0x1b8cb0) [0x7fe416760cb0]
/opt/python/lib/libpython3.6m.so.1.0(_PyGC_CollectNoFail+0x2a) [0x7fe4167b54ca]
/opt/python/lib/libpython3.6m.so.1.0(PyImport_Cleanup+0x4b9) [0x7fe4167452d9]
/opt/python/lib/libpython3.6m.so.1.0(Py_FinalizeEx+0x62) [0x7fe4167ad092]
uWSGI worker 4(uwsgi_plugins_atexit+0x81) [0x55e235b62dc1]
/lib/x86_64-linux-gnu/libc.so.6(+0x35940) [0x7fe416006940]
/lib/x86_64-linux-gnu/libc.so.6(+0x3599a) [0x7fe41600699a]
uWSGI worker 4(+0x3f53f) [0x55e235b1b53f]
uWSGI worker 4(+0x89748) [0x55e235b65748]
uWSGI worker 4(+0x89778) [0x55e235b65778]
uWSGI worker 4(uwsgi_close_request+0x561) [0x55e235b1c261]
uWSGI worker 4(simple_loop_run+0xdd) [0x55e235b61c1d]
uWSGI worker 4(simple_loop+0x10) [0x55e235b61a20]
uWSGI worker 4(uwsgi_ignition+0x241) [0x55e235b663c1]
uWSGI worker 4(uwsgi_worker_run+0x275) [0x55e235b6acd5]
uWSGI worker 4(+0x8f2cc) [0x55e235b6b2cc]
uWSGI worker 4(+0x3c0ce) [0x55e235b180ce]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fe415ff12e1]
uWSGI worker 4(_start+0x2a) [0x55e235b180fa]
*** end of backtrace ***
$ pip show protobuf
Name: protobuf
Version: 3.6.1
Summary: Protocol Buffers
Home-page: https://developers.google.com/protocol-buffers/
Author: None
Author-email: None
License: 3-Clause BSD License
Location: /home/lex/lib/python3.6/site-packages
Requires: setuptools, six
Required-by: googleapis-common-protos, google-api-core

Has this been fixed in any later versions?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

twmht picture twmht  路  3Comments

tbillington picture tbillington  路  3Comments

nvarini picture nvarini  路  3Comments

supereagle picture supereagle  路  3Comments

TimmKayserHere picture TimmKayserHere  路  3Comments