Pybind11: Compiling two modules with different compilers leads to segfault in class.h

Created on 26 Jan 2018  路  13Comments  路  Source: pybind/pybind11

We have a library that uses pybind11 to wrap its internal C++ code. We now also want to allow external extension modules to be usable with the library. However, we are noticing that when our library is built with one compiler and an extension module with another, there is a segfault within pybind11 upon import.

I am able to reproduce the bug with a small example:

a.cpp (imagine our library)

#include <pybind11/pybind11.h>

namespace py = pybind11;

struct A {
  explicit A(int y) : _y(y) {}
  int f(int x) { return x + _y; }
  int _y;
};

PYBIND11_MODULE(a, m) {
  py::class_<A>(m, "A").def(py::init<int>()).def("f", &A::f);
}

b.cpp (imagine an extension)

#include <pybind11/pybind11.h>

namespace py = pybind11;

struct B {
  explicit B(int y) : _y(y) {}
  int f(int x) { return x + _y; }
  int _y;
};

PYBIND11_MODULE(b, m) {
  py::class_<B>(m, "B").def(py::init<int>()).def("f", &B::f);
}

setup_a.py

from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext

ext_modules = [
    Extension('a', ['a.cpp'], include_dirs=['../include'], language='c++'),
]


class BuildExtension(build_ext):
    """A custom build extension for adding compiler-specific options."""

    def build_extensions(self):
        for extension in self.extensions:
            extension.extra_compile_args = ['-g', '-std=c++11']
        build_ext.build_extensions(self)


setup(
    name='a', ext_modules=ext_modules, cmdclass={
        'build_ext': BuildExtension
    })

setup_b.py

from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext

ext_modules = [
    Extension('b', ['b.cpp'], include_dirs=['../include'], language='c++'),
]


class BuildExtension(build_ext):
    """A custom build extension for adding compiler-specific options."""

    def build_extensions(self):
        for extension in self.extensions:
            extension.extra_compile_args = ['-g', '-std=c++11']
        build_ext.build_extensions(self)


setup(
    name='b',
    ext_modules=ext_modules,
    cmdclass={
        'build_ext': BuildExtension
    })

Then:

  1. CXX=clang++ CC=clang python setup_a.py install
  2. CXX=g++-7 CC=gcc-7 python setup_b.py install

Then:

$ lldb python
(lldb) target create "python"
iCurrent executable set to 'python' (x86_64).
(lldb) run
imProcess 61507 launched: '/Users/psag/home/play/x/pybind11/env/bin/python' (x86_64)
impPython 3.5.1 (default, Jan 24 2016, 13:26:48)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
o>>> import a
>>> import b
Process 63580 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x130)
    frame #0: 0x0000000102ac58ea b.cpython-35m-darwin.so`pybind11::detail::make_new_python_type(rec=0x00007fff5fbfdf60) at class.h:564
   561      auto metaclass = rec.metaclass.ptr() ? (PyTypeObject *) rec.metaclass.ptr()
   562                                           : internals.default_metaclass;
   563
-> 564      auto heap_type = (PyHeapTypeObject *) metaclass->tp_alloc(metaclass, 0);
   565      if (!heap_type)
   566          pybind11_fail(std::string(rec.name) + ": Unable to create type object!");
   567
Target 0: (python) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x5)
  * frame #0: 0x0000000000000005
    frame #1: 0x000000010236f3b2 b.cpython-35m-darwin.so`pybind11::detail::make_new_python_type(rec=0x00007fff5fbfe000) at class.h:564
    frame #2: 0x0000000102375ac6 b.cpython-35m-darwin.so`pybind11::detail::generic_type::initialize(this=0x00007fff5fbfdff8, rec=0x00007fff5fbfe000) at pybind11.h:887
    frame #3: 0x00000001023762e1 b.cpython-35m-darwin.so`::PyInit_b() [inlined] _ZN8pybind116class_I1BJEEC4IJNS_9metaclassEEEENS_6handleEPKcDpRKT_((null)=<unavailable>, name=<unavailable>, scope=handle @ 0x00007fde97dcf910, this=0x00007fff5fbfdff8) at pybind11.h:1065
    frame #4: 0x0000000102376260 b.cpython-35m-darwin.so`::PyInit_b() [inlined] pybind11_init_b(m=<unavailable>)
    frame #5: 0x0000000102376260 b.cpython-35m-darwin.so`::PyInit_b()
    frame #6: 0x000000010014b844 Python`_PyImport_LoadDynamicModuleWithSpec + 489
    frame #7: 0x000000010014b3f6 Python`_imp_create_dynamic + 252
    frame #8: 0x00000001000d19d5 Python`PyCFunction_Call + 273
    frame #9: 0x00000001001359bc Python`PyEval_EvalFrameEx + 24272
    frame #10: 0x00000001001386f0 Python`_PyEval_EvalCodeWithName + 1884
    frame #11: 0x000000010013902f Python`fast_function + 341
    frame #12: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #13: 0x0000000100138faf Python`fast_function + 213
    frame #14: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #15: 0x0000000100138faf Python`fast_function + 213
    frame #16: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #17: 0x0000000100138faf Python`fast_function + 213
    frame #18: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #19: 0x0000000100138faf Python`fast_function + 213
    frame #20: 0x0000000100135100 Python`PyEval_EvalFrameEx + 22036
    frame #21: 0x00000001001386f0 Python`_PyEval_EvalCodeWithName + 1884
    frame #22: 0x000000010012fad7 Python`PyEval_EvalCodeEx + 78
    frame #23: 0x00000001000babb0 Python`function_call + 377
    frame #24: 0x000000010009905e Python`PyObject_Call + 97
    frame #25: 0x00000001000998b7 Python`_PyObject_CallMethodIdObjArgs + 197
    frame #26: 0x000000010014a8b0 Python`PyImport_ImportModuleLevelObject + 1780
    frame #27: 0x000000010012cbc8 Python`builtin___import__ + 135
    frame #28: 0x00000001000d1900 Python`PyCFunction_Call + 60
    frame #29: 0x000000010009905e Python`PyObject_Call + 97
    frame #30: 0x0000000100137f38 Python`PyEval_CallObjectWithKeywords + 165
    frame #31: 0x0000000100133a10 Python`PyEval_EvalFrameEx + 16164
    frame #32: 0x00000001001386f0 Python`_PyEval_EvalCodeWithName + 1884
    frame #33: 0x000000010012fa83 Python`PyEval_EvalCode + 81
    frame #34: 0x0000000100155461 Python`run_mod + 58
    frame #35: 0x000000010015522e Python`PyRun_InteractiveOneObject + 569
    frame #36: 0x0000000100154b88 Python`PyRun_InteractiveLoopFlags + 209
    frame #37: 0x0000000100154a84 Python`PyRun_AnyFileExFlags + 60
    frame #38: 0x0000000100168d72 Python`Py_Main + 3430
    frame #39: 0x0000000100001e27 python`___lldb_unnamed_symbol1$$python + 224
    frame #40: 0x00007fffa5a38235 libdyld.dylib`start + 1
    frame #41: 0x00007fffa5a38235 libdyld.dylib`start + 1

It seems the metaclass variable is nullptr in this case. This can be confirmed by putting an assertion into that location in class.h.

This is on macOS Sierra, but we see the same on Linux. We also observe this for certain combinations of different GCC versions. In the example above I use Python 3.5, but the same is observable for Python 3.6 and Python 2.7.

Most helpful comment

How do you recommend shipping binary python extension modules that use pybind11 if you don't know what other extensions a user might have installed that may use pybind11 and were built with another compiler version?

Is there a way to completely isolate the pybind11 state across these extensions?

All 13 comments

In general you want everything to be built with the same compiler, and same version of that compiler, see:
https://stackoverflow.com/questions/23895081/can-you-mix-c-compiled-with-different-versions-of-the-same-compiler

STL internal data structures are not guaranteed to be compatible across compiler major versions, and definitely not across entirely different compilers. Pybind11 uses STL data structures to organize its internal state, hence it is important that extension modules are also compiled with the same compiler (otherwise, all sorts of corruption can occur).

How do you recommend shipping binary python extension modules that use pybind11 if you don't know what other extensions a user might have installed that may use pybind11 and were built with another compiler version?

Is there a way to completely isolate the pybind11 state across these extensions?

The compiler version isn't quite as critical as the STL (and its version). STL versions are usually, but not always, backwards-compatible with previous versions of the same STL. For instance, you're usually fine mixing modules built with gcc-5/gcc-6/gcc-7/gcc-8/clang-* on linux, since they all use gcc's stdlibc++. Mixing any of those with clang using libc++鈥攚hich is the default when using clang under macOS, but not on Linux鈥攊s asking for trouble. Very rarely the stl breaks backwards compatibility鈥擨IRC, the last time for stdlibc++ was when version 5 came out (and was related to C++11 compatibility), so crossing the pre-5 and post-5 gcc boundary is likely another no-no.

What you're getting trying to load one so built with g++/stdlibc++ and another built with clang++/libc++ at the same time in the same binary is just something that can't work in any C++ code making use of the stl.

The only way around it is really to isolate the software: keep all your g++/stdlibc++-compiled code separate from your clang++/libc++-compiled code. And while that is a nuissance, it's not something that pybind can realistically do anything about.

Dear all,

I've realized that this has become a bit of a painful problem, particularly when installing external packages where one may not have control over what compiler is being used.

The following commit, currently on master, namespaces pybind11's internal data structures based on the value of the __GXX_ABI_VERSION flag, if present.

https://github.com/pybind/pybind11/commit/bdf1a2cc34815c2f9ee9a5f3b5b05bfadd28dd35

My hope is that this should avoid this kind of breakage in the future. For those of you who are affected, could you let me know if this addresses the problem? My plan then would be to push this into a patch release of pybind11.

Best,
Wenzel

I'll try it today. Thanks for your help

It solved my problem. Thanks!

Hi @wjakob , do you have a schedule for the patch release?

I've added another commit that provides an even stricter separation: https://github.com/pybind/pybind11/commit/c9f5a464bc8ebe91dee8578b2b4a23d9997ffefe

Released in v2.4.0 now :)

(not a patch release after all, because there are also some minor new features)

Thank you for fixing this @wjakob ! I can confirm that this patch worked for us as well.

This seems resolved. If more stuff needs to be done in this regard, please open a new issue.

Was this page helpful?
0 / 5 - 0 ratings