I hope there is some way that I can explain what I am doing, and you can tell me if I am doing something crazy, or if this is a serious bug in pybind11.
I have a factory function that generates instances of classes derived from an ABC, and then lots (150) of methods bounds to the ABC (AbstractState) that is returned. Nearly all these functions are boring POD-type functions, some take enumerated values, etc.. But it seems there is a limit to the number of functions I can have. If I use all 150 functions, I get weird, random seg faults all over the place, but the module compiles properly. If I only enable something like 100 of the methods, the code runs properly.
FWIW, I have no intention to extend the AbstractState in python-land, so I don't think all the trampoline machinery is required, right?
#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
namespace py = pybind11;
CoolProp::AbstractState * factory(const std::string &backend, const std::string &fluid_names) {
return CoolProp::AbstractState::factory(backend, fluid_names);
}
void init_CoolProp(py::module &m){
using namespace CoolProp;
py::class_<AbstractState>(m, "AbstractState")
.def("set_T", &AbstractState::set_T)
.def("backend_name", &AbstractState::backend_name)
.def("using_mole_fractions", &AbstractState::using_mole_fractions)
.def("using_mass_fractions", &AbstractState::using_mass_fractions)
.def("using_volu_fractions", &AbstractState::using_volu_fractions)
.def("set_mole_fractions", &AbstractState::set_mole_fractions)
.def("set_mass_fractions", &AbstractState::set_mass_fractions)
.def("set_volu_fractions", &AbstractState::set_volu_fractions)
......
;
m.def("AbstractState", &factory);
}
#if defined(COOLPROP_PYBIND11_MODULE)
PYBIND11_PLUGIN(CoolProp) {
py::module m("CoolProp", "CoolProp module");
init_CoolProp(m);
return m.ptr();
}
#endif
Hi @ibell,
we'd be happy to look at your example, but we will need something reproducible -- the full C++ module that actually causes the issue, and Python code that triggers the issue.
See CONTRIBUTING.md for details on the expected issue ticket format.
-Wenzel
py::class_<AbstractState>(m, "AbstractState")
...
m.def("AbstractState", &factory);
seems a likely cause of the problem: you're telling Python that AbstractState is both a class and a function. I would try renaming it to something else, or expose it as a constructor, or expose it as a static AbstractState method, via .def_static("factory", &AbstractState::factory) and see if that fixes the issue.
You @jagerman are a genius. That does seem to fix this. Could pybind11 perhaps fail in this case and not let it pass through? I imagine I am not the first person to have done something like this...
The problem for me is that I am trying to be 1-1 compatible with the current cython-based API, which had the AbstractState function as a method. I can certainly see why pybind11 got confused about this. Alternatively, I guess I could rename the alias for the class.
@wjakob - yes I know ALL about reproducible examples (non-reproducible examples are usually just PICNIC (problem in chair, not in computer) )
Anyway, I have to say I have been loving pybind11 more and more the more I use it (especially coming from cython), and have been very appreciative of the ridiculously fast response times on issues.
I guess in general, I feel like if two things have the same alias, maybe pybind11 shouldn't allow that at all? But you guys are the ones in charge, so I'll let you decide.
Agreed, would be nice if pybind11 failed early and loud in case of name collisions.
The problem for me is that I am trying to be 1-1 compatible with the current cython-based API, which had the AbstractState function as a method. I can certainly see why pybind11 got confused about this. Alternatively, I guess I could rename the alias for the class.
The current API seems a bit weird: it's defining something that looks and acts like a constructor, but isn't. Could you just wrap it in an actual constructor from pybind11, such as:
.def("__init__", [](AbstractState &a, const std::string &backend, const std::string &fluid_names) {
auto *inst = AbstractState::factory(backend, fluid_names);
new (&a) AbstractState(std::move(*inst));
delete inst;
})
(assuming it's movable (or at least copyable), doesn't have a public constructor, and that the pointer is unmanaged).
I'm leaning towards closing this issue since it's not really a pybind11 problem. Thoughts?
I think this is a pybind issue, in the sense that pybind lets you do something that invalidates pybind's internal state. (Yes, it's something you shouldn't do, but we should catch that rather than segfault). I can reproduce this with this simple case (added to test_issues.cpp/.py):
// Issue 461: registering two things with the same name:
py::class_<Dupe1>(m2, "Dupe1")
.def("get_value", &Dupe1::get_value)
;
//m2.def("Dupe1", [](int v) { return new Dupe1(v); });
m2.def("dupe1_factory", [](int v) { return new Dupe1(v); });
def test_dupe_assignment():
""" Issue 461: overwriting a class with a function """
from pybind11_tests.issues import Dupe1, dupe1_factory
v = dupe1_factory(23)
assert(v.get_value() == 23)
If I uncomment the Dupe1 definition in the .cpp, the call in Python to dupe1_factory segfaults. The issue is, I believe, that the redefinition of Dupe1 is effectively invalidating registered_types_cpp: we think it's a registered type, but the registered Python type has been obliterated by the second definition, hence the segfault when pybind tries to cast the reference returned by dupe1_factory into the no-longer-valid Python type. This is likely not specific to class/function pairs: mapping two different C++ classes to the same Python name, enums, and so on are all likely to exhibit the same problem.
Correcting this is going to be a matter of detecting (in the various initialize methods) whether the thing we're about to define is already defined, and if so, throwing an error instead of overwriting the existing type/function/etc..
(I've got a PR almost ready that should fix this).
I worry that extra checks for every method definition are likely to come with a hefty object code size penalty. It will be important to try to do this in non-templated code somehow.
Fixed in PR #465: with that merged, @ibell's code will now generate a run-time error about the redefinition, as would similar redefinitions via another class with the same name or exception definitions.
@wjakob: the code ends up entirely in the non-templated bits (it even manages to slightly (_very_ slightly) reduce the test .so size by avoiding one of the inc_ref()s in templated code).
Fixed by (merged) PR #465.
Most helpful comment
Fixed in PR #465: with that merged, @ibell's code will now generate a run-time error about the redefinition, as would similar redefinitions via another class with the same name or exception definitions.
@wjakob: the code ends up entirely in the non-templated bits (it even manages to slightly (_very_ slightly) reduce the test .so size by avoiding one of the
inc_ref()s in templated code).