Hi,
I have an issue with the ref counting of python objects that are passed to C++.
I have a C++ class Vector that the use can inherit from (hence I also have a PyVector trampoline class). The user-defined python class should then be passed back to C++ where some actions on this class are performed. Unfortunately, it seems that python takes complete ownership of the class, so that it gets deleted even if the C++ side still needs it.
MFE:
#include <pybind11/pybind11.h>
#include <iostream>
#include <memory>
namespace py = pybind11;
class Vector : public std::enable_shared_from_this<Vector> {
public:
Vector() {}
virtual void print() {
std::cout << "Vector" << std::endl;
}
};
class PyVector : public Vector {
public:
virtual void print() override {
PYBIND11_OVERLOAD(void, Vector, print,);
}
};
class Foo {
public:
Foo(std::shared_ptr<Vector> vec) : _vec(vec){}
std::shared_ptr<Vector> _vec;
void print() {
_vec -> print();
}
};
PYBIND11_MODULE(cmake_example, m) {
py::class_<Vector, std::shared_ptr<Vector>,
PyVector>(m, "Vector")
.def(py::init<>());
py::class_<Foo, std::shared_ptr<Foo>>(m, "Foo")
.def(py::init<std::shared_ptr<Vector>>())
.def("print", &Foo::print);
}
Python code:
import cmake_example as m
class MyVector(m.Vector):
def print(self):
print("MyVector")
def __del__(self):
print("I was deleted")
def make_foo():
x = MyVector()
foo = m.Foo(x)
foo.print()
return foo
myfoo = make_foo()
myfoo.print()
Output:
I was deleted
Vector
Expected output:
MyVector
I was deleted
This is, unfortunately, a limitation of pybind: we keep the C++ instance alive, but can't keep the Python instance alive.
I think it might actually be possible to work around this with a custom holder. This is just brainstorming, but I think it might be possible to create something that works like a shared_ptr, except that it would also hold a py::object member holding a reference to the Python object alive whenever there are 2 or more references, and clear this py::object reference when there is just one. It would still delete as shared_ptr does when refs drop to 0. That way, as long as you are using a pybind11::shared_holder<T> (or whatever), your copy plus the one in pybind internals would keep the python variable alive. Once you're done with it and it gets destroyed, the extra Python reference would be freed, and at this point if there are no other Python references to it anymore, the Python object gets freed (which then in turn destroys the internal holder, and thus the C++ object).
The downside is that it requires a custom holder class, i.e. not just a straight std::shared_ptr.
A second alternative is to store a py::object as a member inside your trampoline class. This, unfortunately, has another big downside: it'll never get destroyed since it's a circular reference (the internal holder keeps the C++ object alive, the C++ object keeps the Python object alive, and the Python object keeps the internal holder alive), and so would require some manual destruction: easy to forget.
Another approach (untested, but I think it ought to work), that might actually work for your specific case here, is to add a trampoline for Foo with a constructor like this:
class PyFoo : public Foo {
public:
PyFoo(std::shared_ptr<Vector> v) : Foo(v), pyobj(py::cast(v)) {}
private:
py::object pyobj;
};
and then bind the constructor using:
py::class_<Foo, PyFoo>(m, "Foo")
.def(py::init_alias<std::shared_ptr<Vector>>())
;
(py::init_alias is like py::init except it always initializes the PyFoo, while py::init only initializes the alias when you are actually inheriting from Foo on the Python side).
That way, the PyFoo keeps the Vector alive via a Python-side reference in the pyobj member instead of just the C++-side reference.
Thanks for the quick response, I have tried your workaround and it does indeed do the job. I do however have quite a large number of classes and functions that require objects like Vec as their input, so going down this route would be painful...
Regarding the custom holder: what would the implications of that be? Does that mean that the C++ code base that I'm wrapping around needs to be rewritten to use py::shared_holder everywhere? Or is it possible to inherit py::shared_holder from std::shared_ptr in a way that the original code does not have to be touched?
Just an idea here: In theory, a shared_ptr can hold any destructor ad a totally unrelated pointer. So it would definetely be possible to create a shared_ptr keeping the python object alive. So, something like this does work:
#include <pybind11/pybind11.h>
#include <iostream>
#include <memory>
namespace py = pybind11;
class Vector : public std::enable_shared_from_this<Vector> {
public:
Vector() {}
virtual void print() {
std::cout << "Vector" << std::endl;
}
};
class PyVector : public Vector {
public:
virtual void print() override {
PYBIND11_OVERLOAD(void, Vector, print,);
}
};
class Foo {
public:
Foo(std::shared_ptr<Vector> vec) : _vec(vec){}
std::shared_ptr<Vector> _vec;
void print() {
_vec -> print();
}
};
static Foo* make_foo_with_ref(py::handle vec_handle) {
// cast to Vector* needs a slight hack, since cast to pointer is disabled by default
// I know that vec_caster used the pointer inside the instance, so this is safe even after the caster is destroyed
py::detail::type_caster<Vector> vec_caster;
if(!vec_caster.load(vec_handle, true)) {
throw py::value_error();
}
Vector* vec_ptr = vec_caster;
// create a shared_ptr referring to the python instance. Will only decrement python refcount on destruction.
// This is the overly complicated version. We could also use a shared_ptr<py::object>, but this saves one indirection
Py_INCREF(vec_handle.ptr());
std::shared_ptr<PyObject> vec_py_ptr(vec_handle.ptr(), [](PyObject* ob) { Py_DECREF(ob); });
// create sharedpointer referring to vec_ptr, but managing vec_py_ptr
std::shared_ptr<Vector> vec_sp(vec_py_ptr, vec_ptr);
// profit
return new Foo(vec_sp);
}
PYBIND11_MODULE(test_sharedptr, m) {
py::class_<Vector, std::shared_ptr<Vector>,
PyVector>(m, "Vector")
.def(py::init<>());
py::class_<Foo, std::shared_ptr<Foo>>(m, "Foo")
.def(py::init<std::shared_ptr<Vector>>())
.def("print", &Foo::print);
m.def("makeFooWithRef", make_foo_with_ref);
}
def make_foo():
x = MyVector()
foo = m.Foo(x)
foo.print()
return foo
myfoo = make_foo()
myfoo.print()
del myfoo
def make_foo_ref():
x = MyVector()
foo = m.makeFooWithRef(x)
foo.print()
return foo
myfoo = make_foo_ref()
myfoo.print()
del myfoo
This does result in the correct output. I'm not sure though if you can add this in a way that keeps backward compability. Maybe only do this if the object in question is inherited in python.
Oh that's neat! Would it be possible to combine this with a custom holder class to have this done automagically? The code below seems to work, but it would be great if I could just declare py_shared_ptr as the custom holder for Vector. From my understanding a custom holder class needs to take in a raw ptr to Vector - could it get the handle instead?
template <typename T> T *handle_to_ptr(py::handle vec_handle) {
// cast to Vector* needs a slight hack, since cast to pointer is
// disabled by default
// I know that vec_caster used the pointer inside the instance, so this
// is safe even after the caster is destroyed
py::detail::type_caster<T> vec_caster;
if (!vec_caster.load(vec_handle, true)) {
throw py::value_error();
}
T *vec_ptr = vec_caster;
return vec_ptr;
}
std::shared_ptr<PyObject> handle_to_py_ptr(py::handle vec_handle) {
// create a shared_ptr referring to the python instance. Will only
// decrement python refcount on destruction. This is the overly
// complicated version. We could also use a shared_ptr<py::object>, but
// this saves one indirection
Py_INCREF(vec_handle.ptr());
std::shared_ptr<PyObject> vec_py_ptr(vec_handle.ptr(),
[](PyObject *ob) { Py_DECREF(ob); });
return vec_py_ptr;
}
template <typename T> class py_shared_ptr : public std::shared_ptr<T> {
public:
py_shared_ptr(py::handle vec_handle)
: std::shared_ptr<T>(handle_to_py_ptr(vec_handle),
handle_to_ptr<T>(vec_handle)) {}
};
PYBIND11_MODULE(cmake_example2, m) {
py::class_<Vector, std::shared_ptr<Vector>, PyVector>(m, "Vector")
.def(py::init<>());
py::class_<Foo, std::shared_ptr<Foo>>(m, "Foo")
.def(py::init([](py::handle vec_handle) {
return new Foo(py_shared_ptr<Vector>(vec_handle));
}))
.def("print", &Foo::print);
}
could it get the handle instead?
If pybind already knows about the object then py::cast(ptr) will give you back the py::object referencing it rather than constructing a new one.
template <typename T> class py_shared_ptr : public std::shared_ptr<T> {
Don't do that: inheriting from most stl classes isn't a good idea (very few classes in the stl are polymorphic). Instead create a custom class with an operator std::shared_ptr<T>() { /* return shared ptr here */ } to allow implicit conversion (so you can pass your py_shared_ptr<T> to something taking a shared_ptr<T> and have it Just Work).
Okay makes sense, I have tried this
template <typename T> class py_shared_ptr {
private:
std::shared_ptr<T> _impl;
public:
py_shared_ptr(T *ptr) {
py::object pyobj = py::cast(ptr);
PyObject* pyptr = pyobj.ptr();
Py_INCREF(pyptr);
std::shared_ptr<PyObject> vec_py_ptr(
pyptr, [](PyObject *ob) { Py_DECREF(ob); });
_impl = std::shared_ptr<T>(vec_py_ptr, ptr);
}
operator std::shared_ptr<T>() { return _impl; }
T* get() const {return _impl.get();}
};
PYBIND11_DECLARE_HOLDER_TYPE(T, py_shared_ptr<T>);
PYBIND11_MODULE(cmake_example2, m) {
py::class_<Vector, py_shared_ptr<Vector>>(m, "Vector")
.def(py::init<>());
}
but it fails with
[ 25%] Building CXX object CMakeFiles/cmake_example2.dir/src/main2.cpp.o
In file included from /Users/florianwechsung/Dropbox/pybind-sharedptr-bug/src/main2.cpp:3:
/Users/florianwechsung/Dropbox/pybind-sharedptr-bug/pybind11/include/pybind11/pybind11.h:1248:71: error: no type named 'element_type' in 'py_shared_ptr<Vector>'
auto sh = std::dynamic_pointer_cast<typename holder_type::element_type>(
~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
/Users/florianwechsung/Dropbox/pybind-sharedptr-bug/pybind11/include/pybind11/pybind11.h:1294:9: note: in instantiation of function template specialization 'pybind11::class_<Vector,
py_shared_ptr<Vector> >::init_holder<Vector>' requested here
init_holder(inst, v_h, (const holder_type *) holder_ptr, v_h.value_ptr<type>());
^
/Users/florianwechsung/Dropbox/pybind-sharedptr-bug/pybind11/include/pybind11/pybind11.h:1034:32: note: in instantiation of member function 'pybind11::class_<Vector,
py_shared_ptr<Vector> >::init_instance' requested here
record.init_instance = init_instance;
^
/Users/florianwechsung/Dropbox/pybind-sharedptr-bug/src/main2.cpp:77:2: note: in instantiation of function template specialization 'pybind11::class_<Vector, py_shared_ptr<Vector>
>::class_<>' requested here
py::class_<Vector, py_shared_ptr<Vector>>(m, "Vector")
^
1 error generated.
make[2]: *** [CMakeFiles/cmake_example2.dir/src/main2.cpp.o] Error 1
make[1]: *** [CMakeFiles/cmake_example2.dir/all] Error 2
make: *** [all] Error 2
I'm puzzled because I'm comparing to the `custom_unique_ptr in the tests and don't really see what I'm missing here...
Okay so if I get rid of std::enable_shared_from_this it works... is that expected or a bug?
Got it to work with this implementation:
template <typename T> class py_shared_ptr {
private:
std::shared_ptr<T> _impl;
public:
using element_type = T;
py_shared_ptr(T *ptr) {
py::object pyobj = py::cast(ptr);
PyObject* pyptr = pyobj.ptr();
Py_INCREF(pyptr);
std::shared_ptr<PyObject> vec_py_ptr(
pyptr, [](PyObject *ob) { Py_DECREF(ob); });
_impl = std::shared_ptr<T>(vec_py_ptr, ptr);
}
py_shared_ptr(std::shared_ptr<T> r): _impl(r){}
operator std::shared_ptr<T>() { return _impl; }
T* get() const {return _impl.get();}
};
PYBIND11_DECLARE_HOLDER_TYPE(T, py_shared_ptr<T>);
Does that make sense or might this cause issues somewhere internally in pybind?
I think I had run into this issue in #1145 (which may also be related to #1333); I made an overly complicated fix for it #1146 which handles shared_ptr and unique_ptr: it effectively does what your solution does, but in a more conservative fashion - allowing for the object to be destroyed if owned by Python, whereas I believe your solution may keep the object alive if it is owned in pure Python.
Does that make sense or might this cause issues somewhere internally in pybind?
This looks like it makes good sense for your applications. It may not release memory as quickly as you want it to, but I figure that'd be a small price to pay.
We are using a variant of #1146 in a fork of pybind11 for drake, and it seems to have been serving our purposes; if you want, I can put more effort in attempting to upstream a simpler portion of it:
https://github.com/RobotLocomotion/pybind11/commit/c2144ab0c037d3064b07eaf5e4c7eeeb46a88552
We are using a variant of #1146 in a fork of pybind11 for drake, and it seems to have been serving our purposes; if you want, I can put more effort in attempting to upstream a simpler portion of it:
RobotLocomotion/pybind11@c2144ab
That would be great! I'm sure other people have ran into similar issues and will find this useful as well.
I've stumbled across this and have my own variant of basically the same idea #1546. I use the aliasing constructor to keep the Python object alive. It doesn't require a new holder type, just a tweak to the casting machinery.
@cdyson37 Do you happen to have code available that has your change to the casting machinery?
@pvallet's solution made me realize that I should've looked at what you said earlier - sorry!
https://github.com/pybind/pybind11/issues/1120#issuecomment-645905422
Ah, I see #1566 - sweet!
Closing this in lieu of #1333 - please feel free to reopen if you think this is a unique case
Most helpful comment
I've stumbled across this and have my own variant of basically the same idea #1546. I use the aliasing constructor to keep the Python object alive. It doesn't require a new holder type, just a tweak to the casting machinery.