Pybind11: Problems passing a std::vector by reference though virtual functions using pybind11

Created on 16 Dec 2019  Â·  5Comments  Â·  Source: pybind/pybind11

Issue description

This is a problem with passing a std::vector that was allocated in C++ by reference though a virtual method wrapped in pybind11 loses the reference and causes copying, despite making the std::vector opaque. This issue is discussed in a stackoverflow discussion here, which convinced me this is something we should report for pybind11.

Reproducible example code

The following code demonstrates the problem. We have a pure virtual function on a class A that accepts a reference to a std::vector, which we have declared as an opaque type. If that std::vector is allocated in Python, things behave as expected and changes to the vector argument are preserved outside the function. However, if the std::vector is allocated in C++ then overriding A::func in Python acts as though the argument is passed by value rather than reference. The two functions "consumer" and "another_consumer" below demonstrate these two patterns.

#include "pybind11/pybind11.h"
#include "pybind11/stl_bind.h"
#include "pybind11/stl.h"
#include "pybind11/functional.h"
#include "pybind11/operators.h"

#include <iostream>
#include <vector>

namespace py = pybind11;
using namespace pybind11::literals;

PYBIND11_MAKE_OPAQUE(std::vector<int>)

//------------------------------------------------------------------------------
// The C++ types and functions we're binding
//------------------------------------------------------------------------------
struct A {
  virtual void func(std::vector<int>& vec) const = 0;
};

// In this function the std::vector "x" is not modified by the call to a.func(x)
// The difference seems to be that in this function we create the std::vector
// in C++
void consumer(A& a) {
  std::vector<int> x;
  a.func(x);
  std::cerr << "consumer final size: " << x.size() << std::endl;
}

// Whereas here, with the std::vector<int> created in Python and passed in, the
// std::vector "x" is modified by the call to a.func(x).
// The only difference is we create the std::vector in Python and pass it in here.
void another_consumer(A& a, std::vector<int>& x) {
  std::cerr << "another_consumer initial size: " << x.size() << std::endl;
  a.func(x);
  std::cerr << "another_consumer final size  : " << x.size() << std::endl;
}

//------------------------------------------------------------------------------
// Trampoline class for A
//------------------------------------------------------------------------------
class PYB11TrampolineA: public A {
public:
  using A::A;
  virtual void func(std::vector<int>& vec) const override { PYBIND11_OVERLOAD_PURE(void, A, func, vec); }
};

//------------------------------------------------------------------------------
// Make the module
//------------------------------------------------------------------------------
PYBIND11_MODULE(example, m) {
  py::bind_vector<std::vector<int>>(m, "vector_of_int");

  {
    py::class_<A, PYB11TrampolineA> obj(m, "A");
    obj.def(py::init<>());
    obj.def("func", (void (A::*)(std::vector<int>&) const) &A::func, "vec"_a);
  }

  m.def("consumer", (void (*)(A&)) &consumer, "a"_a);
  m.def("another_consumer", (void (*)(A&, std::vector<int>&)) &another_consumer, "a"_a, "x"_a);
}

The following Python code exercises this pattern and demonstrates the problem:

from example import *

class B(A):
    def __init__(self):
        A.__init__(self)
        return
    def func(self, vec):
        print "B.func START: ", vec
        vec.append(-1)
        print "B.func STOP : ", vec
        return

b = B()
print "--------------------------------------------------------------------------------"
print "consumer(b) -- This one seems to fail to pass back the modified std::vector<int> from B.func"
consumer(b)
print "--------------------------------------------------------------------------------"
print "another_consumer(b, x) -- This one works as expected"
x = vector_of_int()
another_consumer(b, x)
print "x : ", x

which yields the following output when executed:

consumer(b) -- This one seems to fail to pass back the modified std::vector<int> from B.func
B.func START:  vector_of_int[]
B.func STOP :  vector_of_int[-1]
consumer final size: 0        
--------------------------------------------------------------------------------
another_consumer(b, x) -- This one works as expected
another_consumer initial size: 0
B.func START:  vector_of_int[]
B.func STOP :  vector_of_int[-1]
another_consumer final size  : 1
x :  vector_of_int[-1]     

So in the above result the std::vector in consumer is not modified by the virtual call to B.func, whereas in another_consumer it is.

The discussion of this issue on stackoverflow speculates as to the cause of the problem.

Most helpful comment

Thanks @YannickJadoul, the workaround in comment https://github.com/pybind/pybind11/issues/2033#issuecomment-703170432 works.

All 5 comments

Same is true for custom C++ classes. For example following code defines two classes. B has virtual method which gets instance of A passed by reference. C is a Python implementation which tries modifying A instance. But instance stays untouched.

test.cpp:

#include <iostream>
#include <pybind11/pybind11.h>

namespace py = pybind11;

class A {
public:
    bool changed;

    A() : changed(false) {}
    void set_changed() { changed = true; }
};

class B {
public:
    virtual ~B() {}
    virtual void change(A& a) const = 0;
};

void test(B& b) {
    A a;
    std::cout << "a.changed: " << a.changed << std::endl;
    b.change(a);
    std::cout << "a.changed: " << a.changed << std::endl;
}

class PyB : public B {
public:
    using B::B;

    void change(A& a) const override {
        PYBIND11_OVERLOAD_PURE(void, B, change, a);
    }
};

PYBIND11_MODULE(test, m) {

    py::class_<A>(m, "A")
        .def("set_changed", &A::set_changed);

    py::class_<B, PyB>(m, "B")
        .def(py::init<>())
        .def("change", &B::change);

    m.def("test", &test);
}

test.py:

from test import *

class C(B):

    def __init__(self):
        B.__init__(self)

    def change(self, a):
        a.set_changed()

if __name__ == '__main__':
    test(C())

test.sh:

c++ -O3 -Wall -shared -std=c++11 -fPIC \
    `python3 -m pybind11 --includes` \
    test.cpp \
    -o test`python3-config --extension-suffix`
python3 test.py

Outputs:

a.changed: 0
a.changed: 0

I've recently debugged the same/a similar issue on Gitter, and it turns out the problem is that the derived method is called with the default py::return_value_policy::automatic_reference, which copies a object when going from C++ to Python (while pointers will be treated like py::return_value_policy::reference).

See also #2516.

A workaround is trying to pass a pointer (since it won't be copied) instead of a reference, and if you can't do that, to write write your own version of PYBIND11_OVERRIDE (PYBIND11_OVERLOAD was renamed to PYBIND11_OVERRIDE in 2.6.0):

https://github.com/pybind/pybind11/blob/e8ad33bb308276149cf9a56f80c477318b64999a/include/pybind11/pybind11.h#L2231-L2243

If you use auto o = override.operator()<py::return_value_policy::reference>(__VA_ARGS__); instead of auto o = override(__VA_ARGS__);, things should work, I hope.
Yes, this is ugly and there ought to be a better way.

I'll close this in favor of #2516, but please reopen if I got this wrong, and this is yet another issue!

For what it's worth, I was able to work around this issue with a different
ugly hack described in this stackoverflow conversation:
https://stackoverflow.com/questions/59330279/problems-passing-a-stdvector-by-reference-though-virtual-functions-using-pybin/59331026?noredirect=1#comment104861677_59331026

Basically I just add an extra statement in the trampoline function
definition before calling the PYBIND11_OVERLOAD macro forcing a dummy
argument cast along the lines of
py::object dummy = py::cast(&vec); // force re-use in the following call
PYBIND11_OVERLOAD_PURE(void, A, func, vec);

I've encoded this hack in my own pybind11 code generator (PYB11Generator,
https://pyb11generator.readthedocs.io/en/latest/) so I don't have to do
this explicitly myself every time, but it's still clunky.

On Sat, Oct 3, 2020 at 3:06 PM Yannick Jadoul notifications@github.com
wrote:

Closed #2033 https://github.com/pybind/pybind11/issues/2033.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/pybind/pybind11/issues/2033#event-3837006386, or
unsubscribe
https://github.com/notifications/unsubscribe-auth/ABCN6BTRLGZVI2S7CPVKTNLSI6N6JANCNFSM4J3OWBPA
.

@jmikeowen Oh, yes, now that you mention it, I believe that's the actual workaround I had advised on Gitter!

Equivalent to py::object dummy = py::cast(&vec);: py::object dummy = py::cast(vec, py::return_value_policy::reference);, if you want to be slightly clearer. Yet it stays an ugly hack.

Apologies that you didn't receive a reply sooner!

Thanks @YannickJadoul, the workaround in comment https://github.com/pybind/pybind11/issues/2033#issuecomment-703170432 works.

Was this page helpful?
0 / 5 - 0 ratings