Pybind11: How to correctly set default argument "none" for pybind11::array_t ?

Created on 15 Oct 2019  路  8Comments  路  Source: pybind/pybind11

Dear Pybind,

I am working with tag v2.4.3. I would like to bind a structure containing an array of int and set the array's default value to "None" in the constructor.

My simple unit test is
~~~

include "pybind11_tests.h"

include

include

using array_i = pybind11::array_t;

struct myObj_array {
array_i ids;
};

TEST_SUBMODULE(defaultnone, m) {
py::class_(m, "myObj_array_int")
.def(py::init( {
std::cout << "ndim " << py_ids.ndim() << std::endl;
if (py_ids.ndim())
return myObj_array{py_ids};
else
return myObj_array{array_i(0)};
}), py::arg("py_ids") = (array_i*) nullptr
);
~
And my python script is
~

import pytest
from pybind11_tests import defaultnone as m

def test_default_array(doc):
array_int = m.myObj_array_int()
~
for which I got the error message
~

  array_int = m.myObj_array_int()

E TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
E 1. pybind11_tests.defaultnone.myObj_array_int(py_ids: numpy.ndarray[int32] = None)
E
E Invoked with:
~~~
However, if I change the c++ code to bind an array of float or double and set the default to nullptr, the script above is running fine without error.

Is None argument allowed for pybind11::array, and why is there difference between type int and float/double ?

Many thanks in advance!
Regards

Most helpful comment

I shall now perform an honorary closing of this issue.

All 8 comments

I have the same problem for setting default argument "none" for pybind11::list and pybind11::dict.

@WeinaJi That is a strange inconsistency indeed.

However, the nullptr seems to be converted to nan, for an array with float/double, so that seems wrong to me.

If you want to accept None, I would suggest to have an argument that's std::optional<array_i>, because pybind11 will not consider None to be a valid value for a array, I believe.

The same goes for list and dict, @ervinYZ. isinstance(None, list) evaluates as false, so you need to accept std::optional<py::list>.

Reopening to have the conversion from (py::array_t<double>*) nullptr or None to nan investigated, though:

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

namespace py = pybind11;

using array = py::array_t<double, pybind11::array::c_style | pybind11::array::forcecast>;

PYBIND11_MODULE(example, m) {
        m.def("f", [](array a) { py::print(a); }, py::arg("a") = py::none());
}
>>> import example
>>> example.f([])
[]
>>> example.f()
nan

You're taking the following constructor:

    array_t(const object &o) : array(raw_array_t(o.ptr()), stolen_t{}) {
        if (!m_ptr) throw error_already_set();
    }

The raw_array_t() is what produces nan. Then you just propagate nan through all the base constructors, until you initialize the py::handle base with it.

    static PyObject *raw_array_t(PyObject *ptr) {
        if (ptr == nullptr) {
            PyErr_SetString(PyExc_ValueError, "cannot create a pybind11::array_t from a nullptr");
            return nullptr;
        }
        return detail::npy_api::get().PyArray_FromAny_(
            ptr, dtype::of<T>().release().ptr(), 0, 0,
            detail::npy_api::NPY_ARRAY_ENSUREARRAY_ | ExtraFlags, nullptr);
    }

Since we know we just took a reference from PyNone(), ptr can't be NULL. Also ExtraFlags is py::array::forcecast. Therefore:

    static PyObject *raw_array_t(PyObject *ptr) {
        return py::detail::npy_api::get().PyArray_FromAny_(
            ptr, py::dtype::of<double>().release().ptr(), 0, 0,
            py::detail::npy_api::NPY_ARRAY_ENSUREARRAY_ | py::array::forcecast, nullptr);
    }

Now... PyArray_FromAny_() is this monstrocity, ptr is py::none().ptr(), py::dtype::of().release().ptr()` is... Wait a moment.

 

Two hours later...

 

Wow, that was a fun rabbit hole. I'm pretty sure this None -> nan is not pybind, but numpy itself. Bear with me.

After a lot of inlining and manual evaluation, here's how the above looks like:

static PyObject *raw_array_t(PyObject *ptr) {
    auto& npy_api_ref = py::detail::npy_api::get();
    return npy_api_ref.PyArray_FromAny_(
            ptr,
            npy_api_ref.PyArray_DescrFromType_(12), // 12 == NPY_DOUBLE_
            0,
            0,
            0x50,
            nullptr);
}
int main() {
    Py_Initialize();
        PyObject* h2 = ::raw_array_t(Py_None);
        PyObject* sys = PyImport_ImportModule("sys");
        PyObject* out = PyObject_GetAttrString(sys, "stdout");
    PyObject* write = PyObject_GetAttrString(out, "write");
    PyObject* t = PyTuple_New(1);
    PyTuple_SetItem(t, 0, PyObject_Str(h2));
    PyObject_Call(write, t, NULL);
    Py_Finalize();
}

At this point:

  1. This isn't even C++ any more. The only thing left of C++ is that one & in the npy_apr_ref and a call to py::detail::npy_api::get(). Everything else is good old C.
  2. From the above, you can see that the only thing left to pybind11 to do here is... fetch a reference to the NumPy API.

We can do better!

 

Another 40 minutes later...

 

#include <Python.h>

enum functions {
    API_PyArray_DescrFromType = 45,
    API_PyArray_FromAny = 69,
};
struct npy_api {
    PyObject *(*PyArray_FromAny_) (PyObject *, PyObject *, int, int, int, PyObject *);
    PyObject *(*PyArray_DescrFromType_)(int);
};
static struct npy_api lookup() {
    PyObject* multiarray = PyImport_ImportModule("numpy.core.multiarray");
    PyObject* array_api = PyObject_GetAttrString(multiarray, "_ARRAY_API");
    void **api_ptr = (void **) PyCapsule_GetPointer(array_api, NULL);
    struct npy_api api;
    api.PyArray_DescrFromType_ = api_ptr[API_PyArray_DescrFromType];
    api.PyArray_FromAny_ = api_ptr[API_PyArray_FromAny];
    return api;
}
static PyObject *raw_array_t(PyObject *ptr) {
    const struct npy_api npy_api_ref = lookup();
    return npy_api_ref.PyArray_FromAny_(
            ptr,
            // 12 == NPY_DOUBLE_
            npy_api_ref.PyArray_DescrFromType_(12),
            0,
            0,
            0x50,
            NULL);
}
int main() {
    Py_Initialize();
    PyObject* h2 = raw_array_t(Py_None);
    PyObject* sys = PyImport_ImportModule("sys");
    PyObject* out = PyObject_GetAttrString(sys, "stdout");
    PyObject* write = PyObject_GetAttrString(out, "write");
    PyObject* t = PyTuple_New(1);
    PyTuple_SetItem(t, 0, PyObject_Str(h2));
    PyObject_Call(write, t, NULL);
    Py_Finalize();
}

This time, it really is pure C. Now is this a bug? I don't know. Perhaps NumPy considers this expected behaviour.
In case someone is worried about magic numbers being wrong:

  • API_PyArray_DescrFromType == 45

    • https://github.com/numpy/numpy/blob/36e017194c32a53cf2965a513243cb3c348095df/numpy/core/code_generators/numpy_api.py#L90

    • In this case, with double, this thing returns &DOUBLE_Descr.

  • API_PyArray_FromAny == 69

    • https://github.com/numpy/numpy/blob/36e017194c32a53cf2965a513243cb3c348095df/numpy/core/code_generators/numpy_api.py#L114

  • 0x50 == forcecast | NPY_API_ENSUREARRAY_

    • https://github.com/numpy/numpy/blob/18a6e3e505ee416ddfc617f3e9afdff5a031c2c2/numpy/core/include/numpy/ndarraytypes.h#L816

    • https://github.com/numpy/numpy/blob/18a6e3e505ee416ddfc617f3e9afdff5a031c2c2/numpy/core/include/numpy/ndarraytypes.h#L831

  • NPY_DOUBLE_ == 12

    • https://github.com/numpy/numpy/blob/18a6e3e505ee416ddfc617f3e9afdff5a031c2c2/numpy/core/include/numpy/ndarraytypes.h#L71

 

A much faster way of confirming this is regular numpy behaviour is python -c 'print(__import__("numpy").array(None, dtype = float))'.

I shall now perform an honorary closing of this issue.

@WeinaJi That is a strange inconsistency indeed.

However, the nullptr seems to be converted to nan, for an array with float/double, so that seems wrong to me.

If you want to accept None, I would suggest to have an argument that's std::optional<array_i>, because pybind11 will not consider None to be a valid value for a array, I believe.

The same goes for list and dict, @ervinYZ. isinstance(None, list) evaluates as false, so you need to accept std::optional<py::list>.

Thanks for the prompt reply. I found the same suggestion that I should use std::optional in the pybind doc. It works perfect.

But I have to say the current way to allow/prohibiting none argument is confusing, since I can set and compile the py::arg("list").none(true) or py::arg("list") = py::none() but it does nothing.

But I have to say the current way to allow/prohibiting none argument is confusing, since I can set and compile the py::arg("list").none(true) or py::arg("list") = py::none() but it does nothing.

Yes, that's kind of true. The use of none is documented, though: https://pybind11.readthedocs.io/en/stable/advanced/functions.html#allow-prohibiting-none-arguments

With C++17's std::optional, here's an example that should work. For earlier versions of C++, you might want a backported optional.h and implement your own optional_caster similar to the one in pybind11/stl.h.

Say you want this function:

def add(a, b=None):
    # Assuming a, b are int.
    if b is None:
        return a
    else:
        return a + b

Here's the equivalent C++ implementation:

m.def("add",
    [](int a, std::optional<int> b) {
        if (!b.has_value()) {
            return a;
        } else {
            return a + b.value();
        }
    },
    py::arg("a"), py::arg("b") = py::none()
);

In python, this function can be called with:

add(1)
add(1, 2)
add(1, b=2)
add(1, b=None)
Was this page helpful?
0 / 5 - 0 ratings