Skip to content

[QUESTION] Enforce array_t type and ordering #2455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PierreMarchand20 opened this issue Sep 2, 2020 · 16 comments · Fixed by #2484
Closed

[QUESTION] Enforce array_t type and ordering #2455

PierreMarchand20 opened this issue Sep 2, 2020 · 16 comments · Fixed by #2484

Comments

@PierreMarchand20
Copy link

I have a naive question about array_t. I have a c++ function that takes a vector and modify it so that I can use it in python thereafter. Reading the documentation, I understood that using py::array_t<T, py::array::f_style | py::array::forcecast> allows converting a numpy array from Python to the type T and fortran ordering in c++.

But then, as soon as the numpy array does not match exactly what is required (so type T and fortran ordering) in my C++ code (with and without py::array::forcecast), the resulting vector in python is filled with zero.

So then, I would like to enforce the type and the ordering, is it possible ? It could throw an error in case the type or the error is not matching T and py::array::f_style.
I have seen some issues about this (#338 or #1305), but I did not find/understand a solution for my problem

@YannickJadoul
Copy link
Collaborator

But then, as soon as the numpy array does not match exactly what is required (so type T and fortran ordering) in my C++ code (with and without py::array::forcecast), the resulting vector in python is filled with zero.

Do you maybe have minimal reproducing example that I can compile, try out, and debug, @PierreMarchand20?

@PierreMarchand20
Copy link
Author

PierreMarchand20 commented Sep 4, 2020

Sorry, I thought the behavior was expected, so that the answer would be obvious. Here is a MWE:

  • a C++ function that takes three std::complex<double> array_t of dimension 2 with fortran ordering to do a matrix matrix product. The solution is stored in A.
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h> 

namespace py = pybind11;

// A=B*C
// A: N1xN2
// B: N1xN3
// C: N3xN2
void matmat_prod(py::array_t<std::complex<double>,  py::array::f_style | py::array::forcecast> A, py::array_t<std::complex<double>,  py::array::f_style | py::array::forcecast> B, py::array_t<std::complex<double>,  py::array::f_style | py::array::forcecast> C){

    if ((A.ndim()!=2 || B.ndim()!=2) || C.ndim()!=2){
        throw std::invalid_argument("Wrong dimension in input");
    }

    int N1 = A.shape()[0];
    int N2 = A.shape()[1];
    int N3 = B.shape()[1];
    if ((B.shape()[0]!=N1 || C.shape()[1]!=N2) || C.shape()[0]!=N3){
        throw std::invalid_argument("Wrong size in input");
    }
    for (int i=0;i<N1;i++){
        for (int j=0;j<N2;j++){
            for (int k=0;k<N3;k++){
                A.mutable_at(i,j)=A.at(i,j)+B.at(i,k)*C.at(k,j);
            }
        }
    }

}

PYBIND11_MODULE(MyModule, m) {
    m.def("matmat_prod",&matmat_prod);
}
  • a python test, with A defined as a default numpy array (without enforcing type and ordering), Atype defined as a numpy array enforcing type, Aorder defined as a numpy array enforcing fortran order, and Atypeorder enforcing both type and order.
import MyModule
import numpy as np

N1=10
N2=5
N3=50
A=np.zeros((N1,N2))
B=np.random.rand(N1,N3)
C=np.random.rand(N3,N2)
pybind11_openmp.matmat_prod(A,B,C)
print(A)

Atype=np.array(A,dtype="complex128")
pybind11_openmp.matmat_prod(Atype,B,C)
print(Atype)

Aorder=np.array(A,order="F")
pybind11_openmp.matmat_prod(Aorder,B,C)
print(Aorder)

Atypeorder=np.array(A,dtype="complex128",order="F")
pybind11_openmp.matmat_prod(Atypeorder,B,C)
print(Atypeorder)
  • I obtain matrices filled with zeros with all except Atypeorder.

@PierreMarchand20
Copy link
Author

So my question following the previous MWE is, why is there no issue giving a numpy array with the wrong type/ordering ? If I give the wrong argument to a function, usually an error is raised and the python module created using pybind11 shows me

  • what are the possible argument of the function I call
  • what I actually gave

Or, how can I throw an error in this case, to avoid bad surprises for people using this kind of function?

@PierreMarchand20
Copy link
Author

@YannickJadoul I hope this MWE is "minimal" enough, do not hesitate if you need more information.

@bstaletic
Copy link
Collaborator

I've just taken a look at this. I can repro, but I don't know numpy a lot. As far as I can tell, all of A, Aorder, Atype and Atypeorder have the exact same type. For the difference in type you can use A.dtype, but I don't see anything that would retrieve ordering information.

@YannickJadoul
Copy link
Collaborator

@PierreMarchand20 Sorry for the delay. I now got to testing out the code and found an obscure (probably undocumented) corner of pybind11 that I think solves your problem :-)

First of all: the problem is that by passing that array, a copy is made if you're expecting a different type or contiguous data. forcecast only allows for more casts, according to the numpy docs: https://numpy.org/doc/stable/reference/c-api/array.html?highlight=forcecast#c.PyArray_FromAny, but even disabling that flag, you still get promotion from double to complex, for example (and thus a copy, if you don't pass complex). I guess you had figured this out, already?

The undocumented secret to get an error is hidden here:

PYBIND11_NAMESPACE_BEGIN(detail)
template <typename T, int ExtraFlags>
struct pyobject_caster<array_t<T, ExtraFlags>> {
using type = array_t<T, ExtraFlags>;
bool load(handle src, bool convert) {
if (!convert && !type::check_(src))
return false;
value = type::ensure(src);
return static_cast<bool>(value);
}
static handle cast(const handle &src, return_value_policy /* policy */, handle /* parent */) {
return src.inc_ref();
}
PYBIND11_TYPE_CASTER(type, handle_type_name<type>::name);
};

I.e., when I write

PYBIND11_MODULE(example, m) {
    m.def("matmat_prod", &matmat_prod, py::arg("A").noconvert(true), py::arg("B"), py::arg("C"));
}

pybind11 gives you a nice error if you don't pass exactly the right type!
This doesn't seem to work for f_style or c_style, though, but I think that's a bug in pybind11 :-/

@YannickJadoul
Copy link
Collaborator

@PierreMarchand20 I've submitted a PR ;-)

@PierreMarchand20
Copy link
Author

Thank you both for your answers! Happy to see it was not as trivial as I thought.

Yes, I did understand that a copy was made if the type/ordering did not match, with and without the forecast option. I just checked the vector in C++ to see that I actually modified it. So once the PR is merged, noconvert(true) should solve this issue.

I am discovering pybind11 and I thought that using array_t was the easiest way to share a vector between C++ and python. But since I had this issue with this quite hidden solution, I am wondering if there is a better/simpler way? I just need to modify a vector in C++ and get it back in python, how people do that usually?

@PierreMarchand20
Copy link
Author

I let you close this issue when you want (now or when the PR is merged). Thank you for your amazing work 🙏🙏🙏🙏 I tried to interface my C++ code with ctypes before, and it feels WAY more "natural" to use pybind11 .

@YannickJadoul
Copy link
Collaborator

But since I had this issue with this quite hidden solution, I am wondering if there is a better/simpler way?

You can also just accept py::array (which should never copy, I think) and do the checks yourself.
If you want to change numpy arrays in place, numpy.h is probably the best way to go, though, yes. You could accept py::object and do some things manually, but it will be a lot slower :-)

But yes, numpy.h is not well documented. I hope to write some things down, at some point, but ... :-/

I let you close this issue when you want (now or when the PR is merged).

:-) It should be closed soon enough by the PR. Meanwhile, if you just accept py::array_t<double, 0> with noconvert, and manually check array.attr("flags").attr("F_CONTIGUOUS").cast<bool>() or so (or basically use the code I've written in #2484), though.

@PierreMarchand20
Copy link
Author

But yes, numpy.h is not well documented. I hope to write some things down, at some point, but ... :-/

I understand 😄 thankfully, it is pretty clean and well-documented so that I could just go through numpy.h and find most of the methods I needed!

@bstaletic
Copy link
Collaborator

You can also give py::bind_vector a try. That isn't as powerful as numpy, but is a stronger type by default (without noconver), since it creates a py::class_ under the hood.

@PierreMarchand20
Copy link
Author

FYI, I do get an error with the wrong ordering, but the output does not explain what is the issue. It just says it expects numpy.ndarray[numpy.complex128] let's say, and it shows an array, which is a numpy.ndarray[numpy.complex128].

@YannickJadoul
Copy link
Collaborator

@PierreMarchand20 That's true, yes, or at least what I'd expect. Not sure how that's best fixed

@PierreMarchand20
Copy link
Author

@YannickJadoul I do not know if/how it should be fixed, I just wanted you to be aware of this ^^ I do not mind personally

@YannickJadoul
Copy link
Collaborator

@PierreMarchand20 Yes, thanks! It's indeed confusing (but we'd need to somehow add the ordering to the type annotation?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants