-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add unchecked array access via proxy object #746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds bounds-unchecked access to arrays through a `a.unchecked<Type, Dimensions>()` method. (For `array_t<T>`, the `Type` template parameter is omitted). A readonly version (which doesn't require mutability) is available as `a.unchecked_readonly<...>()`. Specifying the Dimensions as a template parameter allows storage of an std::array; having the strides and sizes stored that way (as opposed to storing a copy of the array's strides/shape pointers) allows the compiler to make significant optimizations of the shape() method that it can't make with a pointer; testing with nested loops of the form: for (size_t i0 = 0; i0 < r.shape(0); i0++) for (size_t i1 = 0; i1 < r.shape(1); i1++) ... r(i0, i1, ...) += 1; over a 10 million element array gives around a 25% speedup (versus using a pointer) for the 1D case, 33% for 2D, and runs more than twice as fast with a 5D array. Since the primary purpose of this interface is to get maximum performance, needing a compile-time dimensionality seems worthwhile given the gains it can achieve.
I thin that this looks great. Regarding the dramatic performance improvements: it's possible that having the dimensions fixed at compile time makes the program structure simple enough that the compiler will add a vectorized path for your example computations (adding entries, etc.). One really minor comments is that the reference classes should perhaps be in the detail namespace since they are not part of the explicit API. (this is just C++ duck typing where we return another type that behaves like an array). |
@wjakob - I expect that's most of it; but it appears to also need the variables to be local to be able to apply that optimization. I'm guessing that with just a pointer, the compiler can't guarantee that something else hasn't changed the value at that memory location, and has to check it each time. |
This gives a compile-time failure (rather than a later runtime failure) if attempting to return the proxy object (e.g. from a bound lambda with return type deduction).
This looks great, especially the compile-time ndim on the proxies. |
Should the unchecked references also support the |
Quick note on the |
I don't think I would call that an inconsistency. When a pybind11 function invoked from Python (i.e. the main use case of pybind11), the GIL is held, so there is a well-defined window were we have exclusive access to arrays. |
Only enforcing writability at the time of making the reference is also what numpy itself does: >>> z = np.array([[1,2],[3,4]])
>>> y = np.transpose(z)
>>> z.flags.writeable = False
>>> z[1,1] = 10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only
>>> z
array([[1, 2],
[3, 4]])
>>> y[1,1] = 10
>>> z
array([[ 1, 2],
[ 3, 10]])
>>> y
array([[ 1, 3],
[ 2, 10]]) Yes, you can work around it if you try, but that doesn't mean it isn't still useful. |
OK, fair enough for the mutability check. |
Regarding static dimension, what alternative did you test this against? If you were storing e.g. std::vector shapes, a difference of performance can be attributed to the stack-allocated std::array over heap-allocated std::vector. However, if you were wrapping the underlying pointer, I don't see the reason why it should make a difference. |
I compared against both std::vector and storing a copy of the |
ok, it makes sense. So the only problem would be dynamic allocation vs stack allocation. |
@jagerman: would you mind also adding Other than that, is there anything else that needs to be done here? |
What would |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
Minor comment on access defaults: the existing functions are read-only by default (e.g. data()
/mutable_data()
, at()
/mutable_at()
), while the new functions flip that to read-write by default (unchecked()
/unchecked_readonly()
). It would be nice to make the default unchecked()
call read-only for consitency.
docs/advanced/pycpp/numpy.rst
Outdated
m.def("increment_3d", [](py::array_t<double> x) { | ||
auto r = x.unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false | ||
if (x.ndim() != 3) | ||
throw std::runtime_error("error: 3D array required"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be missing something, but doesn't the unchecked()
call already throw under the same condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, yeah. That's leftover from earlier versions where it didn't.
include/pybind11/numpy.h
Outdated
* care: the array must not be destroyed or reshaped for the duration of the returned object, | ||
* and the caller must take care not to access invalid dimensions or dimension indices. | ||
*/ | ||
template <typename T, size_t Dims> detail::unchecked_reference<T, Dims> unchecked() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that both const
and non-const
versions of this function are needed. So far in pybind11, const
has referred to the state of the internal pointer PyObject *m_ptr
, but not to the pointee. E.g. attr()
is const
because obj.attr("something") = value
will modify the Python object, but not the pointer stored in obj
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Additionally with the unchecked()
-> mutable_unchecked()
rename it makes no sense at all to have a mutable_unchecked()
that returns a non-mutable reference.
include/pybind11/numpy.h
Outdated
|
||
template <typename T, size_t Dim> | ||
struct type_caster<unchecked_const_reference<T, Dim>> { | ||
static_assert(Dim == (size_t) -1 /* always fail */, "unchecked array proxy object is not castable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
AFAICS, |
I just thought it is weird that code which uses |
(maybe I'm just missing something) |
Such code is going to require change anyway, because it's going to have to access through a new proxy object rather than the array itself. |
Makes it consistent with other numpy methods (e.g. mutable_at/at). Also delete the const version of `unchecked` that mapped to `unchecked_readonly`: it makes no sense at all with the rename for a `mutable_unchecked` to return a non-mutable object.
Adding a: template <typename... Ix> const T *data(Ix... ix) const { return &operator()(ix...); } and exactly the same for |
I think that would be nice. Ditto for the |
A few more observations:
|
I pushed a change implementing dynamic dimensions as the default if the size is omitted (and thus defaults to -1), so you can write: auto r1 = arr.unchecked<T>();
auto r2 = arr_t.unchecked(); I also added the various utility methods ( That also makes it easier to compare performance; these are for 1-, 2-, and 5-nested loops that increment the value or calculate the sum of all matrix values, for 10M-element matrices of various shapes; the values are the
|
That's super-neat! To me this all looks good (including the docs) -- shall we merge it? |
Merged! I ran another test to see how much of an improvement this makes versus the current checked access. The answer: huge.
(numbers are directly comparable to the previous post). |
That's a pretty great speed up. Have you tried comparing to Side note: The git history might have benefited from a bit of a cleanup. Fixes within the PR are not useful information when looking through global history. |
Crap, you're right, I meant to squash that during merge. Can I undo a merge to fix it without messing up too much? |
Yeah, that's a lot of commits. @jagerman: do you want to rewrite history and squash this down a bit? |
You can just rebase -i on your end and then force-push. |
Okay, I'll squash directly on master. |
Looks like there is a problem with PyPy again |
Rewritten. |
PyPy 5.7 was released a few days ago; I suggest we switch the build to use the released version. (Perhaps we can add a 5.8 nightly as a failure-allowed build?). |
I'm happy to see speedups, and unchecked access is available. Unfortunately from my perspective, it isn't default and requires extra (code) steps. My typical use would be where indexes are correct by design, and checking access is a waste. It would be nice if I could write: |
There are tradeoffs: the biggest issue/objection was that it would be encoding the access mechanism by using a fundamentally different C++ type for safe versus unsafe access arrays, despite the type representing exactly the same underlying object. You'd need yet another C++ type (either under a separate name, or via a template parameter--either way still a distinct type) to distinguish between read-only and mutable input arrays, so now we'd have 3 C++ types to represent exactly the same object--and they'd need implicit conversions defined so that you could switch between them at will (with mutability checks when switching from a read-only to mutable type). There would also be a large amount of method duplication to support it all. So it's possible, but I'm not really convinced that it's worth the cost just to avoid needing to write |
My stab at addressing #599/#617.
This adds bounds-unchecked access to arrays through a
a.unchecked<Type, Dimensions>()
method which returns a proxy object (after checking dimensionality and mutability). Forarray_t<T>
, theType
template parameter is omitted. A readonly version (which doesn't require mutability when being created) is available asa.unchecked_readonly<...>()
.(I chose the name
unchecked
rather thanunsafe
because it isn't exactly unsafe to use this, it's just potentially unsafe if not done carefully).The one slightly unusual addition from @dean0x7d 's simple example is specifying the
Dimensions
as a template parameter. I did this because it allows storage of the strides and shape instd::array
s; having them stored that way (as opposed to storing a copy of the array's strides/shape pointers) allows thecompiler to make significant optimizations that it can't make with a pointer; testing with nested loops of the form:
over a 10 million element array gave me around a 25% speedup (versus using a pointer) for the 1D case, 33% for 2D, and saw the 5D case run about 2.5 times as fast; the latter, in particular, was a pretty remarkably speedup.
Since the primary purpose of this interface is for unfettered, high-performance access, requiring a compile-time dimensionality seems worthwhile given the gains it can achieve. As a nice side effect, it also lets us catch an invalid number of indices at compile time, and allows
operator []
access to be enabled only for 1D arrays.I haven't added the tests into the PR yet; I wanted to invite comments on the proposal first.