-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Unsafe access #617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsafe access #617
Conversation
Just a quick comment: the alias is definitely needed for backward compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I'm really convinced of this accessor policy approach. It feels a little un-C++ to me--personally I'd prefer a more stl-like approach where an object has both safe and unsafe accessors, but if the accessor policy is the way everyone else wants to go, so be it.
include/pybind11/numpy.h
Outdated
@@ -316,7 +316,64 @@ class dtype : public object { | |||
} | |||
}; | |||
|
|||
class array : public buffer { | |||
namespace detail { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be NAMESPACE_BEGIN(detail)
to be consistent with the rest of pybind.
include/pybind11/numpy.h
Outdated
@@ -561,35 +595,36 @@ class array : public buffer { | |||
} | |||
}; | |||
|
|||
template <typename T, int ExtraFlags = array::forcecast> class array_t : public array { | |||
template <typename T, int ExtraFlags = array<>::forcecast> class array_t : public array<> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is array_t
left out? I'd think taking it as an extra argument then inheriting from array<access_policy>
would make a lot of sense here.
This isn't related to your PR, but I also don't understand why we take ExtraFlags as a template argument here instead of just as an argument to ensure()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though that since array_t
is the C++ class for numpy array, we would want to always check the access to its elements. But I can propagate the policy parameter to it if this looks better.
include/pybind11/numpy.h
Outdated
@@ -316,7 +316,64 @@ class dtype : public object { | |||
} | |||
}; | |||
|
|||
class array : public buffer { | |||
namespace detail { | |||
inline void fail_dim_check(size_t dim, size_t ndim, const std::string& msg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better as [[noreturn]] PYBIND11_NOINLINE inline void fail_dim_check(...
(Also, unindent this part; I seem to recall @wjakob mentioning that not indenting functions in namespaces was one of his primary motivations for using the namespace macros).
include/pybind11/numpy.h
Outdated
throw index_error(msg + ": " + std::to_string(dim) + | ||
" (ndim = " + std::to_string(ndim) + ")"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NAMESPACE_END(detail)
include/pybind11/numpy.h
Outdated
}; | ||
|
||
template <class access_policy = safe_access_policy> | ||
class array : public buffer, private access_policy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing array
to a template class is definitely going to break code; this needs to be somewhere new with a using array = array_base<safe_access_policy>
. With that, you might as well drop the default on the template argument as well. You might also add a using array_unchecked = array_base<unsafe_access_policy>;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it has already broken the tests :) I will revert this change .
cc @aldanor |
b571c37
to
177b631
Compare
PR updated, I reverted the changes in tests, updated |
a483056
to
46ff174
Compare
46ff174
to
b943d9c
Compare
I don't understand why your travis-ci build didn't mark the GCC7 build as an allowed failure. The PR I just pushed did so, so perhaps it was a temporary travis-ci bug? |
I think so, it marked it as an allowed failure every prior squashing commit. |
I'm inclined to agree with that for the most part.
Although, I concur with @jagerman, if everyone and their grandmother agrees on that this is the way to go, then whatever. I'm also wondering what's the typical downstream example of code that would benefit from this and that wouldn't be possible on the current master? |
As to:
I think they should be convertible, i.e. I'm pretty sure you'll be able to do: |
Yea, you're right. Another option is to have |
I don't think any consensus was reached, so I wouldn't call in the grandmothers quite yet. My read on the situation is the following:
At the end of the day, I don't mind it going another way, I'm just voicing my concerns. |
That sounds like an interesting approach to me. The returned object could be a pure C++ type (no |
I lean towards the second approach by @jagerman and @aldanor. The safe on const-ness is tricky to achieve since numpy constness is a runtime thing. |
This is basically what Eigen support does in PR #610. It should have a PyObject somewhere, though, with a proper reference recorded so that numpy knows the base is referenced (and thus will also prohibit resize on the referencee). Edit: and also to guarantee that the original stays around, of course. (But "original" here could be up the chain if the |
I was thinking super stripped down: template<class T>
class unsafe_reference {
T* data;
const size_t* shape;
const size_t* strides;
friend class array;
unchecked_accessor(T* data, const size_t* shape, const size_t* strides);
public:
template<class Idx> T& operator(Idx... idx);
size_t shape(size_t dim) const;
}; Usage: void f(py::array a) {
auto r = a.unsafe_ref<double>();
// any usage of `r` is unchecked
double item = r(2, 3);
} If it's just going to be a reference, it doesn't need to keep anything alive (it wouldn't be a function parameter like Eigen types). |
It might be simpler still to just stash a |
Stashing the data/strides/shape pointers avoids having to reload them for every access. |
Oh, right, I was thinking Also: storing a |
To be honest, I agree with @jagerman : this approach is quite unusual in C++, generally both safe and unsafe accessors are provided. However, since there were many concerns about the unsafe-by-default approach but none about the policy approach, I thought this last one could conciliate everyone's concerns. |
So far I like @dean0x7d 's solution the best; it gives you unchecked access, but hides it behind a gatekeeper method which can do the |
As far as I can see, we're down to two proposals:
I'm personally in favor of option 2. What about others? 1 or 2? |
I'm more or less ambivalent between the two. I like 1's interface more, because it means I don't need to do anything special when I already know the array sizes are valid (e.g. because I just created the array). On the other hand, it bypasses the So I think I have a very weak preference for (2) over (1). |
Both are fine with me. |
@JohanMabille - I guess that's basically pre-approval for a suitable implementation of (2) to resolve this, if you want to update the PR. |
I have a pretty strong preference for making operator() unsafe, and use at if you want it. This is consistent with STL, IIRC |
@nbecker - the one thing I see in support of the intermediate class is that it enforces numpy's writeable flag. STL containers, on the other hand, get that enforcement at compile-time, which is why they don't need something like this. For numpy, however, a run-time check is required (but doing it on each access would obviate much of the point of providing unsafe access in the first place). |
(And if someone figures out a nicer way of dealing with that, perhaps through the |
Another possibility is to make checking a compile time option (e.g., NDEBUG) |
I would personally strongly oppose reusing |
To all involved: input on #746 would be appreciated. |
Closing this in favor of #746 |
This PR adds unsafe access as suggested in #599. Changes have been split in two commits since you may want to keep
py::array
as a non template class; in that case it would be easy to revert the last commit and makepy::array
an alias of a default template type, something like:Let me know what you think