-
Notifications
You must be signed in to change notification settings - Fork 2.2k
pybind::array and pybind::array_t universal references #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The issue is that, although all arguments get converted to For stl functions like vector's class SomeNonCopyableClass {
// ...
operator size_t() const { return /* ... */; }
};
// ...
SomeNonCopyableClass a(2);
std::cout << myvec[a] << "\n"; With pass-by-value semantics without the type being in the function signature, this fails because I can't copy my So I think, to have an stl-like interface that takes anything convertible to a On a side note: there's an |
Can you think of a realistic case where people would be indexing with non-integral types convertible to |
Sure; perhaps I encode something else in my indices--a row title, for instance--and so my index objects are: struct myindex {
size_t i;
std::string name;
operator size_t() const { return i; }
}; Sure, there are other ways I could do this, but what is the compelling reason to not allow arbitrary types that are implicitly convertible to size_t to be used? (In other words, what does pass-by-value gain over perfect forwarding?) |
Not much except that it is the most used semantic for integral types, so universal references go against the principle of least surprise. |
In some unlikely cases it can have border effects, such as an object being moved unexpectedly. For example if you have a function foo that transform the index, but sometimes only returns it unchanged as an rvalue (effectively equivalently to std::move). Then |
Regarding your concern regarding non-copyable things, I think that it is a good assumptions / requirement for and index to have a value semantic. |
I agree: the ideal interface here is |
That would be a fault in The only downside of forwarding references is that they result in extra overhead (pointers to ints) in case the template function isn't inlined (unlikely for short templates, but it does happen). There's no difference for fully inlined templates. It would probably be best to keep things simple and optimize for the most common use case. These are pretty much always going to be simple integers so value semantics seems preferable. |
I strongly agree with @SylvainCorlay and @dean0x7d on that; passing by value is preferable, and follows the conventions of the C++ standard library.
The recursive approach is fine, it is what we use in xtensor. It is more efficient, no additional array on the stack is required, and the compiler can inline it so the assembly is just exactly the computation of the index in the underlying buffer, nothing more. And you force conversion to size_t for free. |
@wjakob what do you think of the above? |
It sounds like a fairly low level change -- either is fine with me I guess. This piece of code "belongs" to @aldanor, who has done an excellent job designing these APIs -- so I ultimately defer to him. |
There is a significant performance difference too. Regarding the dimension checks, I would like to remove them all and only check at the top-level calls to |
Significant performance difference for what? Are the checks being duplicated somewhere or do you just want to get rid of them altogether? The latter was discussed previously and went in favor of checking. |
I would assume all of this gets inlined for any decent C++ compiler (the implementations of these functions are all trivial), so I'm surprised by this statement. |
The performance issue comes from the stack array that is created in |
I want the checks to only be at the top-level functions ( |
PR coming in a minute. |
Hi all, a bit late to the party. Re: universal refs vs by-value, I'm personally fine with whichever way it is. Do we have a consensus on whether Re: checks, it's probably fine to remove the check from the lowest-level method ( |
Actually, I would suggest to do what Eigen and similar libraries do here: Simply #if !defined(NDEBUG)
....
#endif those parts. It makes a lot of sense to have checks for debugging, but they should probably be disabled for release builds. |
I like this idea even better. |
This one I'm not sure I fully agree with, tying bounds checking with a commonly used ndebug flag is very implicit; you switch for a release build and suddenly out of bounds exceptions become bad memory access. There's basically no way to re-enable bound checks in release builds then. |
|
@aldanor did you see the point about avoiding the temporary array (performance)? The following produces simpler assembly |
There's really two use cases here, I guess, hence all the bikeshedding. One is NumPy-like interface with NumPy-like behaviour which implies bound checks and no UB. The other is downstream library code like xtensor that chooses to manage bound checks and other things on its own. Hence my suggestion of providing "unchecked" variants of a few methods. |
@SylvainCorlay yes, recursive would result in a simpler assembly indeed, that's something we should probably do |
So the checks are moved out of We should still decide which top-level functions should have the checks and which not. |
Variadic template methods of
pybind::array
andpybind::array_t
used for accessing data, namelydata
,mutable_data
,offset_at
,index_at
,get_byte_offset
,at
,mutable_at
should take arguments by value instead of universal reference.
The semantic in the standard library for functions taking integral or iterator arguments is to take them by value. (For example
std::vector<T>::operator[]
takes its argument by value).The text was updated successfully, but these errors were encountered: