-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Read/Write a Tensor Python #2928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Basically following http://pybind11.readthedocs.io/en/stable/advanced/pycpp/numpy.html * Use buffer protocol to return a view of Tensor. It can be cast to numpy array in Python. * Set a numpy array to a tensor.
paddle/pybind/pybind.cc
Outdated
| }; | ||
|
|
||
| template <typename T> | ||
| std::ostream& operator<<(std::ostream& os, const std::vector<T>& vec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this operator definition is not necessary. We can print std::vector by calling std::copy. For example:
std::copy(path.begin(), path.end(), std::ostream_iterator<T>(std::cout, " "));
paddle/pybind/pybind.cc
Outdated
|
|
||
| USE_OP(add_two); | ||
|
|
||
| struct PlaceDebugString : public boost::static_visitor<std::string> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
It seems that
PlaceDebugStringshould be inplatform/place.h, but not here. -
The usage of
PlaceDebugStringin this PR is not for debugging, so the name shouldn't containDebug.PrintPlaceseems more accurate. -
PrintPlacecan callstd::ostream &operator<<(std::ostream &, const Place &);inplatform.place.h, or vice versa, to ensure the consistent representation of Places.
paddle/framework/tensor.h
Outdated
|
|
||
| DDim dims() const { return dims_; } | ||
|
|
||
| platform::Place place() const { return holder_->place(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method Tensor::place would cause segment fault if a Tensor is constructed by its mutable_data hasn't been called. Given such delicacy, I'd suggest not to expose it to all users.
It seems that Tensor::place is exposed only for CastToPyBuffer. It seems that what we should add is Tensor::CastToPyBuffer, or Tensor::PyBuffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure but please check if we need al lthe following facilities for the conversion from Tensor to pybind::buffer_info:
- struct TensorToPyBuffer
- CastToPyBufferImpl
- CastToPyBuffer
Even if we need some of them, I would suggest move the definitions out from pybind/pybind.cc to either framework/tensor_pybind.h or pybind/tensor.h.
paddle/pybind/pybind.cc
Outdated
| py::module m("core", "C++ core of Paddle Paddle"); | ||
|
|
||
| py::class_<paddle::platform::Place>( | ||
| m, "Place", R"DOC(Device Place Class.)DOC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that R" is not necessary for a single-line string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't see this operator is used in this PR. Why would we need it?
paddle/pybind/pybind.cc
Outdated
| py::class_<pd::Tensor>(m, "Tensor", py::buffer_protocol()) | ||
| .def("get_place", &pd::Tensor::place) | ||
| .def_buffer([](pd::Tensor& self) -> py::buffer_info { | ||
| PADDLE_ENFORCE(paddle::platform::is_cpu_place(self.place()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that PADDLE_ENFORCE(paddle::platform::is_cpu_place(self.place()) can be moved into CastToPyBuffer, so we can make CastToPyBuffer a friend or a method of class Tensor. This allows us not to expose too many low-level attributes like Tensor::place() and Tensor::size().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good idea of friend class. But it cannot be done because CastToPyBuffer has many templates arguments.
paddle/pybind/pybind.cc
Outdated
|
|
||
| template <size_t I, typename... ARGS> | ||
| struct CastToPyBufferImpl<false, I, ARGS...> { | ||
| py::buffer_info operator()(pd::Tensor& tensor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems sufficient to define a function other than a functor here.
paddle/pybind/pybind.cc
Outdated
| using CUR_TYPE = typename std::tuple_element<I, std::tuple<ARGS...>>::type; | ||
| py::buffer_info operator()(pd::Tensor& tensor) { | ||
| TensorToPyBuffer<CUR_TYPE> cast_object(tensor); | ||
| if (cast_object.CanCast()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here seems the only application of TensorToPyBuffer that it doesn't bother to define the class TensorToPyBuffer.
paddle/pybind/pybind.cc
Outdated
| size_t prod = 1; | ||
| for (size_t i = dim_vec.size(); i != 0; --i) { | ||
| dims_outside[i - 1] = (size_t)dim_vec[i - 1]; | ||
| strides[i - 1] = sizeof(float) * prod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
float => T ?
paddle/pybind/pybind.cc
Outdated
| } | ||
| }; | ||
|
|
||
| template <bool less, size_t I, typename... ARGS> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that less is not needed as we can compare I with std::tuple_size<std::tuple<ARGS...>::value.
paddle/pybind/pybind.cc
Outdated
| } | ||
| }; | ||
|
|
||
| template <typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to simplify the following code
template <typename T>
struct TensorToPyBuffer {
pd::Tensor& self_;
explicit TensorToPyBuffer(pd::Tensor& self) : self_(self) {}
bool CanCast() const { return std::type_index(typeid(T)) == self_.type(); }
py::buffer_info Cast() const {
auto dim_vec = pd::vectorize(self_.dims());
std::vector<size_t> dims_outside;
std::vector<size_t> strides;
dims_outside.resize(dim_vec.size());
strides.resize(dim_vec.size());
size_t prod = 1;
for (size_t i = dim_vec.size(); i != 0; --i) {
dims_outside[i - 1] = (size_t)dim_vec[i - 1];
strides[i - 1] = sizeof(float) * prod;
prod *= dims_outside[i - 1];
}
return py::buffer_info(self_.mutable_data<T>(self_.place()),
sizeof(T),
py::format_descriptor<T>::format(),
(size_t)pd::arity(self_.dims()),
dims_outside,
strides);
}
};
template <bool less, size_t I, typename... ARGS>
struct CastToPyBufferImpl;
template <size_t I, typename... ARGS>
struct CastToPyBufferImpl<false, I, ARGS...> {
py::buffer_info operator()(pd::Tensor& tensor) {
PADDLE_THROW("This type of tensor cannot be expose to Python");
return py::buffer_info();
}
};
template <size_t I, typename... ARGS>
struct CastToPyBufferImpl<true, I, ARGS...> {
using CUR_TYPE = typename std::tuple_element<I, std::tuple<ARGS...>>::type;
py::buffer_info operator()(pd::Tensor& tensor) {
TensorToPyBuffer<CUR_TYPE> cast_object(tensor);
if (cast_object.CanCast()) {
return cast_object.Cast();
} else {
constexpr bool less = I + 1 < std::tuple_size<std::tuple<ARGS...>>::value;
return CastToPyBufferImpl<less, I + 1, ARGS...>()(tensor);
}
}
};
template <typename T>
std::ostream& operator<<(std::ostream& os, const std::vector<T>& vec) {
for (size_t i = 0; i < vec.size(); ++i) {
os << vec[i];
if (i + 1 != vec.size()) {
os << ", ";
}
}
return os;
}
py::buffer_info CastToPyBuffer(pd::Tensor& tensor) {
auto buffer_info = CastToPyBufferImpl<true, 0, float, int>()(tensor);
return buffer_info;
}into
template <size_t TypeIndex, typename... ElementTypes>
py::buffer_info TryCastToPythonBuffer(pd::Tensor& tensor) {
using Types = std::tuple<ElementTypes...>;
PADDLE_ENFORCE(TypeIndex < std::tuple_size<Types>::value,
"This tensor is not any type that can be cast to a NumPy array.");
using T = typename std::tuple_element<TypeIndex, Types>::type;
if (std::type_index(typeid(T)) == tensor.type()) {
std::vector<size_t> strides(tensor.dims().arity());
size_t prod = 1;
for (int a = strides.size(); a > 0; --a) {
strides[a-1] = sizeof(T) * prod;
prod *= get(tensor.dims(), a-1);
}
return py::buffer_info(self_.mutable_data<T>(self_.place()),
sizeof(T),
py::format_descriptor<T>::format(),
(size_t)pd::arity(self_.dims()),
vectorize(tensor.dims()),
strides);
} else {
return TryCastToPythonBuffer<I+1, ElementTypes>(tensor);
}
};
py::buffer_info CastToPyBuffer(pd::Tensor& tensor) {
PADDLE_ENFORCE(paddle::platform::is_cpu_place(self.place()),
"Only CPU tensor can be cast to numpy array");
return TryCastToPythonBuffer<0, float, int>(tensor);
}I haven't debug the new snippet, just hopefully it could shorten the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use bool less in template
The bool less is useful, because C++ 11 lack of static if statement, all codes in that block will be compiled, and part of them cannot be specialized.
std::tuple_element<TypeIndex, Types>::type;this line will cause a compile error because TypeIndex might larger than tuple size.
The template argument less is used for specializing two conditions. When less=False, we do not need to get the current type from the tuple.
Another way is to use std::enable_if to do the similar thing. But that will be harder to read.
Why use functor, i.e, struct with operator(), instead of function.
Because partial specialization is not allowed for cpp function. e.g. That code segment will never be compiled by a CPP compiler.
template <bool flag, typename T>
int foo();
template <typename T>
T foo<true, T>() {
return 0;
}
template <typename T>
T foo<false, T>() {
return 0;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other I will follow comments.
- Compile
CastToPyBufferImplandTensorToPyBuffertogether. - Move them into a header file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, get_place should be a public method for tensor because of two reason.
- The
CastToPyBufferImplclass is in modulepaddle::pybind. If that class is a friend ofTensor, it could be- Make cross dependency between two module.
CastToPyBufferImplis adetailsclass, should be private in that module.CastToPyBufferImplhas many template arguments, and cannot be defined in one friend class.
- Tensor has an attribute of
placeactually. Even in mathematical thinking, tensor should not contain device information, but actually, in neural network frameworks, it actually holds a device information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reyoung I don't understand what the issue that C++11 doesn't have static if. I built the following test program using GCC with -std=c++11
#include <iostream>
#include <tuple>
template <size_t TypeIndex, typename... ElementTypes>
void SelectType() {
using Types = std::tuple<ElementTypes...>;
if (TypeIndex >= std::tuple_size<Types>::value) {
std::cout << "Larger\n";
} else {
std::cout << "OK\n";
}
}
int main() {
SelectType<0, float, int>();
SelectType<1, float, int>();
SelectType<2, float, int>();
return 0;
}and it works:
$ g++ a.cc -std=c++11 -o a && ./a
OK
OK
LargerThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following program cannot be compiled, because the if/else statement is not run at compile time, i.e, not static if. So CUR_TYPE will always specialize, and give an error like
Error message is static_assert failed "tuple_element index out of range"
#include <iostream>
#include <tuple>
template <size_t TypeIndex, typename... ElementTypes>
void SelectType() {
using Types = std::tuple<ElementTypes...>;
if (TypeIndex >= std::tuple_size<Types>::value) {
std::cout << "Larger\n";
} else {
using CUR_TYPE = typename std::tuple_element<TypeIndex, std::tuple<ElementTypes...>>::type;
std::cout << "OK\n";
}
}
int main() {
SelectType<0, float, int>();
SelectType<1, float, int>();
SelectType<2, float, int>();
return 0;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The static if is added in C++ 17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the reason! Thanks! @reyoung
* Follow review comments to seperate Tensor Numpy interactive methods in tensor.h. * Simplify logic for `CastToPyBufferImpl`, make it as one struct and in details namespace. * Remove `Scope` expose in Python, since it currently is useless. * Remove some debug functions.
|
@reyoung The following program works. And it might lead to a simplified implementation that doesn't require std::tuple/std::tuple_element nor functor. But anyway we can merge this PR before we try a simpler version. #include <iostream>
#include <typeinfo>
#include <typeindex>
void SelectType() {}
template <class T, class... Ts>
void SelectType(T arg, Ts... args) {
if (std::type_index(typeid(arg)) == std::type_index(typeid(int))) {
std::cout << "OK, found int\n";
} else {
SelectType(args...);
}
}
int main() {
SelectType(0.0f, 0L, 0);
return 0;
} |
6755eec to
fa42c90
Compare
fa42c90 to
1dc53a2
Compare
Basically following
http://pybind11.readthedocs.io/en/stable/advanced/pycpp/numpy.html
numpy array in Python.