Skip to content

Conversation

@zchen0211
Copy link
Contributor

No description provided.

* return: output tensor
*/
template <typename T>
void GPUTGather(const Place& place, const Tensor* src, const Tensor* index,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GPUTGather name is not clear, and can we merge GPUGather GPUTGather together?

And we'd better change the parameter Place of GPUTGather to parameter CUDADeviceContext. CUDADeviceContext has a CUDA stream, we should launch a CUDA kernel on specific CUDA stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. Working on it now...

@QiJune
Copy link
Member

QiJune commented Sep 29, 2017

Please merge latest develop branch first

void Gather(const platform::Place& place, const paddle::framework::Tensor* src,
const paddle::framework::Tensor* index,
paddle::framework::Tensor* output) {
void CPUGather(const platform::Place& place,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better unify the parameter of CPUGather and GPUGather, the first parameter should be DeviceContext

void CPUGather(const platform::DeviceContext& ctx...)

*/
template <typename T>
void ScatterUpdate(const platform::Place& place,
void ScatterAssign(const platform::Place& place,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same with CPUGather

* return: output tensor
*/
template <typename T>
void GPUGather(const platform::DeviceContext& ctx, const Tensor* src,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the parameter is input, we should take const T&;
If the parameter is output, we should take T*;
If the parameter is both input and output, we should take T*;

Please refer to https://google.github.io/styleguide/cppguide.html#Reference_Arguments

template <typename T>
void GPUGather(const platform::DeviceContext& ctx, const Tensor* src,
const Tensor* index, Tensor* output) {
// PADDLE_ENFORCE(platform::is_gpu_place(place));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PADDLE_ENFORCE(platform::is_gpu_place(place));

* return: output tensor
*/
template <typename T>
void GPUScatterAssign(const platform::DeviceContext& ctx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as CPUGather, input should be const T&

Copy link
Member

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zchen0211 zchen0211 merged commit 2817ca0 into PaddlePaddle:develop Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants