Skip to content

Conversation

@ktlichkid
Copy link

@ktlichkid ktlichkid commented Jun 21, 2018

Sequence expand op's GPU grad kernel implementation is not robust enough if memory optimizer is on.

The GPU kernel directly computed the sum of gradient without checking the initial value in d_x tensor.

In this PR, I moved the "set zero" function outside the functor to guarantee d_x is set to zero both on CPU and GPU.

@ktlichkid ktlichkid changed the title [WIP] Fix sequence expand op Fix sequence expand op Jun 26, 2018
LoDTensor* dx) {
math::SetConstant<platform::CPUDeviceContext, T> set_zero;
set_zero(context, dx, static_cast<T>(0));
// math::SetConstant<platform::CPUDeviceContext, T> set_zero;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove these two lines

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@wanghaoshuang wanghaoshuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Actually, the sequence expand op may also give wrong gradient value even memory optimizer is off.

@kuke
Copy link
Contributor

kuke commented Jun 27, 2018

@wanghaoshuang Please have a test on the attention-based OCR model to make sure that this change solves the problem.

Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! thanks!

@ktlichkid ktlichkid merged commit 8630ba2 into PaddlePaddle:develop Jun 27, 2018
@ktlichkid ktlichkid deleted the fix-seqexp branch June 27, 2018 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants