Skip to content

Conversation

@kavyasrinet
Copy link

@kavyasrinet kavyasrinet commented Oct 3, 2017

Adding the implementation for RMSprop:

MeanSquareOut = decay * MeanSquare + (1 - decay) * Grad * Grad
MomentOut = momentum * Moment + LearningRate * Grad / sqrt(MeanSquareOut + epsilon)
ParamOut = Param - MomentOut

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Job! there are some small pieces need to fix.

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("Param", "Input parameter");
AddInput("Grad", "Input gradient");
AddInput("Moment", "Second moment");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is a typo. It is the momentum. And, the comment is not helpful, we should make the comment self-explained, in format of (type):comment. e.g. (tensor): blabla

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, will fix this.

RMSprop

MomentOut = decayRate * Moment + (1 - decayRate) * Grad * Grad
ParamOut = Param - learningRate * Grad / (sqrt(MomentOut) + epsilon)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Paddle old version, tensorflow, caffe2 had implemented rmsprop algorithm. They all follow the paper's formula parameter names, users used to use the same name between different version of our framework.
https://caffe2.ai/docs/operators-catalogue.html#rmsprop
tensorflow

class RmspropOpKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto param_out = ctx.Output<Tensor>("ParamOut");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here please use auto* since param_out is a pointer. auto keyword always hidden the real type, pointer can be more clear for users who read the code.

Copy link
Author

@kavyasrinet kavyasrinet Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I just assumed for now that auto will resolve this by itself. But I see the point of make it more understandable for users. Will fix.

param = np.random.random((123, 321)).astype("float32")
grad = np.random.random((123, 321)).astype("float32")
moment = np.zeros((123, 321)).astype("float32")
learning_rate = np.array([0.01]).astype("float32")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is 'learning_rate' not an attribute?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Abhinav commented above, I think I will retain this as an input for now, given that we are doing the same in all other PRs too.

http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf)
does not have the epsilon attribute. It is added here for numerical stability
to avoid division by zero.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better use the same common used parameter name since it is a popular optimizer. User used to the same name between frameworks.

https://github.com/tensorflow/tensorflow/blob/994226a4a992c4a0205bca9e2f394cb644775ad7/tensorflow/core/ops/training_ops.cc#L1281
https://caffe2.ai/docs/operators-catalogue.html#rmsprop

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kavyasrinet kavyasrinet merged commit 48f98a6 into PaddlePaddle:develop Oct 6, 2017
@kavyasrinet kavyasrinet deleted the rmsprop branch October 9, 2017 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants