-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Add cosine similarity operator. #3815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dzhwinter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM++
| grad_y->mutable_data<T>(context.GetPlace()); | ||
|
|
||
| auto dims = x->dims(); | ||
| int size = static_cast<int>(framework::product(dims)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the return type of ssize_t not proper here. Since the product of dims measures the variable reserved memory, then int(int64) is enough. Is it any other necessary reason behind? @Canpio @QiJune
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that no negative value is needed in many variables in dim.h, why not directly use size_t instead of int, to avoid such unnecessary type conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using int64_t can't solve the problem, for int64_t also can not cast to int automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All shape,size, dim related will use int64_t. And I will fix it later
So codes here will be:
int64_t = framework::product(dims);
auto new_dims = framework:::make_ddim({dims[0], size / dims[0])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use size_t for all of them?
paddle/operators/cos_sim_op.cc
Outdated
| auto *x_grad = ctx.Output<Tensor>(framework::GradVarName("X")); | ||
| auto *y_grad = ctx.Output<Tensor>(framework::GradVarName("Y")); | ||
| x_grad->Resize(x_dims); | ||
| y_grad->Resize(y_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (x_grad) x_grad->Resize(x_dims);
if (y_grad) y_grad->Resize(y_dims);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
| auto XY = (X * Y).sum(Eigen::array<int, 1>({1})); | ||
| XNorm.device(place) = (X * X).sum(Eigen::array<int, 1>({1})).sqrt(); | ||
| YNorm.device(place) = (Y * Y).sum(Eigen::array<int, 1>({1})).sqrt(); | ||
| Z.device(place) = XY / XNorm / YNorm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names of variables (including function parameters) and data members are all lowercase
https://google.github.io/styleguide/cppguide.html#Variable_Names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| } | ||
| self.compare_grad(op, inputs) | ||
| self.check_grad( | ||
| op, inputs, set(["X", "Y"]), "Out", max_relative_error=0.05) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setUp can be used to create the forward op and inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
| dX.device(place) = | ||
| dZ_bcast * (Y / norm_prod_bcast - Z_bcast * X / X_snorm_bcast); | ||
| dY.device(place) = | ||
| dZ_bcast * (X / norm_prod_bcast - Z_bcast * Y / Y_snorm_bcast); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to handle the case whether one of the input has no gradients.
if (dX) {
auto X_snorm_bcast = X_norm.square().eval().broadcast(bcast);
dX.device(place) =
dZ_bcast * (Y / norm_prod_bcast - Z_bcast * X / X_snorm_bcast);
}
// ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Resolved #3814