-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Add broadcasting support (e.g. matrix-vector) for cos sim operator. #3918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pkuyym
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost LGTM.
paddle/operators/cos_sim_op.cc
Outdated
| "Shape of Input(Out) must be [X.Dim(0), 1]."); | ||
| auto out_grad_dims = | ||
| ctx.Input<Tensor>(framework::GradVarName("Out"))->dims(); | ||
| PADDLE_ENFORCE_EQ(out_grad_dims, framework::make_ddim({x_dims[0], 1}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can using auto x_dims = framework::make_ddim({x_dims[0], 1}) to define x_dims first, and in the following you can just use x_dims instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
| // compute | ||
| auto place = context.GetEigenDevice<Place>(); | ||
| auto xy = (x * y).sum(Eigen::array<int, 1>({1})); | ||
| x_norm.device(place) = x.square().sum(Eigen::array<int, 1>({1})).sqrt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can define row_along = Eigen::array<int, 1>({1}); first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| auto xy = (x * y).sum(Eigen::array<int, 1>({1})); | ||
| z.device(place) = xy / x_norm / y_norm; | ||
| } else { | ||
| Eigen::DSizes<int, 2> bcast(rows_x, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better change bcast to bcast_dims?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to bcast_rows
paddle/operators/cos_sim_op.h
Outdated
| int rows_y = in_y->dims()[0]; | ||
| int cols = framework::product(in_x->dims()) / rows_x; | ||
| auto x = EigenMatrix<T>::From(*in_x, framework::make_ddim({rows_x, cols})); | ||
| auto y = EigenMatrix<T>::From(*in_y, framework::make_ddim({rows_y, cols})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can save framework::make_ddim({rows_x, cols}) and framework::make_ddim({rows_y, cols}) to variables and reuse them in the followings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
| Eigen::DSizes<int, 2> bcast_row(rows_x, 1); | ||
| auto y_bcast = y.broadcast(bcast_row); | ||
| auto y_snorm_bcast = | ||
| y_norm.square().eval().broadcast(bcast_row).eval().broadcast(bcast); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge the two broadcast operations to one like broadcast({rows, cols})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
JiayiFeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very comprehensive unit tests! Thanks!
paddle/operators/cos_sim_op.cc
Outdated
| // shape check | ||
| auto x_dims = ctx.Input<Tensor>("X")->dims(); | ||
| auto y_dims = ctx.Input<Tensor>("Y")->dims(); | ||
| PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DDim has a member function size(), with returns the arity of itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.cc
Outdated
| PADDLE_ENFORCE_EQ( | ||
| framework::slice_ddim(x_dims, 1, framework::arity(x_dims)), | ||
| framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), | ||
| "All dimensions except 1st of Input(X) and Input(Y) must be equal."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
‘... the 1st ...’ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.cc
Outdated
| auto y_dims = ctx.Input<Tensor>("Y")->dims(); | ||
| PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), | ||
| "Ranks of Input(X) and Input(Y) must be equal."); | ||
| PADDLE_ENFORCE_GE(framework::arity(x_dims), 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't the rank of x be 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator assumes that the first dim of the tensor is for batch axis, so we require that the rank to be larger than 1. Otherwise,it is difficult to tell whether a vector of shape 5 is actually 5 x 1 (five samples in the batch) or 1 x 5 (one sample)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very reasonable! Thanks!
paddle/operators/cos_sim_op.cc
Outdated
| "Rank of Input(X) must not be less than 2."); | ||
| PADDLE_ENFORCE_EQ( | ||
| framework::slice_ddim(x_dims, 1, framework::arity(x_dims)), | ||
| framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these two sliced dims are the same, tensors must have the same rank. The above PADDLE_ENFORCE_EQ(framework::arity(x_dims), framework::arity(y_dims), ... is unnecessary.
The infershape of backward operator has the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do not make sure that X and Y are of the same rank (meaning rank of Y is not less than 2), framework::slice_ddim(y_dims, 1, framework::arity(y_dims)) might throw exceptions because framework::arity(y_dims) might be smaller than 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
paddle/operators/cos_sim_op.cc
Outdated
| framework::slice_ddim(y_dims, 1, framework::arity(y_dims)), | ||
| "All dimensions except 1st of Input(X) and Input(Y) must be equal."); | ||
| PADDLE_ENFORCE(x_dims[0] == y_dims[0] || y_dims[0] == 1, | ||
| "1st dimension of Input(Y) must be equal to Input(X) or " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
‘The 1st dimension...’
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| "just 1 (which will be broadcasted to match Input(X))."); | ||
|
|
||
| // resize tensor | ||
| ctx.Output<Tensor>("Out")->Resize({x_dims[0], 1}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the second dimension is just 1, why do we need it? We can squeeze it to one dimension.
The infershape of backward operator has the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's a question we've discussed with @QiJune ---- make the output tensor to be of rank 1 or of rank 2? I prefer rank2, since when we use the output for further processing, it is very clear that it is a batch of samples of scalar instead of one single sample of vector. In a word, it reduces ambiguity. We can discuss this more.
paddle/operators/cos_sim_op.cc
Outdated
| AddInput("X", "The 1st input of cos_sim op."); | ||
| AddInput("Y", "The 2nd input of cos_sim op."); | ||
| AddOutput("Out", "The output of cos_sim op."); | ||
| AddOutput("XNorm", "Row norm of the first input.").AsIntermediate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is 'row norm'? Please add more instruction for this output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
paddle/operators/cos_sim_op.h
Outdated
| int rows_x = in_x->dims()[0]; | ||
| int rows_y = in_y->dims()[0]; | ||
| int cols = framework::product(in_x->dims()) / rows_x; | ||
| auto x = EigenMatrix<T>::From(*in_x, framework::make_ddim({rows_x, cols})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EigenMatrix has a member function Reshape, which converts a paddle tensor to EigenMatrix.
static typename EigenMatrix::ConstType Reshape(const Tensor& tensor, int num_col_dims);The EigenMatrix's first dimension(column length) will be the product of tensor's first num_col_dims dimensions. In current case, it is 1.
See https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/eigen.h#L67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
JiayiFeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for the excellent job!
Resolve #3917