-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Add CUDA kernel for prior_box_op. #9553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| const T step_height, const T* min_sizes, | ||
| const T* max_sizes, const int min_num, | ||
| bool is_clip) { | ||
| int num_priors = max_sizes ? as_num * min_num + min_num : as_num * min_num; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你在host上不是算过num_priors了么?
而且两个地方的计算方式貌似不一样,确认下?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
想少传递一些参数,就在kernel里也计算了下。
max_sizes 和 min_sizes的个数是相等的。 https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/prior_box_op.cc#L52 这里有check。 所以这里就直接 + min_num。
| int w = (i / num_priors) % width; | ||
| int p = i % num_priors; | ||
| int m = max_sizes ? p / (as_num + 1) : p / as_num; | ||
| T cx = (w + offset) * step_width; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第一个box就需要乘上step_width么?
请忽略,我看caffe也是这样写的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, each box needs to mul step_width https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/gserver/layers/PriorBox.cpp#L105
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent!
My biggest concern is its performance, and I feel that there are some points to be improved.
| @@ -0,0 +1,167 @@ | |||
| /* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2016 -> 2018
| namespace operators { | ||
|
|
||
| template <typename T> | ||
| __device__ inline T clip(T in) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clip -> Clip
Related to #9472