-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add ROCm support. #393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ROCm support. #393
Conversation
Codecov Report
@@ Coverage Diff @@
## master #393 +/- ##
==========================================
+ Coverage 62.42% 65.51% +3.09%
==========================================
Files 126 121 -5
Lines 7116 6540 -576
Branches 1255 1141 -114
==========================================
- Hits 4442 4285 -157
+ Misses 2469 2053 -416
+ Partials 205 202 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
__syncwarp(); | ||
#endif | ||
#ifdef __HIP_PLATFORM_HCC__ | ||
// confused !!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__syncwarp() in order to avoid read-write conflict of output_val, when warpReduceSum reading output_val from other threads within a warp.
AMD HIP doesn't support __syncwarp() now. Using __syncthreads() instead is ok, although bringing a few performance decrease.
at::cuda::CUDAGuard device_guard(bboxes1.device()); | ||
cudaStream_t stream = at::cuda::getCurrentCUDAStream(); | ||
#endif | ||
#ifdef __HIP_PLATFORM_HCC__ | ||
// at::cuda::HIPGuard device_guard(bboxes1.device()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why comment out HIPGuard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because pytorch handles ROCm as CUDA, it will cause assert device type error if not comment.
I'm not familiar with CUDA or ROCm, but i hope ROCM can get official efficient support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that here should use HIPGuardImplMasqueradingAsCUDA.h and getCurrentHIPStreamMasqueradingAsCUDA?
closed by duplicate PR #1022 |
Hi @acai66!We are grateful for your efforts in helping improve this open-source project during your personal time. |
note:
I have modified some CUDA apis to HIP apis, so i can compile it with ROCm. I have test mmcv/tests/test_ops/test_bbox.py and test_nms.py, both of them work fine.
But i don't have all the tests, and i have only an AMD graphic card, so i can not test it with CUDA.
It needs to be tested enough Before merging to main stream.