-
Notifications
You must be signed in to change notification settings - Fork 617
Use bounds_check_indices v2 on ROCm #3916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
No ciflow labels are configured for this repo. |
This pull request was exported from Phabricator. Differential Revision: D72334377 |
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Summary: X-link: facebookresearch/FBGEMM#1005 This diff forces using bounds_check_indices v2 on ROCm because ROCm has a constraint that the gridDim * blockDim has to be smaller than 2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32 while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus, its gridDim * blockDim is guaranteed to be smaller than 2^32 Reviewed By: q10, jianyuh, joebos Differential Revision: D72334377
Summary: X-link: facebookresearch/FBGEMM#1005 This diff forces using bounds_check_indices v2 on ROCm because ROCm has a constraint that the gridDim * blockDim has to be smaller than 2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32 while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus, its gridDim * blockDim is guaranteed to be smaller than 2^32 Reviewed By: q10, jianyuh, joebos Differential Revision: D72334377
Summary: X-link: facebookresearch/FBGEMM#1005 This diff forces using bounds_check_indices v2 on ROCm because ROCm has a constraint that the gridDim * blockDim has to be smaller than 2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32 while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus, its gridDim * blockDim is guaranteed to be smaller than 2^32 Reviewed By: q10, jianyuh, joebos Differential Revision: D72334377
Summary: X-link: facebookresearch/FBGEMM#1005 This diff forces using bounds_check_indices v2 on ROCm because ROCm has a constraint that the gridDim * blockDim has to be smaller than 2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32 while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus, its gridDim * blockDim is guaranteed to be smaller than 2^32 Reviewed By: q10, jianyuh, joebos Differential Revision: D72334377
This pull request was exported from Phabricator. Differential Revision: D72334377 |
Summary: X-link: facebookresearch/FBGEMM#1005 This diff forces using bounds_check_indices v2 on ROCm because ROCm has a constraint that the gridDim * blockDim has to be smaller than 2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32 while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus, its gridDim * blockDim is guaranteed to be smaller than 2^32 Reviewed By: q10, jianyuh, joebos Differential Revision: D72334377
This pull request was exported from Phabricator. Differential Revision: D72334377 |
This pull request has been merged in b25dec3. |
Summary: Pull Request resolved: facebookresearch/FBGEMM#1005 X-link: pytorch#3916 This diff forces using bounds_check_indices v2 on ROCm because ROCm has a constraint that the gridDim * blockDim has to be smaller than 2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32 while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus, its gridDim * blockDim is guaranteed to be smaller than 2^32 Reviewed By: q10, jianyuh, joebos Differential Revision: D72334377 fbshipit-source-id: 9c955b691e4462721d500b5b643e037d71e13e0c
This pull request has been reverted by 00690ec. |
Summary:
This diff forces using bounds_check_indices v2 on ROCm because ROCm
has a constraint that the gridDim * blockDim has to be smaller than
2^32. The v1 kernel can be launched with gridDim * blockDim > 2^32
while the v2 kernel limits the gridDim size to 64 * # of SMs. Thus,
its gridDim * blockDim is guaranteed to be smaller than 2^32
Differential Revision: D72334377