[CINN] Explicitly set block launch bound for grid reduce #72191
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
CINN
PR Types
Bug fixes
Description
显式设置grid reduce的min_blocks_per_sm大小,避免编译器分配寄存器过多导致无法launch
注:nvrtc和nvcc的表现还不太一样,nvcc不设置也行,但是nvrtc报错了,所以还是设置一下
TODO:避免直接写1024这样的常数,但是这个要改的话需要从一开始的tile config就开始改,修改内容较多,所以之后再系统改
Pcard-85711