[API]Fix the out of bounds issue and the missing of float16 with padd… #72815

xkkkkkk23 · 2025-05-20T09:32:31Z

PR Category

Execute Infrastructure

PR Types

Bug fixes

Description

修复paddle.frac的cuda越界问题以及原本不支持float16精度的操作问题
Pcard-85711

paddle-bot · 2025-05-20T09:32:36Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

lshpku · 2025-05-20T10:49:30Z

paddle/phi/backends/gpu/gpu_helper.h

@@ -21,6 +21,6 @@
 #include "paddle/phi/backends/gpu/cuda/cuda_helper.h"
 #endif

-#define CUDA_KERNEL_LOOP(i, num) CUDA_KERNEL_LOOP_TYPE(i, num, int)
+#define CUDA_KERNEL_LOOP(i, num) CUDA_KERNEL_LOOP_TYPE(i, num, int64_t)


呃，这里不能这么改吧，你这样相当于把所有引用CUDA_KERNEL_LOOP的kernel都改了，应该只在trunc里面改成用CUDA_KERNEL_LOOP_TYPE（这个宏可以指定index类型）

lshpku · 2025-05-20T10:51:51Z

paddle/phi/kernels/gpu/trunc_kernel.cu

+  grid.x = std::min(blocks, static_cast<int64_t>(UINT32_MAX));
+  grid.y = (blocks + UINT32_MAX - 1) / UINT32_MAX;


这里不能这样写死，要是新硬件改了怎么办，应该用context.GetCUDAMaxGridDimSize()获取上限，或者直接用GetGpuLaunchConfig1D函数获取grid和block配置

…le.frac

…dle#72815)

* refine forrange (#72360) * refine forrange * refine forrange * reduce support big tensor (#71970) * reduce support big tensor * [PHI] Fix gridDim limit for reduce kernel (#72507) * [API] isclose support bigtensor (#72516) * isclose support bigtensor * refine * [API] isnan isinf isfinite support bigtensor (#72517) * isnan isinf isfinite support bigtensor * refine * [PHI] Fix cum kernel for big tensor (#72562) * [PHI] Preliminary fix for elementwise broadcast int32 shape overflow (#72584) * [PHI] Align linalg.solve kernel with torch (#72608) * Update strided copy kernel (#72662) * [PHI] Fix grid sample kernel for big tensor (#72628) * [PHI] Fix argsort big tensor bug (#72712) * [PHI] Fixed argsort big tensor bug * [PHI] Fixed shape mismatch problem. * [PHI] Fix contiguous kernel for big tensor (#72705) * [PHI] Fix flatten and split kernel for big tensor (#72634) * [PHI] Fix out-of-bound issue of paddle.take_along_axis (#72757) * [PHI] fix paddle.diag with big tensor (#72638) * [API] fix paddle.cross with big tensor (#72652) * [PHI] Fix paddle.where api for big tensor (#72717) * [PHI] Fix bincount kernel for big tensor (#72706) * fix bincount kernel for big tensor * use HostAlloc to alloc memory * add cpu test case * [PHI] Fix full_like kernel for big tensor (#72831) * [API] Fix int overflow and float16 support for paddle.frac (#72815) * [PHI] Align paddle.inner with torch in matmul logic (#72843) * [PHI] Fix paddle.var & paddle.std float16 overflow (#72650) * [PHI] Fix logsumexp precision problem (#72681) * [PHI] Debug for logsumexp, bug source found * [PHI] Removed GetNumBlocks func to get correct logsumexp * [PHI] Removed redundant debug VLOG * [PHI] Elegant grid bounded solution * [Accuracy diff No.55-56、76-77] Fix accuracy diff for var&std API (#72879) * [Accuracy diff No.21] Fix accuracy diff for heaviside API (#72894) --------- Co-authored-by: Shuhao Liang <[email protected]> Co-authored-by: Qianyue He <[email protected]> Co-authored-by: Lei Ding <[email protected]> Co-authored-by: ggggxm <[email protected]> Co-authored-by: xkkkkkk23 <[email protected]> Co-authored-by: Zx <[email protected]> Co-authored-by: huangjiyi <[email protected]> Co-authored-by: ooo oo <[email protected]>

paddle-bot bot added the contributor External developers label May 20, 2025

lshpku reviewed May 20, 2025

View reviewed changes

xkkkkkk23 force-pushed the develop branch from 899b1c3 to f7c7b23 Compare May 20, 2025 11:12

[API]Fix the out of bounds issue and the missing of float16 with padd…

3c9a2d0

…le.frac

xkkkkkk23 force-pushed the develop branch from f7c7b23 to 3c9a2d0 Compare May 20, 2025 11:14

lshpku approved these changes May 21, 2025

View reviewed changes

lshpku merged commit d4aaad3 into PaddlePaddle:develop May 21, 2025
53 of 54 checks passed

co63oc pushed a commit to co63oc/Paddle that referenced this pull request May 22, 2025

[API] Fix int overflow and float16 support for paddle.frac (PaddlePad…

75e5195

…dle#72815)

wanghuancoder pushed a commit to wanghuancoder/Paddle that referenced this pull request May 27, 2025

[API] Fix int overflow and float16 support for paddle.frac (PaddlePad…

4f98369

…dle#72815)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[API]Fix the out of bounds issue and the missing of float16 with padd… #72815

[API]Fix the out of bounds issue and the missing of float16 with padd… #72815

Uh oh!

xkkkkkk23 commented May 20, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented May 20, 2025

Uh oh!

lshpku May 20, 2025

Uh oh!

lshpku May 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

		grid.x = std::min(blocks, static_cast<int64_t>(UINT32_MAX));
		grid.y = (blocks + UINT32_MAX - 1) / UINT32_MAX;

[API]Fix the out of bounds issue and the missing of float16 with padd… #72815

[API]Fix the out of bounds issue and the missing of float16 with padd… #72815

Uh oh!

Conversation

xkkkkkk23 commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented May 20, 2025

Uh oh!

lshpku May 20, 2025

Choose a reason for hiding this comment

Uh oh!

lshpku May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

xkkkkkk23 commented May 20, 2025 •

edited

Loading

lshpku May 20, 2025 •

edited

Loading