Skip to content

Conversation

zhengshengning
Copy link
Contributor

@zhengshengning zhengshengning commented Aug 12, 2025

PR Category

Execute Infrastructure

PR Types

Improvements

Description

精度问题描述:

  1. 选择 mode="nearest" 或 mode="bilinear" 下都存在 正向和反向精度问题。
image

问题定位与分析:

  1. 对比paddle与torch的实现发现,在mode="bilinear"时,torch实现使用 std::floor() 计算坐标位置,而paddle cpu使用floor()后又额外增加round()操作,所以需要去掉第二次的round()操作。
  2. 在mode="nearest"时,paddle gpu实现使用std::nearbyint()计算坐标位置,而paddle cpu使用的round(),所以对齐使用std::nearbyint()。同时代码先后进行了两次round(),需要去掉重复的一次。同时修改gpu GridSample3DCudaKernel中的round()为nearbyint()。
  3. 边界判断函数IsInBound()应该输入整型数据进行判断,而不是浮点数,所以需要在调用时使用static_cast。
  4. 由于"bilinear"和 "nearest" 共用 GetGridPointValue函数,但是两者的位置计算使用的取整函数不一样,所以增加一个GetGridPointValue_nearest 和 Get3DGridPointValue_nearest函数。

【4D精度测试】
image

【5D精度测试】
image

【float32精度问题 & 实验分析 -> 合理】
上面精度修复已经对grid sample的计算流程进行修复,但是在测试过程发现,当使用 paddle.nn.functional.grid_sample(Tensor([100, 1, 176, 176],"float32"), Tensor([100, 1, 37632, 2],"float32"), align_corners=False, ) shape,并且 grid tensor元素为 -0.2670455 时,cpu 与 torch 反向精度会出现不对齐。最后发现时float32浮点数计算精度舍入导致的,符合理论预期。下面是实验分析过程:

计算误差代码:

auto factor = static_cast<T>((max_val + 1) * 0.5);
grid_slice_t.device(place) = (grid_slice_t + static_cast<T>(1)) * factor - static_cast<T>(0.5);

当 grid_slice_t = -0.2670455 ,max_val = 175时,上面代码的使用理论计算应该为63.999996,但是上面数据类型是float32时,计算结果为64.000000,代码后面使用floor()进行位置计算,torch在当前数据下ix = 63,而paddle cpu的ix = 64,从而出现了精度误差。这个时候大家都会想到提升精度到double计算,然后再调用floor()计算位置,这个方法有尝试过,实验结果显示,使用double后,上面 grid_slice_t = -0.2670455 时是对齐了。但是 grid_slice_t = -0.41477278时,double的计算结果为50.9999954,floor()后为50,而torch的计算结果为51.000000,floor()后为51,还是不能对齐。怀疑是float32浮点数运算精度舍入导致。

为了进一步验证计算流程的正确性,将上面出现精度误差的实验数据的类型全部转换成float64,paddle与torch结果完全对齐。

x_np = numpy.load('/host_home/ningzhengsheng/src/PaddleAPITest/x.npy').astype('float64')
grid_np = numpy.load('/host_home/ningzhengsheng/src/PaddleAPITest/grid.npy').astype('float64')

使用float64后的反向计算结果:

// paddle
20.981019201441335 21.869245293486983 12.764171000923856 -73.98225032614089 15.466676140834522 95.2787161154327

// torch
20.981019201441335 21.869245293486983 12.764171000923856 -73.98225032614089 15.466676140834522 95.2787161154327

pcard-67164

Copy link

paddle-bot bot commented Aug 12, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@zhengshengning zhengshengning changed the title [Accuracy diff No.43、44] Accuracy grid sample [PHI] Accuracy grid sample Aug 12, 2025
@codecov-commenter
Copy link

codecov-commenter commented Aug 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@6bd5e66). Learn more about missing BASE report.

Additional details and impacted files
@@             Coverage Diff             @@
##             develop    #74555   +/-   ##
===========================================
  Coverage           ?   100.00%           
===========================================
  Files              ?         3           
  Lines              ?       105           
  Branches           ?         0           
===========================================
  Hits               ?       105           
  Misses             ?         0           
  Partials           ?         0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zhengshengning zhengshengning changed the title [PHI] Accuracy grid sample [Accuracy diff No.43-44] Accuracy grid sample Aug 14, 2025
Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhengshengning
Copy link
Contributor Author

/re-run all-failed

@DanielSun11 DanielSun11 merged commit a593bf2 into PaddlePaddle:develop Aug 15, 2025
71 of 77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants