Skip to content

【UnitTestFix No.14】fix test_matmul_v2_op.py#75909

Merged
luotao1 merged 1 commit intoPaddlePaddle:developfrom
scyyh11:fix/test_matmul_v2_op
Oct 21, 2025
Merged

【UnitTestFix No.14】fix test_matmul_v2_op.py#75909
luotao1 merged 1 commit intoPaddlePaddle:developfrom
scyyh11:fix/test_matmul_v2_op

Conversation

@scyyh11
Copy link
Contributor

@scyyh11 scyyh11 commented Oct 17, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

This PR fixes a critical issue where the matmul_v2 gradient operation produces inconsistent gradient shapes between eager mode and static compilation mode when the inputs are 1-D tensors.

Problem

  • Eager mode correctly returns 1-D gradients (n,) for 1-D inputs.
  • Static compilation mode (with check_prim_pir=True) incorrectly returns 2-D gradients (n, 1) or (1, n) for 1-D inputs.
  • This inconsistency causes test failures with shape mismatch errors like:
  AssertionError: Not equal to tolerance rtol=1e-15, atol=1e-15
  Check static comp grad out failed. Mismatch between static comp and eager on Place(gpu:0)
  static comp grad out tensor: [[-0.01653409 -0.33412735 ...]]  # shape (1, 100)
  eager grad out tensor: [-0.01653409 -0.33412735 ...]        # shape (100,)

Root Cause

The shape inconsistency occurs in two places:

  1. Composite implementation (paddle/fluid/primitive/decomp_rule/decomp_vjp/details.h):
    When check_prim_pir=True, the composite matmul_grad function unsqueezes 1-D inputs to 2-D for computation but does not convert the resulting 2-D gradients back to 1-D.

  2. Kernel implementation (paddle/phi/kernels/impl/matmul_grad_kernel_impl.h):
    Internal matrix operations produce 2-D gradients that are not reshaped back to 1-D for 1-D inputs.

Fix

  • In composite implementation, added shape handling logic to squeeze 2-D gradients back to 1-D if the original input was 1-D.
  • In kernel implementation, applied the same logic for consistency.
  • Both paths now check whether the original input was 1-D and squeeze the appropriate dimension to ensure consistent gradient shapes.

Changes

  1. Modified matmul_grad in details.h to handle 1-D gradient shapes.
  2. Modified MatmulGradKernel in matmul_grad_kernel_impl.h for consistency.

This fix ensures consistent behavior between eager and static compilation modes, resolving gradient shape mismatch errors in matmul_v2 backpropagation.


问题描述

  • 动态图模式:对 1 维输入,正确返回 1 维梯度 (n,)
  • 静态编译模式check_prim_pir=True):错误地返回 2 维梯度 (n,1)(1,n)
  • 这种不一致会导致测试失败,报出形状不匹配错误,例如:
  AssertionError: Not equal to tolerance rtol=1e-15, atol=1e-15
  Check static comp grad out failed. Mismatch between static comp and eager on Place(gpu:0)
  static comp grad out tensor: [[-0.01653409 -0.33412735 ...]]  # shape (1, 100)
  eager grad out tensor: [-0.01653409 -0.33412735 ...]        # shape (100,)

根本原因

问题出现在 两个地方

  1. 复合实现paddle/fluid/primitive/decomp_rule/decomp_vjp/details.h):
    check_prim_pir=True 时,matmul_grad 复合函数会将 1 维输入 unsqueeze 为 2 维 来进行矩阵运算,但 没有在梯度返回时再 squeeze 回 1 维

  2. 内核实现paddle/phi/kernels/impl/matmul_grad_kernel_impl.h):
    内部矩阵计算同样会生成 2 维梯度,未根据原始输入形状进行还原

解决方案

  • 复合实现:添加形状处理逻辑,如果原始输入是 1 维,则将 2 维梯度 squeeze 回 1 维。
  • 内核实现:添加相同的逻辑,保证一致性。
  • 现在两个路径都会检查原始输入维度,并在必要时对梯度进行 squeeze 处理。

修改内容

  1. 修改 details.h 中的 matmul_grad 复合函数,支持 1 维梯度形状处理。
  2. 修改 matmul_grad_kernel_impl.h 中的 MatmulGradKernel,保证与复合逻辑一致。

此修复确保了 eager 与静态编译模式下梯度计算行为一致,彻底解决了 matmul_v2 反向传播中的形状不匹配问题。

Screenshot 2025-10-16 at 10 03 25 PM

@luotao1 @YqGe585

…t shape for 1-D tensors in both forward and backward passes. This includes adjustments in the `details.h` and `matmul_grad_kernel_impl.h` files to handle reshaping of gradients appropriately.
@paddle-bot
Copy link

paddle-bot bot commented Oct 17, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Oct 17, 2025
@luotao1 luotao1 added the HappyOpenSource 快乐开源活动issue与PR label Oct 17, 2025
@scyyh11 scyyh11 marked this pull request as ready for review October 17, 2025 08:55
@scyyh11
Copy link
Contributor Author

scyyh11 commented Oct 17, 2025

/re-run all-failed

Copy link
Contributor

@A-nnonymous A-nnonymous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in squeezing logics

@luotao1 luotao1 merged commit 8f6b9df into PaddlePaddle:develop Oct 21, 2025
69 of 71 checks passed
@scyyh11 scyyh11 deleted the fix/test_matmul_v2_op branch October 21, 2025 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers HappyOpenSource 快乐开源活动issue与PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants