Skip to content

Conversation

ggggxm
Copy link
Contributor

@ggggxm ggggxm commented Jul 30, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

当前einsum op对contraction dim的处理导致了两个问题,一个是非法地址访问,另一个是精度问题。
问题描述如下:A=[b, d],B=[b,1],einsum equation=
"bd,bd->b",其中d在AB中均出现并且输出中消除因此为contraction dim,前向过程首先需要对B广播成[b,d],再执行contraction操作,得到输出[b]。

  • 对于非法地址访问,当执行完dA = B*dC时,dA=[b,1],需要对dA进行Tile等操作。当前处理逻辑直接将A Resize为[b,d],而没有分配新的内存空间,因此出错。
  • 对于精度问题,反向过程中默认使用cache,cache保留的是前向过程中经过广播和转置等操作的A,B矩阵;而反向计算实际需要的是没有经过广播的A B矩阵。当前逻辑直接将[b,d]shape的cache B Resize成[b,1],得到的其实是cache B的前b个元素,而不是未经广播的B=[b,1]。

Copy link

paddle-bot bot commented Jul 30, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@ggggxm
Copy link
Contributor Author

ggggxm commented Jul 30, 2025

/re-run all-failed

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lshpku lshpku merged commit 6827b0d into PaddlePaddle:develop Jul 31, 2025
70 of 71 checks passed
@ooooo-create
Copy link
Contributor

这个问题是由 #74257 这个引起的吗,还是处理的同一个问题呢

@ggggxm
Copy link
Contributor Author

ggggxm commented Jul 31, 2025

这个问题是由 #74257 这个引起的吗,还是处理的同一个问题呢

修的是同一个问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants