Skip to content

Conversation

GITD245
Copy link
Contributor

@GITD245 GITD245 commented Aug 15, 2025

PR Category

Auto Parallel

PR Types

Others

Description

为了绕过切多刀,动手/动半自定义算子moe_gate_dispatch与moe_combine在最初的实现版本中output有很多不同,此pr通过新增 auto 版本算子来暂时将动半版本的自定义算子下沉至框架内实现,后续需将动手/动半统一为一个版本

具体的diff如下:

  • moe_gate_dispatch
    • output y shape不同 动手为 num_experts * capacity, x_dims[1];动半为 num_experts, num_rows * k / num_experts, x_dims[1]
    • 由于上一点不同,kernel 中依赖 y 来确定 hidden_size 的方式需要改变
  • moe_gate_dispatch_grad
    • 由于 moe_gate_dispatch 第一点的不同,y_grad 维度由2变为3
    • 由于 moe_gate_dispatch 第一点的不同,确定 num_experts 和 hidden_size 的方式需要改变
  • moe_combine
    • 动手动半版本实现相同
  • moe_combine_grad
    • 动半相较动手需要额外返回scatter_index_grad
    • grad_combine_weights_helper 的 shape 动手为 combine_weights_shape[0], combine_weights_shape[1], x_dim[1];动半为 combine_weights_shape[0], combine_weights_shape[1] (这里与kernel内部实现有关,动手在得到这个output后在组网内使用sum手动消除了最后一维,后续应统一成二维output)

Copy link

paddle-bot bot commented Aug 15, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link

codecov-commenter commented Aug 15, 2025

Codecov Report

❌ Patch coverage is 0% with 271 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@62b1f03). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...ddle/phi/infermeta/spmd_rules/moe_gate_dispatch.cc 0.00% 111 Missing ⚠️
paddle/phi/infermeta/spmd_rules/moe_combine.cc 0.00% 93 Missing ⚠️
paddle/phi/infermeta/backward.cc 0.00% 34 Missing ⚠️
paddle/phi/infermeta/multiary.cc 0.00% 29 Missing ⚠️
...ython/paddle/incubate/nn/functional/moe_combine.py 0.00% 2 Missing ⚠️
...paddle/incubate/nn/functional/moe_gate_dispatch.py 0.00% 2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #74645   +/-   ##
==========================================
  Coverage           ?    0.00%           
==========================================
  Files              ?        6           
  Lines              ?      271           
  Branches           ?        0           
==========================================
  Hits               ?        0           
  Misses             ?      271           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@GITD245 GITD245 changed the title [Auto Paralle] [Cherry-pick] cherry-pick for custom ops (moe_combine moe_gate_dispatch) [Auto Paralle] [Cherry-pick] cherry-pick for auto parallel verison custom ops (moe_combine moe_gate_dispatch) Aug 18, 2025
@GITD245
Copy link
Contributor Author

GITD245 commented Aug 18, 2025

/re-run all-failed

1 similar comment
@GITD245
Copy link
Contributor Author

GITD245 commented Aug 19, 2025

/re-run all-failed

Copy link
Contributor

@liym27 liym27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

x: [S, H], S = b*s
gate_logits: [S, E]
outputs:
y: [E, C, H] is use_pad is true, else [S, K, H], currently only support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is use_pad is true -> if use_pad is true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续pr中修改

@liym27 liym27 merged commit 9e6d97c into PaddlePaddle:develop Aug 19, 2025
181 of 197 checks passed
@GITD245 GITD245 deleted the cherry-pick-spmd branch August 19, 2025 11:08
Luckycheng222 pushed a commit to Luckycheng222/Paddle that referenced this pull request Aug 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants