[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting #41420

danielquintas8 · 2025-10-07T16:02:14Z

What does this PR do?

Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting

Related issue

Fixes #41418

afonsosilva91 · 2025-10-07T16:12:20Z

+1

vasqu · 2025-10-07T16:26:43Z

cc @ArthurZucker for Moe since this might affect more models (?)

i3hz · 2025-10-08T02:28:15Z

I think you didn't run the code which generates the modeling files and that's why you're seeing the difference.

Rocketknight1 · 2025-10-08T12:16:37Z

cc @zucchini-nlp

ArthurZucker

Thanks, can you check this mistake was not propagated to other models please!

…en2_moe, qwen3_moe, qwen3_next,qwen3_omni_moe)

danielquintas8 · 2025-10-08T18:18:21Z

Thanks, can you check this mistake was not propagated to other models please!

We found the same casting happening in a few more models (ernie4_5_moe, qwen2_moe, qwen3_moe, qwen3_next, qwen3_omni_moe) and updated them accordingly

The hunyuan_v1_moe implementation of route_tokens_to_experts has different arguments compared to the other models, so this falls outside the scope of this issue. -> I will inspect it further in a future issue

github-actions · 2025-10-08T18:28:43Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: ernie4_5_moe, qwen2_moe, qwen3_moe, qwen3_next, qwen3_omni_moe, qwen3_vl_moe

jaaabir · 2025-10-10T11:48:22Z

@ArthurZucker can you merge this so that we can properly train these models?

danielquintas8 · 2025-10-13T21:13:39Z

Hey @ArthurZucker, all checks are green and the PR’s been referenced a couple times.
Just wanted to check if there’s anything else needed before merge. Thanks!

ArthurZucker

Thanks and sorry for coming back late

…uting_weights casting (huggingface#41420) * Fixed Expected self.dtype to be equal to src.dtype on eval * Fixed Expected self.dtype to be equal to src.dtype on eval * Fixed Expected self.dtype to be equal to src.dtype on eval * generated modeling_qwen3_vl_moe.py file * Fixed Ernie_4_5_MoE router casting * Fixed routing_weights dtype casting (ernie4_5_moe, hunyuan_v1_moe, qwen2_moe, qwen3_moe, qwen3_next,qwen3_omni_moe) * rollback hunyuan_v1_moe changes --------- Co-authored-by: Daniel Oliveira <[email protected]> Co-authored-by: Daniel Oliveira <[email protected]>

daniel3303 and others added 3 commits October 7, 2025 15:15

Fixed Expected self.dtype to be equal to src.dtype on eval

70b6bf1

Fixed Expected self.dtype to be equal to src.dtype on eval

7fa6bc8

Fixed Expected self.dtype to be equal to src.dtype on eval

47662b5

danielquintas8 and others added 3 commits October 8, 2025 08:58

generated modeling_qwen3_vl_moe.py file

e0d5aaa

Merge branch 'main' into qwen3-vl-moe-casting

d3973a1

Merge branch 'main' into qwen3-vl-moe-casting

f0a2b13

ArthurZucker approved these changes Oct 8, 2025

View reviewed changes

daniel3303 and others added 3 commits October 8, 2025 13:39

Merge branch 'main' into qwen3-vl-moe-casting

ae1ae0e

Fixed Ernie_4_5_MoE router casting

924829a

Fixed routing_weights dtype casting (ernie4_5_moe, hunyuan_v1_moe, qw…

54f830c

…en2_moe, qwen3_moe, qwen3_next,qwen3_omni_moe)

danielquintas8 and others added 2 commits October 8, 2025 19:19

Merge branch 'main' into qwen3-vl-moe-casting

23ca540

rollback hunyuan_v1_moe changes

97a3cc2

This was referenced Oct 12, 2025

When enabling SP, the qwen3_vl_moe model training throws an error volcengine/verl#3721

Closed

[model] fix: qwen3vl patch volcengine/verl#3686

Merged

ArthurZucker approved these changes Oct 14, 2025

View reviewed changes

ArthurZucker merged commit c620c38 into huggingface:main Oct 14, 2025
15 checks passed

ArthurZucker added for patch Tag issues / labels that should be included in the next patch and removed for patch Tag issues / labels that should be included in the next patch labels Oct 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting #41420

[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting #41420

Uh oh!

danielquintas8 commented Oct 7, 2025

Uh oh!

afonsosilva91 commented Oct 7, 2025

Uh oh!

vasqu commented Oct 7, 2025

Uh oh!

i3hz commented Oct 8, 2025

Uh oh!

Rocketknight1 commented Oct 8, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

danielquintas8 commented Oct 8, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 8, 2025

Uh oh!

jaaabir commented Oct 10, 2025

Uh oh!

danielquintas8 commented Oct 13, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting #41420

[Qwen3VLMoe] Fixed: Expected self.dtype to be equal to src.dtype - routing_weights casting #41420

Uh oh!

Conversation

danielquintas8 commented Oct 7, 2025

What does this PR do?

Related issue

Uh oh!

afonsosilva91 commented Oct 7, 2025

Uh oh!

vasqu commented Oct 7, 2025

Uh oh!

i3hz commented Oct 8, 2025

Uh oh!

Rocketknight1 commented Oct 8, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

danielquintas8 commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 8, 2025

Uh oh!

jaaabir commented Oct 10, 2025

Uh oh!

danielquintas8 commented Oct 13, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

danielquintas8 commented Oct 8, 2025 •

edited

Loading