Skip to content

Conversation

@zRzRzRzRzRzRzR
Copy link
Contributor

What does this PR do?

This feature aims to enable the GLM-4.1V model to support 4D input mrope processing. It supports the same input format as Qwen3 in verl.

verl pr: volcengine/verl#3291

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR marked this pull request as draft September 26, 2025 14:39
@zRzRzRzRzRzRzR zRzRzRzRzRzRzR changed the title update for 4D mask Update GLM-4.1V MMRope implementation Sep 28, 2025
@zRzRzRzRzRzRzR zRzRzRzRzRzRzR marked this pull request as ready for review September 28, 2025 08:46
@Rocketknight1
Copy link
Member

cc @ArthurZucker for rope

cache_position: Optional[torch.LongTensor] = None,
**kwargs,
) -> tuple[torch.FloatTensor, Optional[tuple[torch.FloatTensor, torch.FloatTensor]]]:
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use @auto_docstring instead

Comment on lines 743 to 748
outputs = (hidden_states,)

if output_attentions:
outputs += (self_attn_weights,)

return outputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and keep @check_model_inputs below so we don't have to return these explicitly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this code inherits from Glm4MoeDecoderLayer, and this class doesn't seem to have @check_model_inputs. Where should it be added below?

Comment on lines 435 to 439
def forward(
self,
hidden_states: torch.Tensor,
position_embeddings: tuple[torch.Tensor, torch.Tensor],
attention_mask: Optional[torch.Tensor] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ig the overwriting is because of the @check_model_inputs, lets' bring the decorator back and remove unnecessary changes

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 9, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4v, glm4v_moe

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are failing, can you take a look?

@zRzRzRzRzRzRzR
Copy link
Contributor Author

now ci pass

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating, merging

@zucchini-nlp zucchini-nlp merged commit 1951f3b into huggingface:main Oct 9, 2025
19 checks passed
AhnJoonSung pushed a commit to AhnJoonSung/transformers that referenced this pull request Oct 12, 2025
* update for 4D mask

* update

* Update modular_glm4v.py

* 1

* Revert "1"

This reverts commit d13a763.

* update as glm4v logtic

* update

* 1

* update

* Create convert_glm4v_moe_mgt_weights_to_hf.py

* update

* update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants