Skip to content

Fix errors when use verl to train GLM4.1v model #39199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 8, 2025

Conversation

kaln27
Copy link
Contributor

@kaln27 kaln27 commented Jul 3, 2025

  • Support glm4v load from AutoModelForVision2Seq
  • Set glm4v model's _checkpoint_conversion_mapping attribute from None to empty dict {}

What does this PR do?

When use verl to train GLM4.1v model with GRPO, there are several small errors.
Here is how to fix them:

  • support glm4v load using AutoModelForVision2Seq
  • verl treat _checkpoint_conversion_mapping as a dict. But right now is None, which will abort the program. I also found that almost every model which don't need checkpoint convert have a empty dict.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@kaln27
Copy link
Contributor Author

kaln27 commented Jul 3, 2025

@ArthurZucker
Hi Arthur, would you mind to review this PR?
Thank your !

@zucchini-nlp
Copy link
Member

Let's address the mapping comment and rebase main, before we can merge

kaln27 and others added 2 commits July 8, 2025 17:21
* Support glm4v load from AutoModelForVision2Seq
* Set glm4v model _checkpoint_conversion_mapping attr from None to {}
Copy link
Contributor

github-actions bot commented Jul 8, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4v

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) July 8, 2025 09:27
@zucchini-nlp zucchini-nlp merged commit d370bc6 into huggingface:main Jul 8, 2025
19 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label Jul 9, 2025
Cyrilvallez pushed a commit that referenced this pull request Jul 11, 2025
* Fix errors when use verl to train GLM4.1v model

* Support glm4v load from AutoModelForVision2Seq
* Set glm4v model _checkpoint_conversion_mapping attr from None to {}

* Update modeling_auto.py
rjgleaton pushed a commit to rjgleaton/transformers that referenced this pull request Jul 17, 2025
* Fix errors when use verl to train GLM4.1v model

* Support glm4v load from AutoModelForVision2Seq
* Set glm4v model _checkpoint_conversion_mapping attr from None to {}

* Update modeling_auto.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for patch Tag issues / labels that should be included in the next patch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants