Skip to content

GLM-4.1V Model support #38431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 125 commits into from
Jun 25, 2025
Merged

GLM-4.1V Model support #38431

merged 125 commits into from
Jun 25, 2025

Conversation

zRzRzRzRzRzRzR
Copy link
Contributor

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR commented May 28, 2025

  1. This PR aims to support the use of the GLM-4-0414 model for training video understanding and image understanding models GLM-4.1V
  2. This PR has completed the refactoring of the related modules. Due to the overlap of F definitions (torch and torchvision), image_processors and videos_processors have not been placed under modular management @zucchini-nlp review sugguest.
  3. This PR is for code review. @ArthurZucker

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I fixed the remaining parts, and confirmed that the model is still working as expected on real checkpoints - merging now! Thanks for the work!! 🤗🚀

@Cyrilvallez Cyrilvallez merged commit af98702 into huggingface:main Jun 25, 2025
18 checks passed
@ydshieh
Copy link
Collaborator

ydshieh commented Jun 26, 2025

@zRzRzRzRzRzRzR Thank you for adding this model. This model's tests is quite slow as you can see in the following list, and causes the job that has this model tests running in 12 minutes instead of other jobs (~4 minutes)

https://app.circleci.com/pipelines/github/huggingface/transformers/135728/workflows/913523da-80bc-4021-9e85-d5ab8653d204/jobs/1800021/timing

Would you be up to make is faster? Usually it means to tweak Glm4vVisionText2TextModelTester. It's good to check what the config created there and see if they are some values would cause the created model being big.

58.03s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_initialization
23.31s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_model_outputs_equivalence
20.84s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_attention_outputs
18.42s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_resize_tokens_embeddings
17.87s call     tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_model_can_generate
16.71s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_can_use_safetensors
16.34s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_load_save_without_tied_weights
15.13s call     tests/models/clvp/test_modeling_clvp.py::ClvpModelForConditionalGenerationTest::test_batching_equivalence
14.97s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_attn_implementation_composite_models
12.92s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_save_load
12.83s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_hidden_states_output
12.66s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_feed_forward_chunking
12.60s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_can_init_all_missing_weights
10.00s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_eager_matches_sdpa_inference_03_fp16_pad_left_no_attn_mask
9.79s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_model_weights_reload_no_missing_tied_weights
9.54s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_eager_matches_sdpa_inference_18_bf16_pad_left_no_attn_mask_sdpa_kernels
9.12s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_eager_matches_sdpa_inference_04_fp16_pad_right_sdpa_kernels
9.03s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_eager_matches_sdpa_inference_00_fp16_pad_left_sdpa_kernels
9.01s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_batching_equivalence
8.94s call     tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_eager_matches_sdpa_inference_01_fp16_pad_left

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants