convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209

ngxson · 2025-04-30T11:09:30Z

Pre-quantized models converted via this PR:

# Qwen 2 VL
llama-mtmd-cli -hf ggml-org/Qwen2-VL-2B-Instruct-GGUF
llama-mtmd-cli -hf ggml-org/Qwen2-VL-7B-Instruct-GGUF

# Qwen 2.5 VL
llama-mtmd-cli -hf ggml-org/Qwen2.5-VL-3B-Instruct-GGUF
llama-mtmd-cli -hf ggml-org/Qwen2.5-VL-7B-Instruct-GGUF
llama-mtmd-cli -hf ggml-org/Qwen2.5-VL-72B-Instruct-GGUF
# NOTE: Qwen2.5-VL-32B text-only model is currently unusable

Test results:

OK:   llama-mtmd-cli ggml-org/pixtral-12b-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2-VL-7B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-7B-Instruct-GGUF:Q4_K_M
FAIL: llama-mtmd-cli ggml-org/Qwen2.5-VL-32B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-72B-Instruct-GGUF:Q4_K_M

NOTE: the Qwen2.5-VL-32B model is broken even with text-only input (tested via llama-cli), the output is @@@@@@... repeatedly

This reverts commit 651752f.

ngxson · 2025-04-30T20:45:38Z

For visibility: I opened a discussion about 32B bug here: #12402 (comment)

ngxson · 2025-05-02T08:56:18Z

@ggerganov Since the problem with 32B model isn't related to this PR, I will investigate it later on (it's a low prio because the model doesn't get much usage)

Could you review this PR now? Thanks!

gguf-py/gguf/gguf_writer.py

ggerganov · 2025-05-02T13:14:15Z

gguf-py/gguf/tensor_mapping.py

+        ),
+
+        MODEL_TENSOR.V_ENC_EMBD_PATCH1: (
+            "visual.patch_embed.proj.weight.1", # qwen2vl, generated


Is this going to work correctly - usually the ".weight" is suffixed to the base name, so wouldn't this try to map "visual.patch_embed.proj.weight.1.weight"?

The logic of map_tensor_name will firstly try mapping with the whole name (which is the case here). If it cannot find a match, then it will retry with .weight and .bias (which is the case for all other tensors)

hmm on second thought, there is a way to confine this hacky naming inside Qwen2VLVisionModel, so it won't pollute the rest of the code. I did it in c030984

Here is the output:

INFO:hf-to-gguf:v.patch_embd.weight, torch.bfloat16 --> F16, shape = {14, 14, 3, 1280} INFO:hf-to-gguf:v.patch_embd.weight.1, torch.bfloat16 --> F16, shape = {14, 14, 3, 1280}

Nice, much better

LostRuins · 2025-05-03T03:25:55Z

@ngxson does your issues with 32b happen only on metal or accelerate too?

ngxson added 2 commits April 30, 2025 12:51

wip

792387b

qwen2.5vl ok

f7260c2

github-actions bot added the python python script changes label Apr 30, 2025

ngxson added 3 commits April 30, 2025 17:14

Merge branch 'master' into xsn/convert_gguf_qwen2vl

f48f51d

vision: fix models missing "text_config"

b5e72ed

add test

4fac7d4

github-actions bot added the examples label Apr 30, 2025

ngxson added 4 commits April 30, 2025 19:12

fix test repo name

474933e

fix 32B model

651752f

Revert "fix 32B model"

d96ef53

This reverts commit 651752f.

clarify about 32B

13e4ccc

ngxson marked this pull request as ready for review April 30, 2025 20:43

ngxson requested a review from ggerganov April 30, 2025 20:43

ngxson added 2 commits April 30, 2025 22:46

rm qwen surgery script

6e31ddc

update llava/readme

ef0bc7a

cebtenzzre mentioned this pull request May 1, 2025

Feature Request: Support multimodal LLMs such as Qwen2.5-VL as embedding models #13247

Closed

4 tasks

Merge branch 'master' into xsn/convert_gguf_qwen2vl

62b7b1e

ggerganov reviewed May 2, 2025

View reviewed changes

gguf-py/gguf/gguf_writer.py Show resolved Hide resolved

ggerganov approved these changes May 2, 2025

View reviewed changes

move V_ENC_EMBD_PATCH handling to Qwen2VLVisionModel

c030984

ngxson merged commit 074e42a into ggml-org:master May 2, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209

Uh oh!

ngxson commented Apr 30, 2025 •

edited

Loading

Uh oh!

ngxson commented Apr 30, 2025

Uh oh!

ngxson commented May 2, 2025

Uh oh!

Uh oh!

ggerganov May 2, 2025

Uh oh!

ngxson May 2, 2025

Uh oh!

ngxson May 2, 2025 •

edited

Loading

Uh oh!

ggerganov May 2, 2025

Uh oh!

Uh oh!

LostRuins commented May 3, 2025

Uh oh!

Uh oh!

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209

Uh oh!

Conversation

ngxson commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson commented Apr 30, 2025

Uh oh!

ngxson commented May 2, 2025

Uh oh!

Uh oh!

ggerganov May 2, 2025

Choose a reason for hiding this comment

Uh oh!

ngxson May 2, 2025

Choose a reason for hiding this comment

Uh oh!

ngxson May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggerganov May 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LostRuins commented May 3, 2025

Uh oh!

Uh oh!

ngxson commented Apr 30, 2025 •

edited

Loading

ngxson May 2, 2025 •

edited

Loading