Skip to content

convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 2, 2025

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Apr 30, 2025

Pre-quantized models converted via this PR:

# Qwen 2 VL
llama-mtmd-cli -hf ggml-org/Qwen2-VL-2B-Instruct-GGUF
llama-mtmd-cli -hf ggml-org/Qwen2-VL-7B-Instruct-GGUF

# Qwen 2.5 VL
llama-mtmd-cli -hf ggml-org/Qwen2.5-VL-3B-Instruct-GGUF
llama-mtmd-cli -hf ggml-org/Qwen2.5-VL-7B-Instruct-GGUF
llama-mtmd-cli -hf ggml-org/Qwen2.5-VL-72B-Instruct-GGUF
# NOTE: Qwen2.5-VL-32B text-only model is currently unusable

Test results:

OK:   llama-mtmd-cli ggml-org/pixtral-12b-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2-VL-7B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-7B-Instruct-GGUF:Q4_K_M
FAIL: llama-mtmd-cli ggml-org/Qwen2.5-VL-32B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/Qwen2.5-VL-72B-Instruct-GGUF:Q4_K_M

NOTE: the Qwen2.5-VL-32B model is broken even with text-only input (tested via llama-cli), the output is @@@@@@... repeatedly

@github-actions github-actions bot added the python python script changes label Apr 30, 2025
@ngxson ngxson marked this pull request as ready for review April 30, 2025 20:43
@ngxson ngxson requested a review from ggerganov April 30, 2025 20:43
@ngxson
Copy link
Collaborator Author

ngxson commented Apr 30, 2025

For visibility: I opened a discussion about 32B bug here: #12402 (comment)

@ngxson
Copy link
Collaborator Author

ngxson commented May 2, 2025

@ggerganov Since the problem with 32B model isn't related to this PR, I will investigate it later on (it's a low prio because the model doesn't get much usage)

Could you review this PR now? Thanks!

),

MODEL_TENSOR.V_ENC_EMBD_PATCH1: (
"visual.patch_embed.proj.weight.1", # qwen2vl, generated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to work correctly - usually the ".weight" is suffixed to the base name, so wouldn't this try to map "visual.patch_embed.proj.weight.1.weight"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of map_tensor_name will firstly try mapping with the whole name (which is the case here). If it cannot find a match, then it will retry with .weight and .bias (which is the case for all other tensors)

Copy link
Collaborator Author

@ngxson ngxson May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm on second thought, there is a way to confine this hacky naming inside Qwen2VLVisionModel, so it won't pollute the rest of the code. I did it in c030984

Here is the output:

INFO:hf-to-gguf:v.patch_embd.weight,           torch.bfloat16 --> F16, shape = {14, 14, 3, 1280}
INFO:hf-to-gguf:v.patch_embd.weight.1,         torch.bfloat16 --> F16, shape = {14, 14, 3, 1280}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, much better

@ngxson ngxson merged commit 074e42a into ggml-org:master May 2, 2025
5 checks passed
@LostRuins
Copy link
Collaborator

@ngxson does your issues with 32b happen only on metal or accelerate too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants