-
Notifications
You must be signed in to change notification settings - Fork 11.9k
convert : converting mmproj for Qwen2/2.5VL from convert_hf_to_gguf #13209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 651752f.
For visibility: I opened a discussion about 32B bug here: #12402 (comment) |
@ggerganov Since the problem with 32B model isn't related to this PR, I will investigate it later on (it's a low prio because the model doesn't get much usage) Could you review this PR now? Thanks! |
gguf-py/gguf/tensor_mapping.py
Outdated
), | ||
|
||
MODEL_TENSOR.V_ENC_EMBD_PATCH1: ( | ||
"visual.patch_embed.proj.weight.1", # qwen2vl, generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to work correctly - usually the ".weight"
is suffixed to the base name, so wouldn't this try to map "visual.patch_embed.proj.weight.1.weight"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic of map_tensor_name
will firstly try mapping with the whole name (which is the case here). If it cannot find a match, then it will retry with .weight
and .bias
(which is the case for all other tensors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm on second thought, there is a way to confine this hacky naming inside Qwen2VLVisionModel
, so it won't pollute the rest of the code. I did it in c030984
Here is the output:
INFO:hf-to-gguf:v.patch_embd.weight, torch.bfloat16 --> F16, shape = {14, 14, 3, 1280}
INFO:hf-to-gguf:v.patch_embd.weight.1, torch.bfloat16 --> F16, shape = {14, 14, 3, 1280}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, much better
@ngxson does your issues with 32b happen only on metal or accelerate too? |
Pre-quantized models converted via this PR:
Test results:
NOTE: the Qwen2.5-VL-32B model is broken even with text-only input (tested via
llama-cli
), the output is@@@@@@...
repeatedly