Fixes hqq by following a new path for bias parameter in pre_quantized models #37530

MekkCyber · 2025-04-15T13:16:22Z

What does this PR do?

Since HQQ overrides the load_state_dict method for HQQLinear, it directly loads both the weight and bias parameters. This differs from our approach, where we iterate through the parameters one by one and load the bias separately from the weights.

This PR updates the behavior to simply ignore the bias parameter, assuming it was already loaded alongside the weights in the case of pre-quantized models.

github-actions · 2025-04-15T13:16:34Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

SunMarc

SGTM ! Let's add a small test if this isn't tested

mobicham · 2025-04-15T13:34:35Z

Thank you @MekkCyber. Can you also change this:
https://github.com/huggingface/transformers/blob/main/tests/quantization/hqq/test_hqq.py#L85
to facebook/opt-125m
The current hqq test use a model without a bias.

MekkCyber · 2025-04-15T13:40:49Z

we already have this test : https://github.com/huggingface/transformers/blob/main/tests/quantization/hqq/test_hqq.py#L151, we just need to add the pre_quantized case

HuggingFaceDocBuilderDev · 2025-04-15T13:42:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

mobicham · 2025-04-15T13:43:32Z

we already have this test : https://github.com/huggingface/transformers/blob/main/tests/quantization/hqq/test_hqq.py#L151, we just need to add the pre_quantized case

It doesn't check for serialization of a model with a bias. If it did, the tests would have failed actually

mobicham · 2025-04-15T13:45:40Z

Can you please run this script:
https://gist.github.com/mobicham/701dd564c52590203ee09631425ad797

If it doesn't throw an error, the fix works as expected.

MekkCyber · 2025-04-15T13:48:48Z

https://github.com/huggingface/transformers/blob/main/tests/quantization/hqq/test_hqq.py#L151, we just need to add the pre_quantized case

It doesn't check for serialization of a model with a bias. If it did, the tests would have failed actually

yep that's what i meant, we only need to add the case of pre_quantized (serialized) models

MekkCyber · 2025-04-15T13:48:57Z

The snippet works well

mobicham · 2025-04-15T13:51:58Z

Awesome, thank you again @MekkCyber 🙏

… models (huggingface#37530) * fix * add test

mobicham · 2025-04-22T15:04:55Z

@MekkCyber unfortunately it seems that it's not fully resolved. For example, when I tried to load a quantized Qwen model that has a bias:

import torch
from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor

model_id = "mobiuslabsgmbh/Qwen2.5-VL-3B-Instruct_4bitgs64_hqq_hf"
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(model_id, torch_dtype=torch.float16, device_map="cuda:0")

File /opt/conda/lib/python3.11/site-packages/accelerate/utils/modeling.py:283, in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics, tied_params_map)
    280     return
    282 if old_value.device == torch.device("meta") and device not in ["meta", torch.device("meta")] and value is None:
--> 283     raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")
    285 param = module._parameters[tensor_name] if tensor_name in module._parameters else None
    286 param_cls = type(param)

ValueError: bias is on the meta device, we need a `value` to put in on cuda:0.

MekkCyber · 2025-04-22T15:06:36Z

Thanks for reporting @mobicham, will take a look

… models (huggingface#37530) * fix * add test

fix

7b59f2e

github-actions bot marked this pull request as draft April 15, 2025 13:16

MekkCyber requested a review from SunMarc April 15, 2025 13:16

MekkCyber marked this pull request as ready for review April 15, 2025 13:17

SunMarc approved these changes Apr 15, 2025

View reviewed changes

MekkCyber mentioned this pull request Apr 15, 2025

Loading HQQ quantized models is broken since #35926 #37263

Closed

add test

d7747ec

MekkCyber added 4 commits April 15, 2025 15:56

Merge branch 'main' into fix_hqq

17f429b

Merge branch 'main' into fix_hqq

db3c2ce

Merge branch 'main' into fix_hqq

a37f802

Merge branch 'main' into fix_hqq

79a46a7

MekkCyber merged commit 7752e74 into main Apr 16, 2025
21 checks passed

MekkCyber deleted the fix_hqq branch April 16, 2025 11:58

cyr0930 pushed a commit to cyr0930/transformers that referenced this pull request Apr 18, 2025

Fixes hqq by following a new path for bias parameter in pre_quantized…

24ce37c

… models (huggingface#37530) * fix * add test

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025

Fixes hqq by following a new path for bias parameter in pre_quantized…

1772b90

… models (huggingface#37530) * fix * add test

Fixes hqq by following a new path for bias parameter in pre_quantized models #37530

Fixes hqq by following a new path for bias parameter in pre_quantized models #37530

Uh oh!

Conversation

MekkCyber commented Apr 15, 2025

What does this PR do?

Uh oh!

github-actions bot commented Apr 15, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

mobicham commented Apr 15, 2025

Uh oh!

MekkCyber commented Apr 15, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 15, 2025

Uh oh!

mobicham commented Apr 15, 2025

Uh oh!

mobicham commented Apr 15, 2025

Uh oh!

MekkCyber commented Apr 15, 2025

Uh oh!

MekkCyber commented Apr 15, 2025

Uh oh!

mobicham commented Apr 15, 2025

Uh oh!

Uh oh!

mobicham commented Apr 22, 2025

Uh oh!

MekkCyber commented Apr 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants