Add bias support for Int8DynActInt4WeightLinear #1845

andrewor14 · 2025-03-05T21:26:41Z

Summary: Previously, when we see a linear with bias, we simply do not swap it to Int8DynActInt4WeightLinear and leave it as is. Now we do swap it, but bias is not quantized and passed to F.linear in full precision.

Fixes #1821

Test Plan:
python test/quantization/test_quant_api.py -k test_8da4w_quantizer_linear_bias

pytorch-bot · 2025-03-05T21:26:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1845

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1379cce with merge base ffb4350 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

**Summary:** Previously, when we see a linear with bias, we simply do not swap it to `Int8DynActInt4WeightLinear` and leave it as is. Now we do swap it, but bias is not quantized and passed to F.linear in full precision. Fixes #1821 **Test Plan:** python test/quantization/test_quant_api.py -k test_8da4w_quantizer_linear_bias

jainapurva · 2025-03-06T18:36:56Z

@andrewor14 By bias quantization, does it mean quantized bias, or just support for bias in quantization?

andrewor14 · 2025-03-06T19:25:04Z

@andrewor14 By bias quantization, does it mean quantized bias, or just support for bias in quantization?

Right now this doesn't support quantized bias, just loading a quantized model where the linear has a non-existent bias. I think we can add actual bias quantization later. In general quantizing the bias doesn't seem to be very common, however, as the benefits are not super significant

jackzhxng

So with this PR, we will be able to represent unquantized bias with Int8DynActInt4WeightLinear?

andrewor14 · 2025-03-10T18:38:31Z

So with this PR, we will be able to represent unquantized bias with Int8DynActInt4WeightLinear?

Yep

andrewor14 requested review from jerryzh168, iseeyuan, jackzhxng and HDCharles March 5, 2025 21:26

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 5, 2025

andrewor14 force-pushed the 8da4w-bias branch from e3d25ac to 3b3aec6 Compare March 5, 2025 21:29

andrewor14 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Mar 5, 2025

andrewor14 force-pushed the 8da4w-bias branch from 3b3aec6 to 5192847 Compare March 5, 2025 21:32

andrewor14 force-pushed the 8da4w-bias branch from 5192847 to 1379cce Compare March 6, 2025 15:09

jainapurva self-requested a review March 6, 2025 18:23

jainapurva approved these changes Mar 10, 2025

View reviewed changes

jackzhxng reviewed Mar 10, 2025

View reviewed changes

andrewor14 merged commit f64d5a1 into main Mar 10, 2025
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bias support for Int8DynActInt4WeightLinear #1845

Add bias support for Int8DynActInt4WeightLinear #1845

andrewor14 commented Mar 5, 2025

pytorch-bot bot commented Mar 5, 2025 •

edited

Loading

jainapurva commented Mar 6, 2025

andrewor14 commented Mar 6, 2025

jackzhxng left a comment

andrewor14 commented Mar 10, 2025

Add bias support for Int8DynActInt4WeightLinear #1845

Add bias support for Int8DynActInt4WeightLinear #1845

Conversation

andrewor14 commented Mar 5, 2025

pytorch-bot bot commented Mar 5, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1845

✅ No Failures

jainapurva commented Mar 6, 2025

andrewor14 commented Mar 6, 2025

jackzhxng left a comment

Choose a reason for hiding this comment

andrewor14 commented Mar 10, 2025

pytorch-bot bot commented Mar 5, 2025 •

edited

Loading