Match QAT prepare and convert numerics exactly for bf16 and fp16 #2060

andrewor14 · 2025-04-15T21:53:50Z

Summary: The previous PR #1964 got this to match for fp32, but there were three additional sources of numerical discrepancies with bf16:

QAT asymmetric per token choose qparams diverged from choose_qparams_affine, which had simpler logic
QAT per token fake quantize cast the input to fp32 before fake quantizing them
QAT symmetric per group choose qparams used a hardcoded eps value that did not match choose_qparams_affine

These are both resolved in this commit: (1) QAT now uses choose_qparams_affine instead of the custom function for asymmetric per token, which is now deleted, (2) QAT no longer casts the input to fp32, and (3) QAT now uses an eps value that corresponds to the input dtype. The result is exact match in numerics between the prepare and convert steps for both fp32, bf16, and fp16.

Test Plan:

python test/quantization/test_qat.py -k test_fake_quantize_per_token_vs_convert
python test/quantization/test_qat.py -k test_qat_8da4w_prepare_vs_convert

pytorch-bot · 2025-04-15T21:53:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2060

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0267d18 with merge base 31f119e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-04-15T21:55:04Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jerryzh168 · 2025-04-15T22:08:36Z

test/quantization/test_qat.py

+    @unittest.skipIf(
+        not TORCH_VERSION_AT_LEAST_2_4, "skipping when torch version is 2.4 or lower"
+    )
+    def test_fake_quantize_per_token_vs_convert_bf16(self):


nit: do we have float16 as well? also can probably use parametrization for these

facebook-github-bot · 2025-04-16T18:31:23Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2025-04-17T19:02:22Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

**Summary:** The previous PR #1964 got this to match for fp32, but there were two additional sources of numerical discrepancies with bf16: 1. QAT asymmetric per token choose qparams diverged from `choose_qparams_affine`, which had simpler logic 2. QAT per token fake quantize cast the input to fp32 before fake quantizing them 3. QAT symmetric per group choose qparams used a hardcoded eps value that did not match `choose_qparams_affine` These are both resolved in this commit: (1) QAT now uses `choose_qparams_affine` instead of the custom function for asymmetric per token, which is now deleted, (2) QAT no longer casts the input to fp32, and (3) QAT now uses an eps value that corresponds to the input dtype. The result is exact match in numerics between the prepare and convert steps for both fp32, bf16, and fp16. **Test Plan:** python test/quantization/test_qat.py -k test_fake_quantize_per_token_vs_convert python test/quantization/test_qat.py -k test_qat_8da4w_prepare_vs_convert

facebook-github-bot · 2025-04-17T22:28:36Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jerryzh168 · 2025-04-21T17:41:04Z

seems like some relevant tests are failing in CI? https://github.com/pytorch/ao/actions/runs/14568001515/job/40860205831

petrex · 2025-04-22T21:17:36Z

I am seeing TestQAT.test_qat_8da4w_prepare_vs_convert_* failing in CI. Shall we skip the tests before fixes landed?

…p16 (#2060)" This reverts commit 0045d88.

andrewor14 · 2025-04-23T03:23:28Z

Sorry, let me revert this for now

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 15, 2025

andrewor14 force-pushed the qat-bfloat16-match branch from 4c2da01 to 8bd3c69 Compare April 15, 2025 21:54

andrewor14 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Apr 15, 2025

andrewor14 requested a review from jerryzh168 April 15, 2025 21:56

jerryzh168 reviewed Apr 15, 2025

View reviewed changes

jerryzh168 approved these changes Apr 15, 2025

View reviewed changes

andrewor14 force-pushed the qat-bfloat16-match branch from 8bd3c69 to 26df223 Compare April 16, 2025 18:30

andrewor14 force-pushed the qat-bfloat16-match branch from 26df223 to 0267d18 Compare April 17, 2025 22:27

andrewor14 changed the title ~~Match QAT prepare and convert numerics exactly for bf16~~ Match QAT prepare and convert numerics exactly for bf16 and fp16 Apr 17, 2025

andrewor14 merged commit 0045d88 into main Apr 21, 2025
19 of 20 checks passed

andrewor14 mentioned this pull request Apr 21, 2025

Fix numeric mismatches #2085

Merged

andrewor14 added a commit that referenced this pull request Apr 23, 2025

Revert "Match QAT prepare and convert numerics exactly for bf16 and f…

92ad82d

…p16 (#2060)" This reverts commit 0045d88.

andrewor14 mentioned this pull request Apr 23, 2025

Revert "Match QAT prepare and convert numerics exactly for bf16 and fp16" #2113

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Match QAT prepare and convert numerics exactly for bf16 and fp16 #2060

Match QAT prepare and convert numerics exactly for bf16 and fp16 #2060

Uh oh!

andrewor14 commented Apr 15, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Apr 15, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Apr 15, 2025

Uh oh!

jerryzh168 Apr 15, 2025

Uh oh!

facebook-github-bot commented Apr 16, 2025

Uh oh!

facebook-github-bot commented Apr 17, 2025

Uh oh!

facebook-github-bot commented Apr 17, 2025

Uh oh!

Uh oh!

jerryzh168 commented Apr 21, 2025

Uh oh!

petrex commented Apr 22, 2025

Uh oh!

andrewor14 commented Apr 23, 2025

Uh oh!

Uh oh!

Match QAT prepare and convert numerics exactly for bf16 and fp16 #2060

Match QAT prepare and convert numerics exactly for bf16 and fp16 #2060

Uh oh!

Conversation

andrewor14 commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2060

✅ No Failures

Uh oh!

facebook-github-bot commented Apr 15, 2025

Uh oh!

jerryzh168 Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 16, 2025

Uh oh!

facebook-github-bot commented Apr 17, 2025

Uh oh!

facebook-github-bot commented Apr 17, 2025

Uh oh!

Uh oh!

jerryzh168 commented Apr 21, 2025

Uh oh!

petrex commented Apr 22, 2025

Uh oh!

andrewor14 commented Apr 23, 2025

Uh oh!

Uh oh!

andrewor14 commented Apr 15, 2025 •

edited

Loading

pytorch-bot bot commented Apr 15, 2025 •

edited

Loading