Skip to content

Match QAT prepare and convert numerics exactly for bf16 and fp16 #2060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 21, 2025

Conversation

andrewor14
Copy link
Contributor

@andrewor14 andrewor14 commented Apr 15, 2025

Summary: The previous PR #1964 got this to match for fp32, but there were three additional sources of numerical discrepancies with bf16:

  1. QAT asymmetric per token choose qparams diverged from choose_qparams_affine, which had simpler logic
  2. QAT per token fake quantize cast the input to fp32 before fake quantizing them
  3. QAT symmetric per group choose qparams used a hardcoded eps value that did not match choose_qparams_affine

These are both resolved in this commit: (1) QAT now uses choose_qparams_affine instead of the custom function for asymmetric per token, which is now deleted, (2) QAT no longer casts the input to fp32, and (3) QAT now uses an eps value that corresponds to the input dtype. The result is exact match in numerics between the prepare and convert steps for both fp32, bf16, and fp16.

Test Plan:

python test/quantization/test_qat.py -k test_fake_quantize_per_token_vs_convert
python test/quantization/test_qat.py -k test_qat_8da4w_prepare_vs_convert

Copy link

pytorch-bot bot commented Apr 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2060

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0267d18 with merge base 31f119e (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 15, 2025
@andrewor14 andrewor14 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Apr 15, 2025
@facebook-github-bot
Copy link
Contributor

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@andrewor14 andrewor14 requested a review from jerryzh168 April 15, 2025 21:56
@unittest.skipIf(
not TORCH_VERSION_AT_LEAST_2_4, "skipping when torch version is 2.4 or lower"
)
def test_fake_quantize_per_token_vs_convert_bf16(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we have float16 as well? also can probably use parametrization for these

@facebook-github-bot
Copy link
Contributor

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

**Summary:** The previous PR #1964 got this to match for fp32,
but there were two additional sources of numerical discrepancies
with bf16:

1. QAT asymmetric per token choose qparams diverged from
  `choose_qparams_affine`, which had simpler logic
2. QAT per token fake quantize cast the input to fp32 before
  fake quantizing them
3. QAT symmetric per group choose qparams used a hardcoded
  eps value that did not match `choose_qparams_affine`

These are both resolved in this commit: (1) QAT now uses
`choose_qparams_affine` instead of the custom function for
asymmetric per token, which is now deleted, (2) QAT no
longer casts the input to fp32, and (3) QAT now uses
an eps value that corresponds to the input dtype. The result
is exact match in numerics between the prepare and convert
steps for both fp32, bf16, and fp16.

**Test Plan:**
python test/quantization/test_qat.py -k test_fake_quantize_per_token_vs_convert
python test/quantization/test_qat.py -k test_qat_8da4w_prepare_vs_convert
@andrewor14 andrewor14 changed the title Match QAT prepare and convert numerics exactly for bf16 Match QAT prepare and convert numerics exactly for bf16 and fp16 Apr 17, 2025
@facebook-github-bot
Copy link
Contributor

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@andrewor14 andrewor14 merged commit 0045d88 into main Apr 21, 2025
19 of 20 checks passed
@jerryzh168
Copy link
Contributor

seems like some relevant tests are failing in CI? https://github.com/pytorch/ao/actions/runs/14568001515/job/40860205831

@petrex
Copy link
Collaborator

petrex commented Apr 22, 2025

I am seeing TestQAT.test_qat_8da4w_prepare_vs_convert_* failing in CI. Shall we skip the tests before fixes landed?

@andrewor14
Copy link
Contributor Author

Sorry, let me revert this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants