Remove support for quant_llm_linear #3520

howardzhang-cv · 2025-12-20T02:08:44Z

Stack from ghstack (oldest at bottom):

-> Remove support for quant_llm_linear #3520

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue #3516

Differential Revision: D89908990

[ghstack-poisoned]

pytorch-bot · 2025-12-20T02:08:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

B200 runners are down due to network issues

✅ No Failures

As of commit 80d8cd1 with merge base 27c5eb9 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 69c1877 Pull-Request: #3520

jerryzh168 · 2025-12-20T02:11:37Z

probably have to delete this and related tests etc. as well:

ao/torchao/prototype/quantization/quant_api.py

Line 620 in 7035fb7

class FPXWeightOnlyConfig(AOBaseConfig):

you can search for quant_llm_linear in the code base (https://github.com/search?q=repo%3Apytorch%2Fao%20quant_llm_linear&type=code) and delete all the related code

howardzhang-cv · 2025-12-20T02:14:26Z

First time working with the torchao repo so not really sure if this is the right way to do it:
I deleted the entire fp6_llm folder, and modified ops.py and test_ops.py to remove calls to quant_llm_linear. Is this what we wanted? Or did we want to just delete fp6_llm, keep the calls to quant_llm_linear, and just raise an error or something?
Also, if we are deleting quant_llm_linear, should I keep the floatx_tensor_core? I might be misunderstanding, but it seems like the point of those functions were just to create the fp6 that could use quant_llm_linear? In any case, there is still a reference to quant_llm_linear in floatx_tensor_core_layout.py and the README in that same folder that I have not removed. Just wanted some confirmation that this is what I'm supposed to be doing before continuing.

jerryzh168 · 2025-12-20T02:21:49Z

@howardzhang-cv I think it might be cleaner if you delete the floatx_tensor_core_layout and the FPXWeightOnlyConfig in a separate PR first, before doing this

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: fe8afeb Pull-Request: #3520

howardzhang-cv · 2025-12-24T02:19:55Z

Since they seemed pretty intertwined (since floatx seems to only use quant_llm_linear), it made a bit more sense to me to add them to this PR as well. I removed all the references/tests to floatx_tensor_core_layout and FPXWeightOnlyConfig. That included quite a bit more code changes and deletions, so please check over to make sure I didn't delete anything important.

There was also the float8 layouts and cutlass_semi_sparse layouts that were in the floatx folder that I kept and did not delete. There are a couple references to floatx in the repo that are actually referring to these I believe? (from_hp_to_floatx for example). Please let me know if this is correct.

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: c124f6d Pull-Request: #3520

jerryzh168

looks good, thanks! we can land as long as the CI pass I think

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 750701f Pull-Request: #3520

[ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: c4d9c5a Pull-Request: #3520

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Tasks: Related to issue [#3516](github.com//issues/3516) [ghstack-poisoned]

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout Removed all tests/references to FPXWeightOnlyConfig Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 991d72b Pull-Request: #3520

howardzhang-cv · 2025-12-30T01:33:31Z

@howardzhang-cv has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Update

07076f0

[ghstack-poisoned]

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2025

howardzhang-cv marked this pull request as draft December 20, 2025 02:09

howardzhang-cv requested a review from jerryzh168 December 20, 2025 02:09

Update

9e790a9

[ghstack-poisoned]

howardzhang-cv added the topic: bc-breaking Use this tag if this PR breaks backward compatibility label Dec 24, 2025

Update

724f2ef

[ghstack-poisoned]

jerryzh168 approved these changes Dec 24, 2025

View reviewed changes

howardzhang-cv added the topic: deprecation Use this tag if this PR deprecates a feature label Dec 24, 2025

Update

0904ee3

[ghstack-poisoned]

Update

739c904

[ghstack-poisoned]

howardzhang-cv marked this pull request as ready for review December 24, 2025 08:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove support for quant_llm_linear #3520

Remove support for quant_llm_linear #3520

Uh oh!

howardzhang-cv commented Dec 20, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

jerryzh168 commented Dec 20, 2025 •

edited

Loading

Uh oh!

howardzhang-cv commented Dec 20, 2025

Uh oh!

jerryzh168 commented Dec 20, 2025 •

edited

Loading

Uh oh!

howardzhang-cv commented Dec 24, 2025

Uh oh!

jerryzh168 left a comment

Uh oh!

howardzhang-cv commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Remove support for quant_llm_linear #3520

Are you sure you want to change the base?

Remove support for quant_llm_linear #3520

Uh oh!

Conversation

howardzhang-cv commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

❗ 1 Active SEVs

✅ No Failures

Uh oh!

jerryzh168 commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

howardzhang-cv commented Dec 20, 2025

Uh oh!

jerryzh168 commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

howardzhang-cv commented Dec 24, 2025

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

howardzhang-cv commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

howardzhang-cv commented Dec 20, 2025 •

edited

Loading

pytorch-bot bot commented Dec 20, 2025 •

edited

Loading

jerryzh168 commented Dec 20, 2025 •

edited

Loading

jerryzh168 commented Dec 20, 2025 •

edited

Loading