Skip to content

Conversation

@howardzhang-cv
Copy link

@howardzhang-cv howardzhang-cv commented Dec 20, 2025

Stack from ghstack (oldest at bottom):

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue #3516

Differential Revision: D89908990

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 80d8cd1 with merge base 27c5eb9 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

howardzhang-cv added a commit that referenced this pull request Dec 20, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: 69c1877
Pull-Request: #3520
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2025
@howardzhang-cv howardzhang-cv marked this pull request as draft December 20, 2025 02:09
@jerryzh168
Copy link
Contributor

jerryzh168 commented Dec 20, 2025

probably have to delete this and related tests etc. as well:

class FPXWeightOnlyConfig(AOBaseConfig):

you can search for quant_llm_linear in the code base (https://github.com/search?q=repo%3Apytorch%2Fao%20quant_llm_linear&type=code) and delete all the related code

@howardzhang-cv
Copy link
Author

First time working with the torchao repo so not really sure if this is the right way to do it:
I deleted the entire fp6_llm folder, and modified ops.py and test_ops.py to remove calls to quant_llm_linear. Is this what we wanted? Or did we want to just delete fp6_llm, keep the calls to quant_llm_linear, and just raise an error or something?
Also, if we are deleting quant_llm_linear, should I keep the floatx_tensor_core? I might be misunderstanding, but it seems like the point of those functions were just to create the fp6 that could use quant_llm_linear? In any case, there is still a reference to quant_llm_linear in floatx_tensor_core_layout.py and the README in that same folder that I have not removed. Just wanted some confirmation that this is what I'm supposed to be doing before continuing.

@jerryzh168
Copy link
Contributor

jerryzh168 commented Dec 20, 2025

@howardzhang-cv I think it might be cleaner if you delete the floatx_tensor_core_layout and the FPXWeightOnlyConfig in a separate PR first, before doing this

[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: fe8afeb
Pull-Request: #3520
@howardzhang-cv
Copy link
Author

Since they seemed pretty intertwined (since floatx seems to only use quant_llm_linear), it made a bit more sense to me to add them to this PR as well. I removed all the references/tests to floatx_tensor_core_layout and FPXWeightOnlyConfig. That included quite a bit more code changes and deletions, so please check over to make sure I didn't delete anything important.

There was also the float8 layouts and cutlass_semi_sparse layouts that were in the floatx folder that I kept and did not delete. There are a couple references to floatx in the repo that are actually referring to these I believe? (from_hp_to_floatx for example). Please let me know if this is correct.

@howardzhang-cv howardzhang-cv added the topic: bc-breaking Use this tag if this PR breaks backward compatibility label Dec 24, 2025
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: c124f6d
Pull-Request: #3520
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks! we can land as long as the CI pass I think

@howardzhang-cv howardzhang-cv added the topic: deprecation Use this tag if this PR deprecates a feature label Dec 24, 2025
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: 750701f
Pull-Request: #3520
[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 24, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: c4d9c5a
Pull-Request: #3520
@howardzhang-cv howardzhang-cv marked this pull request as ready for review December 24, 2025 08:30
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue [#3516](github.com//issues/3516)

[ghstack-poisoned]
howardzhang-cv added a commit that referenced this pull request Dec 30, 2025
Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls
Removed all tests/references to floatx_tensor_core_layout and FloatXTensorCoreLayout
Removed all tests/references to FPXWeightOnlyConfig

Tasks: Related to issue [#3516](github.com//issues/3516)
ghstack-source-id: 991d72b
Pull-Request: #3520
@howardzhang-cv
Copy link
Author

@howardzhang-cv has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: bc-breaking Use this tag if this PR breaks backward compatibility topic: deprecation Use this tag if this PR deprecates a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants