Skip to content

Conversation

@jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Mar 25, 2025

Summary:
We add the new torchao API support in hf transformers: #36526 one thing that's missing is it does not account for int4 weight only quant config only works on cuda, this PR adds back the workaround

also updated the version requirement to > 0.9 temporarily so that we can use the torchao nightly before 0.10 is released, we should chagne this back before land

Test Plan:
local test: https://gist.github.com/jerryzh168/0e749d0dab40e2a62a7f2e48639f77b5 (we can setup deserialization test later when we can quantize a small model and host in a stable place like TinyLlama/TinyLlama-1.1B-Chat-v1.0)

Reviewers:

Subscribers:

Tasks:

Tags:

@github-actions
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@github-actions github-actions bot marked this pull request as draft March 25, 2025 20:54
@Rocketknight1
Copy link
Member

cc @SunMarc @MekkCyber

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! Just a nit

Comment on lines -239 to 240
module._parameters[tensor_name] = torch.nn.Parameter(param_value).to(device=target_device)
quantize_(module, self.quantization_config.get_apply_tensor_subclass(), set_inductor_config=False)
quantize_(module, self.quantization_config.get_apply_tensor_subclass())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove set_inductor_config ? This was added due to that : #36608

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll have 0.10.0 branch cut next week, we can always ask people to use the latest torchao right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i feel like the next torchao release is quite breaking so we will force the user to use the latest torchao. When do you plan to remove the support for the string (for specifying the quant scheme) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove the string support after 0.10 release and when we enforce the torchao version

@SunMarc SunMarc marked this pull request as ready for review March 27, 2025 10:40
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! Please fix the style with make style and we are good to merge

@SunMarc
Copy link
Member

SunMarc commented Mar 28, 2025

There is still a failing test on the CI check_repository_consistency - Failed

Comment on lines 47 to 54
from torch.utils.checkpoint import checkpoint
from torchao.quantization import Int4WeightOnlyConfig

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are failling because of this

@SunMarc
Copy link
Member

SunMarc commented Apr 1, 2025

we recently updated the ruff version so this is probably the code_quality test is not passing. Can you please update your version of ruff @jerryzh168 ?

@SunMarc SunMarc merged commit a165458 into huggingface:main Apr 2, 2025
18 checks passed
zucchini-nlp pushed a commit to BakerBunker/transformers that referenced this pull request Apr 2, 2025
…ate (huggingface#36980)

* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <[email protected]>
duanjunwen pushed a commit to duanjunwen/transformers that referenced this pull request Apr 3, 2025
…ate (huggingface#36980)

* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <[email protected]>
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
…ate (huggingface#36980)

* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <[email protected]>
soghomon-b pushed a commit to soghomon-b/transformers that referenced this pull request Aug 24, 2025
…ate (huggingface#36980)

* merge

* fix import

* format

* reformat

* reformat

---------

Co-authored-by: Mohamed Mekkouri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants