enable torchao quantization on CPU #36146

jiqing-feng · 2025-02-12T07:15:22Z

Torchao quantization is ready for CPU

jiqing-feng · 2025-02-12T08:44:33Z

The error is TypeError: Object of type Int4CPULayout is not JSON serializable when I want to save the int4 quantized model. It seems that the torchao api is not friendly. We'd like to figure out how to make it works.

Rocketknight1 · 2025-02-12T13:35:49Z

cc @SunMarc @MekkCyber

SunMarc

Thanks, left a few comments

src/transformers/utils/quantization_config.py

tests/quantization/torchao_integration/test_torchao.py

SunMarc · 2025-02-12T14:31:57Z

Also do you know if it is possible to serialize on device (cpu) and load/infer the model on cuda with int4wo ?

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-02-13T01:59:02Z

Also do you know if it is possible to serialize on device (cpu) and load/infer the model on cuda with int4wo ?

I am afraid not, because cuda and CPU have different data layout on int4_weight_only. But it works on int8, I will enable this test on int8.

jiqing-feng · 2025-02-13T02:58:15Z

Fixed all test issues, only left #36147 . Do you have any idea? @SunMarc

jiqing-feng · 2025-02-13T06:13:08Z

New update, I fixed the save error from json by this commit

All tests passed now : pytest tests/quantization/torchao_integration/test_torchao.py

Signed-off-by: jiqing-feng <[email protected]>

SunMarc · 2025-02-13T12:21:55Z

It would be nice if we could repack the tensors depending on the hardware we are loading them cc @jerryzh168

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-02-14T01:24:09Z

It would be nice if we could repack the tensors depending on the hardware we are loading them cc @jerryzh168

Hi @SunMarc . Yes, it could be our next plan. We need to sync it with torchao developers. Before that, do I need any changes to merge this PR?

MekkCyber · 2025-02-14T07:58:06Z

Thanks for the PR @jiqing-feng, I left some nits

tests/quantization/torchao_integration/test_torchao.py

docs/source/en/quantization/torchao.md

jiqing-feng · 2025-02-14T08:22:32Z

Hi @MekkCyber . Thanks for your review. I have fixed these comments.

HuggingFaceDocBuilderDev · 2025-02-14T08:24:45Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-02-17T05:05:54Z

I will rebase this PR after #36206 is merged.

SunMarc

Thanks for adding this. Just a few nits and we are good to merge

src/transformers/utils/quantization_config.py

tests/quantization/torchao_integration/test_torchao.py

jerryzh168 · 2025-02-20T01:39:33Z

It would be nice if we could repack the tensors depending on the hardware we are loading them cc @jerryzh168

we could build some functionality for to probably, that when we change the device we also change the layout, but that would need to be an inplace change, this may not be easily doable. we can also have a util to change between both device and layout I think, which would require some custom code in huggingface side to call it explicitly

SunMarc · 2025-02-20T09:47:20Z

I'm fine with the modification you did but feel like this could make everything easier to read if we just require torchao 0.8 for the tests : #36146 (comment)

SunMarc · 2025-02-20T09:49:17Z

we can also have a util to change between both device and layout I think, which would require some custom code in huggingface side to call it explicitly

Having this could be useful for a temporary solution

Signed-off-by: jiqing-feng <[email protected]>

MekkCyber · 2025-02-20T15:00:12Z

@jerryzh168, just FMI, what is the technical reason why the layout is different between CPU and GPU ? thanks !

jerryzh168 · 2025-02-20T23:13:45Z

@jerryzh168, just FMI, what is the technical reason why the layout is different between CPU and GPU ? thanks !

weights are packed in different format for performance optimizations, different hardwares (CPU v.s. GPU tensor core) prefers different formats for the kernel to work efficiently I think

jiqing-feng · 2025-02-21T08:03:04Z

Hi @SunMarc . I have updated the PR by your comments, but the failed CI is strange. Is it because no torchao available in the tests?

SunMarc

Thanks ! Just a nit

src/transformers/testing_utils.py

SunMarc · 2025-02-21T13:24:18Z

Failing tests are not related. I reran the tests

Signed-off-by: jiqing-feng <[email protected]>

Co-authored-by: Marc Sun <[email protected]>

jiqing-feng · 2025-02-24T01:40:27Z

Failing tests are not related. I reran the tests

Hi @SunMarc . The failed test still exists: 2 errored out because of NameError -> name 'Int4CPULayout' is not defined

SunMarc · 2025-02-24T12:58:09Z

Hi @SunMarc . The failed test still exists: 2 errored out because of NameError -> name 'Int4CPULayout' is not defined

You can put the code that trigger this error in a setup method (will be trigger before each test). That should fix the issue.

def setUp(self):
...

jiqing-feng · 2025-02-25T04:45:59Z

You can put the code that trigger this error in a setup method (will be trigger before each test). That should fix the issue.
def setUp(self):
...

Use setUp will change too much code and make the tests complicated. Just check the quant_scheme_kwargs in the init state will be fine. We can reformat it when torchao is fully ready.

jiqing-feng · 2025-02-25T04:53:10Z

Hi @SunMarc . I used a more straightforward method to fix the tests and also avoid changing too much code. Please review the new changes and let me know if any change is needed before merging. Thanks!

SunMarc

Thanks !

Signed-off-by: jiqing-feng <[email protected]>

SunMarc reviewed Feb 12, 2025

View reviewed changes

src/transformers/utils/quantization_config.py Outdated Show resolved Hide resolved

tests/quantization/torchao_integration/test_torchao.py Show resolved Hide resolved

tests/quantization/torchao_integration/test_torchao.py Show resolved Hide resolved

jiqing-feng added 6 commits February 12, 2025 15:04

enable torchao quantization on CPU

936b206

Signed-off-by: jiqing-feng <[email protected]>

fix int4

4759045

Signed-off-by: jiqing-feng <[email protected]>

fix format

5e51a1c

Signed-off-by: jiqing-feng <[email protected]>

enable CPU torchao tests

6b3c076

Signed-off-by: jiqing-feng <[email protected]>

fix cuda tests

2bf0ba2

Signed-off-by: jiqing-feng <[email protected]>

fix cpu tests

36c6534

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng added 4 commits February 13, 2025 10:17

update tests

872c778

Signed-off-by: jiqing-feng <[email protected]>

fix style

76badb1

Signed-off-by: jiqing-feng <[email protected]>

fix cuda tests

c964c6f

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into torchao

92b3ff1

jiqing-feng added 4 commits February 13, 2025 13:27

fix torchao available

fcf3e9e

Signed-off-by: jiqing-feng <[email protected]>

fix torchao available

a871b35

Signed-off-by: jiqing-feng <[email protected]>

fix torchao config cannot convert to json

65b7de3

Merge branch 'main' into torchao

6847b7c

MekkCyber reviewed Feb 14, 2025

View reviewed changes

tests/quantization/torchao_integration/test_torchao.py Show resolved Hide resolved

docs/source/en/quantization/torchao.md Outdated Show resolved Hide resolved

jiqing-feng added 2 commits February 14, 2025 16:03

fix docs

33da778

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into torchao

8b9b6b1

SunMarc reviewed Feb 19, 2025

View reviewed changes

jiqing-feng added 3 commits February 20, 2025 10:51

limited torchao version for CPU

49015bf

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into torchao

81897c4

fix format

135bbab

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into torchao

443b1cf

SunMarc approved these changes Feb 21, 2025

View reviewed changes

src/transformers/testing_utils.py Outdated Show resolved Hide resolved

jiqing-feng and others added 5 commits February 21, 2025 15:17

fix skip

248e065

Signed-off-by: jiqing-feng <[email protected]>

fix format

a71d8b9

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into torchao

9b3053a

Update src/transformers/testing_utils.py

e2fef70

Co-authored-by: Marc Sun <[email protected]>

Merge branch 'main' into torchao

66b5751

jiqing-feng force-pushed the torchao branch from 3c6a138 to 66b5751 Compare February 25, 2025 04:38

Merge branch 'main' into torchao

d356bf6

SunMarc approved these changes Feb 25, 2025

View reviewed changes

SunMarc merged commit 9d6abf9 into huggingface:main Feb 25, 2025
21 checks passed

jiqing-feng added 2 commits February 25, 2025 12:32

fix cpu test

9d529ca

Signed-off-by: jiqing-feng <[email protected]>

fix format

a633f27

Signed-off-by: jiqing-feng <[email protected]>

jerryzh168 mentioned this pull request Feb 27, 2025

Initial stab at string based config parser pytorch/ao#1774

Closed

jiqing-feng deleted the torchao branch March 27, 2025 08:10

enable torchao quantization on CPU #36146

enable torchao quantization on CPU #36146

Uh oh!

Conversation

jiqing-feng commented Feb 12, 2025

Uh oh!

jiqing-feng commented Feb 12, 2025

Uh oh!

Rocketknight1 commented Feb 12, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc commented Feb 12, 2025

Uh oh!

jiqing-feng commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiqing-feng commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiqing-feng commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc commented Feb 13, 2025

Uh oh!

jiqing-feng commented Feb 14, 2025

Uh oh!

MekkCyber commented Feb 14, 2025

Uh oh!

Uh oh!

Uh oh!

jiqing-feng commented Feb 14, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 14, 2025

Uh oh!

jiqing-feng commented Feb 17, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerryzh168 commented Feb 20, 2025

Uh oh!

SunMarc commented Feb 20, 2025

Uh oh!

SunMarc commented Feb 20, 2025

Uh oh!

MekkCyber commented Feb 20, 2025

Uh oh!

jerryzh168 commented Feb 20, 2025

Uh oh!

jiqing-feng commented Feb 21, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SunMarc commented Feb 21, 2025

Uh oh!

jiqing-feng commented Feb 24, 2025

Uh oh!

SunMarc commented Feb 24, 2025

Uh oh!

jiqing-feng commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiqing-feng commented Feb 25, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

jiqing-feng commented Feb 13, 2025 •

edited

Loading

jiqing-feng commented Feb 13, 2025 •

edited

Loading

jiqing-feng commented Feb 13, 2025 •

edited

Loading

jiqing-feng commented Feb 25, 2025 •

edited

Loading