-
Notifications
You must be signed in to change notification settings - Fork 31.7k
enable torchao quantization on CPU #36146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The error is |
SunMarc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, left a few comments
|
Also do you know if it is possible to serialize on device (cpu) and load/infer the model on cuda with int4wo ? |
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
I am afraid not, because cuda and CPU have different data layout on int4_weight_only. But it works on int8, I will enable this test on int8. |
|
New update, I fixed the save error from json by this commit All tests passed now : |
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
|
It would be nice if we could repack the tensors depending on the hardware we are loading them cc @jerryzh168 |
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Hi @SunMarc . Yes, it could be our next plan. We need to sync it with torchao developers. Before that, do I need any changes to merge this PR? |
|
Thanks for the PR @jiqing-feng, I left some nits |
|
Hi @MekkCyber . Thanks for your review. I have fixed these comments. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Signed-off-by: jiqing-feng <[email protected]>
|
I will rebase this PR after #36206 is merged. |
SunMarc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. Just a few nits and we are good to merge
we could build some functionality for |
|
I'm fine with the modification you did but feel like this could make everything easier to read if we just require torchao 0.8 for the tests : #36146 (comment) |
Having this could be useful for a temporary solution |
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
|
@jerryzh168, just FMI, what is the technical reason why the layout is different between CPU and GPU ? thanks ! |
weights are packed in different format for performance optimizations, different hardwares (CPU v.s. GPU tensor core) prefers different formats for the kernel to work efficiently I think |
|
Hi @SunMarc . I have updated the PR by your comments, but the failed CI is strange. Is it because no torchao available in the tests? |
SunMarc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ! Just a nit
|
Failing tests are not related. I reran the tests |
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Hi @SunMarc . The failed test still exists: |
You can put the code that trigger this error in a setup method (will be trigger before each test). That should fix the issue. |
Use |
|
Hi @SunMarc . I used a more straightforward method to fix the tests and also avoid changing too much code. Please review the new changes and let me know if any change is needed before merging. Thanks! |
SunMarc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Torchao quantization is ready for CPU