You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is confusing. Let's just converge on one place, torchao/prototype. This task is to move the code over without breaking things:
A. chat with @andrewor14 and confirm if it's ok to move torchao/quantization/prototype to torchao/prototype/quantization. If yes, move it.
B. chat with @jcaip and confirm if it's ok to move torchao/sparsity/prototype to torchao/prototype/sparsity. If yes, move it.
When moving the code, it would be good to verify that scripts/features/tests still work, CI is green, and update any hardcoded locations as needed.
Note: feel free to ignore torchao/experimental for now.
The text was updated successfully, but these errors were encountered:
Summary: This improves average tokens/sec from 33.43 to 72.63 on A100 for AOTI.
```
python3 torchchat.py export llama3 --quantize '{"precision": {"dtype":"bfloat16"}, "executor":{"accelerator":"cuda"}}' --output-dso-path /tmp/model16.so && python3 torchchat.py generate llama3 --dso-path /tmp/model16.so --prompt "Once upon a time," --max-new-tokens 256 --device cuda --num-samples 3
```
context
Today
torchao
has various prototype folders:the task
This is confusing. Let's just converge on one place,
torchao/prototype
. This task is to move the code over without breaking things:A. chat with @andrewor14 and confirm if it's ok to move
torchao/quantization/prototype
totorchao/prototype/quantization
. If yes, move it.B. chat with @jcaip and confirm if it's ok to move
torchao/sparsity/prototype
totorchao/prototype/sparsity
. If yes, move it.When moving the code, it would be good to verify that scripts/features/tests still work, CI is green, and update any hardcoded locations as needed.
Note: feel free to ignore
torchao/experimental
for now.The text was updated successfully, but these errors were encountered: