Skip to content

ensure that there is only one "prototype" folder in torchao #1013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vkuzo opened this issue Oct 4, 2024 · 1 comment · Fixed by #1145 or #1187
Closed

ensure that there is only one "prototype" folder in torchao #1013

vkuzo opened this issue Oct 4, 2024 · 1 comment · Fixed by #1145 or #1187
Assignees

Comments

@vkuzo
Copy link
Contributor

vkuzo commented Oct 4, 2024

context

Today torchao has various prototype folders:

  1. https://github.com/pytorch/ao/tree/main/torchao/prototype
  2. https://github.com/pytorch/ao/tree/main/torchao/quantization/prototype
  3. https://github.com/pytorch/ao/tree/main/torchao/sparsity/prototype

the task

This is confusing. Let's just converge on one place, torchao/prototype. This task is to move the code over without breaking things:
A. chat with @andrewor14 and confirm if it's ok to move torchao/quantization/prototype to torchao/prototype/quantization. If yes, move it.
B. chat with @jcaip and confirm if it's ok to move torchao/sparsity/prototype to torchao/prototype/sparsity. If yes, move it.

When moving the code, it would be good to verify that scripts/features/tests still work, CI is green, and update any hardcoded locations as needed.

Note: feel free to ignore torchao/experimental for now.

@vkuzo
Copy link
Contributor Author

vkuzo commented Oct 4, 2024

cc @drisspg

@jainapurva jainapurva reopened this Oct 28, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
Summary: This improves average tokens/sec from 33.43 to 72.63 on A100 for AOTI.

```
python3 torchchat.py export llama3 --quantize '{"precision": {"dtype":"bfloat16"}, "executor":{"accelerator":"cuda"}}' --output-dso-path /tmp/model16.so && python3 torchchat.py generate llama3 --dso-path /tmp/model16.so --prompt "Once upon a time," --max-new-tokens 256 --device cuda --num-samples 3
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants