You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Qualcomm AI Engine Direct - Add 4-bit Embedding Quantization Option
Summary:
- Introduce 4-bit embedding quantization for prefill, kv, and hybrid mode
- Fixe an assertion condition bug in the annotate_and_quant_scalar pass
- Refactor passes in capture_program
- Add topological sorting for passes in capture_program
0 commit comments