Skip to content

Conversation

@cccclai
Copy link
Owner

@cccclai cccclai commented Nov 14, 2025

No description provided.

kaleid-liner and others added 10 commits May 7, 2025 08:00
… support GPTQ and BitNet models

- Integrate OSS runner into LlamaDemo
- Fix several tman_linear & tman_bitnet_linear accuracy issues
- Fix BitNet
- Add BitNet support
- Fix tiling parameters
- Optimize performance with DMA & Tiling
- Merge SHAs; Support torch.split_with_sizes; Workaround fix for custom op registration bug of multiple graphs compilation
- Integrate into LlamaDemo
- Add supports for symmetric quantization
- Clean debug code
- Fix uint16 correctness issue under certain inputs
- Support inference with fp16 and TMANOpPackage
- Define custom_op, annotators and NodeVisitor in Python side
- Add custom op package TMANOpPackage
Use different prompts according to mode
This is a miss-imported commit. The released model and libQnnTMANOpPackage.so should be built with this updates already.
@cccclai cccclai marked this pull request as draft November 14, 2025 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants