Skip to content

BF16 Support #9004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cbilgin opened this issue Mar 6, 2025 · 1 comment
Open

BF16 Support #9004

cbilgin opened this issue Mar 6, 2025 · 1 comment
Assignees
Labels
module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@cbilgin
Copy link

cbilgin commented Mar 6, 2025

cc @digantdesai @mcr229

@cbilgin cbilgin added the module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ label Mar 6, 2025
@cbilgin cbilgin moved this to Backlog in ExecuTorch - CPU Mar 6, 2025
@cbilgin cbilgin added this to the 0.6.0 milestone Mar 6, 2025
@iseeyuan iseeyuan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 7, 2025
@cbilgin cbilgin moved this from Backlog to Ready in ExecuTorch - CPU Mar 10, 2025
@mcr229 mcr229 self-assigned this Mar 14, 2025
@digantdesai
Copy link
Contributor

We should break this down further into subtasks like op support measured via e2e model enablement each may involve perf work, and upstreaming. Starting with Llama 3.2 bf16 (w/o quantization and with spin-quant with q*8-bf16-qb4w support). This may involve other work like SDPA (lowered to XNNPACK), KV-cache bf16 support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: In progress
Development

No branches or pull requests

4 participants