Skip to content

[Android] qwen2_5-0_5b runtime failure #9965

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kirklandsign opened this issue Apr 8, 2025 · 0 comments
Open

[Android] qwen2_5-0_5b runtime failure #9965

kirklandsign opened this issue Apr 8, 2025 · 0 comments
Assignees
Labels
module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@kirklandsign
Copy link
Contributor

kirklandsign commented Apr 8, 2025

🐛 Describe the bug

Posting on behalf of discord user Norbert Klockiewicz:

I'm trying to run the qwen2_5-0_5b model exported using the script provided at:
https://github.com/pytorch/executorch/tree/main/examples/models/qwen2_5 .

To reduce the model size, I modified the dtype to bfloat16 (bf16). When validating the model locally using LLaMARunner (built as described in step 3 here: https://github.com/pytorch/executorch/tree/main/examples/models/llama#step-3-run-on-your-computer-to-validate ), I encountered the following error:

E 00:00:00.788349 executorch:op_linear.cpp:32] Check failed (!bias.has_value()): bias not supported yet in linear
E 00:00:00.788352 executorch:method.cpp:1311] KernelCall failed at instruction 0:15 in operator aten::linear.out: 0x12

I resolved this by incorporating changes proposed in this PR, which adds support for bias in aten::linear:
#9527

After rebuilding the runner with these changes, the model runs correctly on my computer.

However, when I attempt to run it on Android (using a runner built following step 4: https://github.com/pytorch/executorch/tree/main/examples/models/llama#step-4-run-benchmark-on-android-phone ), the app crashes with a segmentation fault during the warmup run. Here's the log output:

I 00:00:00.011720 executorch:runner.cpp:67] Creating LLaMa runner: model_path=qwen2_5-0_5b.pte, tokenizer_path=tokenizer.bin
...
I 00:00:01.026996 executorch:runner.cpp:100] Reading metadata from model
I 00:00:01.027108 executorch:runner.cpp:180] Doing a warmup run...
Segmentation fault

The version of ExecuTorch I am working on is from branch "release/0.6".
Do you have any insight into what could be causing this segmentation fault on Android and how could it be fixed?

Versions

release/0.6

cc @cbilgin

@kirklandsign kirklandsign added module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 8, 2025
@kirklandsign kirklandsign self-assigned this Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Development

No branches or pull requests

1 participant