Description
🐛 Describe the bug
Posting on behalf of discord user Norbert Klockiewicz:
I'm trying to run the qwen2_5-0_5b model exported using the script provided at:
https://github.com/pytorch/executorch/tree/main/examples/models/qwen2_5 .
To reduce the model size, I modified the dtype to bfloat16 (bf16). When validating the model locally using LLaMARunner (built as described in step 3 here: https://github.com/pytorch/executorch/tree/main/examples/models/llama#step-3-run-on-your-computer-to-validate ), I encountered the following error:
E 00:00:00.788349 executorch:op_linear.cpp:32] Check failed (!bias.has_value()): bias not supported yet in linear
E 00:00:00.788352 executorch:method.cpp:1311] KernelCall failed at instruction 0:15 in operator aten::linear.out: 0x12
I resolved this by incorporating changes proposed in this PR, which adds support for bias in aten::linear:
#9527
After rebuilding the runner with these changes, the model runs correctly on my computer.
However, when I attempt to run it on Android (using a runner built following step 4: https://github.com/pytorch/executorch/tree/main/examples/models/llama#step-4-run-benchmark-on-android-phone ), the app crashes with a segmentation fault during the warmup run. Here's the log output:
I 00:00:00.011720 executorch:runner.cpp:67] Creating LLaMa runner: model_path=qwen2_5-0_5b.pte, tokenizer_path=tokenizer.bin
...
I 00:00:01.026996 executorch:runner.cpp:100] Reading metadata from model
I 00:00:01.027108 executorch:runner.cpp:180] Doing a warmup run...
Segmentation fault
The version of ExecuTorch I am working on is from branch "release/0.6".
Do you have any insight into what could be causing this segmentation fault on Android and how could it be fixed?
Versions
release/0.6
cc @cbilgin
Metadata
Metadata
Assignees
Labels
Type
Projects
Status