Skip to content

Segmentation Fault when implementing llama/stories110M Android phone deployment #4237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BESTTOOLBOX opened this issue Jul 12, 2024 · 4 comments
Assignees
Labels
module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@BESTTOOLBOX
Copy link

🐛 Describe the bug

I encountered a segmentation fault when implementing llama/stories110M Android phone deployment according to the #https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md.
I put xnnpack_stories110M.pte, tokenizer.bin and llama_main compiled with xnn backend into the same directory and executed llama_main. The following is the error message:
image
Here is the logcat output:
image
The following is the information parsed by addr2line. It seems that there was an error in the memory allocation.
#00 pc 0000000002ff1afc /data/local/tmp/jiaxgeng/stories/llama_main (xnn_pack_qs8_qc4w_gemm_bl_goi_w_nr8_kr4+2696) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/packing.c:548
07-12 10:11:23.542 11569 11569 F DEBUG : #1 pc 0000000002ff3604 /data/local/tmp/jiaxgeng/stories/llama_main (xnn_pack_qs8_qc4w_gemm_bl_goi_w+1512) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/packing.c:753
07-12 10:11:23.542 11569 11569 F DEBUG : #2 pc 00000000030e79bc /data/local/tmp/jiaxgeng/stories/llama_main (xnn_create_fully_connected_nc_qd8_f32_qb4w+3240) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/operators/fully-connected-nc.c:737
07-12 10:11:23.542 11569 11569 F DEBUG : #3 pc 00000000030e1acc /data/local/tmp/jiaxgeng/stories/llama_main (create_fully_connected_operator+3156) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/subgraph/fully-connected.c:237
image
07-12 10:11:23.542 11569 11569 F DEBUG : #4 pc 0000000002d4791c /data/local/tmp/jiaxgeng/stories/llama_main (xnn_create_runtime_v4+1644) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/runtime.c:575
image
07-12 10:11:23.542 11569 11569 F DEBUG : #5 pc 0000000002d47260 /data/local/tmp/jiaxgeng/stories/llama_main (xnn_create_runtime_v3+104) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/runtime.c:208
image
07-12 10:11:23.542 11569 11569 F DEBUG : #6 pc 0000000002d471e8 /data/local/tmp/jiaxgeng/stories/llama_main (xnn_create_runtime_v2+48) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/third-party/XNNPACK/src/runtime.c:193
image
07-12 10:11:23.542 11569 11569 F DEBUG : #7 pc 0000000000855d4c /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::xnnpack::delegate::XNNCompiler::compileModel(void const*, unsigned long, torch::executor::xnnpack::delegate::XNNExecutor*, torch::executor::MemoryAllocator*)+1760) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/runtime/XNNCompiler.cpp:1681
image
07-12 10:11:23.542 11569 11569 F DEBUG : #8 pc 0000000000861c88 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::XnnpackBackend::init(torch::executor::BackendInitContext&, torch::executor::FreeableBuffer*, torch::executor::ArrayReftorch::executor::CompileSpec) const+184) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/backends/xnnpack/runtime/XNNPACKBackend.cpp:42
image
07-12 10:11:23.542 11569 11569 F DEBUG : #9 pc 000000000312bbd0 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::BackendDelegate::Init(executorch_flatbuffer::BackendDelegate const&, torch::executor::Program const*, torch::executor::BackendInitContext&, torch::executor::BackendDelegate*)+992) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/runtime/executor/method.cpp:97
image
07-12 10:11:23.542 11569 11569 F DEBUG : #10 pc 000000000312a74c /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::Method::init(executorch_flatbuffer::ExecutionPlan*)+728) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/runtime/executor/method.cpp:596
image
07-12 10:11:23.542 11569 11569 F DEBUG : #11 pc 000000000312a298 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::Method::load(executorch_flatbuffer::ExecutionPlan*, torch::executor::Program const*, torch::executor::MemoryManager*, torch::executor::EventTracer*)+84) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/runtime/executor/method.cpp:547
image
07-12 10:11:23.542 11569 11569 F DEBUG : #12 pc 00000000031345c4 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::Program::load_method(char const*, torch::executor::MemoryManager*, torch::executor::EventTracer*) const+380) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/runtime/executor/program.cpp:246
image
07-12 10:11:23.542 11569 11569 F DEBUG : #13 pc 0000000003118424 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::Module::load_method(std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const&)+888) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/extension/module/module.cpp:131
image
07-12 10:11:23.542 11569 11569 F DEBUG : #14 pc 000000000086cd78 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::Runner::load()+312) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/examples/models/llama2/runner/runner.cpp:73
image
07-12 10:11:23.542 11569 11569 F DEBUG : #15 pc 000000000086fc94 /data/local/tmp/jiaxgeng/stories/llama_main (torch::executor::Runner::generate(std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const&, int, std::__ndk1::function<void (std::__ndk1::basic_string<char, std::__ndk1::char_traits, std::__ndk1::allocator > const&)>, std::__ndk1::function<void (torch::executor::Runner::Stats const&)>)+236) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/examples/models/llama2/runner/runner.cpp:351
image
07-12 10:11:23.542 11569 11569 F DEBUG : #16 pc 0000000000863fe8 /data/local/tmp/jiaxgeng/stories/llama_main (main+480) (BuildId: 76fcd4977dfd60ce58d60c76d5f70bf61f77b471)
/local/mnt/workspace/executorch/examples/models/llama2/main.cpp:75
image
07-12 10:11:23.542 11569 11569 F DEBUG : #17 pc 0000000000053e48 /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+108) (BuildId: 50118287324a156bc7be11d3d940c7be)

Here are my compile instructions for llama_main android. To see debug information, set type to debug.
sudo cmake -DCMAKE_TOOLCHAIN_FILE=/local/mnt/workspace/android-ndk-r26d/build/cmake/android.toolchain.cmake
-DANDROID_ABI=arm64-v8a
-DANDROID_PLATFORM=android-33
-DCMAKE_INSTALL_PREFIX=cmake-out-android
-DCMAKE_BUILD_TYPE=Debug
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON
-DEXECUTORCH_ENABLE_LOGGING=1
-DPYTHON_EXECUTABLE=python
-DEXECUTORCH_BUILD_XNNPACK=ON
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON
-DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON
-Bcmake-out-android .

cmake --build cmake-out-android -j16 --target install --config Debug

sudo cmake -DCMAKE_TOOLCHAIN_FILE=/local/mnt/workspace/android-ndk-r26d/build/cmake/android.toolchain.cmake
-DANDROID_ABI=arm64-v8a
-DANDROID_PLATFORM=android-33
-DCMAKE_INSTALL_PREFIX=cmake-out-android
-DCMAKE_BUILD_TYPE=Debug
-DPYTHON_EXECUTABLE=python
-DEXECUTORCH_BUILD_XNNPACK=ON
-DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON
-DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON
-DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON
-Bcmake-out-android/examples/models/llama2
examples/models/llama2

sudo cmake --build cmake-out-android/examples/models/llama2 -j16 --config Debug

Versions

[pip3] executorch==0.4.0a0+8740c69
[pip3] numpy==2.0.0
[pip3] torch==2.5.0.dev20240618+cpu
[pip3] torchao==0.1
[pip3] torchaudio==2.4.0.dev20240618+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0.dev20240618+cpu
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.2.2 pypi_0 pypi
[conda] torchaudio 2.2.2 pypi_0 pypi
[conda] torchvision 0.17.2 pypi_0 pypi

@lucylq lucylq added the module: android Issues related to Android code, build, and execution label Jul 13, 2024
@lucylq
Copy link
Contributor

lucylq commented Jul 13, 2024

cc @kirklandsign for Android

@lucylq lucylq added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 13, 2024
@kimishpatel
Copy link
Contributor

Wow. YOu did good chunk of work to narrow it down. Feels like the same issue that we saw earlier with out-of-bounds access for weights? cc: @digantdesai

@kirklandsign
Copy link
Contributor

Hi @BESTTOOLBOX Thank you for reporting! This is a known issue and will be fixed by digantdesai/XNNPACK#12 when we update the XNNPACK commit in ExecuTorch. You can also see https://github.com/pytorch/executorch/pull/4304/files for the fix working.

@BESTTOOLBOX
Copy link
Author

Thank you very much. I have been confused about this issue for a long time. I will go check on the fix working.

Hi @BESTTOOLBOX Thank you for reporting! This is a known issue and will be fixed by digantdesai/XNNPACK#12 when we update the XNNPACK commit in ExecuTorch. You can also see https://github.com/pytorch/executorch/pull/4304/files for the fix working.

Wow. YOu did good chunk of work to narrow it down. Feels like the same issue that we saw earlier with out-of-bounds access for weights? cc: @digantdesai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

4 participants