Skip to content

Missing operator: [3] cadence::quantized_relu.per_tensor_out when following build-run-xtensa tutorial #8900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ELR11C opened this issue Mar 3, 2025 · 3 comments · Fixed by #8907
Assignees
Labels
module: cadence Issues related to the Cadence/Xtensa backend triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@ELR11C
Copy link

ELR11C commented Mar 3, 2025

🐛 Describe the bug

I have installed executorch using WSL (Ubuntu 22.04) and am running into issues when following the tutorial found here. Specifically, I am running python3 -m examples.cadence.models.rnnt_predictor in the executorch folder using my venv, and I obtain the following error log after the CXX executable cadence_runner is built:

[100%] Linking CXX executable cadence_runner
[100%] Built target cadence_runner
Built cmake-out/backends/cadence/cadence_runner
[INFO 2025-03-03 17:19:19,965 executor.py:129] ./cmake-out/backends/cadence/cadence_runner --bundled_program_path=/tmp/tmp5meia7b1/CadenceDemoModel.bpte --etdump_path=/tmp/tmp5meia7b1/etdump.etdp --debug_output_path=/tmp/tmp5meia7b1/debug_output.bin --dump_outputs=true
I 00:00:00.001455 executorch:example_runner.cpp:145] Model file /tmp/tmp5meia7b1/CadenceDemoModel.bpte is loaded.
I 00:00:00.001471 executorch:example_runner.cpp:154] Running method forward
I 00:00:00.001473 executorch:example_runner.cpp:201] Setting up planned buffer 0, size 20496.
E 00:00:00.001498 executorch:operator_registry.cpp:252] kernel 'cadence::quantized_relu.per_tensor_out' not found.
E 00:00:00.001500 executorch:operator_registry.cpp:253] dtype: 1 | dim order: [
E 00:00:00.001501 executorch:operator_registry.cpp:253] 0,
E 00:00:00.001503 executorch:operator_registry.cpp:253] 1,
E 00:00:00.001504 executorch:operator_registry.cpp:253] 2,
E 00:00:00.001505 executorch:operator_registry.cpp:253] ]
E 00:00:00.001506 executorch:operator_registry.cpp:253] dtype: 0 | dim order: [
E 00:00:00.001507 executorch:operator_registry.cpp:253] 0,
E 00:00:00.001508 executorch:operator_registry.cpp:253] 1,
E 00:00:00.001510 executorch:operator_registry.cpp:253] 2,
E 00:00:00.001511 executorch:operator_registry.cpp:253] ]
E 00:00:00.001512 executorch:operator_registry.cpp:253] dtype: 0 | dim order: [
E 00:00:00.001513 executorch:operator_registry.cpp:253] 0,
E 00:00:00.001514 executorch:operator_registry.cpp:253] 1,
E 00:00:00.001515 executorch:operator_registry.cpp:253] 2,
E 00:00:00.001516 executorch:operator_registry.cpp:253] ]
E 00:00:00.001518 executorch:method.cpp:724] Missing operator: [3] cadence::quantized_relu.per_tensor_out
E 00:00:00.001521 executorch:method.cpp:944] There are 1 instructions don't have corresponding operator registered. See logs for details
F 00:00:00.001527 executorch:example_runner.cpp:220] In function main(), assert failed (method.ok()): Loading of method forward failed with status 0x14
Traceback (most recent call last):
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/runtime/executor.py", line 94, in execute
return _execute_subprocess(args)
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/runtime/executor.py", line 78, in _execute_subprocess
raise subprocess.CalledProcessError(p.returncode, p.args, stdout, stderr)
subprocess.CalledProcessError: Command '['./cmake-out/backends/cadence/cadence_runner', '--bundled_program_path=/tmp/tmp5meia7b1/CadenceDemoModel.bpte', '--etdump_path=/tmp/tmp5meia7b1/etdump.etdp', '--debug_output_path=/tmp/tmp5meia7b1/debug_output.bin', '--dump_outputs=true']' died with <Signals.SIGABRT: 6>.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/elriic/executorch/examples/cadence/models/rnnt_predictor.py", line 69, in
export_model(model, example_inputs)
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/aot/export_example.py", line 117, in export_model
runtime.run_and_compare(
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/runtime/runtime.py", line 220, in run_and_compare
outputs = run(executorch_prog, inputs, ref_outputs, working_dir)
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/runtime/runtime.py", line 158, in run
executor()
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/runtime/executor.py", line 130, in call
execute(args)
File "/home/elriic/executorch_venv/lib/python3.10/site-packages/executorch/backends/cadence/runtime/executor.py", line 97, in execute
raise RuntimeError(
RuntimeError: Failed to execute. Use the following to debug:
fdb ./cmake-out/backends/cadence/cadence_runner --bundled_program_path=/tmp/tmp5meia7b1/CadenceDemoModel.bpte --etdump_path=/tmp/tmp5meia7b1/etdump.etdp --debug_output_path=/tmp/tmp5meia7b1/debug_output.bin --dump_outputs=true

Versions

PyTorch version: 2.7.0.dev20250131+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0
Clang version: Could not collect
CMake version: version 3.31.4
Libc version: glibc-2.35

Python version: 3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: GenuineIntel
Model name: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
CPU family: 6
Model: 142
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 10
BogoMIPS: 3791.99
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi ept vpid ept_ad fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves md_clear flush_l1d arch_capabilities
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 128 KiB (4 instances)
L1i cache: 128 KiB (4 instances)
L2 cache: 1 MiB (4 instances)
L3 cache: 6 MiB (1 instance)
Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Mitigation; IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; IBRS; IBPB conditional; STIBP conditional; RSB filling; PBRSB-eIBRS Not affected; BHI SW loop, KVM SW loop
Vulnerability Srbds: Unknown: Dependent on hypervisor status
Vulnerability Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown

Versions of relevant libraries:
[pip3] executorch==0.5.0a0+1bc0699
[pip3] numpy==2.0.0
[pip3] torch==2.7.0.dev20250131+cpu
[pip3] torchao==0.10.0+git7d879462
[pip3] torchaudio==2.6.0.dev20250131+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.22.0.dev20250131+cpu
[conda] Could not collect

cc @mcremon-meta

@iseeyuan iseeyuan added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: cadence Issues related to the Cadence/Xtensa backend labels Mar 3, 2025
@github-project-automation github-project-automation bot moved this to To triage in ExecuTorch Core Mar 3, 2025
@iseeyuan
Copy link
Contributor

iseeyuan commented Mar 3, 2025

Since Matthias is on leave, @zonglinpeng and @tarun292 could you help on this issue?

@zonglinpeng
Copy link
Contributor

codegen yaml may be misaligned due to recent change. Let me check and will update

@zonglinpeng
Copy link
Contributor

zonglinpeng commented Mar 3, 2025

quantized_relu_per_tensor_out is missing in cpu flow. Fixed in #8907
@ELR11C Thanks for capturing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: cadence Issues related to the Cadence/Xtensa backend triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants