-
Notifications
You must be signed in to change notification settings - Fork 536
Openvino backend for Executorch to enable inference on Intel CPUs, GPUs, NPUs #8573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…orch into openvino_backend
Handling multiple inputs/outputs with zero-copy
Added fallback with portable kernels
Enhancements to openvino example
Initial unit tests for OpenVINO backend
Added license headers to the openvino files
Pybind merge fix
build update for pybind
This is what I have
It is not ubuntu or other versions suppoted. |
Fixed MyPy TypeChecker Issues
@ynimmaga thanks for the lint patches. a few still remaining looks like |
Hey @mergennachin, the remaining issues are because nncf and openvino packages are not installed while running the linter. When the packages are not installed mypy is not able to find them. Should we add an ignore for import-not-found errors ? |
Co-authored-by: Daniil Lyakhov <[email protected]>
Fixed typechecker issues
there is also sklearn related lint error |
has this been fixed? |
Hi @kimishpatel, thanks for approving the PR. Yes, this has been fixed with ynimmaga#45 |
ok landed |
As fyi, we have a discord channel and have highlighted this PR there too: https://discordapp.com/channels/1334270993966825602/1334274132182827069/1355286663701725356 cc: @ilya-lavrenov, @alexsu52, @AlexKoff88, @suryasidd, @cavusmustafa, @daniil-lyakhov Thank you for this great contribution! |
…Us, NPUs (#8573) ### Summary This PR introduces support for the OpenVINO backend in Executorch, enabling accelerated inference on Intel hardware, including CPU, GPU, and NPU devices. OpenVINO optimizes deep learning model performance by leveraging hardware-specific enhancements. The PR also introduces the OpenVINO quantizer with NNCF (Neural Network Compression Framework) for model optimization. The functionality has been tested on several torchvision and timm models, with plans to test and enable support for additional model types in the future. Below is a description of the features: - OpenVINO Backend Integration: The backends/openvino directory includes build scripts, AOT components (partitioner, preprocesser), OpenVINO Quantizer, and runtime backend files that register the OpenVINO backend, manage OpenVINO’s inference engine interactions, including model execution, device-specific optimizations, and backend initialization. It also contains tests for layers and models. See backends/openvino/README.md for usage. - OpenVINO Examples: The examples/openvino directory provides scripts for AOT optimization, quantization, and C++ executor examples. It includes instructions for optimizing the models, quantizing them, and exporting Executorch programs with OpenVINO optimizations. Refer to examples/openvino/README.md for details. - E2E Tutorial: Added an end-to-end tutorial in docs/source/build-run-openvino.md. ### Test plan This PR is tested with OpenVINO backend on Intel Core Ultra 7 processors for CPU, GPU, and NPU devices. To run the layer tests and model tests, please refer to backends/openvino/tests/README.md cc: @yury-gorbachev @alexsu52 @cavusmustafa @daniil-lyakhov @suryasidd @AlexKoff88 @MaximProshin @AlexanderDokuchaev --------- Co-authored-by: Cavus Mustafa <[email protected]> Co-authored-by: Aleksandr Suslov <[email protected]> Co-authored-by: dlyakhov <[email protected]> Co-authored-by: Kimish Patel <[email protected]> Co-authored-by: suryasidd <[email protected]>
Summary
This PR introduces support for the OpenVINO backend in Executorch, enabling accelerated inference on Intel hardware, including CPU, GPU, and NPU devices. OpenVINO optimizes deep learning model performance by leveraging hardware-specific enhancements. The PR also introduces the OpenVINO quantizer with NNCF (Neural Network Compression Framework) for model optimization. The functionality has been tested on several torchvision and timm models, with plans to test and enable support for additional model types in the future.
Below is a description of the features:
OpenVINO Backend Integration: The backends/openvino directory includes build scripts, AOT components (partitioner, preprocesser), OpenVINO Quantizer, and runtime backend files that register the OpenVINO backend, manage OpenVINO’s inference engine interactions, including model execution, device-specific optimizations, and backend initialization. It also contains tests for layers and models. See backends/openvino/README.md for usage.
OpenVINO Examples: The examples/openvino directory provides scripts for AOT optimization, quantization, and C++ executor examples. It includes instructions for optimizing the models, quantizing them, and exporting Executorch programs with OpenVINO optimizations. Refer to examples/openvino/README.md for details.
E2E Tutorial: Added an end-to-end tutorial in docs/source/build-run-openvino.md.
Test plan
This PR is tested with OpenVINO backend on Intel Core Ultra 7 processors for CPU, GPU, and NPU devices. To run the layer tests and model tests, please refer to backends/openvino/tests/README.md
cc: @yury-gorbachev @alexsu52 @cavusmustafa @daniil-lyakhov @suryasidd @AlexKoff88 @MaximProshin @AlexanderDokuchaev