[ET-VK][ez] Fix IndexError in Vulkan partitioner DtypeSetList/TensorRepSetList#18264
Merged
SS-JIA merged 13 commits intogh/SS-JIA/490/origfrom Mar 18, 2026
Merged
[ET-VK][ez] Fix IndexError in Vulkan partitioner DtypeSetList/TensorRepSetList#18264SS-JIA merged 13 commits intogh/SS-JIA/490/origfrom
SS-JIA merged 13 commits intogh/SS-JIA/490/origfrom
Conversation
…epSetList Pull Request resolved: #18048 The `__getitem__` methods of `DtypeSetList` and `TensorRepSetList` in `utils.py` could raise an `IndexError` when the index is greater than or equal to the length of the list. This can happen when partitioning ops whose number of inputs or outputs exceeds the number of entries in the dtype/tensor-rep specification list. Fix by returning an empty set in this case, matching the intent of the existing broadcasting logic. ghstack-source-id: 353546684 @exported-using-ghexport Differential Revision: [D95970163](https://our.internmc.facebook.com/intern/diff/D95970163/)
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18264
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
added 12 commits
March 17, 2026 21:54
Pull Request resolved: #18049 Add Vulkan build support for the Parakeet runner: llm-debug-vulkan preset in root CMakePresets.json, parakeet-vulkan presets in the Parakeet CMakePresets.json, vulkan_backend linkage in CMakeLists.txt, and a `make parakeet-vulkan` Makefile target. Add _create_vulkan_partitioners() and wire it into lower_to_executorch() so that `--backend vulkan` is accepted by export_parakeet_tdt.py. ghstack-source-id: 353546680 @exported-using-ghexport Differential Revision: [D95970157](https://our.internmc.facebook.com/intern/diff/D95970157/)
…teGraph Fix output argument indexing in VulkanBackend::execute() and extend ComputeGraph to transparently handle symint values. The output loop previously computed the args index as `i + num_inputs`, which breaks when non-tensor arguments (e.g. symints) sit between the tensor inputs and outputs in the args array. Fix by computing the offset from the end: `args.size() - num_outputs`. ComputeGraph changes add symint support so that operators can read symint values uniformly: - `extract_scalar<T>()` now handles SymInt values, allowing operators to call extract_scalar on arguments that may be either plain ints or symints without special-casing. - `read_symint()` falls back to reading plain Int values, so values stored as Int (rather than SymInt objects) can be read uniformly. Pull Request resolved: #18050 ghstack-source-id: 353546683 @exported-using-ghexport Differential Revision: [D95970167](https://our.internmc.facebook.com/intern/diff/D95970167/)
Modernize constant_pad_nd to support ANY_STORAGE (both buffer and texture). Migrate shaders to BufferMetadata/TextureMetadata with indexing.glslh and unify dispatch into a single add_constant_pad_nd_node function using DynamicDispatchNode. Pull Request resolved: #18051 ghstack-source-id: 353546682 @exported-using-ghexport Differential Revision: [D95970168](https://our.internmc.facebook.com/intern/diff/D95970168/)
Modernize arange and full operators to support ANY_STORAGE. Add separate buffer and texture shader variants using BufferMetadata/TextureMetadata with indexing.glslh. Unify dispatch with add_storage_type_suffix and DynamicDispatchNode. Add symint support via read_symint_list for dynamic output sizes. Pull Request resolved: #18052 ghstack-source-id: 353546693 @exported-using-ghexport Differential Revision: [D95970169](https://our.internmc.facebook.com/intern/diff/D95970169/)
Modernize expand_copy to support ANY_STORAGE. Add buffer shader variant using BufferMetadata with indexing.glslh. Unify dispatch with add_storage_type_suffix and DynamicDispatchNode. Add resize function and symint support for dynamic target sizes. Pull Request resolved: #18053 ghstack-source-id: 353546690 @exported-using-ghexport Differential Revision: [D95970162](https://our.internmc.facebook.com/intern/diff/D95970162/)
Modernize softmax and log_softmax to support ANY_STORAGE. Migrate both buffer and texture shaders from indexing_utils.h to indexing.glslh with BufferMetadata/TextureMetadata UBOs. Merge separate texture and buffer dispatch functions into a unified add_softmax_node using add_storage_type_suffix and graph.meta_ubo(). Pull Request resolved: #18054 ghstack-source-id: 353546688 @exported-using-ghexport Differential Revision: [D95970171](https://our.internmc.facebook.com/intern/diff/D95970171/)
Modernize native_layer_norm to support ANY_STORAGE. Migrate texture shader from indexing_utils.h to indexing.glslh with TextureMetadata UBOs. Merge separate texture and buffer dispatch functions into a unified add_native_layer_norm_node using graph.meta_ubo(). Buffer path retains custom workgroup sizing for cooperative shared-memory reduction. Pull Request resolved: #18055 ghstack-source-id: 353546686 @exported-using-ghexport Differential Revision: [D95970158](https://our.internmc.facebook.com/intern/diff/D95970158/)
Modernize repeat to support ANY_STORAGE. Rewrite texture shader to use TextureMetadata with indexing.glslh helpers for coordinate conversion. Add buffer shader variant using BufferMetadata. Unify dispatch to use graph.meta_ubo() for both paths. Add symint support for dynamic repeat counts. Pull Request resolved: #18056 ghstack-source-id: 353546685 @exported-using-ghexport Differential Revision: [D95970170](https://our.internmc.facebook.com/intern/diff/D95970170/)
Modernize embedding to support ANY_STORAGE. Add buffer and texture shader variants using BufferMetadata/TextureMetadata with indexing.glslh. Unify new dispatch path with add_storage_type_suffix and graph.meta_ubo(). Legacy channels-packed texture path retained for backward compatibility. Pull Request resolved: #18057 ghstack-source-id: 353546689 @exported-using-ghexport Differential Revision: [D95970161](https://our.internmc.facebook.com/intern/diff/D95970161/)
Modernize argmax and argmin to support ANY_STORAGE via the add_reduce_per_row_node dispatch path. Buffer shader uses BufferMetadata with indexing.glslh. Custom workgroup sizing retained for cooperative row-reduction algorithm with shared memory. Pull Request resolved: #18058 ghstack-source-id: 353546687 @exported-using-ghexport Differential Revision: [D95970165](https://our.internmc.facebook.com/intern/diff/D95970165/)
Pull Request resolved: #18059 Add missing operators needed for Parakeet TDT model support: - New symint ops: sym_sub, sym_floordiv, sym_mul in SymIntOps.cpp; register operator.floordiv and operator.mul as ephemeral ops in op_registry.py - New tensor ops: bitwise_not (via unary_op shader with uint8 DTYPE), logical_and (alias for bitwise_and dispatch) - Improve _to_copy: expand dtype support to FP_INT_BOOL_T and use pick_io_storage_fn to restrict to CONTIGUOUS_BUFFER for non-fp conversions - Fix where resize: compute output shape via broadcast across all tensor inputs instead of always using the second input's shape - Add symint support to split: use extract_int_or_symint_list instead of get_int_list in resize_split_node and split_with_sizes_copy_default - Mark scalar_tensor as supporting resize ghstack-source-id: 353546692 @exported-using-ghexport Differential Revision: [D95970159](https://our.internmc.facebook.com/intern/diff/D95970159/)
…linear ops Pull Request resolved: #18061 Wire bias through the q4gsw and dq8ca_q4gsw quantized linear operators. Add add_bias_to_out_tile() helper in the output tile computation header and call it from all three shader variants (tiled, coop, dq8ca_tiled). Remove the bias guard in the pattern matcher to allow biased linear layers. ghstack-source-id: 353546681 @exported-using-ghexport Differential Revision: [D95970172](https://our.internmc.facebook.com/intern/diff/D95970172/)
SS-JIA
approved these changes
Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #18048 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/465/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/465/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/490/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/465/orig
Differential Revision: D95970163
@diff-train-skip-merge
cc @SS-JIA @manuelcandales @digantdesai @cbilgin