[FixBug]online serving fails for high-resolution videos#198
Merged
Gaohan123 merged 3 commits intovllm-project:mainfrom Dec 5, 2025
Merged
[FixBug]online serving fails for high-resolution videos#198Gaohan123 merged 3 commits intovllm-project:mainfrom
Gaohan123 merged 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: princepride <wangzhipeng628@gmail.com>
|
The account who enabled Codex for this repo no longer has access to Codex. Please contact the admins of this repo to enable Codex again. |
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Collaborator
Author
|
@SamitHuang I already revert the yaml change, can you help merge it, thank you!😊 |
Collaborator
|
nice catch ! |
5 tasks
1 task
LawJarp-A
pushed a commit
to LawJarp-A/vllm-omni
that referenced
this pull request
Dec 12, 2025
…#198) Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Prajwal A <prajwalanagani@gmail.com>
LawJarp-A
pushed a commit
to LawJarp-A/vllm-omni
that referenced
this pull request
Dec 12, 2025
…#198) Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Prajwal A <prajwalanagani@gmail.com>
faaany
pushed a commit
to faaany/vllm-omni
that referenced
this pull request
Dec 19, 2025
…#198) Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com> Signed-off-by: Fanli Lin <fanli.lin@intel.com>
princepride
added a commit
to princepride/vllm-omni
that referenced
this pull request
Jan 10, 2026
…#198) Signed-off-by: princepride <wangzhipeng628@gmail.com> Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Fix #128
This PR resolves a critical bug in the Qwen3-Omni model's deepstack feature that caused crashes when processing high-resolution videos. The root cause was incorrect tensor dimension indexing when retrieving the sequence length from
input_ids, which led to shape mismatches between visual embeddings and hidden states during the deepstack processing.Root Cause:
input_idshas shape[batch_size, seq_len], e.g.,[1, 8192]input_ids.size(0)to get sequence length, which returned the batch size (1) instead of the actual sequence lengthFix:
Changed
input_ids.size(0)toinput_ids.size(1)in two locations to correctly retrieve the sequence length (second dimension) instead of batch size (first dimension).BTW, I used two H200s for testing, and an OOM error occurred. Therefore, I adjusted the qwen3 omni deployment file.
Test Plan
Setup: Start vLLM-Omni server with Qwen3-Omni-30B-A3B-Instruct model
Test with high-resolution video: Run the multimodal generation client with a large video file
python examples/online_serving/qwen3_omni/openai_chat_completion_client_for_multimodal_generation.py \ --query-type use_video \ --video-path sample_demo_2.mp4 \ --prompt "explain this video" \ --model /path/to/Qwen3-Omni-30B-A3B-Instructsample_demo_2.mp4
Test Result
audio_0.wav
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.