Skip to content

[FixBug]online serving fails for high-resolution videos#198

Merged
Gaohan123 merged 3 commits intovllm-project:mainfrom
princepride:fix-qwen3-omni-high-resolution
Dec 5, 2025
Merged

[FixBug]online serving fails for high-resolution videos#198
Gaohan123 merged 3 commits intovllm-project:mainfrom
princepride:fix-qwen3-omni-high-resolution

Conversation

@princepride
Copy link
Collaborator

@princepride princepride commented Dec 4, 2025

Purpose

Fix #128

This PR resolves a critical bug in the Qwen3-Omni model's deepstack feature that caused crashes when processing high-resolution videos. The root cause was incorrect tensor dimension indexing when retrieving the sequence length from input_ids, which led to shape mismatches between visual embeddings and hidden states during the deepstack processing.

Root Cause:

  • input_ids has shape [batch_size, seq_len], e.g., [1, 8192]
  • The code incorrectly used input_ids.size(0) to get sequence length, which returned the batch size (1) instead of the actual sequence length
  • This caused only 1 token's worth of deepstack embeddings to be retrieved from the buffer, while the model expected the full sequence length (e.g., 8192 or 3701 tokens)

Fix:
Changed input_ids.size(0) to input_ids.size(1) in two locations to correctly retrieve the sequence length (second dimension) instead of batch size (first dimension).
BTW, I used two H200s for testing, and an OOM error occurred. Therefore, I adjusted the qwen3 omni deployment file.

Test Plan

  1. Setup: Start vLLM-Omni server with Qwen3-Omni-30B-A3B-Instruct model

    vllm serve /path/to/Qwen3-Omni-30B-A3B-Instruct --omni --port 8091
  2. Test with high-resolution video: Run the multimodal generation client with a large video file

    python examples/online_serving/qwen3_omni/openai_chat_completion_client_for_multimodal_generation.py \
        --query-type use_video \
        --video-path sample_demo_2.mp4 \
        --prompt "explain this video" \
        --model /path/to/Qwen3-Omni-30B-A3B-Instruct
sample_demo_2.mp4

Test Result

image

audio_0.wav

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

Signed-off-by: princepride <wangzhipeng628@gmail.com>
@chatgpt-codex-connector
Copy link

The account who enabled Codex for this repo no longer has access to Codex. Please contact the admins of this repo to enable Codex again.

Copy link
Collaborator

@SamitHuang SamitHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice bugfix!

Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
@princepride
Copy link
Collaborator Author

princepride commented Dec 4, 2025

@SamitHuang I already revert the yaml change, can you help merge it, thank you!😊

Copy link
Collaborator

@Gaohan123 Gaohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great!

@Gaohan123 Gaohan123 enabled auto-merge (squash) December 5, 2025 02:05
@Gaohan123 Gaohan123 merged commit f3c69df into vllm-project:main Dec 5, 2025
4 checks passed
@david6666666
Copy link
Collaborator

nice catch !

LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025
…#198)

Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: Prajwal A <prajwalanagani@gmail.com>
LawJarp-A pushed a commit to LawJarp-A/vllm-omni that referenced this pull request Dec 12, 2025
…#198)

Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: Prajwal A <prajwalanagani@gmail.com>
faaany pushed a commit to faaany/vllm-omni that referenced this pull request Dec 19, 2025
…#198)

Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: Fanli Lin <fanli.lin@intel.com>
princepride added a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026
…#198)

Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][Qwen3-Omni]: online serving fails for high-resolution videos

4 participants