Skip to content

[Core] Supports stage abstraction in the diffusion model#391

Merged
hsliuustc0106 merged 19 commits intovllm-project:mainfrom
fake0fan:test_stage
Dec 26, 2025
Merged

[Core] Supports stage abstraction in the diffusion model#391
hsliuustc0106 merged 19 commits intovllm-project:mainfrom
fake0fan:test_stage

Conversation

@fake0fan
Copy link
Contributor

To align with our initial vision and for future overall optimization, we gradually began to provide Stage abstractions for Diffusion.

image

[Core] Add Stage Abstraction Support for Diffusion Models

Overview

This PR adds stage abstraction support for the diffusion model component of vLLM-Omni, achieving a consistent architectural design with LLM models. It also includes code refactoring to unify the sampling parameter interface, improving code maintainability and extensibility.

Major Changes

1. Code Refactoring

  • Refactored entry point code structure: Refactored and integrated LLM-related code from omni_llm.py into omni.py, unifying entry point management
  • Enhanced Stage abstraction: Extended omni_stage.py to support stage configuration and management for diffusion models

2. Diffusion Stage Abstraction Support

  • New unified sampling parameter class (omni_sampling_params.py):
    • Created OmniSamplingParams class to uniformly manage sampling parameters for both LLM and diffusion models
    • Supports LLM parameters (temperature, top_p, top_k, etc.) and diffusion parameters (num_inference_steps, guidance_scale, etc.)
    • Provides conversion methods with vLLM SamplingParams
  • Extended Diffusion Engine:
    • Updated diffusion_engine.py to support stage abstraction
    • Enhanced stage support in gpu_worker.py
  • Updated configuration system:
    • Added configuration file: QwenImagePipeline.yaml
    • Updated stage configuration files for multiple models

3. Example Updates

  • Updated text_to_image.py example to demonstrate how to use the new stage abstraction interface

4. Outputs

image

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@@ -33,25 +69,465 @@ def __init__(self, *args, **kwargs):
args[0] = model

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid assigning to immutable args tuple in Omni init

When Omni is instantiated with the model passed positionally (e.g. Omni("Qwen/Qwen-Image")), the constructor assigns to args[0], but args is a tuple, so the assignment raises TypeError: 'tuple' object does not support item assignment before any initialization occurs. This makes the new entrypoint unusable for positional calls that previously worked with OmniLLM; callers must now pass model as a keyword or hit a hard crash.

Useful? React with 👍 / 👎.

@ZJY0516
Copy link
Member

ZJY0516 commented Dec 20, 2025

looking forward to it

@hsliuustc0106
Copy link
Collaborator

let's get the initial version done before 1230 release

@ZJY0516 ZJY0516 self-requested a review December 20, 2025 15:30
@ZJY0516
Copy link
Member

ZJY0516 commented Dec 22, 2025

Does this pr support reuse vllm as text encoding stage for diffusion models?

@fake0fan
Copy link
Contributor Author

Does this pr support reuse vllm as text encoding stage for diffusion models?

Not yet.

This PR only encapsulates the entire diffusion model into a single stage first.

@princepride
Copy link
Collaborator

Does this PR mean that all models under diffusion folder can be deployed using YAML?

@fake0fan
Copy link
Contributor Author

Does this PR mean that all models under diffusion folder can be deployed using YAML?

Through some offline discussions, we decided that this version will not require providing a yaml file for the Diffusion model. Instead, the system will automatically generate a YAML file for the current Diffusion model.

@princepride
Copy link
Collaborator

In which group you are discussing? Can you add me?

@erfgss
Copy link
Contributor

erfgss commented Dec 23, 2025

Fixes #340

@fake0fan fake0fan force-pushed the test_stage branch 2 times, most recently from 7f9aecc to d5be79d Compare December 23, 2025 16:07
@fake0fan fake0fan changed the title [WIP][Core] Supports stage abstraction in the diffusion model [Core] Supports stage abstraction in the diffusion model Dec 23, 2025
@david6666666 david6666666 added the ready label to trigger buildkite CI label Dec 24, 2025
@yinpeiqi yinpeiqi force-pushed the test_stage branch 2 times, most recently from e08d98e to 5ef0287 Compare December 26, 2025 06:01
if "dtype" in kwargs:
kwargs["dtype"] = str(kwargs["dtype"])
# TODO: hack, calculate devices based on parallel config.
devices = "0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may bring some problems but we can leave it later

Each stage will create appropriate instances (AsyncOmniLLM or AsyncOmniDiffusion)
based on stage_type in YAML config.
"""
init_sleep_seconds = kwargs.get("init_sleep_seconds", 20)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tzhouam init_sleep_seconds needs to be fixed in your PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will unfiy the args for sleep in my PR later

self.stage_configs = load_stage_configs_from_model(model, base_engine_args)
self.stage_configs = load_stage_configs_from_model(model)
if not self.stage_configs:
default_stage_cfg = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a mechanism to prevent error if the default_stage_cfg is not suitable, e.g., OOM

init_timeout = kwargs.get("init_timeout", 300)
worker_backend = kwargs.get("worker_backend", "multi_process")
ray_address = kwargs.get("ray_address", None)
batch_timeout = kwargs.get("batch_timeout", 10)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we set 10 here? is it in ms or seconds?

"""
init_sleep_seconds = kwargs.get("init_sleep_seconds", 20)
shm_threshold_bytes = kwargs.get("shm_threshold_bytes", 65536)
init_timeout = kwargs.get("init_timeout", 300)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it in seconds?
what's the relationship between init_timeout and init_sleep_seconds?
do we have to check init_sleep_seconds * num_stages < init_timeout?

idx, cfg = idx_cfg
return idx, OmniStage(cfg)

with ThreadPoolExecutor(max_workers=min(len(self.stage_configs), max(1, os.cpu_count() or 1))) as executor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will the deployment strategy affect the way we build stage? for example, if we deploy one stage into one device, how do we choose the cpu workers

Each stage will create appropriate instance (OmniLLM or OmniDiffusion)
based on stage_type in YAML config (handled in omni_stage.py).
"""
init_sleep_seconds = kwargs.get("init_sleep_seconds", 20)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question, we need to take care of init_sleep_seconds, init_timeout and batch_timeout

idx, cfg = idx_cfg
return idx, OmniStage(cfg)

with ThreadPoolExecutor(max_workers=min(len(self.stage_configs), max(1, os.cpu_count() or 1))) as executor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question

fake0fan and others added 5 commits December 26, 2025 15:36
Signed-off-by: Chenguang ZHENG <645327136@qq.com>
Signed-off-by: Chenguang ZHENG <645327136@qq.com>
Signed-off-by: Chenguang ZHENG <645327136@qq.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
yinpeiqi and others added 7 commits December 26, 2025 15:55
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
@hsliuustc0106 hsliuustc0106 added ready label to trigger buildkite CI and removed ready label to trigger buildkite CI labels Dec 26, 2025
@hsliuustc0106 hsliuustc0106 merged commit 595e7c0 into vllm-project:main Dec 26, 2025
6 of 7 checks passed
yenuo26 pushed a commit to yenuo26/vllm-omni that referenced this pull request Dec 29, 2025
…t#391)

Signed-off-by: Chenguang ZHENG <645327136@qq.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Co-authored-by: yinpeiqi <yinpeiqi809@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Signed-off-by: wangyu31577 <wangyu31577@hundsun.com>
princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026
…t#391)

Signed-off-by: Chenguang ZHENG <645327136@qq.com>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Co-authored-by: yinpeiqi <yinpeiqi809@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
@fake0fan fake0fan deleted the test_stage branch January 13, 2026 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants