Commit e9405d7
[BREAKING][worker, rollout, vllm] feat: implement vLLM colocated training-inference rollout with process separation (verl-project#4280)
### What does this PR do?
Refactor vLLM co-located training-inference rollout from single-process
to multi-process architecture. This refactoring separates training and
inference into different processes, enabling better resource isolation
and paving the way for future checkpoint-engine integration (in roadmap
verl-project#3624).
**Key Changes:**
- Transform `vLLMAsyncRollout` into `ServerAdapter` - a client-side
adapter that communicates with the inference executor
- Remove `ExternalZeroMQDistributedExecutor` and use `MultiprocExecutor`
as the inference backend
- Implement CUDA IPC-based weight updates via ZeroMQ for efficient
parameter synchronization between training and inference processes
### Checklist Before Starting
- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
- `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
- Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`
### Test
> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluation results, etc.
### API and Usage Example
This refactoring maintains full backward compatibility with existing
vLLM rollout APIs. No changes are required to user code.
**Key API Components:**
* **ServerAdapter** (replaces `vLLMAsyncRollout`):
- Acts as client-side adapter for communicating with inference executor
- Manages CUDA IPC-based weight updates
- Provides same interface as previous `vLLMAsyncRollout` class
### Design
#### Architecture Overview
1. Before (Single-Process Architecture)
* Single-Process Design
In the original `AsyncActorRolloutRefWorker`, the training engine and
inference engine shared the same process. The vLLM inference engine
directly received weight updates through parameter passing.

* Communication Architecture
`ExternalZeroMQDistributedExecutor` acts as a client, sending
instructions to all `AsyncActorRolloutRefWorker` inference engines via
ZMQ to execute operations like `init_worker`, `load_model`,
`init_device`, and `generate`. Operations like `wake_up`, `sleep`, and
weight updates were executed directly in `vLLMAsyncRollout` without
going through `ExternalZeroMQDistributedExecutor`.

2. After (Multi-Process Architecture):
* Multi-Process Design
Transform `vLLMAsyncRollout` into `ServerAdapter`, serving as a client
for communicating with the inference engine (AsyncLLM). Weight updates
are based on CUDA IPC, passing through ZeroMQ to the inference engine.

* Communication Architecture
Deprecate the original `ExternalZeroMQDistributedExecutor` class and
directly use vllm's `MultiprocExecutor` by passing
`distributed_executor_backend = "mp"`. All inference engine operations
are uniformly broadcast to all inference workers through
`MultiprocExecutor`'s RPC Broadcast MQ.

### Convergence test
- model: Qwen3-VL-30B-A3B-Instruct
- dataset: geo3k
- GPU: 4*8 H100
<img width="660" height="618" alt="image"
src="https://github.com/user-attachments/assets/6e3e7dbd-03f9-471a-b8d5-bc0344dba299"
/>
### Performance test: update weights
- CUDA IPC bucket_size: 2GB
- GPU: H100, ConnectX-7 400 Gbps (InfiniBand)
| Model | #GPU | Parallelism | Time |
|---|---|---|---|
|Qwen3-VL-30B-A3B-Instruct|TP2,EP8|4*8|5s|
|DeepSeek-V3.1-Terminus|TP8, PP16, EP8| 16*8 | 120s |
|DeepSeek-V3.1-Terminus|TP16,PP16| 32*8 | 80s|
### Checklist Before Submitting
> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.
- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)
---------
Signed-off-by: jianjunzhong <jianjunzhong@foxmail.com>
Co-authored-by: wuxibin <wuxibin@bytedance.com>1 parent f31df34 commit e9405d7
37 files changed
Lines changed: 527 additions & 520 deletions
File tree
- .github/workflows
- stash
- examples/sglang_multiturn
- tests
- experimental/agent_loop
- special_e2e
- ppo_trainer
- trainer/config
- verl
- experimental/reward_loop/router
- trainer
- config
- rollout
- utils
- workers
- actor
- engine/megatron
- rollout
- sglang_rollout
- trtllm_rollout
- vllm_rollout
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
257 | 257 | | |
258 | 258 | | |
259 | 259 | | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
70 | | - | |
| 70 | + | |
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
112 | | - | |
| 112 | + | |
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| |||
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
121 | | - | |
| 121 | + | |
File renamed without changes.
File renamed without changes.
Lines changed: 1 addition & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
67 | | - | |
| 66 | + | |
68 | 67 | | |
Lines changed: 1 addition & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
62 | | - | |
| 61 | + | |
63 | 62 | | |
Lines changed: 0 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
54 | 53 | | |
55 | 54 | | |
56 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
79 | | - | |
80 | 79 | | |
81 | 80 | | |
82 | 81 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
0 commit comments