Skip to content

[sglang, rollout] feat: support sglang as rollout engine in fully async policy#4191

Merged
wuxibin89 merged 30 commits intoverl-project:mainfrom
meituan-search:recipe/async_policy_sglang
Jan 19, 2026
Merged

[sglang, rollout] feat: support sglang as rollout engine in fully async policy#4191
wuxibin89 merged 30 commits intoverl-project:mainfrom
meituan-search:recipe/async_policy_sglang

Conversation

@AniZpZ
Copy link
Copy Markdown
Contributor

@AniZpZ AniZpZ commented Nov 19, 2025

What does this PR do?

Extend the fully async policy recipe by adding SGLang as an alternative rollout engine to vLLM when using FSDP

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@lizipao
Copy link
Copy Markdown

lizipao commented Nov 21, 2025

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before?
Traceback (most recent call last):
File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main
run_ppo(config, task_runner_class=FullyAsyncTaskRunner)
File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo
ray.get(runner.run.remote(config))
File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get
values, debugger_breakpoint = worker.get_objects(
File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>)
File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run
self._run_training_loop()
File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop
raise e
File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop
ray.get(future)
ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>)
File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit
self._check_save_checkpoint(False, timing_raw)
File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint
self._save_checkpoint()
File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint
dataloader_state_dict = self.train_dataloader.state_dict()
AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

@AniZpZ AniZpZ marked this pull request as ready for review November 26, 2025 13:50
@AniZpZ
Copy link
Copy Markdown
Contributor Author

AniZpZ commented Nov 26, 2025

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

@lizipao
Copy link
Copy Markdown

lizipao commented Nov 26, 2025

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

thanks, I have fixed it

@ForeverDJ-ux
Copy link
Copy Markdown

How do you fix this problem?, I have met the same error @lizipao

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

thanks, I have fixed it

@lizipao
Copy link
Copy Markdown

lizipao commented Dec 7, 2025

How do you fix this problem?, I have met the same error @lizipao

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

thanks, I have fixed it

我在recipe\fully_async_policy\fully_async_trainer.py里加了
from verl.trainer.main_ppo import create_rl_dataset, create_rl_sampler
from verl.utils.dataset.rl_dataset import collate_fn

    train_dataset = create_rl_dataset(config.data.train_files, config.data, tokenizer, processor)
    val_dataset = create_rl_dataset(config.data.val_files, config.data, tokenizer, processor)
    train_sampler = create_rl_sampler(config.data, train_dataset)

    print(f"[FullyAsyncRollouter] Rollouter _create_dataloader...\n{train_dataset}\n{val_dataset}")

    self._create_dataloader(train_dataset, val_dataset, collate_fn, train_sampler)

@jsfanfanfan jsfanfanfan force-pushed the recipe/async_policy_sglang branch from 17d9bad to 815ebb2 Compare December 17, 2025 03:01
@AniZpZ AniZpZ changed the title [WIP][recipe, sglang] support sglang as rollout engine in fully async policy [recipe, sglang] support sglang as rollout engine in fully async policy Dec 17, 2025
@jsfanfanfan
Copy link
Copy Markdown
Contributor

jsfanfanfan commented Dec 17, 2025

Test Results Supplements

We conducted tests on 64 H20 GPUs using the Qwen2.5-Math-7B model, with the dapo-math-17k dataset as the train set and aime-2024 as the test set. A total of 400 steps were trained, with testing performed every 10 steps. The experimental results are as follows:
截屏2025-12-17 下午2 33 44
截屏2025-12-17 下午2 33 52
截屏2025-12-17 下午2 34 07

We tested under the same conditions by replacing the rollout backend engine with vLLM, and the comparison results are as follows:
截屏2025-12-17 下午2 38 43
截屏2025-12-17 下午2 38 49
截屏2025-12-17 下午2 38 56

A more detailed comparative analysis between vLLM and SGLang is conducted on steps 100-300 (stable phase).

1. Performance Metrics Comparison
Metric SGLang VLLM Ratio (SGLang/VLLM) Relative Difference Note
Throughput (tokens/sec) 703.46 635.85 1.1063 +10.63% SGLang is faster
Time per Step (sec) 193.54 229.89 0.8419 -15.81% SGLang is faster
MFU 0.6802 0.6760 1.0063 +0.63% Comparable
Total Tokens 8,583,224 8,927,580 0.9614 -3.86% VLLM has slightly more
Throughput (perf/throughput)
  • SGLang: Mean=703.46, Median=728.33, Min=280.39, Max=753.76, Std=71.00
  • VLLM: Mean=635.85, Median=689.51, Min=255.31, Max=730.12, Std=113.49
Time per Step (perf/time_per_step)
  • SGLang: Mean=193.54s, Median=184.78s, Min=173.91s, Max=483.87s, Std=30.78s
  • VLLM: Mean=229.89s, Median=203.10s, Min=185.95s, Max=521.57s, Std=60.92s
2. Time Metrics Comparison
Metric SGLang VLLM Ratio (SGLang/VLLM) Relative Difference Note
Generation Time (sec) 40.36 70.53 0.5722 -42.78% SGLang is significantly faster
Actor Update Time (sec) 152.05 158.10 0.9618 -3.82% Comparable
Total Step Time (sec) 193.54 229.89 0.8419 -15.81% SGLang is faster
Validation Wait Time (sec) 0.0024 0.0032 0.7564 -24.36% SGLang is faster
Parameter Sync Time (sec) 18.34 5.18 3.5380 +253.80% ⚠️ VLLM is faster
Generation Time per Token (ms) 0.0024 0.0042 0.5681 -43.19% SGLang is significantly faster
Actor Update Time per Token (ms) 0.0169 0.0172 0.9842 -1.58% Comparable
3. Validation Metrics Comparison
Metric SGLang VLLM Ratio (SGLang/VLLM) Relative Difference Note
Validation Reward (mean@1) -0.3373 -0.4299 0.7846 +21.54% SGLang is higher
Validation Score (mean@1) -0.3372 -0.4299 0.7844 +21.56% SGLang is higher
Validation Accuracy (mean@1) 0.3314 0.2851 1.1625 +16.25% SGLang is higher
Validation Accuracy (val-core/math_dapo/acc/mean@1)
  • SGLang: Mean=0.3314, Median=0.3323, Min=0.3000, Max=0.3510, Std=0.0141
  • VLLM: Mean=0.2851, Median=0.2854, Min=0.2521, Max=0.3167, Std=0.0177

@ArronHZG ArronHZG self-requested a review December 17, 2025 06:55
@AniZpZ
Copy link
Copy Markdown
Contributor Author

AniZpZ commented Dec 17, 2025

@chenhaiq @zhaochenyang20 could you please trigger the ci ?

Copy link
Copy Markdown
Collaborator

@ArronHZG ArronHZG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await asyncio.gather(*self.active_tasks, return_exceptions=True)
self.active_tasks.clear()
print("[FullyAsyncRollouter][Public][Pause] All active tasks completed")
print("[FullyAsyncRollouter][Public][Pause] Ready to reset prefix cache")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we unify the use of clear_kv_cache as the interface here? Modifications can be made by rebasing on the main branch.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"mem_fraction_static": self.config.gpu_memory_utilization,
"disable_cuda_graph": self.config.enforce_eager,
"enable_memory_saver": True,
"enable_memory_saver": False,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this affect the existing logic?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I maintained the original logic at f6c7589

@jsfanfanfan
Copy link
Copy Markdown
Contributor

jsfanfanfan commented Dec 18, 2025

We further reduced the parameter synchronization time in f6c7589. Experiments conducted on 32 H20 GPUs, using data from step 20 to 120, show that the average parameter synchronization time decreased from 10.36 seconds to 1.34 seconds,reduced by approximately 87%.

@AniZpZ AniZpZ requested a review from ISEEKYAN as a code owner December 24, 2025 14:06
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jan 9, 2026

CLA assistant check
All committers have signed the CLA.

@jsfanfanfan jsfanfanfan force-pushed the recipe/async_policy_sglang branch from e0ba55d to f424edb Compare January 9, 2026 03:58
ray.get(dependency_ref)
print("[FullyAsyncRollouter][Public][Resume]")
async with self.lock:
if self.config.async_training.partial_rollout:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this line if be removed?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep,this line is added back!

if self.vanilla_bridge:
from verl.models.mcore.mbridge import AutoBridge

bridge = AutoBridge.from_config(self.model_config.hf_config, dtype=self.param_dtype)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need dtype params

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New mbridge version works!Fine!


async with self.lock:
while self.paused:
self.idle_start_time = time.time()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but idle_start_time only set once when idle_start_time is None ?
Is this right

rollout_device_mesh["infer_tp"].get_local_rank() == 0
and rollout_device_mesh["infer_pp"].get_local_rank() == 0
)
if self.config.rollout.mode == "async" and self.config.rollout.name == "sglang":
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be any code duplication here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@AniZpZ AniZpZ changed the title [recipe, sglang] support sglang as rollout engine in fully async policy [sglang, rollout]support sglang as rollout engine in fully async policy Jan 14, 2026
@AniZpZ AniZpZ changed the title [sglang, rollout]support sglang as rollout engine in fully async policy [sglang, rollout] feat: support sglang as rollout engine in fully async policy Jan 14, 2026


@ray.remote(num_cpus=1)
class SGLangHttpServer:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以参考下vllm当前的改动,这里去掉 @ray.remote(num_cpus=1)

self.server_class = ray.remote(SGLangHttpServer)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Copy Markdown
Collaborator

@ArronHZG ArronHZG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1

rollout_device_mesh["infer_tp"].get_local_rank() == 0
and rollout_device_mesh["infer_pp"].get_local_rank() == 0
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove empty line

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

@wuxibin89 wuxibin89 merged commit f104dfa into verl-project:main Jan 19, 2026
72 of 74 checks passed
vyomakesh0728 added a commit to vyomakesh0728/verl that referenced this pull request Jan 22, 2026
…nc policy (verl-project#4191)

### What does this PR do?

Extend the fully async policy recipe by adding SGLang as an alternative
rollout engine to vLLM when using FSDP

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluation results, etc.

### API and Usage Example

> Demonstrate how the API changes if any, and provide usage example(s)
if possible.

```python
# Add code snippet or script demonstrating how to use this
```

### Design & Code Changes

> Demonstrate the high-level design if this PR is complex, and list the
specific changes.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: jsfanfanfan <2981866535@qq.com>
Co-authored-by: jsfanfanfan <2981856535@qq.com>
Co-authored-by: jsfanfanfan <71052636+jsfanfanfan@users.noreply.github.com>
sophiayyya pushed a commit to sophiayyya/verl that referenced this pull request Jan 25, 2026
…nc policy (verl-project#4191)

### What does this PR do?

Extend the fully async policy recipe by adding SGLang as an alternative
rollout engine to vLLM when using FSDP

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluation results, etc.

### API and Usage Example

> Demonstrate how the API changes if any, and provide usage example(s)
if possible.

```python
# Add code snippet or script demonstrating how to use this
```

### Design & Code Changes

> Demonstrate the high-level design if this PR is complex, and list the
specific changes.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: jsfanfanfan <2981866535@qq.com>
Co-authored-by: jsfanfanfan <2981856535@qq.com>
Co-authored-by: jsfanfanfan <71052636+jsfanfanfan@users.noreply.github.com>
meichangsu1 pushed a commit to meichangsu1/verl that referenced this pull request Jan 27, 2026
…nc policy (verl-project#4191)

### What does this PR do?

Extend the fully async policy recipe by adding SGLang as an alternative
rollout engine to vLLM when using FSDP

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluation results, etc.

### API and Usage Example

> Demonstrate how the API changes if any, and provide usage example(s)
if possible.

```python
# Add code snippet or script demonstrating how to use this
```

### Design & Code Changes

> Demonstrate the high-level design if this PR is complex, and list the
specific changes.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: jsfanfanfan <2981866535@qq.com>
Co-authored-by: jsfanfanfan <2981856535@qq.com>
Co-authored-by: jsfanfanfan <71052636+jsfanfanfan@users.noreply.github.com>
meichangsu1 pushed a commit to meichangsu1/verl that referenced this pull request Jan 27, 2026
…nc policy (verl-project#4191)

### What does this PR do?

Extend the fully async policy recipe by adding SGLang as an alternative
rollout engine to vLLM when using FSDP

### Checklist Before Starting

- [x] Search for similar PRs. Paste at least one query link here: ...
- [x] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

> For changes that can not be tested by CI (e.g., algorithm
implementation, new model support), validate by experiment(s) and show
results like training curve plots, evaluation results, etc.

### API and Usage Example

> Demonstrate how the API changes if any, and provide usage example(s)
if possible.

```python
# Add code snippet or script demonstrating how to use this
```

### Design & Code Changes

> Demonstrate the high-level design if this PR is complex, and list the
specific changes.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [ ] Read the [Contribute
Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md).
- [ ] Apply [pre-commit
checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [ ] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [ ] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [ ] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
(If not accessible, please try [the Feishu group
(飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).)

---------

Co-authored-by: jsfanfanfan <2981866535@qq.com>
Co-authored-by: jsfanfanfan <2981856535@qq.com>
Co-authored-by: jsfanfanfan <71052636+jsfanfanfan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants