[sglang, rollout] feat: support sglang as rollout engine in fully async policy by AniZpZ · Pull Request #4191 · verl-project/verl

AniZpZ · 2025-11-19T10:00:50Z

What does this PR do?

Extend the fully async policy recipe by adding SGLang as an alternative rollout engine to vLLM when using FSDP

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: ...
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

lizipao · 2025-11-21T12:13:19Z

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before?
Traceback (most recent call last):
File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main
run_ppo(config, task_runner_class=FullyAsyncTaskRunner)
File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo
ray.get(runner.run.remote(config))
File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get
values, debugger_breakpoint = worker.get_objects(
File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>)
File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run
self._run_training_loop()
File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop
raise e
File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop
ray.get(future)
ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>)
File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit
self._check_save_checkpoint(False, timing_raw)
File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint
self._save_checkpoint()
File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint
dataloader_state_dict = self.train_dataloader.state_dict()
AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

AniZpZ · 2025-11-26T13:52:10Z

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

lizipao · 2025-11-26T13:58:58Z

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

thanks, I have fixed it

ForeverDJ-ux · 2025-12-05T07:11:05Z

How do you fix this problem?, I have met the same error @lizipao

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

thanks, I have fixed it

…glang

lizipao · 2025-12-07T07:12:02Z

How do you fix this problem?, I have met the same error @lizipao

Hi,I tried your PR and attempted to replace FSDP with Megatron, but I encountered this error. Have you come across it before? Traceback (most recent call last): File "verl/recipe/fully_async_policy/fully_async_main.py", line 292, in main run_ppo(config, task_runner_class=FullyAsyncTaskRunner) File "verl/verl/trainer/main_ppo.py", line 115, in run_ppo ray.get(runner.run.remote(config)) File "/usr/local/lib/python3.10/dist-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/client_mode_hook.py", line 104, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 2961, in get values, debugger_breakpoint = worker.get_objects( File "/usr/local/lib/python3.10/dist-packages/ray/_private/worker.py", line 1026, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTaskRunner.run() (pid=381899, ip=10.92.240.81, actor_id=3eb5061ea7fd165c6fef1fb811000000, repr=<fully_async_main.FullyAsyncTaskRunner object at 0x7f221528f430>) File "verl/recipe/fully_async_policy/fully_async_main.py", line 138, in run self._run_training_loop() File "verl/recipe/fully_async_policy/fully_async_main.py", line 268, in _run_training_loop raise e File "verl/recipe/fully_async_policy/fully_async_main.py", line 262, in _run_training_loop ray.get(future) ray.exceptions.RayTaskError(AttributeError): ray::FullyAsyncTrainer.fit() (pid=382636, ip=10.92.240.81, actor_id=46571d273791a7f60ff7a86211000000, repr=<recipe.fully_async_policy.fully_async_trainer.FullyAsyncTrainer object at 0x7f1e26892f80>) File "verl/recipe/fully_async_policy/fully_async_trainer.py", line 270, in fit self._check_save_checkpoint(False, timing_raw) File "verl/recipe/fully_async_policy/ray_trainer.py", line 722, in _check_save_checkpoint self._save_checkpoint() File "verl/verl/trainer/ppo/ray_trainer.py", line 1024, in _save_checkpoint dataloader_state_dict = self.train_dataloader.state_dict() AttributeError: 'FullyAsyncTrainer' object has no attribute 'train_dataloader'

sry, i have not adapted ot for megatron and encounter the same issue yet.

thanks, I have fixed it

我在recipe\fully_async_policy\fully_async_trainer.py里加了
from verl.trainer.main_ppo import create_rl_dataset, create_rl_sampler
from verl.utils.dataset.rl_dataset import collate_fn

    train_dataset = create_rl_dataset(config.data.train_files, config.data, tokenizer, processor)
    val_dataset = create_rl_dataset(config.data.val_files, config.data, tokenizer, processor)
    train_sampler = create_rl_sampler(config.data, train_dataset)

    print(f"[FullyAsyncRollouter] Rollouter _create_dataloader...\n{train_dataset}\n{val_dataset}")

    self._create_dataloader(train_dataset, val_dataset, collate_fn, train_sampler)

…glang

jsfanfanfan · 2025-12-17T06:49:50Z

Test Results Supplements

We conducted tests on 64 H20 GPUs using the Qwen2.5-Math-7B model, with the dapo-math-17k dataset as the train set and aime-2024 as the test set. A total of 400 steps were trained, with testing performed every 10 steps. The experimental results are as follows:

We tested under the same conditions by replacing the rollout backend engine with vLLM, and the comparison results are as follows:

A more detailed comparative analysis between vLLM and SGLang is conducted on steps 100-300 (stable phase).

1. Performance Metrics Comparison

Metric	SGLang	VLLM	Ratio (SGLang/VLLM)	Relative Difference	Note
Throughput (tokens/sec)	703.46	635.85	1.1063	+10.63%	SGLang is faster
Time per Step (sec)	193.54	229.89	0.8419	-15.81%	SGLang is faster
MFU	0.6802	0.6760	1.0063	+0.63%	Comparable
Total Tokens	8,583,224	8,927,580	0.9614	-3.86%	VLLM has slightly more

Throughput (perf/throughput)

SGLang: Mean=703.46, Median=728.33, Min=280.39, Max=753.76, Std=71.00
VLLM: Mean=635.85, Median=689.51, Min=255.31, Max=730.12, Std=113.49

Time per Step (perf/time_per_step)

SGLang: Mean=193.54s, Median=184.78s, Min=173.91s, Max=483.87s, Std=30.78s
VLLM: Mean=229.89s, Median=203.10s, Min=185.95s, Max=521.57s, Std=60.92s

2. Time Metrics Comparison

Metric	SGLang	VLLM	Ratio (SGLang/VLLM)	Relative Difference	Note
Generation Time (sec)	40.36	70.53	0.5722	-42.78%	SGLang is significantly faster
Actor Update Time (sec)	152.05	158.10	0.9618	-3.82%	Comparable
Total Step Time (sec)	193.54	229.89	0.8419	-15.81%	SGLang is faster
Validation Wait Time (sec)	0.0024	0.0032	0.7564	-24.36%	SGLang is faster
Parameter Sync Time (sec)	18.34	5.18	3.5380	+253.80%	⚠️ VLLM is faster
Generation Time per Token (ms)	0.0024	0.0042	0.5681	-43.19%	SGLang is significantly faster
Actor Update Time per Token (ms)	0.0169	0.0172	0.9842	-1.58%	Comparable

3. Validation Metrics Comparison

Metric	SGLang	VLLM	Ratio (SGLang/VLLM)	Relative Difference	Note
Validation Reward (mean@1)	-0.3373	-0.4299	0.7846	+21.54%	SGLang is higher
Validation Score (mean@1)	-0.3372	-0.4299	0.7844	+21.56%	SGLang is higher
Validation Accuracy (mean@1)	0.3314	0.2851	1.1625	+16.25%	SGLang is higher

Validation Accuracy (val-core/math_dapo/acc/mean@1)

SGLang: Mean=0.3314, Median=0.3323, Min=0.3000, Max=0.3510, Std=0.0141
VLLM: Mean=0.2851, Median=0.2854, Min=0.2521, Max=0.3167, Std=0.0177

AniZpZ · 2025-12-17T06:56:45Z

@chenhaiq @zhaochenyang20 could you please trigger the ci ?

ArronHZG

I think the README should also be updated to indicate that SGLang is now supported, along with supplementary experimental data.

3 wiki:
https://github.com/volcengine/verl/blob/main/docs/advance/fully_async.md
https://github.com/volcengine/verl/blob/main/recipe/fully_async_policy/README.md

The two above are exactly the same.

https://github.com/volcengine/verl/blob/main/recipe/fully_async_policy/README_zh.md

ArronHZG · 2025-12-17T07:05:55Z

recipe/fully_async_policy/fully_async_rollouter.py

                await asyncio.gather(*self.active_tasks, return_exceptions=True)
                self.active_tasks.clear()
                print("[FullyAsyncRollouter][Public][Pause] All active tasks completed")
+            print("[FullyAsyncRollouter][Public][Pause] Ready to reset prefix cache")


Should we unify the use of clear_kv_cache as the interface here? Modifications can be made by rebasing on the main branch.

verl/workers/fsdp_workers.py

ArronHZG · 2025-12-17T07:11:40Z

verl/workers/rollout/sglang_rollout/async_sglang_server.py

            "mem_fraction_static": self.config.gpu_memory_utilization,
            "disable_cuda_graph": self.config.enforce_eager,
-            "enable_memory_saver": True,
+            "enable_memory_saver": False,


Will this affect the existing logic?

I maintained the original logic at f6c7589

verl/experimental/fully_async_policy/fsdp_workers.py

…er synchronization time.

jsfanfanfan · 2025-12-18T16:11:08Z

We further reduced the parameter synchronization time in f6c7589. Experiments conducted on 32 H20 GPUs, using data from step 20 to 120, show that the average parameter synchronization time decreased from 10.36 seconds to 1.34 seconds，reduced by approximately 87%.

…glang

CLAassistant · 2026-01-09T03:48:51Z

All committers have signed the CLA.

…glang

ArronHZG · 2026-01-12T02:27:00Z

verl/experimental/fully_async_policy/fully_async_rollouter.py

            ray.get(dependency_ref)
        print("[FullyAsyncRollouter][Public][Resume]")
        async with self.lock:
-            if self.config.async_training.partial_rollout:


why this line if be removed?

Yep，this line is added back！

ArronHZG · 2026-01-12T02:27:50Z

verl/workers/engine/megatron/transformer_impl.py

        if self.vanilla_bridge:
            from verl.models.mcore.mbridge import AutoBridge

-            bridge = AutoBridge.from_config(self.model_config.hf_config, dtype=self.param_dtype)


we need dtype params

New mbridge version works！Fine！

ArronHZG · 2026-01-12T02:30:41Z

verl/experimental/fully_async_policy/fully_async_rollouter.py


                async with self.lock:
                    while self.paused:
-                        self.idle_start_time = time.time()


but idle_start_time only set once when idle_start_time is None ?
Is this right

ArronHZG · 2026-01-12T02:33:47Z

verl/workers/megatron_workers.py

-            rollout_device_mesh["infer_tp"].get_local_rank() == 0
-            and rollout_device_mesh["infer_pp"].get_local_rank() == 0
-        )
+        if self.config.rollout.mode == "async" and self.config.rollout.name == "sglang":


Will there be any code duplication here?

ArronHZG · 2026-01-14T11:06:59Z

verl/workers/rollout/sglang_rollout/async_sglang_server.py



-@ray.remote(num_cpus=1)
-class SGLangHttpServer:


可以参考下vllm当前的改动，这里去掉 @ray.remote(num_cpus=1)

self.server_class = ray.remote(SGLangHttpServer)

ArronHZG

1

ArronHZG · 2026-01-16T07:57:17Z

verl/workers/megatron_workers.py

            rollout_device_mesh["infer_tp"].get_local_rank() == 0
            and rollout_device_mesh["infer_pp"].get_local_rank() == 0
        )
+


remove empty line

…nc policy (verl-project#4191) ### What does this PR do? Extend the fully async policy recipe by adding SGLang as an alternative rollout engine to vLLM when using FSDP ### Checklist Before Starting - [x] Search for similar PRs. Paste at least one query link here: ... - [x] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [ ] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [ ] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [ ] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [ ] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [ ] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) --------- Co-authored-by: jsfanfanfan <2981866535@qq.com> Co-authored-by: jsfanfanfan <2981856535@qq.com> Co-authored-by: jsfanfanfan <71052636+jsfanfanfan@users.noreply.github.com>

AniZpZ added 3 commits November 19, 2025 17:51

sglang rollout in fully async

e18ba0d

upd

0aec584

minor fix

48103dc

AniZpZ marked this pull request as ready for review November 26, 2025 13:50

AniZpZ requested review from SwordFaith, chenhaiq and zhaochenyang20 as code owners November 26, 2025 13:50

AniZpZ added 2 commits December 5, 2025 15:22

Merge remote-tracking branch 'origin/main' into recipe/async_policy_s…

c9e7d09

…glang

Merge branch 'main' into recipe/async_policy_sglang

815ebb2

jsfanfanfan force-pushed the recipe/async_policy_sglang branch from 17d9bad to 815ebb2 Compare December 17, 2025 03:01

fix ValueError and MemoryAllocateError

814be85

AniZpZ changed the title ~~[WIP][recipe, sglang] support sglang as rollout engine in fully async policy~~ [recipe, sglang] support sglang as rollout engine in fully async policy Dec 17, 2025

Merge remote-tracking branch 'origin/main' into recipe/async_policy_s…

ce64a14

…glang

ArronHZG self-requested a review December 17, 2025 06:55

ArronHZG reviewed Dec 17, 2025

View reviewed changes

AniZpZ and others added 2 commits December 17, 2025 15:44

fmt

a1ae8db

Keep the original logic of enable_memory_saver and reduce the paramet…

f6c7589

…er synchronization time.

AniZpZ added 4 commits December 19, 2025 17:18

Merge remote-tracking branch 'origin/main' into recipe/async_policy_s…

0b13a30

…glang

update clear_kv_cache

d6e7ab7

upd

4492587

upda

2b0c897

AniZpZ requested a review from ISEEKYAN as a code owner December 24, 2025 14:06

jsfanfanfan requested review from FightingZhen, PeterSH6, ji-huazhong and tongyx361 as code owners January 9, 2026 03:48

jsfanfanfan force-pushed the recipe/async_policy_sglang branch from e0ba55d to f424edb Compare January 9, 2026 03:58

AniZpZ and others added 4 commits January 9, 2026 16:45

Merge remote-tracking branch 'origin/main' into recipe/async_policy_s…

8dc3303

…glang

Minor rectification after rebase.

64aba78

fix pre-commit

03cc18d

minor fix

dbe9a45

ArronHZG reviewed Jan 12, 2026

View reviewed changes

minor fix

0121750

AniZpZ changed the title ~~[recipe, sglang] support sglang as rollout engine in fully async policy~~ [sglang, rollout]support sglang as rollout engine in fully async policy Jan 14, 2026

AniZpZ changed the title ~~[sglang, rollout]support sglang as rollout engine in fully async policy~~ [sglang, rollout] feat: support sglang as rollout engine in fully async policy Jan 14, 2026

ArronHZG reviewed Jan 14, 2026

View reviewed changes

jsfanfanfan and others added 5 commits January 15, 2026 14:02

minor fix

8257a08

Merge branch 'volcengine:main' into recipe/async_policy_sglang

5721903

fix TypeError: got an unexpected keyword argument 'base_gpu_id'

72ddbb1

Use lazy import to avoid ModuleNotFoundError

80f277f

fix base_gpu_id missing!

f9994a6

ArronHZG reviewed Jan 16, 2026

View reviewed changes

pre-commit

67096c3

ArronHZG reviewed Jan 16, 2026

View reviewed changes

remove reduntant classes and lines.

b0f905a

wuxibin89 approved these changes Jan 19, 2026

View reviewed changes

wuxibin89 merged commit f104dfa into verl-project:main Jan 19, 2026
72 of 74 checks passed

Conversation

AniZpZ commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

lizipao commented Nov 21, 2025

Uh oh!

AniZpZ commented Nov 26, 2025

Uh oh!

lizipao commented Nov 26, 2025

Uh oh!

ForeverDJ-ux commented Dec 5, 2025

Uh oh!

lizipao commented Dec 7, 2025

Uh oh!

jsfanfanfan commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results Supplements

1. Performance Metrics Comparison

Throughput (perf/throughput)

Time per Step (perf/time_per_step)

2. Time Metrics Comparison

3. Validation Metrics Comparison

Validation Accuracy (val-core/math_dapo/acc/mean@1)

Uh oh!

AniZpZ commented Dec 17, 2025

Uh oh!

ArronHZG left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jsfanfanfan commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArronHZG left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

AniZpZ commented Nov 19, 2025 •

edited

Loading

jsfanfanfan commented Dec 17, 2025 •

edited

Loading

jsfanfanfan commented Dec 18, 2025 •

edited

Loading

CLAassistant commented Jan 9, 2026 •

edited

Loading