Skip to content

Commit 6ec5f72

Browse files
committed
update rl script
1 parent e86568d commit 6ec5f72

File tree

1 file changed

+3
-6
lines changed

1 file changed

+3
-6
lines changed

verl/experimental/fully_async_policy/shell/grpo_qwen35_35b_megatron_async.sh

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,15 @@
33
#
44
# Requirements:
55
# pip install --upgrade transformers==5.3.0
6-
# mbridge: https://github.com/ISEEKYAN/mbridge
6+
# mbridge: make sure https://github.com/ISEEKYAN/mbridge/pull/98 this pr has merged
77
#
88
# MTP (Multi-Token Prediction) notes:
99
# - actor_rollout_ref.model.mtp.enable=True enables MTP module
1010
# - actor_rollout_ref.model.mtp.enable_train=True enables MTP training loss
1111
# - actor_rollout_ref.model.mtp.enable_rollout=True enables speculative decoding in SGLang
1212
#
1313
# Example parallelism configs for Qwen3.5-35B-A3B:
14-
# 8 GPUs (1 node): train_tp=4 train_pp=2 EP=4 gen_tp=8
15-
# 16 GPUs (2 nodes): train_tp=4 train_pp=4 EP=4 gen_tp=8
14+
# 16 GPUs (2 nodes): train_tp=4 train_pp=2 EP=4 gen_tp=8
1615
#
1716
# Run:
1817
# NNODES_TRAIN=1 NNODES_ROLLOUT=1 bash grpo_qwen35_35b_megatron_async.sh
@@ -115,7 +114,7 @@ fi
115114

116115
CHECKPOINT_CONTENTS=['model','hf_model','extra']
117116

118-
python -X faulthandler -m verl.experimental.fully_async_policy.fully_async_main \
117+
python -m verl.experimental.fully_async_policy.fully_async_main \
119118
--config-path=config \
120119
--config-name='fully_async_ppo_megatron_trainer.yaml' \
121120
data.train_files="${TRAIN_FILE}" \
@@ -224,8 +223,6 @@ python -X faulthandler -m verl.experimental.fully_async_policy.fully_async_main
224223
actor_rollout_ref.rollout.multi_turn.max_tool_response_length=${max_prompt_length} \
225224
actor_rollout_ref.rollout.agent.num_workers=2 \
226225
actor_rollout_ref.rollout.disable_log_stats=False \
227-
actor_rollout_ref.rollout.prometheus.enable=True \
228-
actor_rollout_ref.rollout.prometheus.port=44398 \
229226
actor_rollout_ref.rollout.checkpoint_engine.update_weights_bucket_megabytes=1024 \
230227
+actor_rollout_ref.rollout.engine_kwargs.sglang.mamba_scheduler_strategy=no_buffer \
231228
+actor_rollout_ref.rollout.engine_kwargs.sglang.disable_radix_cache=True \

0 commit comments

Comments
 (0)