[BugFix] fix ep=1 etp=16 #985

ttanzhiqiang · 2025-05-28T08:18:02Z

What this PR does / why we need it?

Fixed ep=1 etp=16 bug #971, Refer to #863 this pr

Does this PR introduce any user-facing change?

Added etp logic branch in deepseekv2 and fused_moe

How was this patch tested?

nohup python -m vllm.entrypoints.openai.api_server --model=/mnt/deepseek/DeepSeek-R1-W8A8-VLLM
--trust-remote-code
--distributed-executor-backend=mp
-tp=16
-dp=1
--port 8006
--max-num-seqs 24
--max-model-len 32768
--max-num-batched-tokens 32768
--block-size 128
--enable-expert-parallel
--compilation_config 0
--gpu-memory-utilization 0.96
--additional-config '{"expert_tensor_parallel_size":1, "ascend_scheduler_config":{}}' &> run.log &

nohup python -m vllm.entrypoints.openai.api_server --model=/mnt/deepseek/DeepSeek-R1-W8A8-VLLM
--trust-remote-code
--distributed-executor-backend=mp
-tp=16
-dp=1
--port 8006
--max-num-seqs 24
--max-model-len 32768
--max-num-batched-tokens 32768
--block-size 128
--enable-expert-parallel
--compilation_config 0
--gpu-memory-utilization 0.96
--additional-config '{"expert_tensor_parallel_size":16, "ascend_scheduler_config":{}}' &> run.log &

Signed-off-by: ttanzhiqiang <[email protected]>

ttanzhiqiang · 2025-05-28T08:32:07Z

@wangxiyuan @Angazenn

Signed-off-by: ttanzhiqiang <[email protected]>

ttanzhiqiang · 2025-05-29T14:54:03Z

The latest branch is running smoothly, vllm-ascend: commit 6eddbd2
vllm: releases/v0.9.0
nohup python -m vllm.entrypoints.openai.api_server --model=/mnt/deepseek/DeepSeek-R1-W8A8-VLLM
--trust-remote-code
--distributed-executor-backend=mp
-tp=16
-dp=1
--port 8006
--max-num-seqs 24
--max-model-len 32768
--max-num-batched-tokens 32768
--block-size 128
--enable-expert-parallel
--compilation_config 0
--gpu-memory-utilization 0.96
--additional-config '{"expert_tensor_parallel_size":1}' &> run.log &

nohup python -m vllm.entrypoints.openai.api_server --model=/mnt/deepseek/DeepSeek-R1-W8A8-VLLM \ --trust-remote-code \ --distributed-executor-backend=mp \ -tp=16 \ -dp=1 \ --port 8006 \ --max-num-seqs 24 \ --max-model-len 32768 \ --max-num-batched-tokens 32768 \ --block-size 128 \ --enable-expert-parallel \ --compilation_config 0 \ --gpu-memory-utilization 0.96 \ --additional-config '{"expert_tensor_parallel_size":16}' &> run.log &

github-actions · 2025-06-04T10:32:09Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

ttanzhiqiang · 2025-06-05T13:46:09Z

#1012 do it

fix ep=1 etp=16

1d9d037

Signed-off-by: ttanzhiqiang <[email protected]>

github-actions bot added module:ops module:quantization labels May 28, 2025

update

0cb29fa

Signed-off-by: ttanzhiqiang <[email protected]>

update

8f9e635

Signed-off-by: ttanzhiqiang <[email protected]>

ttanzhiqiang changed the title ~~fix ep=1 etp=16~~ [BugFix] fix ep=1 etp=16 May 28, 2025

ttanzhiqiang mentioned this pull request May 28, 2025

[Bug]: When moe ep=16 etp=1, the result is normal. When moe ep=1 etp=16, the result is abnormal. #971

Closed

ttanzhiqiang and others added 6 commits May 28, 2025 18:19

update graph

6bd5891

Signed-off-by: ttanzhiqiang <[email protected]>

update

d201ca0

Signed-off-by: ttanzhiqiang <[email protected]>

Merge branch 'main' into fix_etp_ep=1

a5e628c

update

eccdcc2

Signed-off-by: ttanzhiqiang <[email protected]>

update ep>1

3c254d4

Signed-off-by: ttanzhiqiang <[email protected]>

update

4d3a521

Signed-off-by: ttanzhiqiang <[email protected]>

Yikun mentioned this pull request Jun 3, 2025

[BugFix] Fix accuray problems with deepseek in situation of ep=1, etp>1 #863

Closed

github-actions bot added the merge-conflicts label Jun 4, 2025

ttanzhiqiang closed this Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] fix ep=1 etp=16 #985

[BugFix] fix ep=1 etp=16 #985

ttanzhiqiang commented May 28, 2025

Uh oh!

ttanzhiqiang commented May 28, 2025

Uh oh!

ttanzhiqiang commented May 29, 2025

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

ttanzhiqiang commented Jun 5, 2025

Uh oh!

Uh oh!

[BugFix] fix ep=1 etp=16 #985

[BugFix] fix ep=1 etp=16 #985

Conversation

ttanzhiqiang commented May 28, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

ttanzhiqiang commented May 28, 2025

Uh oh!

ttanzhiqiang commented May 29, 2025

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

ttanzhiqiang commented Jun 5, 2025

Uh oh!

Uh oh!