Skip to content

fix pp for deepseek #6152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

zhjc1124
Copy link

@zhjc1124 zhjc1124 commented May 9, 2025

Motivation

run DeepSeek in pp.
#5724 #5925

$python3 -m sglang.bench_one_batch_server --model /data/modelscope/DeepSeek-Coder-V2-Lite-Instruct/ --batch-size 1 --trust-remote-code --base-gpu-id 4 --port 38884 --pp 2 --tp 2
batch size: 16
latency: 2.50 s
output throughput: 102.25 token/s
(input + output) throughput: 6645.98 token/s
[2025-05-09 17:37:07 TP0 PP0] Prefill batch. #new-seq: 1, #new-token: 1024, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-05-09 17:37:07 TP0 PP1] Prefill batch. #new-seq: 1, #new-token: 1024, #cached-token: 0, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-05-09 17:37:07 TP0 PP0] Decode batch. #running-req: 1, #token: 1027, token usage: 0.00, gen throughput (token/s): 20.20, #queue-req: 0
[2025-05-09 17:37:07 TP0 PP1] Decode batch. #running-req: 1, #token: 1027, token usage: 0.00, gen throughput (token/s): 20.19, #queue-req: 0
[2025-05-09 17:37:07] INFO: 127.0.0.1:59888 - "POST /generate HTTP/1.1" 200 OK
batch size: 1
latency: 0.17 s
output throughput: 95.08 token/s
(input + output) throughput: 6180.45 token/s

Modifications

modify deepseep_v2.py to fit pp.

Checklist

@Edenzzzz
Copy link
Contributor

Edenzzzz commented May 14, 2025

@zhjc1124 Can I ask why you closed it? Is it because deepseek should use EP instead?

@zhjc1124
Copy link
Author

zhjc1124 commented May 15, 2025

@zhjc1124 Can I ask why you closed it? Is it because deepseek should use EP instead?

I saw ur comment too.
Once I found I failed to run DeepSeek-R1 in three nodes with tp=8 pp=3. So I considered if there are some compatibility with mla backend need to handle.
In fact, I found there was something wrong in my machine nccl configuration. Now I run DeepSeek-R1 with tp=8 and pp=3 sucessfully.But I am still not sure if the compatibility with mla backend need to be handled
5dc2d76b5baf787d68e4726a0e166adf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants