Skip to content

Misc. bug: Reasoning content is not separated when streaming #13867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Edremon opened this issue May 28, 2025 · 2 comments · May be fixed by #13933
Open

Misc. bug: Reasoning content is not separated when streaming #13867

Edremon opened this issue May 28, 2025 · 2 comments · May be fixed by #13933
Assignees

Comments

@Edremon
Copy link

Edremon commented May 28, 2025

Name and Version

version: 5523 (aa6dff0)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server -m models/Qwen3-30B-A3B-IQ4_XS.gguf --jinja

Problem description & steps to reproduce

Thinking content should be separated when streaming too.

@ochafik in #12379 said:

Note: Ideally, we'd stream the thoughts as a reasoning_content delta (now trivial to implement), but for now we are just aiming for compatibility w/ DeepSeek's API (if --reasoning-format deepseek, which is the default).

I just tested using the official deepseek API and thoughts are separated.

Official deepseek API:
"choices":[{"index":0,"delta":{"content":null,"reasoning_content":"Okay"},"logprobs":null,"finish_reason":null}]}
llama.cpp server API:
"choices":[{"finish_reason":null,"index":0,"delta":{"content":"<think>Okay"}}]

@ochafik
Copy link
Collaborator

ochafik commented May 29, 2025

Oh, that's an interesting development! I can't confirm the DeepSeek API change (nor contribute any code atm), but it should be easy to modify the code to support this (uncomment the reasoning_content_delta bits and disable the reasoning_in_content behaviour)

@ochafik
Copy link
Collaborator

ochafik commented May 31, 2025

Fix in progress in #13933

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants