Misc. bug: Reasoning content is not separated when streaming #13867

Edremon · 2025-05-28T16:55:37Z

Name and Version

version: 5523 (aa6dff0)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server -m models/Qwen3-30B-A3B-IQ4_XS.gguf --jinja

Problem description & steps to reproduce

Thinking content should be separated when streaming too.

@ochafik in #12379 said:

Note: Ideally, we'd stream the thoughts as a reasoning_content delta (now trivial to implement), but for now we are just aiming for compatibility w/ DeepSeek's API (if --reasoning-format deepseek, which is the default).

I just tested using the official deepseek API and thoughts are separated.

Official deepseek API:
"choices":[{"index":0,"delta":{"content":null,"reasoning_content":"Okay"},"logprobs":null,"finish_reason":null}]}
llama.cpp server API:
"choices":[{"finish_reason":null,"index":0,"delta":{"content":"<think>Okay"}}]

The text was updated successfully, but these errors were encountered:

ochafik · 2025-05-29T03:35:23Z

Oh, that's an interesting development! I can't confirm the DeepSeek API change (nor contribute any code atm), but it should be easy to modify the code to support this (uncomment the reasoning_content_delta bits and disable the reasoning_in_content behaviour)

ochafik · 2025-05-31T00:03:14Z

Fix in progress in #13933

Edremon added the bug-unconfirmed label May 28, 2025

ochafik self-assigned this May 30, 2025

ochafik linked a pull request May 30, 2025 that will close this issue

server: update deepseek reasoning format (pass reasoning_content as diffs) #13933

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Reasoning content is not separated when streaming #13867

Misc. bug: Reasoning content is not separated when streaming #13867

Edremon commented May 28, 2025 •

edited

Loading

ochafik commented May 29, 2025

Uh oh!

ochafik commented May 31, 2025

Uh oh!

Misc. bug: Reasoning content is not separated when streaming #13867

Misc. bug: Reasoning content is not separated when streaming #13867

Comments

Edremon commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Name and Version

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

ochafik commented May 29, 2025

Uh oh!

ochafik commented May 31, 2025

Uh oh!

Edremon commented May 28, 2025 •

edited

Loading