Skip to content

Conversation

createthis
Copy link
Contributor

@createthis createthis commented Aug 23, 2025

This PR enables DeepSeek V3.1 thinking mode as the default. Disable with --reasoning-budget 0.

It also implements tool calling support.

Addresses #15496

My understanding is that this is a continuation of #9639 for DeepSeek V3.1 specifically.

openhands-agent and others added 14 commits August 22, 2025 13:31
- Added COMMON_CHAT_FORMAT_DEEPSEEK_V3_1 enum value
- Created common_chat_params_init_deepseek_v3_1() function (currently uses R1 implementation)
- Created common_chat_parse_deepseek_v3_1() function that handles V3.1 thinking format:
  - Extracts reasoning content before '</think>' tag into reasoning_content
  - Extracts regular content after '</think>' tag into content
  - No opening '<think>' tag in V3.1 format
- Added detection logic for V3.1 templates based on pattern: 'message['prefix'] is defined and message['prefix'] and thinking'
- Added V3.1 case to parsing switch statement

This addresses the issue where V3.1 outputs reasoning content followed by '</think>' and then regular content without the opening '<think>' tag.
@createthis createthis requested a review from ngxson as a code owner August 25, 2025 01:50
@createthis createthis marked this pull request as draft August 25, 2025 03:34
@createthis createthis marked this pull request as ready for review August 25, 2025 05:42
out parts we cargo culted from R1 that don't make sense.
…rimental

Strip grammar down to strictly what we expect based on model card. Throw
@createthis
Copy link
Contributor Author

@CISC are we ready to merge this or do I need to make more changes?

@createthis
Copy link
Contributor Author

@CISC who approves the workflows?

@ggerganov
Copy link
Member

@CISC who approves the workflows?

I think all collaborators can approve them?

Just approved this one.

@createthis
Copy link
Contributor Author

I still don’t see a merge button. Do we need @ngxson to review too?

@pwilkin
Copy link
Collaborator

pwilkin commented Sep 5, 2025

@createthis nope, only collaborators with write access can merge, so you need either @CISC or @ggerganov to merge it :>

@createthis createthis marked this pull request as draft September 5, 2025 21:44
tool calling in the reasoning content, but then the model just stops the
output without closing the </think> tag, so it's not a partial. In this
case, use the tool call in the reasoning content.
@createthis
Copy link
Contributor Author

createthis commented Sep 6, 2025

I added an edge case where thinking is forced open, there is tool calling in the reasoning content, but then the model just stops the output without closing the </think> tag, so it's not a partial. In this case, use the tool call in the reasoning content, because the model appears to be confused.

@createthis createthis marked this pull request as ready for review September 6, 2025 04:50
@createthis
Copy link
Contributor Author

@CISC @ggerganov Let me know if you want any more changes, otherwise please merge. This is working well on my end.

@createthis
Copy link
Contributor Author

createthis commented Sep 8, 2025

@CISC @ggerganov I simplified update_cursor with 26b02fa. I also performed an extensive analysis of the cause. You can view my notes here: https://gist.github.com/createthis/dc3098c3abb4ff809d0291c91322f512

TL;DR: After the secondfunction_regex call fails to find a match, it also fails to reset builder.pos(), so block_close fails. update_cursor just resets the cursor to the last position before function_regex fails.

@createthis createthis changed the title Deepseek V3.1 thinking mode is the default Deepseek V3.1 tool calling support Sep 8, 2025
@createthis createthis changed the title Deepseek V3.1 tool calling support Deepseek V3.1 native tool calling support (OpenAI Style) Sep 8, 2025
createthis and others added 4 commits September 8, 2025 07:11
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
@CISC CISC merged commit 8802156 into ggml-org:master Sep 8, 2025
48 checks passed
@createthis
Copy link
Contributor Author

🎉🎉🎉

njsyw1997 pushed a commit to aizip/llama.cpp that referenced this pull request Sep 10, 2025
ggml-org#15533)

* Add DeepSeek V3.1 thinking mode support

- Added COMMON_CHAT_FORMAT_DEEPSEEK_V3_1 enum value
- Created common_chat_params_init_deepseek_v3_1() function (currently uses R1 implementation)
- Created common_chat_parse_deepseek_v3_1() function that handles V3.1 thinking format:
  - Extracts reasoning content before '</think>' tag into reasoning_content
  - Extracts regular content after '</think>' tag into content
  - No opening '<think>' tag in V3.1 format
- Added detection logic for V3.1 templates based on pattern: 'message['prefix'] is defined and message['prefix'] and thinking'
- Added V3.1 case to parsing switch statement

This addresses the issue where V3.1 outputs reasoning content followed by '</think>' and then regular content without the opening '<think>' tag.

* Another attempt by V3.1 non-thinking

* Fix test, but it's not asserting anything.

* Ignore vim swap files in tests dir

* Update the test

* Try using try_find_literal instead of regex

* passing test

* Revert "Try using try_find_literal instead of regex"

This reverts commit c50d887.

* Remove unnecessary change

* Remove comment

* Add code to handle non-thinking mode.

* Try to set message['prefix'] when thinking is enabled.

* This fixes reasoning, but breaks normal content. We need state in the
chat parser.

* DeepSeek V3.1 thinking is now the default. Disable with `--reasoning-budget 0`.

* Simplify (DeepSeek V3.1 reasoning)

* Fix sign inversion bug

* Add some tool calling code (not working).

* Tool calls working in non-reasoning mode.

* Attempt a unit test for tool call parsing.

* Passing test

* Add tests for both happy path and broken fenced DeepSeek V3.1 tool call variants.

* Passing DeepSeek V3.1 tool call tests, but model is not working.

* Revert assistance response prefill change. Not my monkeys.

* Add fenced_thinking unit test variant. Passes, but thinking tool calling
still isn't working for some reason.

* Tests pass in reasoning mode. Also e2e tool test passes.

* Make a copy of the parse_json_tool_calls function for deepseek-v3.1 so
as to not accidentally introduce regressions.

* Fix thinking_forced_open logic. tool calling broken. Need to add another
test case.

* That's what I get for cargo culting a newline.

* Add multi tool call test for deepseek v3.1 non-reasoning

* Move test, remove .gitignore change

* Place deepseek-v3.1 reasoning test directly into existing reasoning
function per CISC's request.

* Address whitespace CI failure.

* Merge two assert_equals per CISC's request.

* Add DeepSeek-V3.1 tests to tests/test-chat.cpp per CISC's request.

* Merge deepseek V3.1 and regular parse_json_tool_calls() function
behaviors by adding optional update_cursor argument.

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* DeepSeek V3.1 fix reasoning_format none

* Strip grammar down to strictly what we expect based on model card. Throw
out parts we cargo culted from R1 that don't make sense.

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* DeepSeek V3.1 - Add edge case where thinking is forced open, there is
tool calling in the reasoning content, but then the model just stops the
output without closing the </think> tag, so it's not a partial. In this
case, use the tool call in the reasoning content.

* DeepSeek V3.1 - simplify update_cursor

* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Fix indent

---------

Co-authored-by: openhands <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples server testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants