-
Notifications
You must be signed in to change notification settings - Fork 12k
Eval bug: llama-cpp-deepseek-r1.jinja template will miss the <think> tag #12107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
if remove |
@Sherlock-Holo Thanks for filing this! FYI, this is currently halfway between working as intended (by the DeepSeek R1 template authors - both the latest official DS template and our |
@ochafik did you mean use the official template or so before you add the streaming reasoning_content support, I should avoid use these template? |
@Sherlock-Holo Both, actually! Luckily the GGUF from bartowski was converted before DeepSeek updated their template to have the trailing
Probably best yes. Hope to share a draft PR for the streaming that detects and normalizes pre-fed think tags this weekend 🤞 |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Name and Version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: Tesla T4, compute capability 7.5, VMM: yes
version: 4790 (438a839)
Operating systems
Linux
GGML backends
CUDA
Hardware
tesla t4
Models
gguf deepseek-r1:14b, downloaded from ollama
Problem description & steps to reproduce
this will use llama-cpp-deepseek-r1.jinja as the template, however, when using stream mode, the output content will miss the start
<think>
tag, but the</think>
still exists, if remove the flag--chat-template-file /root/git/llama.cpp/models/templates/llama-cpp-deepseek-r1.jinja
, problem goneFirst Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: