You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Later in the PR chat I actully copied and pasted the:
} else if (tmpl == "deepseek" || (tmpl.find("### Instruction:") != std::string::npos && tmpl.find("<|EOT|>") != std::string::npos)) {
// deepseek-ai/deepseek-coder-33b-instruct
for (auto message : chat) {
std::string role(message->role);
if (role == "system") {
ss << message->content;
} else if (role == "user") {
ss << "### Instruction:\n" << message->content << "\n";
} else if (role == "assistant") {
ss << "### Response:\n" << message->content << "\n<|EOT|>\n";
}
}
if (add_ass) {
ss << "### Response:\n";
}
}
to:
} else if (tmpl == "alpaca" || (tmpl.find("### Instruction:") != std::string::npos && tmpl.find("<|EOT|>") == std::string::npos)) {
// deepseek-ai/deepseek-coder-33b-instruct
for (auto message : chat) {
std::string role(message->role);
if (role == "system") {
ss << message->content << "\n\n";
} else if (role == "user") {
ss << "### Instruction:\n" << message->content << "\n\n";
} else if (role == "assistant") {
ss << "### Response:\n" << message->content << "</s>\n\n";
}
}
if (add_ass) {
ss << "### Response:\n";
}
}
and the space between <s> and ### Instruction: persisted.
The effect of that space being added is very serious; my attempt at using phind-codellama using the deepseek chat template (with completely the wrong newlines and EOS token) actually worked much better.
The text was updated successfully, but these errors were encountered:
I'm not even sure if it is the space that is causing it now, as this is the tokenization when using the deepseek chat template:
{"tid":"140097807843328","timestamp":1716135105,"level":"VERB","function":"update_slots","line":1955,"msg":"prompt tokenized","id_slot":0,"id_task":0,"n_ctx":16384,"n_keep":0,"n_prompt_tokens":43,"prompt_tokens":"<s> ### Instruction:\nCan you write me a C++ program to calculate logistic regression using GSL? Write a short driver in main to test it with hard coded values\n### Response:\n"}
You can see this in my PR where I tried to add the 'alpca' chat template:
#7383
Later in the PR chat I actully copied and pasted the:
to:
and the space between
<s>
and### Instruction:
persisted.The effect of that space being added is very serious; my attempt at using
phind-codellama
using thedeepseek
chat template (with completely the wrong newlines and EOS token) actually worked much better.The text was updated successfully, but these errors were encountered: