Skip to content

Possible (very serious) bug in chat templates that use '<s>' token having a space added after it #7390

Closed
@jukofyork

Description

@jukofyork

You can see this in my PR where I tried to add the 'alpca' chat template:

#7383

Later in the PR chat I actully copied and pasted the:

    } else if (tmpl == "deepseek" || (tmpl.find("### Instruction:") != std::string::npos && tmpl.find("<|EOT|>") != std::string::npos)) {
        // deepseek-ai/deepseek-coder-33b-instruct
        for (auto message : chat) {
            std::string role(message->role);
            if (role == "system") {
                ss << message->content;
            } else if (role == "user") {
                ss << "### Instruction:\n" << message->content << "\n";
            } else if (role == "assistant") {
                ss << "### Response:\n" << message->content << "\n<|EOT|>\n";
            }
        }
        if (add_ass) {
            ss << "### Response:\n";
        }
    }

to:

    } else if (tmpl == "alpaca" || (tmpl.find("### Instruction:") != std::string::npos && tmpl.find("<|EOT|>") == std::string::npos)) {
        // deepseek-ai/deepseek-coder-33b-instruct
        for (auto message : chat) {
            std::string role(message->role);
            if (role == "system") {
                ss << message->content << "\n\n";
            } else if (role == "user") {
                ss << "### Instruction:\n" << message->content << "\n\n";
            } else if (role == "assistant") {
                ss << "### Response:\n" << message->content << "</s>\n\n";
            }
        }
        if (add_ass) {
            ss << "### Response:\n";
        }
    }

and the space between <s> and ### Instruction: persisted.

The effect of that space being added is very serious; my attempt at using phind-codellama using the deepseek chat template (with completely the wrong newlines and EOS token) actually worked much better.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions