-
Notifications
You must be signed in to change notification settings - Fork 11.9k
tool-call
: allow --chat-template chatml
w/ --jinja
, default to chatml upon parsing issue, avoid double bos
#11616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tool-call
: allow --chat-template chatml
when --jinja
enabledtool-call
: allow --chat-template chatml
w/ --jinja
, default to chatml upon parsing issue, avoid double bos
Sorry somehow had forgotten half of the changes when I undrafted, should look better now. |
LOG_ERR("%s: failed to parse chat template: %s\n", __func__, e.what()); | ||
return { | ||
has_explicit_template, | ||
std::make_unique<minja::chat_template>(CHATML_TEMPLATE_SRC, token_bos, token_eos), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at some point we should no longer fallback to chatml. The fallback to chatml was a temporary solution when chat templates was not a common thing.
For example, in such case, we can return an error message like: Chat template is not supported, you must specify a custom template using --chat-template ...
when user uses /chat/completions
endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either way, it's surprising all the things we can have chatml do with a few "polyfills" (in minja)
…chatml upon parsing issue, avoid double bos (ggml-org#11616) * tool-call: allow `--jinja --chat-template chatml` * fix double bos issue (drop bos/eos tokens from jinja template) * add missing try catch around jinja parsing to default to chatml * Simplify default chatml logic
…chatml upon parsing issue, avoid double bos (ggml-org#11616) * tool-call: allow `--jinja --chat-template chatml` * fix double bos issue (drop bos/eos tokens from jinja template) * add missing try catch around jinja parsing to default to chatml * Simplify default chatml logic
…chatml upon parsing issue, avoid double bos (ggml-org#11616) * tool-call: allow `--jinja --chat-template chatml` * fix double bos issue (drop bos/eos tokens from jinja template) * add missing try catch around jinja parsing to default to chatml * Simplify default chatml logic
…chatml upon parsing issue, avoid double bos (ggml-org#11616) * tool-call: allow `--jinja --chat-template chatml` * fix double bos issue (drop bos/eos tokens from jinja template) * add missing try catch around jinja parsing to default to chatml * Simplify default chatml logic
Couple of fixes in common_chat_templates_from_model
--chat-template chatml
when--jinja
enabled:model
, not anassistant
; testing other models w/ it in slow tool call server tests).b4524
breaks serving ofgranite-code
models #11500) and default to chatml