-
Notifications
You must be signed in to change notification settings - Fork 79
Added Apply Chat Template #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Apply Chat Template #127
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition. I'd like to see a user able to allocate a bigger buffer if required + a couple nitpicks.
Very useful feature, can’t wait! |
Checking in on this. Has anything with this functionality been added yet? If not, I'll make the requested changes |
There's some overlap with #194, otherwise nothing comes to mind. |
The windows build failing is fine, but could you look into why the Linux Cuda one is failing? Looks like a i8 vs u8 pointer difference. |
6f9fa32 should fix it. That is actually a really interesting error. Whether c_char is an *i8 or *u8 is dependent on the architecture which is why I was not getting an error: https://doc.rust-lang.org/std/os/raw/type.c_char.html |
awesome, thanks! |
LlamaCPP added support for chat templates a few weeks ago: ggml-org/llama.cpp#5538
This adds a method on the model struct to apply a chat template.