We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We released and open-sourced Aquila 7B series , including AquilaChat-7B(https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila/Aquila-chat) and Aquila-7B(https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila/Aquila-pretrain), which support both Chinese and English knowledge.
The model architectures are almost same as LLaMa, except one GPT2-like BPE tokenizer is used.
Could llama.cpp repo add our Aquila 7B models and how to adapt for the BPE tokenizer? Thanks very much.
The text was updated successfully, but these errors were encountered:
llama.cpp implemented llama_tokenizer as a separate class. You can implement GPT2Tokenizer similar to that.
The interface today doesn't support different tokenizer, however it should be relative easy to fix by introducing a new setting.
Sorry, something went wrong.
#2228
I add one pr for support bpe tokenizer in convert.py. @howard0su could you please review it,thanks.
This issue was closed because it has been inactive for 14 days since being marked as stale.
No branches or pull requests
We released and open-sourced Aquila 7B series , including AquilaChat-7B(https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila/Aquila-chat) and Aquila-7B(https://github.com/FlagAI-Open/FlagAI/tree/master/examples/Aquila/Aquila-pretrain), which support both Chinese and English knowledge.
The model architectures are almost same as LLaMa, except one GPT2-like BPE tokenizer is used.
Could llama.cpp repo add our Aquila 7B models and how to adapt for the BPE tokenizer? Thanks very much.
The text was updated successfully, but these errors were encountered: