Yet Another LLaMA Discord Bot

What is this?

A chatbot for Discord using Meta's LLaMA model, 4-bit quantized. The 13 billion parameters model fits within less than 9 GiB VRAM.

Installation

Before you do any of this, you will need a bot token. If you don't have a bot token, follow this guide to make a bot and then add the bot to your server.

Presently this is Linux only, but you might be able to make it work with other OSs.

Make sure you have Python 3.10+, virtualenv (pip install virtualenv), and CUDA installed.
Clone the bot and setup the virtual environment.

git clone https://github.com/AmericanPresidentJimmyCarter/yal-discord-bot/
cd yal-discord-bot
python3 -m virtualenv env
source env/bin/activate
pip install -r requirements.txt

Setup transformers fork and ignore any version incompatibility errors when you do this.

git clone https://github.com/huggingface/transformers/
cd transformers
git checkout 20e54e49fa11172a893d046f6e7364a434cbc04f
pip install -e .
cd ..

Build 4-bit CUDA kernel.

cd bot/llama_model
python setup_cuda.py install
cd ../..

Download the 4-bit quantized model to somewhere local. For bigger/smaller 4-bit quantized weights, refer to this link.

wget https://huggingface.co/Neko-Institute-of-Science/LLaMA-13B-4bit-128g/resolve/main/llama-13b-4bit-128g.safetensors

Fire up the bot.

cd bot
python -m bot $YOUR_BOT_TOKEN --allow-queue -g $YOUR_GUILD --llama-model="Neko-Institute-of-Science/LLaMA-13B-4bit-128g" --groupsize=128 --load-checkpoint="path/to/llama/weights/llama-13b-4bit-128g.safetensors"

Ensure that $YOUR_BOT_TOKEN and $YOUR_GUILD are set to what they should be, --load-checkpoint=..." is pointing at the correct location of the weights, and --llama-model=... is pointing at the correct location in Huggingface to find the configuration for the weights.

Using an ALPACA model (Recommended)

You can use any ALPACA model by setting the --alpaca flag, which will allow you to add input strings as well as automatically format your prompt into the form expected by ALPACA.

Recommended 4-bit ALPACA weights are as follows:

Or GPT4 finetuned (better coding responses, more restrictive in content):

30b (MetaIX/GPT4-X-Alpaca-30B-Int4)

cd bot
python -m bot $YOUR_BOT_TOKEN --allow-queue -g $YOUR_GUILD --alpaca --groupsize=128 --llama-model="elinas/alpaca-30b-lora-int4" --load-checkpoint="path/to/alpaca/weights/alpaca-30b-4bit-128g.safetensors"

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
bot		bot
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yet Another LLaMA Discord Bot

What is this?

Installation

Using an ALPACA model (Recommended)

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Yet Another LLaMA Discord Bot

What is this?

Installation

Using an ALPACA model (Recommended)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages