This repository host the implementation of the persona databases project
-
Clone the repository and change to its working directory:
git clone https://github.com/dice-group/persona-db.git cd persona-db
-
Create and activate a Conda environment:
conda create -n .venv python=3.11.13 --no-default-packages conda activate .venv
-
Install pip and dependencies:
conda install pip pip3 install -r requirements.txt
-
Configure your model and dataset split:
- Open
config.py
.- Set MODEL_NAME to the path of your quantized model.
- Set DATASET_SPLIT to the portion of the dataset you want to run. This helps avoid reprocessing data.
- Examples:
- DATASET_SPLIT = "train" # Runs on the entire training set
- DATASET_SPLIT = "train[:100]" # Runs on the first 100 samples only
- DATASET_SPLIT = "train[101:1001]" # Runs from sample 101 to 1000
- Open
-
Run the main script:
python3 main.py