Skip to content

LLM Inference Code of CAP-6640 Project: Evaluating Open-Source LLMs for Bengali Text Classification Across Multiple Domains

Notifications You must be signed in to change notification settings

eftekhar-hossain/Computational-NLU-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(CAP 6640 Course Project)

Evaluating Open-Source LLMs for Bengali Text Classification Across Multiple Domains


LLMs performance with two prompt settings across various classification tasks including Sentiment Analysis, Emotion Recognition, Hate Speech Detection, and Fake News Detection.

Instructions

  • To run the script you need to install Python=3.10.x. For LLM inference, both vLLM and Huggingface Pipeline has been used.

  • If you are using any IDE, then first clone (git clone <url>) the repository. Then create a virtual environment and activate it.

    conda create -n NLU Python=3.10.12 
    conda activate NLU
    
  • Install all the dependencies.

    pip install -r requirements.txt
    

Evaluated LLMs

We downloaded the instruct version of the models from the Huggingface Library.

Prompts for each task are organized in the Prompts folder.

LLM inference (Examples)

To get the Llama model response with Zero-Shot prompting for any task, run the following script. If you are not in the Scripts folder.

cd Scripts

python zero_few_shot.py \
--llm_id meta-llama/Llama-3.2-3B-Instruct \
--llm_name llama32-3B \
--dataset_name emo \            # emotion recognition
--prompt_type zero              # zero shot prompting

To get the Qwen model response with Few-Shot prompting for any task, run the following script. If you are not in the Scripts folder.

cd Scripts

python zero_few_shot.py \
--llm_id Qwen/Qwen2.5-72B-Instruct-AWQ \
--llm_name qwen-72B \
--dataset_name hate \        # hate speech detection
--prompt_type few           # few shot prompting

Arguments

  • --llm_id: Specify the LLM want to use.
  • --llm_name: Specify the llm name.
  • --dataset_name: Specify dataset name (option: senti,emo,hate, or fake).
  • --prompt_type: Specify the prompting technique you want to use. (zero,few)

You will get an excel file in Results/ folder that store the responses for the corresponding LLM.

The plot-notebook.ipynb file contains the code for the result visualization.


About

LLM Inference Code of CAP-6640 Project: Evaluating Open-Source LLMs for Bengali Text Classification Across Multiple Domains

Topics

Resources

Stars

Watchers

Forks