(CAP 6640 Course Project)

Evaluating Open-Source LLMs for Bengali Text Classification Across Multiple Domains

LLMs performance with two prompt settings across various classification tasks including Sentiment Analysis, Emotion Recognition, Hate Speech Detection, and Fake News Detection.

Instructions

To run the script you need to install Python=3.10.x. For LLM inference, both vLLM and Huggingface Pipeline has been used.
If you are using any IDE, then first clone (git clone <url>) the repository. Then create a virtual environment and activate it.
```
conda create -n NLU Python=3.10.12 
conda activate NLU
```
Install all the dependencies.
```
pip install -r requirements.txt
```

Evaluated LLMs

We downloaded the instruct version of the models from the Huggingface Library.

Prompts for each task are organized in the Prompts folder.

LLM inference (Examples)

To get the Llama model response with Zero-Shot prompting for any task, run the following script. If you are not in the Scripts folder.

cd Scripts

python zero_few_shot.py \
--llm_id meta-llama/Llama-3.2-3B-Instruct \
--llm_name llama32-3B \
--dataset_name emo \            # emotion recognition
--prompt_type zero              # zero shot prompting

To get the Qwen model response with Few-Shot prompting for any task, run the following script. If you are not in the Scripts folder.

cd Scripts

python zero_few_shot.py \
--llm_id Qwen/Qwen2.5-72B-Instruct-AWQ \
--llm_name qwen-72B \
--dataset_name hate \        # hate speech detection
--prompt_type few           # few shot prompting

Arguments

--llm_id: Specify the LLM want to use.
--llm_name: Specify the llm name.
--dataset_name: Specify dataset name (option: senti,emo,hate, or fake).
--prompt_type: Specify the prompting technique you want to use. (zero,few)

You will get an excel file in Results/ folder that store the responses for the corresponding LLM.

The plot-notebook.ipynb file contains the code for the result visualization.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Datasets		Datasets
Prompts		Prompts
Results		Results
Scripts		Scripts
.gitignore		.gitignore
Final_Report.pdf		Final_Report.pdf
README.md		README.md
llm-perf.jpg		llm-perf.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

(CAP 6640 Course Project)

Evaluating Open-Source LLMs for Bengali Text Classification Across Multiple Domains

Instructions

Evaluated LLMs

LLM inference (Examples)

About

Uh oh!

Uh oh!

Languages

eftekhar-hossain/Computational-NLU-Project

Folders and files

Latest commit

History

Repository files navigation

(CAP 6640 Course Project)

Evaluating Open-Source LLMs for Bengali Text Classification Across Multiple Domains

Instructions

Evaluated LLMs

LLM inference (Examples)

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages