Skip to content

Latest commit

 

History

History
66 lines (42 loc) · 1.4 KB

File metadata and controls

66 lines (42 loc) · 1.4 KB

🎥 Sentiment Analysis of Video Data using AI

This project uses text, audio, and visual cues extracted from a video to classify sentiment as Positive, Negative, or Neutral. It leverages powerful models like BERT, LSTM, and CNN to encode features from each modality and fuses them to predict the emotion.


📌 Features

  • 🔤 Text encoding with BERT (transformers)
  • 🎧 Audio encoding with MFCC + LSTM
  • 🎞️ Visual encoding with CNN (OpenCV)
  • 🤖 Multimodal fusion for final sentiment prediction
  • 🛠 Extracts and processes audio/video automatically

📁 Project Structure


🚀 Installation

  1. Clone the repository
git clone https://github.com/your-username/multimodal-sentiment-analysis.git
cd multimodal-sentiment-analysis 
  1. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
  1. Install dependencies
pip install -r requirements.txt

requirements.txt

To run the code in this repository, you need to have the following libraries inside requirements.txt:

  • torch
  • transformers
  • librosa
  • opencv-python
  • subprocess
  • torchvision
  • numpy
  • os

Output

After upload an video data

image

🙋‍♂️ Author

  • 👤 Vikash Kumar