Github Repo for an Independent Work Seminar at Princeton
This project focuses on predicting Pokemon types and generations using various machine learning models. So far I have implemented and compared several algorithms to classify Pokemon based on their stats and characteristics.
Data: I am using a dataset of Pokemon with features like HP, Attack, Defense, Special Attack, Special Defense, Speed, Height, and Weight. The dataset is scraped from https://pokemondb.net/pokedex/all.
- Primary Type Prediction
- Both Types Prediction (Multi-label classification)
- Generation Prediction
I have experimented with several machine learning models so far:
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Random Forests
- Support Vector Machines (SVM) - both linear and RBF kernels
- Gradient Boosting (XGBoost)
For each model, the general implementation steps are as follows:
- Data preprocessing and splitting
- Initial model training without cross-validation
- Cross-validation and hyperparameter tuning
- Performance evaluation using metrics like accuracy, Hamming loss, and classification reports
- Feature importance analysis