Skip to content

andreeaacotos/urban-vegetation-classification-with-catboost

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

9 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Urban Vegetation Classification with CatBoost

This repository presents the methodology for classifying vegetation in urban areas using CatBoost, a powerful gradient boosting algorithm.

The project is part of Omdena's Local Chapter challenges: Standardized Comparison of Urban Green Space Mapping Through Remote Sensing for Frankfurt, Germany, aiming to address the challenge of detecting and mapping small urban green spaces using machine learning and deep learning techniques. ๐ŸŒณ๐ŸŒ๐Ÿ“ธ

This initiative focuses on leveraging advanced algorithms to enhance the accuracy and efficiency of urban vegetation classification. By combining high-resolution satellite imagery with innovative computational methods, the project seeks to create robust solutions for monitoring urban greenery, ultimately contributing to sustainable urban planning and environmental preservation.

Overview ๐Ÿ“Š

The notebook catboost_Omdena-Frankfurt.ipynb includes the following steps:

  1. Environment Setup โš™๏ธ

    • Install necessary Python packages including numpy, scikit-learn, CatBoost, rasterio.
    • Setup DagsHub client for data downloading ๐ŸŒ.
  2. Data Preparation ๐Ÿ“‚

    • Download and load remote sensing data using DagsHub ๐Ÿ“ฅ.
    • Load images and masks from the downloaded data.
    • Preprocess the images and masks for the model.
  3. Exploratory Data Analysis ๐Ÿ”

    • Visualize sample images and masks to understand the dataset structure ๐Ÿ“‰.
  4. Preprocessing and Training the CatBoost Model

    • Extract relevant bands from the masks to create a binary vegetation mask ๐ŸŒฟ.
    • Flatten the images and prepare the target labels for binary classification ๐Ÿ”ข.
    • Handle class imbalance by calculating class weights.
    • Split the data into training, validation, and test sets ๐Ÿ”„.
    • Initialize and train the CatBoost model with class weights and validation set.
  5. Model Evaluation ๐Ÿ“Š

    • Evaluate the trained model on the test set using various metrics including precision, recall, F1-score, and ROC-AUC score ๐ŸŽฏ.
    • Visualize feature importance and ROC curve.
    • Plot training and validation metrics over epochs.

    More Details:

    The model is trained on a large dataset with optimized hyperparameters to enhance predictive accuracy. Performance is evaluated using standard classification metrics, including precision, recall, F1-score, and ROC-AUC score, ensuring a robust assessment of the model's effectiveness ๐ŸŽฏ.

    Key Results ๐Ÿ“Š

    • Overall Accuracy: 91% โœ…
    • Precision for Vegetation: 98% ๐ŸŒฟ
    • Precision for NON-Vegetation: 68%
    • Recall for Vegetation: 91% ๐Ÿ”
    • Recall for NON-Vegetation: 92%
    • F1 score for Vegetation: 94%
    • F1 score for NON-Vegetation: 78%
    • ROC-AUC Score: 0.97 ๐Ÿ“ˆ

    The confusion matrix reveals that most vegetation areas are correctly classified, though some non-vegetation pixels are misclassified as vegetation. These results highlight CatBoost's strong predictive capability and reliability for urban vegetation classification ๐ŸŒ.

  6. Model Inference ๐Ÿ”ฎ

    • Upload test images and evaluate the model's performance on new data ๐Ÿ†•.

Notebook Details ๐Ÿ““

Environment Setup โš™๏ธ

# Install necessary packages
!pip install numpy==1.25.2 scikit-learn catboost pyrsgis rasterio focal-loss segmentation_models dagshub

Conclusion ๐ŸŽฏ

CatBoost proves to be a highly effective and computationally efficient solution for urban vegetation mapping, achieving good performance metrics. This approach demonstrates its potential for real-world applications in remote sensing and urban green space monitoring ๐ŸŒฑ๐Ÿ›ฐ๏ธ.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published