Skip to content

rishika0212/Image_segmentation_using_unet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgriTech Semantic Segmentation

Advanced semantic segmentation for precision agriculture using UNet architecture with optimized training and comprehensive evaluation.

Overview

This project implements a state-of-the-art semantic segmentation pipeline for agricultural applications, focusing on pixel-wise classification of agricultural imagery to identify crops, soil, weeds, and infrastructure. While initially trained on Cityscapes data, the model is specifically designed for transfer to agricultural datasets with minimal adaptation.

Features

Architecture & Implementation

  • UNet Architecture: Encoder-decoder with skip connections for precise boundary detection
  • Progressive Dropout: Increasing rates (0.05→0.25) in encoder, decreasing (0.2→0.05) in decoder
  • Enhanced Output Layer: Additional refinement block before final classification

Training Configuration

  • Environment & Hyperparameters: CUDA-compatible with configurable batch size, learning rate, and epochs
  • Reproducibility: Comprehensive seed setting for PyTorch, NumPy, and CUDA operations

Optimization Techniques

  • Composite Loss Function: Weighted combination of CE (50%), Focal (30%), and Dice (20%) losses
  • Gradient Management: Clipping (max_norm=1.0) and accumulation (steps=2) for stability and effective larger batches
  • Learning Rate Scheduling: ReduceLROnPlateau based on validation IoU
  • Data Augmentation: Conservative color jittering, horizontal flipping, and normalization
  • Test-Time Augmentation: Horizontal flip with prediction averaging during validation/testing

Post-Processing

  • Small Region Removal: Elimination of isolated predictions below minimum size threshold
  • Confidence-Based Refinement: Boundary enhancement through prediction confidence analysis

Comprehensive Evaluation

  • Quantitative Metrics: Full suite including IoU, mAP, precision, recall, F1, pixel accuracy, SSIM, PSNR, and MSE
  • Qualitative Results: Visual prediction analysis with side-by-side comparisons
  • Confusion Matrix: Detailed class-wise performance analysis for targeted improvements

Requirements

  • Python 3.7+
  • PyTorch 1.7+
  • torchvision 0.8+
  • NumPy 1.19+
  • scikit-learn 0.24+
  • scikit-image 0.18+
  • matplotlib 3.3+
  • seaborn 0.11+
pip install -r requirements.txt

Dataset Structure

dataset/
├── leftImg8bit/    # Input images
│   ├── train/
│   ├── val/
│   └── test/
└── gtFine/         # Ground truth masks
    ├── train/
    ├── val/
    └── test/

Usage

Training

python train.py

Evaluation

python evaluate.py

Configuration (config.py)

BATCH_SIZE = 4
NUM_CLASSES = 34  
LEARNING_RATE = 1e-4
EPOCHS = 77
IMAGE_SIZE = (256, 512)
DATASET_PATH = "dataset"

Model Evaluation Summary

Key Improvements Over Baseline

  1. Enhanced Architecture: Progressive dropout and refined output layer
  2. Advanced Training: Composite loss, gradient accumulation, and TTA
  3. Post-Processing: Small region removal for cleaner predictions

Strengths and Limitations

Strengths:

  • Robust optimization with early stopping and adaptive learning rates
  • Comprehensive evaluation framework with multiple metrics
  • Flexibility for agricultural dataset adaptation

Limitations:

  • Currently trained on urban scenes, pending agricultural fine-tuning
  • Memory-intensive architecture for edge deployment
  • Limited agricultural-specific temporal pattern handling

Architectural Justification

Why UNet?

  1. Skip Connections: Preserve spatial details critical for precise crop/weed boundaries
  2. Multi-scale Feature Extraction: Captures individual plants, row patterns, and field layouts
  3. Efficient Training: Performs well even with limited agricultural training data
  4. Flexibility: Easily adaptable to multiple agricultural class types

Alternatives Considered

  • DeepLabV3+: More complex with higher parameter count
  • PSPNet: Better global context but loses fine details
  • Lightweight Models: Faster but significant quality reduction
  • Transformer-Based: Potentially better but requires larger datasets

Results and Visualization

Training Outputs

  • Model Checkpoints: Best and final models saved automatically
  • Training Curves: Loss and IoU progression visualization
  • Detailed Logging: Per-epoch metrics with learning rate tracking

Evaluation Outputs

  • Comprehensive Metrics Report: Detailed in evaluation_results.txt
  • Visual Predictions: Sample outputs with ground truth comparison
  • Class Performance Breakdown: Per-class metrics for targeted improvement

Future Improvements

  • Custom agricultural dataset adaptation
  • Additional agriculture-specific augmentations
  • Boundary refinement for precise crop delineation
  • Edge-optimized deployment for field use

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages