Fire Weather Index system for Parks Canada Agency (PEI Field Unit)
This project provides:
- Automated wildfire risk assessment across Prince Edward Island National Park
- Interactive FWI dashboard displaying current conditions at weather stations
- Station redundancy analysis to identify optimal sensor placement
- Future FWI forecasts using Environment and Climate Change Canada (ECCC) GDPS model data
Note: The dashboard displays static data generated from the pipeline. For truly live/real-time data, the pipeline must be deployed to a hosted environment with scheduled runs and API credentials configured.
π Fire Weather Index Dashboard β Interactive map with FWI values and 7-day forecasts (static data, updated on manual pipeline runs)
Analysis Reports:
- Network Analysis (visuals only) β Exploratory data analysis, redundancy results, FWI validation
- Network Analysis (with code) β Full analytical notebook including code
- Redundancy Module β PCA-based station overlap analysis source code
| Output | Description |
|---|---|
| FWI Dashboard | Interactive map showing FWI, DMC, DC, ISI, BUI values at each station (static data) |
| 7-Day Forecasts | GDPS-driven FWI projections for all park weather stations |
| Redundancy Report | PCA biplot showing which stations provide overlapping vs. unique coverage |
| Cleaned Data | Quality-controlled hourly and daily weather datasets |
| Uncertainty Bounds | Probabilistic risk assessment for station removal decisions (KDE-based confidence intervals) |
Weather stations across PEI National Park:
- Cavendish
- Greenwich
- North Rustico
- Stanhope (reference station, ECCC data source)
- Stanley Bridge
- Tracadie
This project implements an end-to-end OSEMN (Obtain, Scrub, Explore, Model, iNterpret) pipeline for Fire Weather Index calculation and weather-station redundancy analysis. It processes raw station data, validates against Environment and Climate Change Canada standards, and provides both interactive dashboards and programmatic access to results.
Key components:
- Data cleaning pipeline (
src/pea_met_network/cleaning.py) β normalization, resampling, imputation - FWI calculation engine β standard Canadian FWI chain (FFMC β DMC β DC β ISI β BUI β FWI)
- Redundancy analysis (
src/pea_met_network/redundancy.py) β PCA-based station overlap detection - Forecast pipeline β GDPS model ingestion with FWI chain propagation
- Interactive dashboard β Leaflet.js visualization with static data (can be made live with scheduled deployment)
- Analysis notebook (
analysis.ipynb) β full EDA, validation, and uncertainty quantification
| Stage | Implementation |
|---|---|
| Obtain | Raw station CSVs inventoried and schema-audited; GDPS model data fetched via ECCC API |
| Scrub | Ingestion, timestamp normalization, hourly/daily resampling, missing-value imputation |
| Explore | EDA notebooks, QA/QC summaries, correlation analysis |
| Model | Stanhope reference data ingestion, FWI chain execution, PCA redundancy analysis |
| iNterpret | Probabilistic uncertainty quantification, station consolidation recommendations |
# Clone repository
git clone https://github.com/Cstewart-HC/pei-parks-fwi.git
cd pei-parks-fwi
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate # or `.venv\Scripts\activate` on Windows
# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txtOr use the Makefile shortcut:
make installThe main entry point is pea_met_network.cleaning. It processes raw station CSVs from data/raw/, normalizes timestamps, resamples to hourly/daily frequencies, applies imputation, and writes cleaned datasets to data/processed/.
python -m pea_met_network
python -m pea_met_network --stations all
python -m pea_met_network --stations cavendish,greenwich --force
python -m pea_met_network --fwi-mode compliant
python -m pea_met_network --dry-runOptions: --stations (comma-separated or all), --force (reprocess), --dry-run (report only), --fwi-mode (hourly|compliant|extended), --no-fetch (skip downloads).
Missing raw data directories trigger clear error messages.
analysis.ipynb contains the full analytical narrative:
- Exploratory data analysis
- Redundancy analysis (PCA biplots, clustering)
- FWI logic validation
- Uncertainty quantification
jupyter lab analysis.ipynbmake lint # Ruff linting
make test # pytest test suite
make check # Type checking + linting + tests| Output | Location | Description |
|---|---|---|
| Cleaned hourly data | data/processed/ |
Quality-controlled hourly and daily resampled data |
| FWI values | data/processed/ |
Full FWI chain (FFMC β DMC β DC β ISI β BUI β FWI) |
| FWI forecasts | data/forecasts/*_fwi_forecast.csv |
7-day GDPS-driven projections |
| GDPS cache | data/gdps_cache/ |
Raw model data (YYYYMMDDTHH.json format) |
| Redundancy results | analysis.ipynb |
PCA biplot, clustering dendrograms |
| Dashboard HTML | dashboard/ |
Standalone Leaflet.js application |
pei-parks-fwi/
βββ .github/workflows/ # CI/CD (GitHub Actions, dashboard deploy)
βββ analysis.ipynb # Analytical narrative notebook
βββ dashboard/ # FWI geospatial dashboard
β βββ index.html # Main dashboard page
β βββ analysis.html # Notebook HTML (outputs only)
β βββ analysis_full.html # Notebook HTML (with code)
β βββ redundancy.html # Redundancy module source
β βββ css/ # Dashboard styles
β βββ js/ # Dashboard JavaScript
β βββ data/ # Static data for dashboard
βββ data/
β βββ raw/ # Raw station data (CSV, JSON, XLSX, XLE)
β βββ processed/ # Pipeline output (gitignored)
β βββ forecasts/ # FWI forecast CSVs + startup_state.json
β βββ gdps_cache/ # Cached GDPS model data
βββ docs/
β βββ cleaning-config.json # Pipeline configuration
β βββ data-sources.md # Data source documentation
β βββ fwi-forecast-plan.md # Forecast pipeline design
β βββ licor-ingestion-plan.md # LICOR sensor ingestion design
β βββ working-agreement.md # Project working agreement
β βββ pipeline/ # Architecture documentation
β βββ specs/ # Phase specifications (01-16)
βοΏ½οΏ½β notebooks/ # Historical notebooks
βββ scripts/ # Utility and build scripts
βββ src/
β βββ pea_met_network/ # Pipeline source code
β βββ cleaning.py # Main cleaning pipeline
β βββ fwi.py # FWI chain (vendored cffdrs_py wrapper)
β βββ fwi_forecast.py # GDPS forecast pipeline
β βββ gdps_fetcher.py # ECCC GDPS data fetcher
β βββ stanhope_cache.py # Stanhope ECCC data ingestion
β βββ redundancy.py # PCA redundancy analysis
β βββ uncertainty.py # Probabilistic uncertainty quantification
β βββ qa_qc.py # QA/QC checks
β βββ imputation.py # Missing value imputation
β βββ cross_station_impute.py # Cross-station imputation
β βββ validation.py # Data validation
β βββ vapor_pressure.py # Vapor pressure calculations
β βββ vendor/cffdrs/ # Vendored cffdrs_py (Van Wagner FWI)
βββ tests/ # Test suite
βββ AGENTS.md # Agent workspace rules
βββ Makefile
βββ README.md
βββ pyproject.toml # Package metadata (source of truth for versions)
βββ requirements.txt # Pinned runtime dependencies
βββ requirements-dev.txt # Pinned dev dependencies (pytest, ruff)
No required environment variables for basic pipeline operation. Forecast pipeline may use optional ECCC API credentials (configured via data/forecasts/startup_state.json).
The current dashboard displays static data. To enable live/real-time updates, you would need:
| Component | Requirement |
|---|---|
| Hosted environment | Cloud server (e.g., AWS, GCP, Azure) or on-premise machine |
| Scheduler | Cron job or GitHub Actions workflow to run pipeline hourly/daily |
| API credentials | ECCC GDPS API key (or alternative weather data provider) |
| Data publishing | Workflow to push updated CSVs to dashboard/data/ and trigger GitHub Pages deploy |
| Storage | Persistent storage for forecast cache (data/gdps_cache/) |
Example: A GitHub Actions workflow scheduled every 6 hours could:
- Fetch latest GDPS data from ECCC API
- Run FWI pipeline
- Update
dashboard/data/files - Commit and push to
mainbranch - Trigger automatic Pages deploy
pyproject.toml is the source of truth for versions. requirements.txt and requirements-dev.txt provide minimum pins.
Core:
pandas,numpyβ Data manipulationscipy,scikit-learnβ Statistical analysis, PCA, clusteringmatplotlib,seabornβ Visualizationrequests,httpxβ HTTP clients for ECCC dataopenpyxl,defusedxmlβ Excel/XML format supportarrow,PyYAML,jsonschemaβ Utilities
Development:
pytestβ Testing (e2e, slow, integration markers defined)ruffβ Linting and formatting
Note: requirements.txt contains minimum version pins; pyproject.toml defines the full dependency set including jupyter, nbconvert, prometheus_client, psutil, lark, and Jinja2.
DATA-3210: Advanced Concepts in Data β Semester Project
Client: Parks Canada Agency (PEI Field Unit)
Required themes:
- Python-based data pipeline and QA/QC
- Station redundancy analysis using PCA and/or clustering
- FWI calculation and validation
- Probabilistic uncertainty quantification
License: See project repository for license information. Contact: For dashboard issues or questions, contact Parks Canada PEI Field Unit.