A Prometheus exporter that monitors services using Atlassian StatusPage.io status pages. This exporter periodically checks status page APIs to track service health, incidents, and maintenance windows, exposing metrics for integration with Prometheus and Grafana.
- Features
- Metrics Exposed
- Metric Caching Strategy
- Configuration
- Docker Setup
- Integration with Prometheus
- Monitoring Schedule
- Requirements
- Status Monitoring: Tracks operational status of services using Atlassian Status Page.io format
- Incident Tracking: Monitors active incidents with detailed metadata (ID, name, impact, affected components)
- Maintenance Windows: Tracks scheduled and active maintenance events
- Response Time Metrics: Records API response times for each status check
- Prometheus Metrics: Exposes standard Prometheus metrics on configurable port
The exporter exposes the following Prometheus metrics:
-
statuspage_service_status: Service operational status (1=operational, 0=incident/down)- Labels:
service_name
- Labels:
-
statuspage_response_time_seconds: API response time in seconds- Labels:
service_name
- Labels:
-
statuspage_incident_info: Active incident metadata- Labels:
service_name,incident_id,incident_name,impact,shortlink,started_at,affected_components
- Labels:
-
statuspage_maintenance_info: Active maintenance event metadata- Labels:
service_name,maintenance_id,maintenance_name,scheduled_start,scheduled_end,shortlink,affected_components
- Labels:
-
statuspage_component_status: Individual component status- Labels:
service_name,component_name
- Labels:
-
statuspage_component_timestamp: Last update timestamp of component- Labels:
service_name,component_name - Values: Unix timestamp in milliseconds (for better Grafana compatibility)
- Labels:
-
statuspage_probe_check: Whether all queries on the application were successful- Labels:
service_name - Values: 1 (all successful), 0 (at least one failed)
- Labels:
-
statuspage_application_timestamp: Timestamp of last update of overall application status- Labels:
service_name - Values: Unix timestamp in milliseconds (for better Grafana compatibility)
- Labels:
The exporter uses intelligent caching to maintain metric freshness in Prometheus while preventing duplicate alerts. Gauges are always updated to keep metrics fresh, while the cache is used to preserve labels for existing incidents/maintenance to prevent alert churn.
All metrics are updated on every check cycle to ensure Prometheus knows they're still active and they appear correctly in Grafana dashboards:
statuspage_service_status: Always updated (even if status unchanged) to keep metrics freshstatuspage_incident_info: Always updated for active incidents to maintain freshnessstatuspage_maintenance_info: Always updated for active maintenance to maintain freshnessstatuspage_component_status: Always updated for all components to maintain freshnessstatuspage_component_timestamp: Always updated with current timestampstatuspage_application_timestamp: Always updated with current timestampstatuspage_response_time_seconds: Always cleared and updated every run (dynamic metric)statuspage_probe_check: Always updated to reflect current probe status
Label Preservation for Duplicate Alert Prevention:
- For existing incidents/maintenance (same ID), labels are preserved from cache to prevent duplicate alerts
- For new incidents/maintenance, current labels from the API are used
- This ensures Prometheus recognizes them as the same metric series, preventing alert re-firing
The cache is used for two purposes:
- Fallback on API failures: If an API request fails, cached data is used to maintain metric continuity
- Label preservation: Existing incident/maintenance labels are preserved to prevent duplicate alerts
Cache Update Logic:
- Cache only updates when meaningful values change (status, incident IDs, maintenance IDs, component status)
response_timeis excluded from cache (not used for alerts)- Labels for existing incidents/maintenance are preserved from cache to maintain consistent metric series
- Cache comparison is used for logging changes, but doesn't prevent gauge updates
Benefits:
- Metrics stay fresh in Prometheus and Grafana dashboards
- Prevents duplicate alerts by preserving labels for existing incidents/maintenance
- Maintains metric continuity even when individual API requests fail (falls back to cached data)
- Reduces unnecessary cache writes by only updating when meaningful data changes
Configure the services you want to monitor in services.json:
{
"service_key": {
"url": "https://status.example.com/api/v2/summary.json",
"name": "Example Service"
}
}Each service requires:
url: The full URL to the Status Page.io API summary endpoint (typically/api/v2/summary.json)name: Display name for the service in metrics
METRICS_PORT: Port for Prometheus metrics server (default:9001)SERVICES_JSON_PATH: Custom path toservices.jsonfile (default:/app/statuspage-exporter/services.json)CHECK_INTERVAL_MINUTES: Interval in minutes between status checks (default:20)DEBUG: Enable debug logging (set totrueto enable, default:false/INFO level)CLEAR_CACHE: Clear all cache files on startup (set totrueto enable, default:false)
The easiest way to use this exporter is with the published Docker image from Docker Hub:
Required: You must mount your own services.json file for the exporter to work. The image includes a services.json.example file as a template, but you must create your own configuration file with the services you want to monitor.
Optional: Environment variables can be set to customize behavior. See the Environment Variables section above for available options and their defaults.
docker run -d \
--name statuspage-exporter \
-p 9001:9001 \
-v /path/to/your/services.json:/app/statuspage-exporter/services.json \
mcarvin8/statuspage-prometheus-exporter:latestdocker run -d \
--name statuspage-exporter \
-p 9001:9001 \
-v /path/to/your/services.json:/app/statuspage-exporter/services.json \
-e CHECK_INTERVAL_MINUTES=20 \
-e DEBUG=true \
mcarvin8/statuspage-prometheus-exporter:latestAdd the exporter to your Prometheus configuration (prometheus.yml):
scrape_configs:
- job_name: 'statuspage-exporter'
scrape_interval: 30s
static_configs:
- targets: ['statuspage-exporter:9001']The exporter provides metrics that can be used to create Prometheus alerting rules for incidents, service status changes, and component degradation.
An example PrometheusRule manifest is provided in prometheus/prometheusrule-example.yaml that demonstrates:
- Incident Alerts: Alert when active incidents are detected for specific services
- Service Status Alerts: Alert on service status changes (down, maintenance)
- Component Alerts: Alert when individual components are degraded
- Performance Alerts: Alert on slow API response times
- Generic Alerts: Catch incidents across all monitored services
The example includes:
- Recording rules to aggregate incident metadata for easier alerting
- Alert rules with configurable thresholds and durations
- Template annotations with incident details (ID, name, impact, status page links)
- Customizable labels for routing to notification channels
To use the example:
- Copy
prometheus/prometheusrule-example.yamlto your Prometheus configuration - Update service names to match your
services.jsonconfiguration - Customize alert thresholds, durations, and notification channels
- Adjust metadata (namespace, labels) to match your Prometheus operator setup
- Apply the manifest:
kubectl apply -f prometheus/prometheusrule-example.yaml
The exporter performs status checks:
- Initial check: Executes immediately on startup
- Scheduled checks: Configurable interval via
CHECK_INTERVAL_MINUTESenvironment variable (default: 20 minutes)
You can customize the check interval by setting the CHECK_INTERVAL_MINUTES environment variable:
# Run checks every 10 minutes
export CHECK_INTERVAL_MINUTES=10
# Run checks every 30 minutes
export CHECK_INTERVAL_MINUTES=30- Python 3.6+
- Dependencies:
prometheus_clientrequestsapscheduler