Skip to content

monitoring and alerts #84

@gubuntu

Description

@gubuntu

Problem

When there is a problem or the server goes down, stakeholders should be notified automatically and not have to wait for someone to discover the issue.

from Charlotte, Rizky email thread 15 Jan 2018:

Actually I need your input for this one. If this happens in production like in the past, I informed Ibu Dian (BNPB) and Pak Yedi + Pak Artadi (BMKG). But since this is a staging server and mainly being used by PVMBG, is Ibu Estu the right person to inform too?

I think that it would be good if you can inform Kartoza and Geoscience Australia (me, Rikki, David) and DMI (Ivan) any time either staging or production are down. I think it would be good to contact the partners; at the moment this is BNPB, BMKG and PDC for production and PVMBG and BNPB for staging. Considering the imminent migration to production - I think we could look at expanding this list to include all agencies who will be users of the sustainable system (BNPB, BMKG, PVMBG, BNPB, PetaBencana, PDC, BPBD DKI Jakarta) and until June 2018 agencies involved in the development and maintenance (Kartoza, DMI, GA).

Proposed solution

  • Sentry
  • Grafana / Prometheus

Also (from realtime meeting minutes):

Set up communication protocol for reporting and discussing issues. E.g. put alert on website. Email stakeholders. E.g. if a site is down. Channel for reporting an outage. E.g. Zendesk. Build a status log. Like https://status.github.com/messages

The solution should be built into Kartoza standard orchestration

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions