Skip to content

Why Checkpoint Mechanism exists?

Phillip Rafail Papadakis edited this page Jan 18, 2025 · 1 revision

This wiki page is created to explain the usage and the severity of the checkpoint mechanism which is embedded to the agent

1. Purpose

  • To improve efficiency by fetching only new events since the last successful data push.
  • Minimizes unnecessary data transfer and processing overhead.

2. Workflow

  • Initialization:
    • On startup, the agent checks if checkpoint.json exists:
      • If it exists, it reads the file to retrieve the stored timestamps.
      • If it does not exist, it creates a new checkpoint.json file and initializes it.
  • Data Retrieval:
    • For each bucket, the agent retrieves events using the timestamp stored in checkpoint.json as the start query parameter in the ActivityWatch REST API.
    • If no timestamp exists for a bucket, all events from the start are fetched.
  • Data Push and Update:
    • After successfully pushing events to Prometheus, the agent updates the corresponding bucket's timestamp in checkpoint.json.

3. Fail-safe

  • If the checkpoint mechanism is not used, the agent defaults to retrieving all data. While less efficient, this ensures functionality even in cases where checkpoint.json is unavailable or corrupted.

4. JSON Structure (Example)

{
    "bucket1": "2024-11-25T15:30:00Z",
    "bucket2": "2024-11-25T15:45:00Z",
    "bucket3": "2024-11-25T16:00:00Z"
}
  • Keys: Bucket IDs.
  • Values: ISO 8601 formatted timestamps of the last successfully pushed events.

5. Advantages

  • Efficiency: Retrieves only required data, reducing API calls and processing time.
  • Flexibility: The system can function without the checkpoint mechanism if needed, albeit less efficiently.
Clone this wiki locally