This code infers actual travel times between two KMB (Hong Kong) bus stops by watching live ETAs, detecting when the front-most bus “hands off” to the next one, and pairing arrivals at Stop A and Stop B to compute A→B trip durations. Results are appended to CSV so you can analyze rush-hour patterns, reliability, and variability (see sample chart in output.png).
- Automatic route discovery: Finds every
(route, direction, service_type)that servesSTOP_AthenSTOP_Bin order, using KMB’s open-data endpoints, and caches the discovery toroute_configs_*.csv. - ETA front-runner tracking: Tracks the front-most bus (
eta_seq == 1) at each stop and emits an arrival timestamp whenever that identity switches to the next bus (a “handoff”). - Robust pairing: Pairs A-arrivals to later B-arrivals per
(route, dir, service_type)to compute travel seconds/minutes, writing each observation to a long-lived CSV. - Graceful, durable logging: Files are opened once and flushed as data arrives; press Ctrl+C to stop safely.
- Discover configs that include both stops in order (
seqA < seqB) for each route/direction/service type. These are saved/loaded viaroute_configs_{STOP_A}_{STOP_B}.csv. - Create a
FrontWatcherfor Stop A and Stop B for each config. When the top ETA identity changes beyond a tolerance, we infer the previous bus has arrived and log the timestamp. - FIFO pair: For each config, match the earliest unpaired A-arrival with the next B-arrival to compute
travel_seconds/travel_minutes, then append totravel_times_*.csv.
- Python 3.9+ and the
requestslibrary.
Install:
python -m pip install requests- Edit stops: In
main.py, setSTOP_AandSTOP_Bto your two KMB stop IDs and names (lat/long are optional metadata). - Run the logger:
python main.py- Let it run through the periods you care about (e.g., weekday mornings). Stop with Ctrl+C.
- Explore the generated CSVs or open the included notebook/script to chart distributions (see
output.pngfor an example visualization).
POLL_SECONDS: how often to poll KMB ETAs. Default 20s (KMB updates roughly once per minute).HANDOFF_TOL_SECS: if the top ETA’s timestamp jumps by more than this when the front bus changes, treat it as a real swap/arrival. Default 60s.
The script uses the official KMB open-data APIs for routes, route stops, and stop ETAs (configured at the top of the file).
-
Route config cache:
route_configs_{STOP_A}_{STOP_B}.csv— all(route, dir, service_type, seqA, seqB)serving A→B. -
Arrivals (per stop):
arrivals_{stop_id}.csvwith columns:logged_at_utc, route, dir, service_type, stop_id, stop_name, seq, arrived_at_utc -
Travel times (A→B):
travel_times_{STOP_A}_{STOP_B}.csvwith columns:logged_at_utc, route, dir, service_type, from_stop_id, from_stop_name, to_stop_id, to_stop_name, seq_from, seq_to, arrive_from_utc, arrive_to_utc, travel_seconds, travel_minutes
