v2: Cached Isochrones

I was reading this [DoorDash blog](https://doordash.engineering/2017/09/22/how-we-designed-road-distances-in-doordash-search/) and came across the term _isochrone_.

In our context, an [isochrone](https://en.wikipedia.org/wiki/Isochrone_map) is a polygon of the area a driver can reach in a given amount of time.

<img src="https://user-images.githubusercontent.com/5129994/171278922-859c075b-0b7b-4802-add8-6871359fceeb.png" height="250" />

Currently, we're querying for points in near k-rings at fine resolutions (which is kind of like a concentric circle approach; points in smaller concentric circles are ranked higher). These concentric circles are like _isochores_ ("same space") since they consider distance, not duration.

## What if 💭 

What if we could avoid the Distance Matrix call to Google Maps (127ms) or OSRM (680ms) entirely? 

What if we pre-computed and cached the distance matrix such that we could say what the duration was between any two hex cells?

## Importance of indexing
The set of latitude/longitude points (real numbers) is uncountably infinite. We can use geospatial indexing to transform this space into something finite and countable. That's where something like H3 comes in. Our isochrones will be stored as a list of cell indexes, not as a polygon. The cache key is the geohash of the origin.

Williamsburg, Brooklyn consists of ~49 hex cells at Resolution 9.
Brooklyn consists of ~2500 of those cells.

## Algorithm for Building the Matrix
For each k=1 k-ring at Resolution 7...

<img src="https://raw.githubusercontent.com/kevinmichaelchen/api-dispatch/main/docs/res%3D7.png" width="500" />

Get the pairwise Distance Matrix for all (343) descendant resolution 9 cells within the k-ring...

<img src="https://raw.githubusercontent.com/kevinmichaelchen/api-dispatch/main/docs/hex.png" height="200" />

This will result 117,000+ pairwise elements.

Finally, persist these durations in a datastore (e.g., Postgres). Also persist hex elements in appropriate isochrone based on time (1m, 2m, 3m, 4m, 5m, etc)... (e.g., "the sub-3-minute isochrone from Cell A (res 9) is a list of all of these other res-9 cells").

## Query time
Take the example of querying nearby trips for a driver. In this case, the driver is the origin.

Instead of going to Google Maps to get a distance matrix (which costs money and time, and isn't something we can legally cache), we can check if we've cached an isochrone for the driver's location. If we have one, we can rank based on the isochrone time (e.g., 5m, 10m, etc). What we need for this to work is the point-in-polygon function.
 
## Accounting for traffic
If we want a final sorting/ranking pass that factors in traffic, that's also possible.

## Running an OSRM server in Docker
The demo OSRM endpoint's terms of usage limits you to 1 request per second. To actually do this preprocessing, it would be ideal to have a high-resource machine with lots of memory and compute power.
> It takes significantly more RAM to process the data than it does to host it. As long as you keep the operating systems/machine architectures the same, you can rent a huge box for a short period to generate the routing files, then move them to a smaller box to actually perform route calculations on.

We can run the [osrm-backend](https://hub.docker.com/r/osrm/osrm-backend/) container.

First, download NY OSM data for [Geofabrik](http://download.geofabrik.de/). (This doesn't include any parts of NJ across the river, or any parts of CT).
```
wget http://download.geofabrik.de/north-america/us/new-york-latest.osm.pbf
```
Pre-process the extract with the car profile and start a routing engine HTTP server on port 5000
```
docker run --platform linux/amd64 \
  -t \
  -v "${PWD}:/data" \
  osrm/osrm-backend osrm-extract \
    -p /opt/car.lua \
    /data/new-york-latest.osm.pbf
```
I needed the `--platform` flag on an M1 Mac.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2: Cached Isochrones #22

What if 💭

Importance of indexing

Algorithm for Building the Matrix

Query time

Accounting for traffic

Running an OSRM server in Docker

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

v2: Cached Isochrones #22

Description

What if 💭

Importance of indexing

Algorithm for Building the Matrix

Query time

Accounting for traffic

Running an OSRM server in Docker

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions