-
Notifications
You must be signed in to change notification settings - Fork 1
Stop to Shape Matching
GTFS data defines a trip as a sequence of stops. There is often also shape data associated with that trip, describing the geographic path traveled by the transit vehicle during the trip. When importing data, OneBusAway attempts to figure out where along the shape of travel each stop lies. It's designed to be flexible, but sometimes the stop locations and the shape data don't totally agree.
Specifically, OneBusAway attempts to find the best location along a shape for each stop. The stop-to-shape matching is smart enough to deal with routes that loop and other complex shapes. However, sometimes we just can't find a good match. Specifically, if stop A comes before stop B in a trip sequence, but A comes after B when matching the stops to shape data, we have a problem.
When OneBusAway detects such a matching problem, it spits out an error message along with a LOT of debugging information.
This is a place where we couldn't find a good match between the stops and the shape data Morris: Yes, I've seen that error. One of about 2000 similar me: yep when it rains, it pours The error includes a LOT of debug output To help you (ok me) figure out what's going on Morris: Oddly, that seems to reference a Pierce Transit trip! me: that's... true that's strange ooh! I have a theory but first let's go through the data Morris: I've had that PT data in the previous bundle. me: right but let me show how you would figure this out on your own Morris: That's the goal. me: So the first bunch of debug output lists the lat-lon location of each stop for the trip along with it's ids And the points along the shape where that stop could potentially lie (We call these assignments) First couple of entries look like: 47.62 -122.32 0 1_9320 47.620079966001924 -122.32915494936833 1145.11 688.00 11 47.61 -122.33 1 1_940 47.607038769281715 -122.33691959492059 3503.08 615.47 43 47.61 -122.34 2 1_1920 47.61012527189429 -122.33966533081063 3102.73 28.75 40 ^ potential problem here ^ 47.61 -122.34 3 1_280 47.61012527189429 -122.33966533081063 3102.73 28.75 40 I realize that the legend at the top:
Is maybe not totally correct It should be: Morris: stop_lat, stop_lon, sequence, trip_id? stop_id me: # stopLat stopLon stopId # locationOnShapeLat locationOnShapeLon distanceAlongShape distanceFromShape shapePointIndex It's a nested list Main entires are the stops (their location + stop id) Nested entries are points on the shape, including their distance along the shape, the stop's distance from the shape, and the shape point index If you imagine snapping each stop to the shape and measuring the distance from the start of the shape to the stop, you'd hope that the distance is always increasing aka a bus should never travel backwards I'm going to give you a second to let that sink in Let me know if that doesn't make sense Morris: Well, sometimes a bus will take a route that does bring it closer to the start point. me: but the distance ALONG the shape should still increase Morris: Ah. Sure. me: Even if the "as-the-crow-flies" distance goes down Morris: That's clear me: ok So looking a the debug output We started with stop 1_9320, found a point along the shape that's 1145 meters in, looks good Same for the second stop 1_940 (3503 meters along the shape at this point) But then we look at stop 1_1920 and we see that the distance along the shape is 3102.73! Aka the bus moved backwards 400 meters somethings not right here Morris: Which could happen on Queen Anne on an icy day, but not by design. me: right So what's going on? It's usually one of two things:
- The shape data is wrong
- The stop data is wrong At this point, I like to visualize what's going on and I even wrote a simple tool to help http://developer.onebusaway.org/maps/ This is a little widget I put together for quickly plotting things on a map Morris: I see that. me: For example, you can copy the following in the big text box in the upper right: 47.62 -122.32 0 1_9320 47.61 -122.33 1 1_940 47.61 -122.34 2 1_1920 47.61 -122.34 3 1_280 And hit "Map" with "Points" selected, and it should plot the position of the four stops It will even show the stop id in an pop-window if you click on a marker Which is handy for figuring out what corresponds to what Morris: Yes me: Let's also copy in some shape data 47.6157869139 -122.338751588 2242.7883149884256 47.6151181982 -122.339600384 2340.6505025274637 47.6144965119 -122.340422397 2433.253538211936 47.6144070258 -122.340532851 2446.1980416758915 47.6138332007 -122.341275393 2530.869899243129 47.6137660214 -122.341362284 2540.7807020344662 47.6131356528 -122.342212341 2635.5087920073993 47.6124857742 -122.343081978 2732.8306661169718 47.6116871139 -122.341740594 2866.985502707464 47.6109478971 -122.340454014 2993.7063787125153 47.610786218 -122.340274078 3016.1819026690114 47.6107186651 -122.340210924 3025.060900783843 47.6102675258 -122.339781771 3084.654962867508 47.6092280261 -122.338930918 3216.673362909483 47.6082381679 -122.338061536 3344.5895835571296 47.6072657033 -122.337131968 3473.233312506266 47.6065280165 -122.336441622 3570.2230287296 47.6061766019 -122.336121383 3616.084785249523 47.6058279268 -122.335801243 3661.683453649191 47.6051256668 -122.335124317 3754.8138923214933 47.6044172204 -122.334491793 3846.7619535426834 That's a section of the shape (Which is printed out as the second debug message) Copy that into the maps widget thing but this time select "polyline" and hit map and you should get a nice black line Morris: true me: So hopefully you should now have shape line for the trip and some stops Morris: And the stops aren't in the same neighborhood. me: bingo! that's a bad thing the next quest is "what's going on here?" Morris: And how did the cancer spread to the Pierce Transit? me: So that was my first reaction How can a KCM update start breaking Pierce Transit? I think the issue is Puget Sound Stop Consolidation Recall that there are overlapping stops shared between the two agencies Morris: I see that in the admin guide. Sent at 10:33 PM on Thursday me: If you actually grep "4669656-12FEB-MVS-WKD-Weekday-03" from stop_times.txt for the most recent Pierce Transit GTFS, you'll see that it visits these PT stops: 1133 3020 4333 4241 (first four) Those would all typically be 3_1133, 3_3020, etc because Pierce Transit has the 3_ prefix. Morris: I see those stop_times records, yes. me: But we see in the PugetSoundStopConsolidation... the following entry: 1_9320 3_1133 Which says PT stop 3_1133 should be consolidate to KCM stop 1_9320 (the first stop in the row is the one we keep, the other stops are the ones we consolidate) This is how KCM stop 1_9320 ends up used in a Pierce Transit trip Sent at 10:37 PM on Thursday me: It's also how the next stop, 1_940, is included, as it was mapped from 3_3020 in the consolidation file. Morris: My understanding: each has their own number for the stop. We take the metadata from the master stop. In this case, the KCM stop. If that's wrong, then the Pierce Transit trip is wrong. me: I think the PT trip is ok I think something has happened to the location of KCM stop # 940 Notice it's the first stop well off the shape data for the PT trip The PT trip looks good to me I know it runs down 2nd avenue Where as stop #940 looks suspect Morris: "Notice it's the first stop well off the shape data for the PT trip" How would I notice that? me: In the map widget We plotted stops + shape I clicked on one of the stops that is well away from the shape And saw 1_940 Sent at 10:41 PM on Thursday me: my first fear was that KCM had changed their stop ids Such that stop #940 in the previous feed was different from stop #940 in the new feed Which would break all sorts of things because everybody bookmarks stops by stop # but I just checked and the stop has not changed