I owe somebody what amounts to this blog post. Pardon the lack of illustrative diagrams.
I have been thinking about mass transit trip planning software for the web and for mobile devices. Between the individual efforts of agencies around the world, and Google's efforts towards open sharing of structured transit system data, we seem to be on the right track, institutionally speaking. As a user, however, I am perpetually frustrated by the focus that every transit trip planner I have ever used puts on the supposed schedule, even for services that are high frequency and/or less-than-perfectly reliable.
This general feeling, combined with two recent and exciting meetings I have had, leave me with a few nagging questions:
- In providing transit users with such software, how useful is the schedule by which the transit provider has planned their operations?
- When are expected waiting and travel times more useful than precise trip-by-trip itineraries?
- What effect do randomness and unreliability have on those expectations?
- Should the passenger plan her trip differently if she has to be on time than if her schedule is flexible?
- Finally, does real-time information obviate the need for any or all of these other inputs?
The answer: it depends. The actual schedule (R trains leave Union St at 8:13, 8:25, 8:37 arriving at Union Square at 8:39, 8:51, 9:03, etc) is only relevant to the degree to which operations follow the plan. And even in the face of near-perfect operations, I only care about the schedule of departures when I have something to lose by ignoring it (i.e. when there's not always another train or bus in tolerably few minutes).
Expectations implied by the schedule (I should wait 6 minutes on average, but never longer than 12, and the ride is expected to take 26 minutes) are meaningful even when the precise schedule isn't, but only if those expectations are reasonable. For example, a simple model shows that as the service becomes even slightly variable, expectation of waiting time increases, as does the maximum. Of course, many things that cause some passengers to wait longer are experienced by other passengers as delays along the way.
Let's now think specifically about trip planning software for relatively high frequency urban transit services with normal amounts of variability. I don't want to be bothered with exact but fairly useless times of scheduled departures and arrivals. I just want to know how long I can realistically expect to have to wait, and how long the trip is likely to take. And when I have a hard timeline, like getting to a meeting or a catching an airplane, I want to know the (approximately) worst case scenario.
Current levels of unreliability in our transit systems are not something we should have to live with. More funding, saner public policy, and better management can go a long way towards fixing some problems. I am not focusing here on the sources of unreliability, but suffice it to say they are many, some debatably the provider's responsibility (eg missing drivers, faulty equipment) and some debatably not (eg on-street traffic, passenger behavior). But given that they are here today, would you rather think a trip will be fast and have it end up being slow, or would you prefer to have the best information possible when making your own decisions?
The copious amounts of real service data collected by transit providers from bus GPS and rail signaling systems are of great value here. They allow us to fairly easily and cheaply describe distributions of waiting and travel times, and thus estimate expectations and approximate maximums for use in trip planning software.
Often, those systems were in fact installed to provide real time data, with historical performance analysis a secondary or accidental purpose. The notion of an expected waiting time changes radically when real-time "next-vehicle" information is provided, assuming the real-time predictions are in fact accurate. However, even perfect real-time data doesn't protect from problems from occurring down the line or reduce the variability inherently introduced by successive transfers.
In the next generation of (open source?) web and mobile transit trip planning, please:
- Give me the option to use the schedule or to use expected values, but try to be smart about the default.
- When not using the schedule, please allow me to plan depending on how flexible my own schedule is.
- Use real performance data to generate realistic expected and worst case scenarios.
- When possible, especially when the trip is imminent, use real time data to reduce uncertainty in my trip plan, but make use of realistic expectations for forecasting the balance of the trip.
To implement such a trip planner, a number of open questions remain:
- Even for a perfectly reliable system, where exactly is that threshold between using the schedule and using expectations?
- How does this threshold change as a function of normal or excessive variability in operations?
- What is the best way to integrate real-time data (of varying predicative quality) with realistic expectations for trip planning on-the-go?
If you're still awake, and have comments or questions, let's talk. The fact that this post found its way onto your computer makes it highly likely you already know how to get in touch.