The Bus Stop Problems
Since we had so much fun with Bayes Theorem in a recent post, I can’t resist another.
Young Economics whippersnapper Evan Soltas posed two problems to do with Bayesian probability:
- You arrive at a bus stop in an unfamiliar part of town. Assume that buses arrive at the stop as a Poisson process, with an unknown (to you) rate, . You don’t know , but say you have a prior probability distribution for it, .
- What’s your expected wait time, , for the next bus to arrive?
- Say you’ve been waiting for a time . What’s your posterior probability distribution, , and what’s your new expected wait time?
- Let’s add some more information. Say that riders arrive at the bus stop via an independent Poisson process with an (unknown to you) rate, . Whenever a bus arrives, all those waiting at the stop get on it. Thus, the number of people waiting is the number who arrived since the last bus. Say you arrive at the stop to find people already waiting. You wait for a time, , at which point there are other people waiting at the stop (i.e., arrived while you were waiting).
- Given this data, what’s your posterior probability distribution, ?
- What’s your new expected wait time, ?
These questions illustrate one of my favourite points of view on Bayes Theorem, namely that it induces a flow on the (infinite-dimensional!) space of probability distributions. Understanding the nature of that flow is, I think, the key task of the subject.
Infinite dimensions are hard to get an intuition for, so one of the first tasks is to cut the problem down to a finite-dimensional one.
Say we have a finite-dimensional family of probability distributions. We will call that family natural for the problem at hand, if the Bayes flow keeps us within that finite dimensional space.
A natural family of probability distributions for Problem 1 is the -distribution. Let us choose a prior in that family where, for reasons that will be apparent in a moment, we require . For fixed , the expected wait time is So, given our prior, we expect to wait
Applying Bayes Theorem, our posterior distribution, after waiting a time , is which, as announced, is just a shift of parameters of the -distribution. We immediately conclude that our expected wait time has gone up. The longer we wait, the longer we expect to continue having to wait!
You should pause to convince yourself that’s the generic behaviour, whatever prior you assumed.
What about Problem 2? The first task is to compute, for fixed rates, , the probability that
- There are people waiting at the stop, when you arrive.
- You wait for a time, , during which
- no bus arrives, but
- more riders arrive.
The answer is
The second task is to find a natural family of probability distributions for this problem. I don’t know its name, but there is an obvious 5-parameter family, which is the natural generalization of the -distribution, Note that
- For , this is just a product of independent -distributions.
- For positive integer , the hypergeometric function is just a finite-order polynomial
- There’s an obvious symmetry
Applying Bayes Theorem, the posterior probability is which, again, is just a shift of parameters of the distribution. The expected wait time transforms accordingly. The dependence on is, alas, somewhat complicated. You can spend some hours convincing yourself that it is what you should expect.
Whether you think a Poisson process is a good model for a real public transit systems, probably depends on your political persuasion.
Update: 12/29/2013
In return, I should pose the followup problem:
- Same setup as Problem 1 but, now, you’ve been to the bus stop times previously and had to wait for times . What is your posterior probability distribution for (hint: the -distribution is natural for this problem, too), and what is your expected wait time?
Re: The Bus Stop Problems
Regarding the first problem, and the fact that the expected waiting time increases, I find it very fascinating that it implies that a rational agent might decide to try waiting for the bus for a while and then give up and choose to walk when the expected wait time exceeds the time that it takes to walk.
I was forced to ponder the exact same problem during my school years and never figured this out…