Systems with randomness are inherently subject to delay. But how does that delay vary over the day? That is, if we think of a service setting with “peak” hours that in some sense resets every day, when are delays worst? Note that this would be relevant for, say, a restaurant that closes every evening or for a hospital emergency room that doesn’t officially close but typically does see a dramatic drop in volume in the overnight hours. Intuitively, one would expect that delays build over the day. Now Nate Silver at FiveThirtyEight has provided a graphical illustration of how delay builds for a stochastic system — specifically for US domestic flights (Fly Early, Arrive On-Time, Apr 19).
Here is Silver’s description for what is actually being plotted here.
The Bureau of Transportation Statistics keeps on-time data on every domestic flight flown by the 14 largest U.S.-based airlines. I collected summary data for the more than 6 million flights in the database in 2013 and organized them in one-hour blocks by their scheduled hour of departure (for instance, 10:00 a.m. to 10:59 a.m.). Then I looked at how many minutes late the flights arrived on average.
A few notes on this procedure: As should be clear, I prefer to look at delays in terms of time lost rather than a binary definition of arriving late or on time. (A 20-minute delay is nothing compared to a whole afternoon spent in transit hell.) I counted diverted and cancelled flights as equivalent to 120-minute delays. Some versions of the BTS data assign a flight negative minutes if it arrives early; I used a version where the minimum delay is 0 minutes instead.
To kibitz a little with the methodology, the mean is arguably the wrong thing to plot. Consider this graph from the Bureau of Transportation Statistics.
It shows the percent of US flights that were on-time. For the last two years, 80% of flights have been on-time. So far this year, performance has dropped to just shy of 70%; one guesses that a crappy winter had something to do with that. Now this is on-time by the government’s definition, i.e., reaching the gate within 15 minutes of its scheduled arrival. It seems that Silver is counting five minutes late as being, well, five minutes late, which is fine, but it is still likely that a large number of flights are on time by his definition. That is, it wouldn’t be surprising that a third or more of the flights in his data have zero delay — which would greatly lower the mean delay. Stated another way, looking at, say, the median delay or 75th percentile of delay might be more informative and generally show much higher delays.
Still it would be unlikely to show a markedly different pattern: Delays will peak in the evening and you should travel at the crack of dawn if you live in fear of being stuck at O’Hare.
The natural question to ask next is why delays occur. Silver’s got an answer for that as well. The Feds record the causes of delays and Silver then plots average delay by cause of delay.
This suggests that if your flight is delayed because the in-bound flight is late, you should settle in because things are going to take a while. Pairing this with the graph above, it seems that late aircraft are responsible for the bulk of the evening delay. Further, for fans of The Goal, it makes a lot of sense. Flights are subject to statistical variation and later flights are dependent on earlier flights coming in. Bad outcomes then propagate while the system cannot take advantage of favorable ones because fights cannot leave early.