The 100m mean’s sprint finals of the 2012 London Olympics are over and Usain Bolt won the gold medal again with a winning time of 9.63s. Time to compare the result with my forecast of 9.68s, posted on 22 July.
My simple log-linear model predicted a winning time of 9.68s with a prediction interval from 9.39s to 9.97s. Well, that is of course a big interval of more than half a second, or ±3%. Yet, the winning time was only 0.05s away from my prediction. That is less than 1% difference. Not bad for such a simple model.
Comments on my earlier post suggested to look into other parameters as well, such as track, weather and wind, or the times of the other medal winners. Others thought to focus on the recent past performance of the participants, rather than historical times over the last 100 years.
Interestingly enough the Economist published an article on the same subject (Faster, higher, no longer) in its current print edition (following my example :-?). The article uses data back to 1912 as well and considers wind and altitude as critical parameters.
The Olympic games are not over yet, and there are more opportunities to forecast results. Rob J. Hyndman lists further interesting examples of Olympic models and predictions on his blog, and my colleague Matt Malin presents ideas to model the 100m butterfly men’s swimming final on his site.
R code used in this post
## 100m men's sprint historical winning times ## Sourced from: ## http://www.databaseolympics.com/sport/sportevent.htm?enum=110&sp=ATH golddata <- read.table(sep=",", header=TRUE, text="Year, Event, Athlete, Medal, Country, Result 1896, 100m Men, Tom Burke, GOLD, USA, 12.00 1900, 100m Men, Frank Jarvis, GOLD, USA, 11.00 1904, 100m Men, Archie Hahn, GOLD, USA, 11.00 1906, 100m Men, Archie Hahn, GOLD, USA, 11.20 1908, 100m Men, Reggie Walker, GOLD, SAF, 10.80 1912, 100m Men, Ralph Craig, GOLD, USA, 10.80 1920, 100m Men, Charles Paddock, GOLD, USA, 10.80 1924, 100m Men, Harold Abrahams, GOLD, GBR, 10.60 1928, 100m Men, Percy Williams, GOLD, CAN, 10.80 1932, 100m Men, Eddie Tolan, GOLD, USA, 10.30 1936, 100m Men, Jesse Owens, GOLD, USA, 10.30 1948, 100m Men, Harrison Dillard, GOLD, USA, 10.30 1952, 100m Men, Lindy Remigino, GOLD, USA, 10.40 1956, 100m Men, Bobby Morrow, GOLD, USA, 10.50 1960, 100m Men, Armin Hary, GOLD, GER, 10.20 1964, 100m Men, Bob Hayes, GOLD, USA, 10.00 1968, 100m Men, Jim Hines, GOLD, USA, 9.95 1972, 100m Men, Valery Borzov, GOLD, URS, 10.14 1976, 100m Men, Hasely Crawford, GOLD, TRI, 10.06 1980, 100m Men, Allan Wells, GOLD, GBR, 10.25 1984, 100m Men, Carl Lewis, GOLD, USA, 9.99 1988, 100m Men, Carl Lewis, GOLD, USA, 9.92 1992, 100m Men, Linford Christie, GOLD, GBR, 9.96 1996, 100m Men, Donovan Bailey, GOLD, CAN, 9.84 2000, 100m Men, Maurice Greene, GOLD, USA, 9.87 2004, 100m Men, Justin Gatlin, GOLD, USA, 9.85 2008, 100m Men, Usain Bolt, GOLD, JAM, 9.69 2012, 100m Men, Usain Bolt, GOLD, JAM, 9.63 ") myData <- subset(golddata, Year>=1900 & Year<2012) log.linear <- lm(log(Result)~Year, data=myData) years <- seq(1896,2012, 4) predictions <- exp(predict(log.linear, newdata=data.frame(Year=years), level=0.95, interval="prediction")) predictions <- data.frame(predictions) plot(Result ~ Year, data=golddata, xlim=c(1896,2012), ylim=c(9.5,12), xlab="Year", main="Olympic 100 metre sprint", ylab="Winning time for the 100m men final (s)") lines(years, predictions$fit, col="red") lines(years, predictions$lwr, col="black", lty=2) lines(years, predictions$upr, col="black", lty=2) London.Prediction <- predictions$fit[length(years)] points(2012, London.Prediction, pch=19, col="red") ## 2012 London Olympics 100m men's gold winnning time winning.time <- 9.63 points(2012, winning.time, pch=21, cex=1, bg="gold") legend(x=1960, y=12, title="London Olympics", legend=c(paste("Prediction: ", round(London.Prediction, 2)), paste("2012 Result:", winning.time)), col=c("red", "black"), pch=c(19, 21), box.col="white", pt.bg=c("red", "gold"))