The new book Analyzing Baseball Data with R by Max Marchi and Jim Albert is now available, and the authors have also launched a companion blog to share some of the analyses from the book. For example, they used the Lahman package in R to look at the strikeout rate in World Series baseball games over the last century and found (after a little nonparametric smoothing) that the rate has been on a steady rise:
Nonetheless, 2013 was not a record year for strikeouts — although given the steady rise, record-holder 2002 is likely to be eclipsed soon.
For another inetesting R-based baseball analysis, take a look at the presentation on the analysis of “streakiness” in baseball, given by Jim Albert at the recent NESSIS conference:
If you want to reproduce the strikeout analysis above, just follow the link below for the R code.