(This article was first published on Bayes Ball, and kindly contributed to R-bloggers)
Or, The 2010 Mariners: How Bad Were They?In earlier posts, I used the statistical software R to plot the trends in league average run scoring since 1901. This was the first step to answering other questions I had on my mind:
- How poor was the offensive performance of the 2010 Seattle Mariners?
- Are they showing any signs of improvement?
- And how can I use R to tabulate the data to answer these questions?
As I started into this, the first decision was to draw a line in the historical record. I opted to use the eras described in Bill James' "Dividing Baseball History into Eras" article (behind a pay wall – but chances are if you're reading my blog, you already a Bill James subscriber):
- Era 1 (The Pioneer Era), 1871-1892
- Era 2 (The Spitball Era), 1893-1919
- Era 3 (The Landis Era), 1920-1946
- Era 4 (The Baby Boomers Era), 1947-1968
- Era 5 (The Artifical Turf Era), 1969-1992
- Era 6 (The Camden Yards Era), 1993-2012
The second step was to calculate a runs per game (RPG) for each team, by year. This corrects for the longer regular season in the post-expansion period, the strike-shortened seasons, and will give us a common denominator to compare the results so far in 2012.
To do this, I accessed the 2012 edition of the Lahman database. Once I had downloaded and extracted the comma-delimted version of the files, I read the "teams" file into R.
Read more »
To leave a comment for the author, please follow the link and comment on his blog: Bayes Ball.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).