Comparing individual team run production

February 3, 2013

(This article was first published on Bayes Ball, and kindly contributed to R-bloggers)

Or, The 2010 Mariners: How Bad Were They?

In earlier posts, I used the statistical software R to plot the trends in league average run scoring since 1901. This was the first step to answering other questions I had on my mind:

  1. How poor was the offensive performance of the 2010 Seattle Mariners?
  2. Are they showing any signs of improvement?
  3. And how can I use R to tabulate the data to answer these questions?

So, to answer Question #1.  It is well-established that the 2010 Mariners were not very good, at least offensively. (For fans of the team the well-deserved Cy Young award won by Felix Hernandez is surely the highlight of the season.) But I wanted a form of relative measure that would be comparable across time, to accommodate the various fluctuations in run scoring that were the subject of that earlier post.

As I started into this, the first decision was to draw a line in the historical record. I opted to use the eras described in Bill James' "Dividing Baseball History into Eras" article (behind a pay wall – but chances are if you're reading my blog, you already a Bill James subscriber):

  • Era 1 (The Pioneer Era), 1871-1892 
  • Era 2 (The Spitball Era), 1893-1919 
  • Era 3 (The Landis Era), 1920-1946 
  • Era 4 (The Baby Boomers Era), 1947-1968 
  • Era 5 (The Artifical Turf Era), 1969-1992 
  • Era 6 (The Camden Yards Era), 1993-2012

Based on these groupings, I opted to use the range of seasons 1947-2012 inclusive. This yields 1,580 team seasons of National League and American League baseball.

The second step was to calculate a runs per game (RPG) for each team, by year. This corrects for the longer regular season in the post-expansion period, the strike-shortened seasons, and will give us a common denominator to compare the results so far in 2012.

To do this, I accessed the 2012 edition of the Lahman database. Once I had downloaded and extracted the comma-delimted version of the files, I read the "teams" file into R.

Read more »

To leave a comment for the author, please follow the link and comment on their blog: Bayes Ball. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

plotly webpage

dominolab webpage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training




CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)