Of course, a few days before I leave for a much needed vacation, USA Today released their updated NCAA coaching salary database. For sports junkies, there’s an unlimited number of analysis and visualizations that can be done on the data.
I took a quick break from packing to condense the data to a csv and write up a very rough R script. Note: sqldf rocks but installing tcltk (if you have too) can be a bit of a pain. Look here for help with tcltk.
library(ggplot2) library(sqldf) salaries <- read.csv("2011Salary.csv", header=T, sep=",") result <- sqldf('select a.Conference, sum(a.SchoolPay) / b.spc as avg_pay from salaries as a join (select Conference, count(*) as spc from salaries where SchoolPay > 0 group by Conference) as b on a.Conference = b.Conference group by a.Conference') chart <- qplot(result$Conference, result$avg_pay, geom="bar", stat="identity", fill = I("grey50"), main = 'Average Coaches Salary by Conference', xlab = 'Conference', ylab = 'Average Pay') chart + opts(axis.text.x=theme_text(angle=-45))
Most surprising result? PAC-12 coaches average ~ $400,000 less than the Big East.
Edited per G.'s suggestion: sqldf rocks, tcltk can be tricky.