Use R to Analyze Players for your Fantasy Hockey League

October 4, 2010
By

(This article was first published on Brock's Data Adventure » R, and kindly contributed to R-bloggers)

I am in a fantasy hockey league for the first time this seasons and I wanted to use R to analyze players.  Since I am relatively new to R, I am quite certain this code could be improved.  The code below is functional, however, and while this isn’t my complete analysis, I think it outlines how powerful R truly is.

NOTE: Have a wordpress.com and post R code? Check out this post:

http://www.r-statistics.com/2010/09/r-syntax-highlighting-for-bloggers-on-wordpress-com/

##############################################################################
# Analyze fantasy hockey skater stats using hockey reference
#
# Author: @BrockTibert
# Date:   October 2010
#
# Help:
# http://stackoverflow.com/questions/3796266/change-the-class-of-many-columns-in-a-data-frame
###############################################################################

library(RCurl)
library(XML)
library(ggplot2)  ## loads all sort of great packages from @HadleyWickham

URL <- "http://www.hockey-reference.com/leagues/NHL_2010_skaters.html"
tables <- readHTMLTable(URL, header=FALSE)

## for tutorial sake -- its the first table in the list
data <- tables[[1]]

## colnames
cn <- c("rk",
		"player",
		"age",
		"tm",
		"pos",
		"gp",
		"g",
		"a",
		"pts",
		"plus_minus",
		"pim",
		"ev",
		"pp",
		"sh",
		"gw",
		"s",
		"s_pct",
		"toi",
		"atoi")

colnames(data) <- cn

## remove the column breaks
data <- data[data$rk!='Rk',]

## holds the column indexes for the loop
## need to wrap the seq so it properly uses ncol
index <- c(1,3,6:(ncol(data)-1))

## change to numeric, but since factor, need to change to a value first
for(i in index) {
	data[,i] <- as.numeric(as.character(data[,i]))
}

## check to see if team's make sense
## some players have 'TOT' if played multple teams
#table(data$tm)
data <- data[data$tm!='Tm', ]

## create basic rannks for stats that are ranked
## goals, assists, points, plus minus, PIM, PPG, SHG, GW goals, shots on goal
index <- c()
index <- c(7:9,11,13:16) ## want rank of 1 to be on the largest value
index2 <- c(10)  ## want the rank of 1 to be on smallest value

for(i in index) {
	data$temp <- rank(-data[,i], ties.method="min")
	name <- paste(colnames(data[i]), "_rank", sep="")
	names(data)[ncol(data)] <- name

}

for(i in index2) {
	data$temp <- rank(data[,i], ties.method="min")
	name <- paste(colnames(data[i]), "_rank", sep="")
	names(data)[ncol(data)] <- name

}

## Very Basic Analysis
table(data$pos)

data <- data[data$pos %in% c("C", "LW", "RW", "D"),]
## one way to remove unused levels -- just specify the levels you want
## http://www.statmethods.net/input/valuelabels.html
data$pos <- factor(data$pos,
		levels = c("C", "LW", "RW", "D"))

## create summary stats
## has to be an easier way to do this
mean.goals <- function(df) mean(df$g, na.rm=T)
mean.assists <- function(df) mean(df$a, na.rm=T)
mean.pim <- function(df) mean(df$pim, na.rm=T)
ds.g <- ddply(data, c("pos"), mean.goals)
ds.g <- rename(ds.g, c(V1 = "goals")) #reshape package to rename variables
ds.a <- ddply(data, c("pos"), mean.assists)
ds.a <- rename(ds.a, c(V1 = "assists"))
ds.pim <- ddply(data, c("pos"), mean.pim)
ds.pim <- rename(ds.pim, c(V1 = "pim"))

summ <- merge(ds.g, ds.a, by="pos")
summ <- merge(summ, ds.pim, by="pos")

## print basic stats by position
summ

## Clean up and quit
rm(list=ls())
q()
n

Filed under: Fantasy HOckey, R, Tutorial Tagged: Fantasy Hockey, R

To leave a comment for the author, please follow the link and comment on his blog: Brock's Data Adventure » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.