Use R to Analyze Players for your Fantasy Hockey League

[This article was first published on Brock's Data Adventure » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I am in a fantasy hockey league for the first time this seasons and I wanted to use R to analyze players.  Since I am relatively new to R, I am quite certain this code could be improved.  The code below is functional, however, and while this isn’t my complete analysis, I think it outlines how powerful R truly is.

NOTE: Have a wordpress.com and post R code? Check out this post:

http://www.r-statistics.com/2010/09/r-syntax-highlighting-for-bloggers-on-wordpress-com/

##############################################################################
# Analyze fantasy hockey skater stats using hockey reference
#
# Author: @BrockTibert
# Date:   October 2010
#
# Help:
# http://stackoverflow.com/questions/3796266/change-the-class-of-many-columns-in-a-data-frame
###############################################################################

library(RCurl)
library(XML)
library(ggplot2)  ## loads all sort of great packages from @HadleyWickham

URL <- "http://www.hockey-reference.com/leagues/NHL_2010_skaters.html"
tables <- readHTMLTable(URL, header=FALSE)

## for tutorial sake -- its the first table in the list
data <- tables[[1]]

## colnames
cn <- c("rk",
		"player",
		"age",
		"tm",
		"pos",
		"gp",
		"g",
		"a",
		"pts",
		"plus_minus",
		"pim",
		"ev",
		"pp",
		"sh",
		"gw",
		"s",
		"s_pct",
		"toi",
		"atoi")

colnames(data) <- cn

## remove the column breaks
data <- data[data$rk!='Rk',]

## holds the column indexes for the loop
## need to wrap the seq so it properly uses ncol
index <- c(1,3,6:(ncol(data)-1))

## change to numeric, but since factor, need to change to a value first
for(i in index) {
	data[,i] <- as.numeric(as.character(data[,i]))
}

## check to see if team's make sense
## some players have 'TOT' if played multple teams
#table(data$tm)
data <- data[data$tm!='Tm', ]

## create basic rannks for stats that are ranked
## goals, assists, points, plus minus, PIM, PPG, SHG, GW goals, shots on goal
index <- c()
index <- c(7:9,11,13:16) ## want rank of 1 to be on the largest value
index2 <- c(10)  ## want the rank of 1 to be on smallest value

for(i in index) {
	data$temp <- rank(-data[,i], ties.method="min")
	name <- paste(colnames(data[i]), "_rank", sep="")
	names(data)[ncol(data)] <- name

}

for(i in index2) {
	data$temp <- rank(data[,i], ties.method="min")
	name <- paste(colnames(data[i]), "_rank", sep="")
	names(data)[ncol(data)] <- name

}

## Very Basic Analysis
table(data$pos)

data <- data[data$pos %in% c("C", "LW", "RW", "D"),]
## one way to remove unused levels -- just specify the levels you want
## http://www.statmethods.net/input/valuelabels.html
data$pos <- factor(data$pos,
		levels = c("C", "LW", "RW", "D"))

## create summary stats
## has to be an easier way to do this
mean.goals <- function(df) mean(df$g, na.rm=T)
mean.assists <- function(df) mean(df$a, na.rm=T)
mean.pim <- function(df) mean(df$pim, na.rm=T)
ds.g <- ddply(data, c("pos"), mean.goals)
ds.g <- rename(ds.g, c(V1 = "goals")) #reshape package to rename variables
ds.a <- ddply(data, c("pos"), mean.assists)
ds.a <- rename(ds.a, c(V1 = "assists"))
ds.pim <- ddply(data, c("pos"), mean.pim)
ds.pim <- rename(ds.pim, c(V1 = "pim"))

summ <- merge(ds.g, ds.a, by="pos")
summ <- merge(summ, ds.pim, by="pos")

## print basic stats by position
summ

## Clean up and quit
rm(list=ls())
q()
n

Filed under: Fantasy HOckey, R, Tutorial Tagged: Fantasy Hockey, R

To leave a comment for the author, please follow the link and comment on their blog: Brock's Data Adventure » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)