Now, that the 2011 F1 season is over I decided to quickly scrub the Formula 1 data of the F1.com website, such as the list of drivers, ordered by the approximate amount of salary driver is getting (top list driver is making the most, approx. 30MM) and position at the end of each race. There was a little bit of work coming up with this small dataset but I wanted to produce a heatmap type of graph to show the distinction between the drivers with respect to their salaries, plus its just couple of simple steps in R. One thing about this heatmap to notice is how consistent the driver is and who will move up the chain based on this season’s performance.
1.) You need R
2.) Dataset. I uploaded mine to DataCouch.com
ggplot is an implementation of the grammar of graphics in R
4.) >F1_POS <- read.csv(“/Users/marcinkulakowski/R/F1_POS.csv”)
Load the data into F1_POS dataframe
5.) >F1_POS$Driver <- with(F1_POS, reorder(Driver, Salary))
The drivers are ordered by points, and the Salary variable converted to a factor for sorting.
6.) >F1_POS.m <- melt(F1_POS)
>F1_POS.m <- ddply(F1_POS.m, .(variable), transform, rescale = rescale(value))
Convert the data for easy casting and rescale the stats.
7.) >(p <- ggplot(F1_POS.m, aes(variable, Driver)) + geom_tile(aes(fill = rescale), colour = “white”) + scale_fill_gradient(low = “red”,high = “yellow”))
Plot the data.
8.) >base_size <- 9
> p + theme_grey(base_size = base_size) + labs(x = “”, y = “”) + scale_x_discrete(expand = c(0, 0)) + scale_y_discrete(expand = c(0, 0)) + opts(legend.position = “none”, axis.ticks = theme_blank(), axis.text.x = theme_text(size = base_size * 0.8, angle = 330, hjust = 0, colour = “grey50″))
Source: Formula One