What I like most about the R and Python developer and user communities, is their incredible openness and generosity. One of the finest examples in the past year was the online course “Statistical Learning” taught by Stanford professors Trevor Hastie and Rob Tibshirani.
In this MOOC they explain very understandably (even for beginners) the basics of statistical modeling (or machine learning) techniques such as linear, polynomial and logistic regression, smoothing splines, Ridge regression, Lasso, Generalized Additive Models, various methods for classification (Classification Trees to random forests) and also unsupervised learning methods.
But the highlights of this course are the R labs between all units. In these sessions, the statistical theory is supplemented with many practical examples. It’s really fantastic to hear the authors explain and teach the (very essential) R packages they wrote themselves. For me, the course was also an impetus to learn even more about knitr. Especially if you’re used to IPython notebook, this combination of code and output can be very intuitive and useful. Even months after going through the course, I refer to my lab R code (see also here) when I need some quick templates for common statistical modelling tasks. I really liked the strong focus on cross validation methods – many basic courses on statistics focus only on the methods and not on how to estimate how well you’re predicting.
The course is based on the textbook “Introduction to Statistical Learning” (or short: ISL, download here) Hastie and Tibshirani wrote together with Gareth James and Daniela Witten. If you want to dive even deeper into the subject, you can also work through the more advanced work “Elements of Statistical Learning” (ESL, download).
So, if one of your New Year resolutions for 2015 is, learning how to do more with R, you should definitively take a look at this course. The next free class is starting on January 19.