Statistical learning theory offers an opportunity to those of us trained as social science methodologists to look at everything we have learned from a different perspective. For example, missing value imputation can be seen as matrix completion and recommender systems used to fill-in empty questionnaire items that were never shown to more than a few respondents by design. It is not difficult to show how to run the R package softImpute that makes all this happen. But it can be overwhelming trying to learn about the underlying mechanism in enough detail that you have some confidence that you know what you are doing. One does not want to spend the time necessary to become a statistician, yet we need be aware of when and how to use specific models, and what can go wrong, and what to do when something goes wrong. At least with R, one can run analyses on data sets and work through concrete examples.
The publication of An Introduction to Statistical Learning with Applications in R (download the book pdf) provides a gentle introduction with lots of R code. The book achieves a nice balance and well worth looking at both for the beginner and the more experienced needing to explain to others with less training. As a bonus, Stanford’s OpenEdX has scheduled a MOOC by Hastie and Tibshirani beginning in January 21 using this textbook.