Site icon R-bloggers

R for Data Mining

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Statistics and data mining often get bundled together, but (in my opinion), they're generally different practices with different goals. As a language designed for statistics, much of R's core functionality is focused on exploring and understanding data: model design, inference, and visualization. But when your goal is simply to get the best predictions from a big data set (without worrying too much about the model itself), much of R's statistical power can also be put to data mining purposes. A good overview can be found in Luis Torgo's book Data Mining with R, and the functions in the associated package DMwR.

Another good resource is the Yanchang Zhao's website rdatamining.com, which collects resources related to data mining with R. In particular, check out his R Reference Card for Data Mining, a 3-page PDF index of the R functions and packages for association rules, classification, clustering, text mining, social network analysis, and more. Find it in the "Docs" section linked below.

RDataMining.com: Documents

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.