R for Data Mining

June 6, 2011

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Statistics and data mining often get bundled together, but (in my opinion), they're generally different practices with different goals. As a language designed for statistics, much of R's core functionality is focused on exploring and understanding data: model design, inference, and visualization. But when your goal is simply to get the best predictions from a big data set (without worrying too much about the model itself), much of R's statistical power can also be put to data mining purposes. A good overview can be found in Luis Torgo's book Data Mining with R, and the functions in the associated package DMwR.

Another good resource is the Yanchang Zhao's website rdatamining.com, which collects resources related to data mining with R. In particular, check out his R Reference Card for Data Mining, a 3-page PDF index of the R functions and packages for association rules, classification, clustering, text mining, social network analysis, and more. Find it in the "Docs" section linked below.

RDataMining.com: Documents

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)