(This article was first published on Jeromy Anglim's Blog: Psychology and Statistics, and kindly contributed to R-bloggers)
This post lists a few data mining resources in R. I also provide a few observations on the distinction between data mining, data analysis, and statistics as it pertains to the analysis work that I do in psychology.Online Resources
- The classic book The Elements of Statistical Learning by Hastie, Tibshirani, Friedman is available for free online. There's also an accompanying R package.
- I previously discussed David Mease's online data mining course
- Rattle - a data mining GUI for R.
- Some comments on data mining by John Maindonald
- Luis Torgo has a book currently available online providing demonstrations of data mining using R
- Cran Task View on Machine Learning & Statistical Learning
Some Casual Observations
- Data mining seems more concerned with prediction using observed variables than with understanding the causal system of latent variables; psychology is typically more concerned with the causal system of latent variables.
- Data mining typically involves massive datasets (e.g. 10,000 + rows) collected for a purpose other than the purpose of the data mining. Psychological datasets are typically small (e.g., less than 1,000 or 100 rows) and collected explicitly to explore a research question.
- Psychological analysis typically involves testing specific models. Automated model development approaches tend not to be theoretically interesting.
To leave a comment for the author, please follow the link and comment on his blog: Jeromy Anglim's Blog: Psychology and Statistics.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).