Data Mining and R

October 24, 2009

(This article was first published on Jeromy Anglim's Blog: Psychology and Statistics, and kindly contributed to R-bloggers)

This post lists a few data mining resources in R. I also provide a few observations on the distinction between data mining, data analysis, and statistics as it pertains to the analysis work that I do in psychology.

Online Resources

Some Casual Observations
  • Data mining seems more concerned with prediction using observed variables than with understanding the causal system of latent variables; psychology is typically more concerned with the causal system of latent variables.
  • Data mining typically involves massive datasets (e.g. 10,000 + rows) collected for a purpose other than the purpose of the data mining. Psychological datasets are typically small (e.g., less than 1,000 or 100 rows) and collected explicitly to explore a research question.
  • Psychological analysis typically involves testing specific models. Automated model development approaches tend not to be theoretically interesting.

To leave a comment for the author, please follow the link and comment on their blog: Jeromy Anglim's Blog: Psychology and Statistics. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)