Data Mining and R

October 24, 2009
By

(This article was first published on Jeromy Anglim's Blog: Psychology and Statistics, and kindly contributed to R-bloggers)

This post lists a few data mining resources in R. I also provide a few observations on the distinction between data mining, data analysis, and statistics as it pertains to the analysis work that I do in psychology.

Online Resources
Some Casual Observations
  • Data mining seems more concerned with prediction using observed variables than with understanding the causal system of latent variables; psychology is typically more concerned with the causal system of latent variables.
  • Data mining typically involves massive datasets (e.g. 10,000 + rows) collected for a purpose other than the purpose of the data mining. Psychological datasets are typically small (e.g., less than 1,000 or 100 rows) and collected explicitly to explore a research question.
  • Psychological analysis typically involves testing specific models. Automated model development approaches tend not to be theoretically interesting.

To leave a comment for the author, please follow the link and comment on his blog: Jeromy Anglim's Blog: Psychology and Statistics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.