Statistical Analysis Functions in R

August 20, 2011
By

(This article was first published on Optinalysis, and kindly contributed to R-bloggers)

Lately, I've been using statistical tests on a daily basis. I've noticed that I have to format my data the same way in order to get it into R (tab-delimited flat file essentially). Every other change in order to prep that data structure for any sort of statistical analysis function require minimal modification to the data structure.


Why not kill several birds with the same stone? I've just written a wrapper function around a few smaller, independent R scripts which perform statistical analysis tests.


Get the code here:


https://github.com/ngopal/Statistical-Analysis-Functions


The input to the wrapper is a single, tab-delimited file (with the first row being the header). The output is a few PDF files, each complete with plots for different statistical analysis tests.


The statistical analysis tests which are currently performed are:
  • Principal Component Analysis (PCA)
  • K-means Clustering
  • Hierarchical Clustering
There are a few R scripts in the currently released package which require the user to download external libraries. These R scripts are turned off by default.

I am considering adding more statistical tests to the package--perhaps a t-test with a few box-and-whisker plots.

Now I can run one script and output several plots in one slick shot.

To leave a comment for the author, please follow the link and comment on his blog: Optinalysis.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.