Lately, I’ve been using statistical tests on a daily basis. I’ve noticed that I have to format my data the same way in order to get it into R (tab-delimited flat file essentially). Every other change in order to prep that data structure for any sort of statistical analysis function require minimal modification to the data structure.

Why not kill several birds with the same stone? I’ve just written a wrapper function around a few smaller, independent R scripts which perform statistical analysis tests.

Get the code here:

https://github.com/ngopal/Statistical-Analysis-Functions

The input to the wrapper is a single, tab-delimited file (with the first row being the header). The output is a few PDF files, each complete with plots for different statistical analysis tests.

The statistical analysis tests which are currently performed are:

- Principal Component Analysis (PCA)
- K-means Clustering
- Hierarchical Clustering

There are a few R scripts in the currently released package which require the user to download external libraries. These R scripts are turned off by default.

I am considering adding more statistical tests to the package–perhaps a t-test with a few box-and-whisker plots.

Now I can run one script and output several plots in one slick shot.

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** Optinalysis**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...