Lately, I’ve been using statistical tests on a daily basis. I’ve noticed that I have to format my data the same way in order to get it into R (tab-delimited flat file essentially). Every other change in order to prep that data structure for any sort of statistical analysis function require minimal modification to the data structure.
Why not kill several birds with the same stone? I’ve just written a wrapper function around a few smaller, independent R scripts which perform statistical analysis tests.
Get the code here:
The input to the wrapper is a single, tab-delimited file (with the first row being the header). The output is a few PDF files, each complete with plots for different statistical analysis tests.
The statistical analysis tests which are currently performed are:
- Principal Component Analysis (PCA)
- K-means Clustering
- Hierarchical Clustering
There are a few R scripts in the currently released package which require the user to download external libraries. These R scripts are turned off by default.
I am considering adding more statistical tests to the package–perhaps a t-test with a few box-and-whisker plots.
Now I can run one script and output several plots in one slick shot.