Blog Archives

Using Repeated Measures to Remove Artifacts from Longitudinal Data

June 4, 2014
By
Using Repeated Measures to Remove Artifacts from Longitudinal Data

Recently I was tasked with evaluating and most importantly removing analytical variance form a longitudinal metabolomic analysis carried out over a few years and including >2,5000 measurements for >5,000 patients. Even using state-of-the-art analytical instruments and techniques long term biological studies are plagued with unwanted trends which are unrelated to the original experimental design and stem from analytical

Read more »

Enrichment Network

May 10, 2014
By
Enrichment Network

Enrichment is beyond random occurrence within a category. Networks can represent relationships among variables. Enrichment networks display relationships among variables which are over represented compared to random chance. Next is  a tutorial for making enrichment networks for biological (metabolomic) data in R using the KEGG database.

Read more »

Choose Your Own Data Adventure

April 5, 2014
By
Choose Your Own Data Adventure

The question is: can we automate scientific discovery, and what might an interface to such a tool look like. I’ve been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress and challenges encountered in the presentation

Read more »

High Dimensional Biological Data Analysis and Visualization

February 22, 2014
By
High Dimensional Biological Data Analysis and Visualization

High dimensional biological data shares many qualities with other forms of data. Typically it is wide (samples << variables), complicated by experiential design and made up of complex relationships driven by both biological and analytical sources of variance. Luckily the powerful combination of R, Cytoscape (< v3) and the R package RCytoscape can be used

Read more »

Tutorials- Statistical and Multivariate Analysis for Metabolomics

February 17, 2014
By
Tutorials- Statistical and Multivariate Analysis for Metabolomics

I recently had the pleasure in participating in the 2014 WCMC Statistics for Metabolomics Short Course. The course was hosted by the NIH West Coast Metabolomics Center and focused on statistical and multivariate strategies for metabolomic data analysis. A variety of topics were covered using 8 hands on tutorials which focused on: data quality overview

Read more »

Classification with O-PLS-DA

September 29, 2013
By
Classification with O-PLS-DA

Partial least squares (PLS) is a versatile algorithm which can be used to predict either continuous or discrete/categorical variables. Classification with PLS is termed PLS-DA, where the DA stands for discriminant analysis.  The PLS-DA algorithm has many favorable properties for dealing with multivariate data; one of the most important of which is how variable collinearity is

Read more »

Orthogonal Partial Least Squares (OPLS) in R

July 28, 2013
By
Orthogonal Partial Least Squares (OPLS) in R

I often need to analyze and model very wide data (variables >>>samples), and because of this I gravitate to robust yet relatively simple methods. In my opinion partial least squares (PLS) is a particular useful algorithm. Simply put, PLS is an extension of principal components analysis (PCA), a non-supervised  method to maximizing  variance explained in X,

Read more »

Interactive Heatmaps (and Dendrograms) – A Shiny App

July 7, 2013
By
Interactive Heatmaps (and Dendrograms) – A Shiny App

Heatmaps are a great way to visualize data matrices. Heatmap color and organization can be used to  encode information about the data and metadata to help learn about the data at hand. An example of this could be looking at the raw data  or hierarchically clustering samples and variables based on their similarity or differences.

Read more »

Principal Components Analysis Shiny App

June 23, 2013
By
Principal Components Analysis Shiny App

I’ve recently started experimenting with making Shiny apps, and today I wanted to make a basic app for calculating and visualizing principal components analysis (PCA). Here is the basic interface I came up with. Test drive the app for yourself using the code below or  check out the the R code HERE. Above is an example of the

Read more »

Dynamic Data Visualizations in the Browser Using Shiny

June 16, 2013
By
Dynamic Data Visualizations in the Browser Using Shiny

After being busy the last two weeks teaching and attending academic conferences, I finally found some time to do what I love, program data visualizations using R. After being interested in Shiny for a while, I finally decided to pull the trigger and build my first Shiny app! I wanted to make a proof of

Read more »