Blog Archives

2014 Metabolomic Data Analysis and Visualization Workshop and Tutorials

October 11, 2014
By
2014 Metabolomic Data Analysis and Visualization Workshop and Tutorials

Recently I had the pleasure of teaching statistical and multivariate data analysis and visualization at the annual Summer Sessions in Metabolomics 2014, organized by the NIH West Coast Metabolomics Center. Similar to last year, I’ve posted all the content (lectures, labs and software) for any one to follow along with at their own pace. I also

Read more »

2014 UC Davis Proteomics Workshop

August 9, 2014
By
2014 UC Davis Proteomics Workshop

Recently I had the pleasure of teaching data analysis at the 2014 UC Davis Proteomics Workshop. This included a hands on lab for making gene ontology enrichment networks. You can check out my lecture and tutorial below or download all the material. Introduction Tutorial 2014 UC Davis Proteomics Workshop Dmitry Grapov is licensed under a

Read more »

PubChem 446220 = Yeyo

August 8, 2014
By
PubChem 446220 = Yeyo

I just updated my R package, CTSgetR, for biological database translation using the Chemical Translation Service (CTS). While making code examples I came across some humorous chemical name synonyms for the molecule referenced in PubChem  as CID = 446220. Below are a few examples, can you guess what this is? Badrock, Bazooka, Bernice, Bernies, Blast, Blizzard, Bouncing Powder, Bump, Burese,

Read more »

Multivariate Data Analysis and Visualization Through Network Mapping

June 27, 2014
By
Multivariate Data Analysis and  Visualization Through Network Mapping

Recently I had the pleasure of speaking about one of my favorite topics, Network Mapping. This is a continuation of a general theme I’ve previously discussed and involves the merger of statistical and multivariate data analysis results with a network. Over the past year I’ve been working on two major tools, DeviumWeb and MetaMapR, which

Read more »

Using Repeated Measures to Remove Artifacts from Longitudinal Data

June 4, 2014
By
Using Repeated Measures to Remove Artifacts from Longitudinal Data

Recently I was tasked with evaluating and most importantly removing analytical variance form a longitudinal metabolomic analysis carried out over a few years and including >2,5000 measurements for >5,000 patients. Even using state-of-the-art analytical instruments and techniques long term biological studies are plagued with unwanted trends which are unrelated to the original experimental design and stem from analytical

Read more »

Enrichment Network

May 10, 2014
By
Enrichment Network

Enrichment is beyond random occurrence within a category. Networks can represent relationships among variables. Enrichment networks display relationships among variables which are over represented compared to random chance. Next is  a tutorial for making enrichment networks for biological (metabolomic) data in R using the KEGG database.

Read more »

Choose Your Own Data Adventure

April 5, 2014
By
Choose Your Own Data Adventure

The question is: can we automate scientific discovery, and what might an interface to such a tool look like. I’ve been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress and challenges encountered in the presentation

Read more »

High Dimensional Biological Data Analysis and Visualization

February 22, 2014
By
High Dimensional Biological Data Analysis and Visualization

High dimensional biological data shares many qualities with other forms of data. Typically it is wide (samples << variables), complicated by experiential design and made up of complex relationships driven by both biological and analytical sources of variance. Luckily the powerful combination of R, Cytoscape (< v3) and the R package RCytoscape can be used

Read more »

Tutorials- Statistical and Multivariate Analysis for Metabolomics

February 17, 2014
By
Tutorials- Statistical and Multivariate Analysis for Metabolomics

I recently had the pleasure in participating in the 2014 WCMC Statistics for Metabolomics Short Course. The course was hosted by the NIH West Coast Metabolomics Center and focused on statistical and multivariate strategies for metabolomic data analysis. A variety of topics were covered using 8 hands on tutorials which focused on: data quality overview

Read more »

Classification with O-PLS-DA

September 29, 2013
By
Classification with O-PLS-DA

Partial least squares (PLS) is a versatile algorithm which can be used to predict either continuous or discrete/categorical variables. Classification with PLS is termed PLS-DA, where the DA stands for discriminant analysis.  The PLS-DA algorithm has many favorable properties for dealing with multivariate data; one of the most important of which is how variable collinearity is

Read more »