Articles by dgrapov

Machine Learning Powered Biological Network Analysis

June 11, 2017 | dgrapov

Metabolomic network analysis can be used to interpret experimental results within a variety of contexts including: biochemical relationships, structural and spectral similarity and empirical correlation. Machine learning is useful for modeling relationships in the context of pattern recognition, clustering, classification and regression based predictive modeling. The combination of developed metabolomic ...
[Read more...]

Push it to the limit: SOM + Clustering + Networks

May 18, 2016 | dgrapov

What is the highest dimensional visualization you can think of? Now imagine it being interactive. The following details a Frankenstein visualization packing a smorgasbord of multivariate goodness. Enter first, self-organizing maps (SOM). I first fell into a love dream with SOMs after using the kohonen package. The  wines data set ...
[Read more...]

Network Visualization with Plotly and Shiny

February 28, 2016 | dgrapov

In addition to their more common uses, networks  can be used as powerful multivariate data visualizations and exploration tools. Networks not only provide mathematical representations of data but are also one of the few data visualization methods capable of easily displaying multivariate variable relationships. The process of network mapping involves ...
[Read more...]

dplyr Tutorial: verbs + split-apply

May 3, 2015 | dgrapov

At a recent Saint Louis R users meeting I had the pleasure of giving a basic introduction to the awesome dplyr R package. For me, data analysis ubiquitously involves splitting the data based on grouping variable and then applying some function to the subsets or what is termed split-apply (typically ... [Read more...]

dplyr Tutorial: verbs + split-apply-combine

May 3, 2015 | dgrapov

At a recent Saint Louis R users meeting I had the pleasure of giving a basic introduction to the awesome dplyr R package. For me, data analysis ubiquitously involves splitting the data based on grouping variable and then applying some function to the subsets or what is termed split-apply-combine. Having ... [Read more...]

2014 UC Davis Proteomics Workshop

August 9, 2014 | dgrapov

Recently I had the pleasure of teaching data analysis at the 2014 UC Davis Proteomics Workshop. This included a hands on lab for making gene ontology enrichment networks. You can check out my lecture and tutorial below or download all the material. Introduction Tutorial 2014 UC Davis Proteomics Workshop Dmitry Grapov is ... [Read more...]

PubChem 446220 = Yeyo

August 8, 2014 | dgrapov

I just updated my R package, CTSgetR, for biological database translation using the Chemical Translation Service (CTS). While making code examples I came across some humorous chemical name synonyms for the molecule referenced in PubChem  as CID = 446220. Below are a few examples, can you guess what this is? Badrock, Bazooka, ...
[Read more...]

Using Repeated Measures to Remove Artifacts from Longitudinal Data

June 4, 2014 | dgrapov

Recently I was tasked with evaluating and most importantly removing analytical variance form a longitudinal metabolomic analysis carried out over a few years and including __2,5000 measurements for __5,000 patients. Even using state-of-the-art analytical instruments and techniques long term biological studies are plagued with unwanted trends which are unrelated to the original ... [Read more...]

Enrichment Network

May 10, 2014 | dgrapov

Enrichment is beyond random occurrence within a category. Networks can represent relationships among variables. Enrichment networks display relationships among variables which are over represented compared to random chance. Next is  a tutorial for making enrichment networks for biological (metabolomic) data in R using the KEGG database.
[Read more...]

Choose Your Own Data Adventure

April 5, 2014 | dgrapov

The question is: can we automate scientific discovery, and what might an interface to such a tool look like. I’ve been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress ... [Read more...]

Tutorials- Statistical and Multivariate Analysis for Metabolomics

February 17, 2014 | dgrapov

I recently had the pleasure in participating in the 2014 WCMC Statistics for Metabolomics Short Course. The course was hosted by the NIH West Coast Metabolomics Center and focused on statistical and multivariate strategies for metabolomic data analysis. A variety of topics were covered using 8 hands on tutorials which focused on: ... [Read more...]

Classification with O-PLS-DA

September 29, 2013 | dgrapov

Partial least squares (PLS) is a versatile algorithm which can be used to predict either continuous or discrete/categorical variables. Classification with PLS is termed PLS-DA, where the DA stands for discriminant analysis.  The PLS-DA algorithm has many favorable properties for dealing with multivariate data; one of the most important ... [Read more...]

Orthogonal Partial Least Squares (OPLS) in R

July 28, 2013 | dgrapov

I often need to analyze and model very wide data (variables ______samples), and because of this I gravitate to robust yet relatively simple methods. In my opinion partial least squares (PLS) is a particular useful algorithm. Simply put, PLS is an extension of principal components analysis (PCA), a non-supervised  method ... [Read more...]

Interactive Heatmaps (and Dendrograms) – A Shiny App

July 7, 2013 | dgrapov

Heatmaps are a great way to visualize data matrices. Heatmap color and organization can be used to  encode information about the data and metadata to help learn about the data at hand. An example of this could be looking at the raw data  or hierarchically clustering samples and variables based ... [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)