# 354 search results for "pca"

## Using Biplots to Map Cluster Solutions

July 2, 2014
By

FactoMineR is a quick and easy R package for generating biplots, such as the following plot showing the columns as arrows with the rows to be added later as points. As you might recall from a previous post, a biplot maps a data matrix by plotting both ...

## stone flakes IV

June 29, 2014
By

In this post I want to try something new, a causal graphical model. The aim here is just as much to get myself a feel what these things do as to understand how the stone flakes data fit together. DataData are stone flakes data which I analyzed previous...

## Multivariate Data Analysis and Visualization Through Network Mapping

June 27, 2014
By

Recently I had the pleasure of speaking about one of my favorite topics, Network Mapping. This is a continuation of a general theme I’ve previously discussed and involves the merger of statistical and multivariate data analysis results with a network. Over the past year I’ve been working on two major tools, DeviumWeb and MetaMapR, which

## Tailoring univariate probability distributions

June 26, 2014
By

This post shows how to build a custom univariate distribution in R from scratch, so that you end up with the essential functions: a probability density function, cumulative distribution function, quantile function and random number generator. In the beginning all you need is an equation of the probability density function, … Continue reading →

## Bedtools tutorial from 2013 CSHL course

June 24, 2014
By

A couple of months ago I posted about how to visualize exome coverage with bedtools and R. But if you're looking to get a basic handle on genome arithmetic, take a look at Aaron Quinlan's bedtools tutorials from the 2013 CSHL course. The tutorial uses ...

## stone flakes

June 6, 2014
By

I browsed through UC Irvine Machine Learning Repository! the other day and noticed a nice data set regarding stone flakes produced by our ancestors, the prehistoric men. To quote the dataset owners:'The data set concerns the earliest history ...

## Using Repeated Measures to Remove Artifacts from Longitudinal Data

June 4, 2014
By

Recently I was tasked with evaluating and most importantly removing analytical variance form a longitudinal metabolomic analysis carried out over a few years and including >2,5000 measurements for >5,000 patients. Even using state-of-the-art analytical instruments and techniques long term biological studies are plagued with unwanted trends which are unrelated to the original experimental design and stem from analytical

## SMART Hackathon: Day 2: Writing Packages in RStudio

May 6, 2014
By

So day 2 of the #JHUSMARTHack was last week, but I figured this would be a good time to discuss what was accomplished. I created some packages that are somewhat specialized and aren't fully finished yet, so I'll hold off. What I really want to discuss though is why I like using RStudio for making

## Decision making trees and machine learning resources for R

April 30, 2014
By

I have recently come across Ricky Ho's blog "Pragmatic Programming Techniques", which seems to be excellent resource for all sorts of aspects regarding data exploration and predictive modelling. The post "Six steps in data science" provides a nice overview to some of the topics covered in the blog. For some reason, this blog does not seem to be...

## Mythbusting – Dr. Copper

April 21, 2014
By

Image by Justin Reznick   “An economist is an expert who will know tomorrow why the things he predicted yesterday didn't happen today.” Laurence J. Peter (author and creator of the Peter Principle) If you were paying attention to financial sites last month, you probably noticed a number of articles on “Dr. Copper”. Here is