# Blog Archives

## Fall is the data analysis season

December 7, 2013
By

Dear diary, I spent a lot of my summer in the lab, and my fall has been mostly data analysis, with a little writing and a couple of courses thrown in there. Data analysis means writing code, and nowadays I do most of my work with the help of R. R has even replaced python

## Using R: Coloured sizeplot with ggplot2

November 17, 2013
By

Someone asked about this and I though the solution with ggplot2 was pretty neat. Imagine that you have a scatterplot with some points in the exact same coordinates, and to reduce overplotting you want to have the size of the dot indicating the number of data points that fall on it. At the same time

## A slightly different introduction to R, part V: plotting and simulating linear models

November 11, 2013
By

In the last episode (which was quite some time ago) we looked into comparisons of means with linear models. This time, let’s visualise some linear models with ggplot2, and practice another useful R skill, namely how to simulate data from known models. While doing this, we’ll learn some more about the layered structure of a

## R intro seminars, take 2: some slides about data frames, linear models and statistical graphics

November 7, 2013
By

I am doing a second installment of the lunch seminars about data analysis with R for the members of the Wright lab. It’s pretty much the same material as before — data frames, linear models and some plots with ggplot2 — but I’ve sprinkled in some more exercises during the seminars. I’ve tried emphasising scripting a

## Using R: Two plots of principal component analysis

June 26, 2013
By

PCA is a very common method for exploration and reduction of high-dimensional data. It works by making linear combinations of the variables that are orthogonal, and is thus a way to change basis to better see patterns in data. You either do spectral decomposition of the correlation matrix or singular value decomposition of the data

## Using R: drawing several regression lines with ggplot2

June 2, 2013
By

Occasionally I find myself wanting to draw several regression lines on the same plot, and of course ggplot2 has convenient facilities for this. As usual, don’t expect anything profound from this post, just a quick tip! There are several reasons we might end up with a table of  regression coefficients connecting two variables in different

## ”How to draw the line” with ggplot2

May 30, 2013
By

In a recent tutorial in the eLife journal, Huang, Rattner, Liu & Nathans suggested that researchers who draw scatterplots should start providing not one but three regression lines. I quote, Plotting both regression lines gives a fuller picture of the data, and comparing their slopes provides a simple graphical assessment of the correlation coefficient. Plotting

## Slides and exercise from my second R intro seminar

April 28, 2013
By

This week I held the second introductory seminar on R, and I think it went pretty well — though I guess you really should ask my colleagues if you want to know. The first seminar was a lecture, and this seminar was a tutorial where we made some plots and calculated a few of the

## Slides from my R intro seminar

April 23, 2013
By

Here are my slides from a short introductory seminar on R (essentially going through part I of the R tutorial) last week. As magic lantern pictures go, they’re hideously ugly, but they were mostly there for future reference. Most of the seminar was spent showing RStudio. This Friday, we’ll practice some uses of qplot and make

## Using R: reading tables that need a little cleaning

March 24, 2013
By

Sometimes one needs to read tables that are a bit messy, so that read.table doesn’t immediately recognize the content as numerical. Maybe some weird characters are sprinkled in the table (ever been given a table with significance stars in otherwise numerical columns?). Some search and replace is needed. You can do this by hand, and