# Blog Archives

## Using R: common errors in table import

March 6, 2014
By

I’ve written before about importing tabular text files into R, and here comes some more. This is because I believe (firmly) that importing data is the major challenge for beginners who want to analyse their data in R. What is the most important thing about using any statistics software? To get your data into it

## Using R: correlation heatmap, take 2

March 3, 2014
By

Apparently, this turned out to be my most popular post ever.  Of course there are lots of things to say about the heatmap (or quilt, tile, guilt plot etc), but what I wrote was literally just a quick celebratory post to commemorate that I’d finally grasped how to combine reshape2 and ggplot2 to quickly make

## Books and lessons about ggplot2

February 19, 2014
By

I recently got an email from a person at Packt publishing, who suggested I write a book for them about ggplot2. My answer, which is perfectly true, is that I don’t have the time, nor the expertise to do that. What I didn’t say is that 1) a quick web search suggests that Packt doesn’t

## Fall is the data analysis season

December 7, 2013
By

Dear diary, I spent a lot of my summer in the lab, and my fall has been mostly data analysis, with a little writing and a couple of courses thrown in there. Data analysis means writing code, and nowadays I do most of my work with the help of R. R has even replaced python

## Using R: Coloured sizeplot with ggplot2

November 17, 2013
By

Someone asked about this and I though the solution with ggplot2 was pretty neat. Imagine that you have a scatterplot with some points in the exact same coordinates, and to reduce overplotting you want to have the size of the dot indicating the number of data points that fall on it. At the same time

## A slightly different introduction to R, part V: plotting and simulating linear models

November 11, 2013
By

In the last episode (which was quite some time ago) we looked into comparisons of means with linear models. This time, let’s visualise some linear models with ggplot2, and practice another useful R skill, namely how to simulate data from known models. While doing this, we’ll learn some more about the layered structure of a

## R intro seminars, take 2: some slides about data frames, linear models and statistical graphics

November 7, 2013
By

I am doing a second installment of the lunch seminars about data analysis with R for the members of the Wright lab. It’s pretty much the same material as before — data frames, linear models and some plots with ggplot2 — but I’ve sprinkled in some more exercises during the seminars. I’ve tried emphasising scripting a

## Using R: Two plots of principal component analysis

June 26, 2013
By

PCA is a very common method for exploration and reduction of high-dimensional data. It works by making linear combinations of the variables that are orthogonal, and is thus a way to change basis to better see patterns in data. You either do spectral decomposition of the correlation matrix or singular value decomposition of the data

## Using R: drawing several regression lines with ggplot2

June 2, 2013
By

Occasionally I find myself wanting to draw several regression lines on the same plot, and of course ggplot2 has convenient facilities for this. As usual, don’t expect anything profound from this post, just a quick tip! There are several reasons we might end up with a table of  regression coefficients connecting two variables in different

## ”How to draw the line” with ggplot2

May 30, 2013
By

In a recent tutorial in the eLife journal, Huang, Rattner, Liu & Nathans suggested that researchers who draw scatterplots should start providing not one but three regression lines. I quote, Plotting both regression lines gives a fuller picture of the data, and comparing their slopes provides a simple graphical assessment of the correlation coefficient. Plotting