Blog Archives

A brief introduction to “apply” in R

August 19, 2010
By
A brief introduction to “apply” in R

At any R Q&A site, you’ll frequently see an exchange like this one: Q: How can I use a loop to ? A: Don’t. Use one of the apply functions. So, what are these wondrous apply functions and how do they work? I think the best way to figure out anything in

Read more »

Analysing the ISMB 2010 meeting using R

July 20, 2010
By
Analysing the ISMB 2010 meeting using R

The colossus of bioinformatics meetings, ISMB, convened in Boston this year from July 9 – 13. As in recent years, the meeting was covered online at its website, FriendFeed and Twitter. I thought it would be fun to run a quick analysis of activity at the FriendFeed room using R. 1. Fetch the data We

Read more »

biomaRt and GenomeGraphs: a worked example

June 6, 2010
By
biomaRt and GenomeGraphs: a worked example

As promised a few posts ago, another demonstration of the excellent biomaRt package, this time in conjunction with GenomeGraphs. Here’s what we’re going to do: Grab some public microarray data Normalise and get a list of the most differentially-expressed probesets Use biomaRt to fetch the genes associated with those probesets Plot the data using GenomeGraphs

Read more »

Beware of rogue header files (Bioconductor installation)

May 11, 2010
By
Beware of rogue header files (Bioconductor installation)

Just a short note concerning a “gotcha”. As I have many times before, I opened an R console on my newly-upgraded (to lucid 10.04) Ubuntu machine, typed source(“http://bioconductor.org/biocLite.R”) and began a Bioconductor install with biocLite(). Only this time, I saw this: Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/home/sau103/R/i486-pc-linux-gnu-library/2.11/affyio/libs/affyio.so':

Read more »

Experiments with igraph

April 21, 2010
By
Experiments with igraph

Networks – social and biological – are all the rage, just now. Indeed, a recent entry at Duncan’s QOTD described the “hairball” network representation as the dominant cultural icon in molecular biology. I’ve not had occasion to explore networks “professionally”, but have always been fascinated by both networks and the tools used to analyse them.

Read more »

Getting your web application and R(Apache) to talk to each other

April 19, 2010
By
Getting your web application and R(Apache) to talk to each other

Here’s the situation. Web applications, built using a framework (e.g. Rails, Django) are great for fetching data from a database and rendering it. They’re not so great for crunching and charting the data. Conversely, R is great for crunching and charting, but doesn’t make for a great web application. The idea then, is to let

Read more »

I’d be more than happy with the unlinked data web

April 14, 2010
By
I’d be more than happy with the unlinked data web

Visit this URL and you’ll find a perfectly-formatted CSV file containing information about recent earthquakes. A nice feature of R is the ability to slurp such a URL straight into a data frame: quakes <- read.csv("http://neic.usgs.gov/neis/gis/qed.asc", header = T) colnames(quakes) # "Date" "TimeUTC" "Latitude" "Longitude" "Magnitude" "Depth" # number of recent quakes nrow(quakes) #

Read more »

Plotting “time of day” data using ggplot2

April 14, 2010
By
Plotting “time of day” data using ggplot2

William asks: How can I make a graph that looks like this, “tweet density” style, showing time intervals? He then helpfully describes his input data: a CSV file with headers “time started, time finished, date”. Here’s a simple CSV file, tasks.csv: task,date,start,end task1,2010-03-05,09:00:00,13:00:00 task2,2010-03-06,10:00:00,15:00:00 task3,2010-03-06,11:00:00,18:00:00 task4,2010-03-07,08:00:00,11:00:00 task5,2010-03-08,14:00:00,17:00:00 task6,2010-03-09,12:00:00,16:00:00 task7,2010-03-10,14:00:00,19:00:00 task8,2010-03-11,09:30:00,13:30:00 Read into R, calculate the

Read more »

BioMart (and biomaRt)

March 26, 2010
By
BioMart (and biomaRt)

I’ve been vaguely aware of BioMart for a few years. Inexplicably, I’ve only recently started to use it. It’s one of the most useful applications I’ve ever used. The concept is simple. You have a set of identifiers that describe a biological object, such as a gene. These are called filters. They have values –

Read more »

From the “blogosphere”? Hardly.

January 27, 2010
By
From the “blogosphere”? Hardly.

I generally skip over “From the Blogosphere”, a (mostly) weekly-summary of one or two blog posts in Nature’s “Authors” section (here is the latest). Why? Well, I’ve always suspected that the title is rather misleading. Now, I have the hard numbers to prove it. My feed reader contains an archive of 128 articles, dating back

Read more »