Blog Archives

I’d be more than happy with the unlinked data web

April 14, 2010
By
I’d be more than happy with the unlinked data web

Visit this URL and you’ll find a perfectly-formatted CSV file containing information about recent earthquakes. A nice feature of R is the ability to slurp such a URL straight into a data frame: quakes <- read.csv("http://neic.usgs.gov/neis/gis/qed.asc", header = T) colnames(quakes) # "Date" "TimeUTC" "Latitude" "Longitude" "Magnitude" "Depth" # number of recent quakes nrow(quakes) #

Read more »

Plotting “time of day” data using ggplot2

April 14, 2010
By
Plotting “time of day” data using ggplot2

William asks: How can I make a graph that looks like this, “tweet density” style, showing time intervals? He then helpfully describes his input data: a CSV file with headers “time started, time finished, date”. Here’s a simple CSV file, tasks.csv: task,date,start,end task1,2010-03-05,09:00:00,13:00:00 task2,2010-03-06,10:00:00,15:00:00 task3,2010-03-06,11:00:00,18:00:00 task4,2010-03-07,08:00:00,11:00:00 task5,2010-03-08,14:00:00,17:00:00 task6,2010-03-09,12:00:00,16:00:00 task7,2010-03-10,14:00:00,19:00:00 task8,2010-03-11,09:30:00,13:30:00 Read into R, calculate the

Read more »

BioMart (and biomaRt)

March 26, 2010
By
BioMart (and biomaRt)

I’ve been vaguely aware of BioMart for a few years. Inexplicably, I’ve only recently started to use it. It’s one of the most useful applications I’ve ever used. The concept is simple. You have a set of identifiers that describe a biological object, such as a gene. These are called filters. They have values –

Read more »

From the “blogosphere”? Hardly.

January 27, 2010
By
From the “blogosphere”? Hardly.

I generally skip over “From the Blogosphere”, a (mostly) weekly-summary of one or two blog posts in Nature’s “Authors” section (here is the latest). Why? Well, I’ve always suspected that the title is rather misleading. Now, I have the hard numbers to prove it. My feed reader contains an archive of 128 articles, dating back

Read more »

A new twist on the identifier mapping problem

January 11, 2010
By
A new twist on the identifier mapping problem

Yesterday, Deepak wrote about BridgeDB, a software package to deal with the “identifier mapping problem”. Put simply, biologists can name a biological entity in any way that they like, leading to multiple names for the same object. Easily solved, you might think, by choosing one identifier and sticking to it, but that’s apparently way too

Read more »

Samples per series/dataset in the NCBI GEO database

January 7, 2010
By
Samples per series/dataset in the NCBI GEO database

Andrew asks: I want to get an NCBI GEO report showing the number of samples per series or data set. Short of downloading all of GEO, anyone know how to do this? Is there a table of just metadata hidden somewhere? At work, we joke that GEO is the only database where data goes in,

Read more »

The Life Scientists at FriendFeed: 2009 summary

December 23, 2009
By
The Life Scientists at FriendFeed: 2009 summary

It’s Christmas Eve tomorrow and so I declare the year over. My Christmas gift to you is a summary of activity in 2009 at the FriendFeed Life Scientists group. It’s crafted using R + Ruby, with raw data and some code snippets available. If you want to see the most popular items from the group

Read more »

APIs: I wish the life sciences would learn from social networks

December 10, 2009
By
APIs: I wish the life sciences would learn from social networks

I was prompted by a thread on the apparent decline of FriendFeed to look for evidence of declining participation in my networks. First, a quick and dirty Ruby script, tls.rb to grab the Life Scientists feed and count the likes and comments: #!/usr/bin/ruby require 'rubygems' require 'json/pure' require 'net/http' require 'open-uri' def format_date(d) if d

Read more »

A brief survey of R web interfaces

November 29, 2009
By
A brief survey of R web interfaces

I’m looking at ways to provide access to R via a web application. First rule: see what’s available first, before you reinvent the wheel. It’s not pretty. From the R Web Interfaces FAQ: Software Brief notes Rweb Page last updated 1999. Of the 3 example links on the page one ran very slowly, the second not at

Read more »

R has a JSON package

November 5, 2009
By
R has a JSON package

Named rjson, appropriately. It’s quite basic just now, but contains methods for interconversion between R objects and JSON. Something like this: > library(rjson) > data <- list(a=1,b=2,c=3) > json <- toJSON(data) > json "{\"a\":1,\"b\":2,\"c\":3}" > cat(json, file="data.json") Use cases? I wonder if RApache could be used to build an API that serves R data in JSON format? Posted in

Read more »