Blog Archives

Analysis of gene expression timecourse data using maSigPro

May 28, 2015
By
Analysis of gene expression timecourse data using maSigPro

About a year ago, I did a little work on a very interesting project which was trying to identify blood-based biomarkers for the early detection of stroke. The data included gene expression measurements using microarrays at various time points after the onset of ischemia (reduced blood supply). I had not worked with timecourse data before,

Read more »

Searching for the Steamer retroelement in the ocean metagenome

May 25, 2015
By
Searching for the Steamer retroelement in the ocean metagenome

Last week, I was listening to episode 337 of the podcast This Week in Virology. It concerned a retrovirus-like sequence element named Steamer, which is associated with a transmissible leukaemia in soft shell clams. At one point the host and guests discussed the idea of searching for Steamer-like sequences in the data from ocean metagenomics

Read more »

Some basics of biomaRt

April 27, 2015
By
Some basics of biomaRt

One of the commonest bioinformatics questions, at Biostars and elsewhere, takes the form: “I have a list of identifiers (X); I want to relate them to a second set of identifiers (Y)”. HGNC gene symbols to Ensembl Gene IDs, for example. When this occurs I have been known to tweet “the answer is BioMart” (there

Read more »

R 3.1 -> 3.2 upgrade notes

April 19, 2015
By
R 3.1 -> 3.2 upgrade notes

My machines upgraded from R version 3.1.3 to version 3.2.0 last week, which means that existing code suddenly cannot find packages and so fails. Some notes to myself, possibly useful to others, for what to do when this happens. Relevant to Ubuntu-based systems (I use Linux Mint). 1. Update packages 1.1. rJava issues My rJava

Read more »

Project Tycho, ggplot2 and the shameless stealing of blog ideas

April 14, 2015
By
Project Tycho, ggplot2 and the shameless stealing of blog ideas

Last week, Mick Watson posted a terrific article on using R to recreate the visualizations in this WSJ article on the impact of vaccination. Someone beat me to the obvious joke. @BioMickWatson @pathogenomenick Nice quilt plot. — Ed Yong (@edyong209) April 9, 2015 Someone also beat me to the standard response whenever base R graphics

Read more »

Configuring the R BatchJobs package for Torque batch queues

March 31, 2015
By
Configuring the R BatchJobs package for Torque batch queues

I was asked recently to look at some R code which performs “embarrassingly parallel” computations (the same function, multiple times, different parameters) and see whether I could modify it to run on one of our high-performance computing clusters. The machine has 63 virtual compute nodes and uses the TORQUE batch queue system to allocate nodes

Read more »

PubMed retraction reporting update

March 23, 2015
By
PubMed retraction reporting update

Just a quick update to the previous post. At the helpful suggestion of Steve Royle, I’ve added a new section to the report which attempts to normalise retractions by journal. So for example, J. Biol. Chem. has (as of now) 94 retracted articles and in total 170 842 publications indexed in PubMed. That becomes (100

Read more »

PMRetract: PubMed retraction reporting rewritten as an interactive RMarkdown document

March 22, 2015
By
PMRetract: PubMed retraction reporting rewritten as an interactive RMarkdown document

Back in 2010, I wrote a web application called PMRetract to monitor retraction notices in the PubMed database. It was written primarily as a way for me to explore some technologies: the Ruby web framework Sinatra, MongoDB (hosted at MongoHQ, now Compose) and Heroku, where the app was hosted. I automated the update process using

Read more »

Just how many retracted articles are there in PubMed anyway?

March 19, 2015
By
Just how many retracted articles are there in PubMed anyway?

I am forever returning to PubMed data, downloaded as XML, trying to extract information from it and becoming deeply confused in the process. Take the seemingly-simple question “how many retracted articles are there in PubMed?” Well, one way is to search for records with the publication type “Retracted Article”. As of right now, that returns

Read more »

Make prettier documents by reusing chunks in RMarkdown

February 23, 2015
By
Make prettier documents by reusing chunks in RMarkdown

No revelations here, just a little R tip for generating more readable documents. There are times when I want to show code in a document, but I don’t want it to be the first thing that people see. What I want to see first is the output from that code. In this silly example, I

Read more »