Blog Archives

R 3.1 -> 3.2 upgrade notes

April 19, 2015
By
R 3.1 -> 3.2 upgrade notes

My machines upgraded from R version 3.1.3 to version 3.2.0 last week, which means that existing code suddenly cannot find packages and so fails. Some notes to myself, possibly useful to others, for what to do when this happens. Relevant to Ubuntu-based systems (I use Linux Mint). 1. Update packages 1.1. rJava issues My rJava

Read more »

Project Tycho, ggplot2 and the shameless stealing of blog ideas

April 14, 2015
By
Project Tycho, ggplot2 and the shameless stealing of blog ideas

Last week, Mick Watson posted a terrific article on using R to recreate the visualizations in this WSJ article on the impact of vaccination. Someone beat me to the obvious joke. @BioMickWatson @pathogenomenick Nice quilt plot. — Ed Yong (@edyong209) April 9, 2015 Someone also beat me to the standard response whenever base R graphics

Read more »

Configuring the R BatchJobs package for Torque batch queues

March 31, 2015
By
Configuring the R BatchJobs package for Torque batch queues

I was asked recently to look at some R code which performs “embarrassingly parallel” computations (the same function, multiple times, different parameters) and see whether I could modify it to run on one of our high-performance computing clusters. The machine has 63 virtual compute nodes and uses the TORQUE batch queue system to allocate nodes

Read more »

PubMed retraction reporting update

March 23, 2015
By
PubMed retraction reporting update

Just a quick update to the previous post. At the helpful suggestion of Steve Royle, I’ve added a new section to the report which attempts to normalise retractions by journal. So for example, J. Biol. Chem. has (as of now) 94 retracted articles and in total 170 842 publications indexed in PubMed. That becomes (100

Read more »

PMRetract: PubMed retraction reporting rewritten as an interactive RMarkdown document

March 22, 2015
By
PMRetract: PubMed retraction reporting rewritten as an interactive RMarkdown document

Back in 2010, I wrote a web application called PMRetract to monitor retraction notices in the PubMed database. It was written primarily as a way for me to explore some technologies: the Ruby web framework Sinatra, MongoDB (hosted at MongoHQ, now Compose) and Heroku, where the app was hosted. I automated the update process using

Read more »

Just how many retracted articles are there in PubMed anyway?

March 19, 2015
By
Just how many retracted articles are there in PubMed anyway?

I am forever returning to PubMed data, downloaded as XML, trying to extract information from it and becoming deeply confused in the process. Take the seemingly-simple question “how many retracted articles are there in PubMed?” Well, one way is to search for records with the publication type “Retracted Article”. As of right now, that returns

Read more »

Make prettier documents by reusing chunks in RMarkdown

February 23, 2015
By
Make prettier documents by reusing chunks in RMarkdown

No revelations here, just a little R tip for generating more readable documents. There are times when I want to show code in a document, but I don’t want it to be the first thing that people see. What I want to see first is the output from that code. In this silly example, I

Read more »

Counting things is hard for a given value of “things”

December 1, 2014
By
Counting things is hard for a given value of “things”

This post is just a summary of some interesting online discussion from last week around open access publishing. I learned a few things about definitions and PubMed/PMC filters. It all begins with an opinion piece, “Open access is tiring out peer reviewers.” With a title like that you might expect rebuttals from people like Michael

Read more »

Bioinformatics journals: time from submission to acceptance, revisited

October 13, 2014
By
Bioinformatics journals: time from submission to acceptance, revisited

Before we start: yes, we’ve been here before. There was the Biostars question “Calculating Time From Submission To Publication / Degree Of Burden In Submitting A Paper.” That gave rise to Pierre’s excellent blog post and code + data on Figshare. So why are we here again? 1. It’s been a couple of years. 2.

Read more »

PubMed Publication Date: what is it, exactly?

September 23, 2014
By
PubMed Publication Date: what is it, exactly?

File this one under “has troubled me (and others) for some years now, let’s try to resolve it.” Let’s use the excellent R/rentrez package to search PubMed for articles that were retracted in 2013. 117 articles. Now let’s fetch the records in XML format. Next question: which XML element specifies the “Date of publication” (PDAT)?

Read more »