Blog Archives

Friday fun projects

May 14, 2011
By
Friday fun projects

What’s a “Friday fun project”? It’s a small computing project, perfect for a Friday afternoon, which serves the dual purpose of (1) keeping your programming/data analysis skills sharp and (2) providing a mental break from the grind of your day job. Ideally, the skills learned on the project are useful and transferable to your work

Read more »

R 2.12 to 2.13 package upgrade

April 14, 2011
By
R 2.12 to 2.13 package upgrade

If you: use Linux have just upgraded your R installation from 2.12 to 2.13 installed some/all of your packages in your home area (e.g. ~/R/i486-pc-linux-gnu-library/2.12) and… …are wondering why R can’t see them any more just do this: # at a shell prompt cp ~/R/i486-pc-linux-gnu-library/2.12 ~/R/i486-pc-linux-gnu-library/2.13 # in R console update.packages(checkBuilt=TRUE, ask=FALSE) # back to

Read more »

Fixing aberrant files using R and the shell: a case study

April 7, 2011
By
Fixing aberrant files using R and the shell: a case study

Once in a while, you embark on what looks like a simple computational procedure only to encounter frustration very early on. “I can’t even read my file into R!” you cry. Step back, take a deep breath and take note of what the software is trying to tell you. Most times, you’ve just missed something

Read more »

The RStudio IDE: first impressions are positive

February 28, 2011
By
The RStudio IDE: first impressions are positive

Integrated development environments (IDEs) are software development tools, providing an interface that enables you to write, debug, run and view the output of your code. Whether you need an IDE or find them useful depends very much on your own preferences and style of working. In my own case for example, I’ve tried both Eclipse

Read more »

Analysis of retractions in PubMed

November 30, 2010
By
Analysis of retractions in PubMed

As so often happens these days, a brief post at FriendFeed got me thinking about data analysis. Entitled “So how many retractions are there every year, anyway?”, the post links to this article at Retraction Watch. It discusses ways to estimate the number of retractions and in particular, a recent article in the Journal of

Read more »

Findings increasingly novel, scientists say…

October 29, 2010
By
Findings increasingly novel, scientists say…

…was the tongue-in-cheek title of an image that I posted to Twitpic this week. It shows the usage of the word “novel” in PubMed article titles over time. As someone correctly pointed out at FriendFeed, it needs to be corrected for total publications per year. It was inspired by a couple of items that caught

Read more »

BioStar users (of the world, unite)

October 9, 2010
By
BioStar users (of the world, unite)

Egon writes: Can someone please plot the BioStar users on a Google Map? Sounds like a challenge. Let’s go. 1. Harvesting user IP addresses BioStar user profiles (here’s mine) include a location field. It’s free text and optional, which means that location is missing or inaccurate for many users. However, if you’re logged into BioStar

Read more »

Connecting to a MongoDB database from R using Java

September 24, 2010
By
Connecting to a MongoDB database from R using Java

It would be nice if there were an R package, along the lines of RMySQL, for MongoDB. For now there is not – so, how best to get data from a MongoDB database into R? One option is to retrieve JSON via the MongoDB REST interface and parse it using the rjson package. Assuming, for

Read more »

GEO database: curation lagging behind submission?

August 30, 2010
By
GEO database: curation lagging behind submission?

I was reading an old post that describes GEOmetadb, a downloadable database containing metadata from the GEO database. We had a brief discussion in the comments about the growth in GSE records (user-submitted) versus GDS records (curated datasets) over time. Below, some quick and dirty R code to examine the issue, using the Bioconductor GEOmetadb

Read more »

Abstract word clouds using R

August 23, 2010
By
Abstract word clouds using R

A recent question over at BioStar asked whether abstracts returned from a PubMed search could easily be visualised as “word clouds”, using Wordle. This got me thinking about ways to solve the problem using R. Here’s my first attempt, which demonstrates some functions from the RCurl and XML packages. update: corrected a couple of copy/paste

Read more »