Bioinformatics

The avalanche of publications mentioning GO

November 30, 2010 | R on Guangchuang Yu

Gene Ontology is the de facto standard for annotation of gene products. It has been widely used in biological data mining, and I believe it will play more central role in the future. Publications mentioning GO was collected and deposited in GO ftp, and can be accessed (ftp://ftp.geneontology.... [Read more...]

Export R data to tex code

October 12, 2010 | Martin Scharm

We often use Gnu R to work on different things and to solve various exercises. It's always a disgusting job to export e.g. a matrix with probabilities to a LaTeX document to send it to our supervisors, but Rumpel just gave me a little hint. [Read more...]

ClusterProfiles

October 12, 2010 | R on Guangchuang Yu

It is very common to cluster genes based on their expression profiles, and also very common to integrate Gene Ontology to observe the distribution of biological processes, molecular functions and cellular components for a given gene list. But, what if the two in combination? The Gene Ontology distributions across a ... [Read more...]

BioStar users (of the world, unite)

October 9, 2010 | nsaunders

Egon writes: Can someone please plot the BioStar users on a Google Map? Sounds like a challenge. Let’s go. 1. Harvesting user IP addresses BioStar user profiles (here’s mine) include a location field. It’s free text and optional, which means that location is missing or inaccurate for many ...
[Read more...]

GEO database: curation lagging behind submission?

August 30, 2010 | nsaunders

I was reading an old post that describes GEOmetadb, a downloadable database containing metadata from the GEO database. We had a brief discussion in the comments about the growth in GSE records (user-submitted) versus GDS records (curated datasets) over time. Below, some quick and dirty R code to examine the ... [Read more...]

Analysing the ISMB 2010 meeting using R

July 20, 2010 | nsaunders

The colossus of bioinformatics meetings, ISMB, convened in Boston this year from July 9 – 13. As in recent years, the meeting was covered online at its website, FriendFeed and Twitter. I thought it would be fun to run a quick analysis of activity at the FriendFeed room using R. 1. Fetch the data ...
[Read more...]

biomaRt and GenomeGraphs: a worked example

June 6, 2010 | nsaunders

As promised a few posts ago, another demonstration of the excellent biomaRt package, this time in conjunction with GenomeGraphs. Here’s what we’re going to do: Grab some public microarray data Normalise and get a list of the most differentially-expressed probesets Use biomaRt to fetch the genes associated with ... [Read more...]

Top 10 Algorithms in Data Mining

April 23, 2010 | Stephen Turner

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public... [Read more...]

A new twist on the identifier mapping problem

January 11, 2010 | nsaunders

Yesterday, Deepak wrote about BridgeDB, a software package to deal with the “identifier mapping problem”. Put simply, biologists can name a biological entity in any way that they like, leading to multiple names for the same object. Easily solved, you might think, by choosing one identifier and sticking to it, ... [Read more...]

RSRuby in the IRB console

August 6, 2009 | nsaunders

R is terrific, of course, for all your statistical needs. But those data structures! “Everything is a list.” Leading to such wondrous ways to access variables as “p
[Read more...]

Select operations on R data frames

July 26, 2009 | Chris

The R language is weird - particularly for those coming from a typical programmer's background, which likely includes OO languages in the curly-brace family and relational databases using SQL. A key data structure in R, the data.frame, is used somethin...
[Read more...]

R String processing

July 2, 2009 | Chris

Here's a little vignette of data munging using the regular expression facilities of R (aka the R-project for statistical computing). Let's say I have a vector of strings that looks like this:__ coords [1] "chromosome+:157470-158370" "chromosome+:1583...
[Read more...]

An R Wiki

April 21, 2008 | nsaunders

It’s been ages since I visited the R website, so I don’t know how long they’ve had a wiki. It’s built using DokuWiki, one of my personal favourites. This is a great leap forward for R documentation, which is somewhat notorious for being (a) difficult to ... [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)