Posts Tagged ‘ Bioinformatics ’

ClusterProfiles

October 12, 2010
By
ClusterProfiles

It is very common to cluster genes based on their expression profiles, and also very common to integrate Gene Ontology to observe the distribution of biological processes, molecular functions and cellular components for a given gene list. But, what if the two in combination? The Gene Ontology distributions across a variety of gene clusters may give us a...

Read more »

ClusterProfiles

October 12, 2010
By
ClusterProfiles

It is very common to cluster genes based on their expression profiles, and also very common to integrate Gene Ontology to observe the distribution of biological processes, molecular functions and cellular components for a given gene list. But, what if the two in combination? The Gene Ontology distributions across a variety of gene clusters may give us a new...

Read more »

BioStar users (of the world, unite)

October 9, 2010
By
BioStar users (of the world, unite)

Egon writes: Can someone please plot the BioStar users on a Google Map? Sounds like a challenge. Let’s go. 1. Harvesting user IP addresses BioStar user profiles (here’s mine) include a location field. It’s free text and optional, which means that location is missing or inaccurate for many users. However, if you’re logged into BioStar

Read more »

GEO database: curation lagging behind submission?

August 30, 2010
By
GEO database: curation lagging behind submission?

I was reading an old post that describes GEOmetadb, a downloadable database containing metadata from the GEO database. We had a brief discussion in the comments about the growth in GSE records (user-submitted) versus GDS records (curated datasets) over time. Below, some quick and dirty R code to examine the issue, using the Bioconductor GEOmetadb

Read more »

Analysing the ISMB 2010 meeting using R

July 20, 2010
By
Analysing the ISMB 2010 meeting using R

The colossus of bioinformatics meetings, ISMB, convened in Boston this year from July 9 – 13. As in recent years, the meeting was covered online at its website, FriendFeed and Twitter. I thought it would be fun to run a quick analysis of activity at the FriendFeed room using R. 1. Fetch the data We

Read more »

biomaRt and GenomeGraphs: a worked example

June 6, 2010
By
biomaRt and GenomeGraphs: a worked example

As promised a few posts ago, another demonstration of the excellent biomaRt package, this time in conjunction with GenomeGraphs. Here’s what we’re going to do: Grab some public microarray data Normalise and get a list of the most differentially-expressed probesets Use biomaRt to fetch the genes associated with those probesets Plot the data using GenomeGraphs

Read more »

Top 10 Algorithms in Data Mining

April 23, 2010
By

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative public...

Read more »

A new twist on the identifier mapping problem

January 11, 2010
By
A new twist on the identifier mapping problem

Yesterday, Deepak wrote about BridgeDB, a software package to deal with the “identifier mapping problem”. Put simply, biologists can name a biological entity in any way that they like, leading to multiple names for the same object. Easily solved, you might think, by choosing one identifier and sticking to it, but that’s apparently way too

Read more »

Samples per series/dataset in the NCBI GEO database

January 7, 2010
By
Samples per series/dataset in the NCBI GEO database

Andrew asks: I want to get an NCBI GEO report showing the number of samples per series or data set. Short of downloading all of GEO, anyone know how to do this? Is there a table of just metadata hidden somewhere? At work, we joke that GEO is the only database where data goes in,

Read more »

RSRuby in the IRB console

August 6, 2009
By
RSRuby in the IRB console

R is terrific, of course, for all your statistical needs. But those data structures! “Everything is a list.” Leading to such wondrous ways to access variables as “p <- Meta(gds)$platform", or "last <- mylist]])]". Sometimes, you want something more familiar. An array, a hash, a hash of arrays. Or, you may

Read more »