Blog Archives

Reconstructing Principal Component Analysis Matrix

April 5, 2013
By
Reconstructing Principal Component Analysis Matrix

PCA is widely used method for finding patterns in high-dimensional data. Whether you use it to compress large matrix or to remove one of the principal components in biological datasets, you’ll end up with the task of performing series of … Continue reading →

Read more »

ggplot2 multiple boxplots with metadata

January 26, 2013
By
ggplot2 multiple boxplots with metadata

Recently I was asked for an advice of how to plot values with an additional attached condition separating the boxplots. This turns out to be ugly in base graphics, but amazingly simple in ggplot2.

Read more »

Controlling heatmap colors with ggplot2

November 22, 2012
By
Controlling heatmap colors with ggplot2

One of the most popular posts on this blog is the very first one, solving the issue of mapping certain ranges of values to particular colors in heatmaps. Given the abundance of ggplot2 usage in R plotting, I thought I’d … Continue reading →

Read more »

It Takes 2 Lines of R Code to Discover Interesting Biology

October 23, 2012
By
It Takes 2 Lines of R Code to Discover Interesting Biology

The following biological phenomenon demonstrates just how elegant R code can be. In vertebrate genomes, a methyl group (-CH3) can be added to nucleotides. Such process of methylation is commonly associated with gene suppression. Most of the cytosines in the … Continue reading →

Read more »

ChIP-seq Analysis with Bioconductor

October 22, 2012
By
ChIP-seq Analysis with Bioconductor

Often scientists are interested in finding genome-wide binding site of their protein of interest. R offers easy way to load and process the sequence files coming from ChIP-seq experiment. During the next weeks I’m going to present a pipeline that … Continue reading →

Read more »

discrimination between CpG islands and random sequences using Markov chains

February 8, 2012
By
discrimination between CpG islands and random sequences using Markov chains

Major part of modern research is trying to find patterns in the given dataset using learning methods. One of the methods that can use a priori information for such purpose is Markov chains, in which the probability of symbol occurrence … Continue reading →

Read more »

quick tips: within function assignment and specific object removal

January 31, 2012
By
quick tips: within function assignment and specific object removal

If you’re familiar with the faster iterations on objects such as lapply, sapply, or apply for matrices, you might get surprised that the function call saves new assignments only locally. One of my favorite lines in R comes from the … Continue reading →

Read more »

hmm: implementation of viterbi algorithm (Durbin, 1998) Part 2

January 29, 2012
By
hmm: implementation of viterbi algorithm (Durbin, 1998) Part 2

Previous post presented the problem of dishonest casino that ocassionally uses loaded die. Sequence of the real states is hidden, and we are trying to figure it out just by looking at the observations (symbols). If we apply our implementation … Continue reading →

Read more »

hmm: implementation of viterbi algorithm (Durbin, 1998) Part 1

January 27, 2012
By
hmm: implementation of viterbi algorithm (Durbin, 1998) Part 1

Example in the mentioned book goes as following – dishonest casino uses two types of dice. Fair one that has equal probability of landing on either side (1/6), and the loaded one with 50% chance for getting 6. Your task … Continue reading →

Read more »

heatmaps: controlling the color representation with set data range

January 27, 2012
By
heatmaps: controlling the color representation with set data range

Often you want to set the fixed colors for particular range of your dataset to be sure that the visual output is correctly represented. This is particularly useful for time series data, where the range or your dataset might drastically … Continue reading →

Read more »