When I plotted the PCA results (e.g. scatter plot for PC1 and PC2) and was about to annotate the dataset with different covariates (e.g. gender, diagnosis, and ethic group), I noticed that it's not straightforward to annotate __2 covariates at the same time using ggplot. Here is what works ... [Read more...]
Most people know KEGG pathway, but not everyone knows that it costs at least $2000 to subscribe its database. If you want to save the cost a bit, you can manually download the KEGG pathway KGML files and install in SPIA. Here I have a workaround to dow... [Read more...]
Here are two tips I can share if you were also working on a big dataset towards a high quality heatmap:1. Don't generate PDF using pheatmap() or heatmap.2() as (i) the file is unnecessarily SUPER large if you have a lot of data points in the heatmap, s... [Read more...]
Here are two tips I can share if you were also working on a big dataset towards a high quality heatmap:1. Don't generate PDF using pheatmap() or heatmap.2() as (i) the file is unnecessarily SUPER large if you have a lot of data points in the heatmap, so that you ... [Read more...]
Tips below are based on the lessons I learnt from making mistakes during my years of research. It's purely personal opinion. Order doesn't mean anything. If you think I should include something else, please comment below.Always set a seed number when y...
Tips below are based on the lessons I learnt from making mistakes during my years of research. It's purely personal opinion. Order doesn't mean anything. If you think I should include something else, please comment below.
Always set a seed number when you run tools with random option, e.g. ...
This Wednesday’s Powerball grand prize already climbed up to $1.5 BILLION. If you choose to cash out, it would be $930 million. And it keeps increasing…So, what’s the odd of winning the jackpot prize?Here is the game rule according to Powerball.com:…we draw five white balls out ... [Read more...]
This Wednesday’s Powerball grand prize already climbed up to $1.5 BILLION. If you choose to cash out, it would be $930 million. And it keeps increasing… So, what’s the odd of winning the jackpot prize? Here is the game rule according to Powerball... [Read more...]
It's not a shame to put a note on something (probably) everyone knows and you thought you know but actually you are not 100% sure. Multiple testing is such a piece in my knowledge map.Some terms first:- Type I error (false positive) and Type II error (false negative): When ... [Read more...]
It's not a shame to put a note on something (probably) everyone knows and you thought you know but actually you are not 100% sure. Multiple testing is such a piece in my knowledge map.Some terms first:- Type I error (false positive) and Type II error (false negative): When ... [Read more...]
Sometimes we want to make our own heatmap using image() function. I recently found it's tricky to set the color option there, as its manual has very little information on col:cola list of colors such as that generated by rainbow, heat.colors,... [Read more...]
Nowadays everyone is talking about big data. As a genomic scientist, I could feel hungry of a collection of tools more specialized for the mediate-to-big data we deal everyday.Here are some tips I found useful when getting, processing or visualizing la... [Read more...]
Typically, I log into my remote server/cluster via "ssh -X" and from there launch R program for plotting. But it always shows an error asunable to open connection to X11 display ''after a while, when you want to call functions such as plot(). This is very annoying. So that ... [Read more...]
This is to continue on the topic of using the melt/cast functions in reshape to convert between long and wide format of data frame. Here is the example I found helpful in generating covariate table required for PEER (or Matrix_eQTL) analysis:Here ... [Read more...]
I was wondering how to draw a venn diagram like pie chart in R, to show the distribution of my RNA-seq reads mapped onto different annotation regions (e.g. intergenic, intron, exons etc.). A google search returns several options, including the nice one from Xiaopeng's bam2x (see below). However, ... [Read more...]
I was wondering how to draw a venn diagram like pie chart in R, to show the distribution of my RNA-seq reads mapped onto different annotation regions (e.g. intergenic, intron, exons etc.). A google search returns several options, including the nice one... [Read more...]
I found this elegant note about reshape2 from Sean Anderson's blog:http://seananderson.ca/2013/10/19/reshape.htmlBasically,reshape2 is based around two key functions: melt and cast:melt takes wide-format data and melts it into long-format data.cast tak... [Read more...]
Google Doc is a good way to share/manage documents between you and your colleagues, but sometime you want to directly access the data in terminal (e.g. bash) or in program (e.g. R), without downloading the data first. For example, I have a Google Spre... [Read more...]
If it's an internal function of R (e.g. from base package), just type the function name, like__ rowMeansfunction (x, na.rm = FALSE, dims = 1L) { if (is.data.frame(x)) x [Read more...]