Anaerobic Stress in Seeds – A Chemical Similarity Network Story

December 31, 2012
By
Anaerobic Stress in Seeds – A Chemical Similarity Network Story

The chemical similarity network or CSN is a great tool for organizing biological data based on known biochemistry or chemical structural similarity. Here is an example CSN for visualizing metabolomic  changes (measured via GC/TOF) due to anaerobic stress in germinating seeds. In this network edges are formed for chemical similarity scores > 75. Node color describes

Read more »

Getting Genetics Done 2012 In Review

December 31, 2012
By

Here are links to all of this year's posts (excluding seminar/webinar announcements), with the most visited posts in bold italic. As always, you can follow me on Twitter for more frequent updates. Happy new year!New Year's Resolution: Learn How to Code...

Read more »

The forward (explicit) Euler method

December 31, 2012
By
The forward (explicit) Euler method

The forward (explicit) Euler method is a first-order numerical procedure for solving ODEs with a given initial value. The forward Euler method is said to be the simplest and most obvious numerical ODEs integrator. In fact, the simulation using the forward Euler only … Continue reading →

Read more »

Nested loops with mapply

December 31, 2012
By
Nested loops with mapply

So as I sink deeper into the second level of R enlightenment, one thing troubled me. “lapply” is fine for looping over a single vector of elements, but it doesn’t do a nested loop structure. These tend to be pretty ubiquitous for me. I’m forever doing the same thing to a set of two or three

Read more »

Top Posts of 2012

December 31, 2012
By
Top Posts of 2012

This has been a great year for my blog. I've seen tremendous growth in my subscribers. I look forward to engaging with and learning from my followers in 2013 and I plan to offer valuable content in return. If you're interested in following along, you can quickly subscribe via RSS or e-mail. I use Google I encourage you...

Read more »

How odd was the UEFA draw?

December 31, 2012
By

I've been away for some time without closely following the media, and without significant internet access. When such a period is over it takes some time to regain momentum. Thus my short exit poll series will be continued in 2013. For now I'm still sor...

Read more »

Learning RStudio for R Statistical Computing

December 31, 2012
By
Learning RStudio for R Statistical Computing

I am happy to announce that our book on RStudio has been released last week.

Read more »

STL random_sample

December 31, 2012
By

An earlier post looked at random shuffle for permutations. The STL also supports creation of random samples. Alas, it seems that this functionality has not been promoted to the C++ standard yet — so we will have to do with what is an extensions ...

Read more »

Software engineer’s guide to getting started with data science

December 30, 2012
By
Software engineer’s guide to getting started with data science

Many of my software engineer friends ask me about learning data science. There are many articles on this subject from renowned data scientists (Dataspora, Gigaom, Quora, Hilary Mason). This post captures my journey (a software engin...

Read more »

Modeling in R with Log Likelihood Function

December 30, 2012
By
Modeling in R with Log Likelihood Function

Similar to NLMIXED procedure in SAS, optim() in R provides the functionality to estimate a model by specifying the log likelihood function explicitly. Below is a demo showing how to estimate a Poisson model by optim() and its comparison with glm() result.

Read more »

Tips for R Package Creation

December 30, 2012
By
Tips for R Package Creation

I’m being tortured by the mistakes of my past self. I think I’ve made most every mistake possible in creating a package and I want to go back in time and tell year ago me all I know now. But … Continue reading →

Read more »

2012 in review by The WordPress.com

December 30, 2012
By
2012 in review by The WordPress.com

Here is a 2012 annual report for this blog produced by The WordPress.com. My top post this year based on number of views are R-Uni (A List of Free R Tutorials and Resources in Universities webpages) 29 COMMENTS February 2012 R Style Guide 7 COMMENTS June 2012 Click here to see the complete report. Filed under: R

Read more »

Pearson’s r: Not a good measure of electoral persistence

December 30, 2012
By
Pearson’s r: Not a good measure of electoral persistence

Pearson’s product-moment correlation, \(r\), is an incredibly useful tool for getting some idea about how two variables are (linearly) related. But there are times when using Pearson’s \(r\) is not appropriate and, even if linearity and all other assumptions hold, … Continue reading →

Read more »

Misusage of the new shiny package: A nerdy drink tracker for your next party

December 30, 2012
By
Misusage of the new shiny package: A nerdy drink tracker for your next party

Currently a lot of people are talking about the new shiny package. So I got curious and built an own, more or less useful app: A drink trackerThis app can be used to track how much someone drank and therefore it is very useful for every party, especial...

Read more »

RcppClassicExamples 0.1.1

December 30, 2012
By

Yesterday's initial upload of RcppClassicExamples was lacking a versioned Depends: to prevent builds on older versions of R. This has been added in a new upload 0.1.1. We also added a NEWS file (see below); no code changes were made. Changes in ver...

Read more »

Searching for Structure underlying Customer Satisfaction Ratings: Item Response Theory through the Back Door

December 30, 2012
By
Searching for Structure underlying Customer Satisfaction Ratings: Item Response Theory through the Back Door

Variations on a Theme of Negative Skew and Positive ManifoldNo one familiar with research on customer satisfaction expects to find uncorrelated ratings or symmetric distributions centered toward the middle of the rating scale.  There are forces at work that structure the means and correlations among the items from a customer satisfaction questionnaire.  On the one hand, unsatisfied customers churn,...

Read more »

National idenftification number: Finland

December 30, 2012
By

The Finnish Social Security number (FSSn) is a common variable in a Finnish population based study. Within FSSn are individuals birthday, and gender. We can also check if the FSSn correct because it has a check digit. If the data doesn't have birthday ...

Read more »

STL random_shuffle for permutations

December 30, 2012
By

The STL also contains random sampling and shuffling algorithms. We start by looking at random_shuffle. There are two forms. The first uses an internal RNG with its own seed; the second form allows for a function object conformant to the STL’s re...

Read more »

Update to Graphing Non-Proportional Hazards in R

December 30, 2012
By
Update to Graphing Non-Proportional Hazards in R

Update 1 February 2013: I've moved all of the functionality described in this post into an R package called simtvc. Have a look. It is much easier to use. This is a quick update for a previous post on Graphing Non-Proportional Hazards in R. In the previous post I showed how to simulate and graph 1,000...

Read more »

Spirograph with R

December 30, 2012
By
Spirograph with R

Just had to figure out how to replicate this old toy of mine with R! I had no idea how long it's been around:Read more »

Read more »

An R wish list for 2013

December 29, 2012
By
An R wish list for 2013

First go and read An R wish list for 2012. None of the wishes came through in 2012. Fix the R website? No, it is the same this year. In fact, it is the same as in 2005. Easy to find help? Sorry, next year. Consistency and sane defaults? Coming soon to a theater near

Read more »

UEFA, is that it ?

December 29, 2012
By
UEFA, is that it ?

Following my previous post, a few more things. As mentioned by Frédéric, it is – indeed – possible to compute the probability of all pairs. More precisely, all pairs are not as likely to occur: some teams can play against (almost) eveyone, while others cannot. From the previous table, it is possible to compute probability that the last team plays...

Read more »

Integration of R, RStudio and Hadoop in a VirtualBox Cloudera Demo VM on Mac OS X

December 29, 2012
By
Integration of R, RStudio and Hadoop in a VirtualBox Cloudera Demo VM on Mac OS X

MotivationI was inspired by Revolution's blog and step-by-step tutorial from Jeffrey Breen on the set up of a local virtual instance of Hadoop with R. However, this tutorial describes the implementation using VMware's application. One downside to using VMware is that it's not free. I know most of the people including me like to hear the words open-source and free,...

Read more »

Row-wise summary curves in faceted ggplot2 figures

December 29, 2012
By
Row-wise summary curves in faceted ggplot2 figures

I really enjoy reading the Junk Charts blog.  A recent post made me wonder how easy it would be to add summary curves for small-multiple type plots, assuming the “small multiples” to summarize were the X component of a ggplot2::facet_grid(Y ~ X) … Continue reading →

Read more »

RcppExamples 0.1.5 and RcppClassicExamples 0.1.0

December 29, 2012
By

The recent releases of Rcpp 0.10.2 and RcppClassic 0.9.3 had one more repercussion. On that dreaded OS, the linker no longer wanted to instantiate a symbol present in both packages; seems to me that the linker in the other two OSs is a little smarter...

Read more »

High-Dimensional Microarray Data Sets in R for Machine Learning

December 29, 2012
By

Much of my research in machine learning is aimed at small-sample, high-dimensional bioinformatics data sets. For instance, here is a paper of mine on the topic. A large number of papers proposing new machine-learning methods that target high-dimensional data use the same two data sets and consider few others. These data sets are the 1) Alon colon cancer...

Read more »

Speed skating 10 km

December 29, 2012
By
Speed skating 10 km

It is winter which makes it time for one of Netherlands beloved sports: speed skating. Speed skating is done over various distances, but for me, the most beautiful is the 10 km. The top men do this in about 13 minutes. In this post I try to u...

Read more »

Men who stare at needles

December 29, 2012
By

Buffon's needle problem is a question first posed in the 18th century by Georges-Louis Leclerc, Comte de Buffon:What is the probability that a needle thrown at a lined sheet of paper will cross a line?This problem can be used to estimate π. If we set the nail size and the line distance = 1, the estimator can be calculated...

Read more »

STL transform + remove_copy for subsetting

December 29, 2012
By

We have seen the use of the STL transform functions in the posts STL transform and Transforming a matrix. We use the same logic in conjuction with a logical (ie boolean) vector in order subset an initial vector. #include <Rcpp.h> using namespace...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.