Blog Archives

Construct an unique index from two integer (Pairing Function)

April 3, 2013
By
Construct an unique index from two integer (Pairing Function)

Recently, I need to construct an unique index from two integer. The best solution I found is the Pairing function. Pairing function is an one to one and onto function that map two integers to a single integer. The definition as follows: If ordering of x and y is not important, we can swap x

Read more »

A handy concatenation operator

February 12, 2013
By
A handy concatenation operator

It may be useful for you to define a concatenation operator for characters. Sometimes, I find this is more intuitive and handy than using paste0 or paste. Also, it makes your code look better when you have nested paste, e.g.paste0("Y~",paste0("z",1:3, "*x",1:3,collapse="+"). The drawback is that it may reduce the readability of your code to other

Read more »

Compute the self excluded sample mean by group

February 12, 2013
By
Compute the self excluded sample mean by group

egen(stata cmd) compute a summary statistics by groups and store it in to a new variable. For example, the data has three variables, id, time and y, we want to compute the mean of y by for each id and then store it as a new variable mean_y. In stata, the command would be In

Read more »

How to do egen (stata cmd) in R

February 12, 2013
By
How to do egen (stata cmd) in R

egen(stata cmd) compute a summary statistics by groups and store it in to a new variable. For example, the data has three variables, id, time and y, we want to compute the mean of y by for each id and then store it as a new variable mean_y. In stata, the command would be In

Read more »

Generating a lag/lead variables

March 11, 2012
By
Generating a lag/lead variables

A few days ago, my friend asked me is there any function in R to generate lag/lead variables in a data.frame or did similar thing as _n in stata. He would like to use that to clean-up his dataset in R. In stata help manual: _n contains the number of the current observation. Here’s an

Read more »

Overhead cost of a function call

October 1, 2011
By
Overhead cost of a function call

Recently, I would like to apply unit testing method to my R program. The first thing i need to chop every few lines of the code into functions so that I can test each of them. A Question comes up to my mind: What is the overhead cost of a function call? To answer this

Read more »

Call by reference in R

September 11, 2011
By
Call by reference in R

Sometimes it is convenient to use “call by reference evaluation” inside an R function. For example, if you want to have multiple return value for your function, then either you return a list of return value and split them afterward or you can return the value via the argument. For some reasons(I would like to

Read more »

A shortcut function for install.packages() and library()

September 10, 2011
By
A shortcut function for install.packages() and library()

I enjoy trying difference kind of R packages. Since I have more than 1 computers (1 at home, 1 at office and a laptop) it is troublesome to check whether I have installed some new packages for each computer. Therefore i wrote a function to load and install packages at once. If the package does

Read more »

A quick way to do row repeat and col repeat (rep.row, rep.col)

September 2, 2011
By
A quick way to do row repeat and col repeat (rep.row, rep.col)

Today I worked on a simulation program which require me to create a matrix by repeating the vector n times (both by row and by col). Even the task is extremely simple and only take 1 line to finish(10sec), I have to think about should the argument in rep be each or times and should

Read more »