Blog Archives

Estimating a Beta Regression with The Variable Dispersion in R

October 19, 2014
By
Estimating a Beta Regression with The Variable Dispersion in R

Read more »

Fitting Lasso with Julia

October 7, 2014
By
Fitting Lasso with Julia

Julia Code R Code

Read more »

By-Group Aggregation in Parallel

October 4, 2014
By
By-Group Aggregation in Parallel

Similar to the row search, by-group aggregation is another perfect use case to demonstrate the power of split-and-conquer with parallelism. In the example below, it is shown that the homebrew by-group aggregation with foreach pakage, albeit inefficiently coded, is still a lot faster than the summarize() function in Hmisc package.

Read more »

Vector Search vs. Binary Search

October 1, 2014
By
Vector Search vs. Binary Search

Read more »

Row Search in Parallel

September 28, 2014
By
Row Search in Parallel

I’ve been always wondering whether the efficiency of row search can be improved if the whole data.frame is splitted into chunks and then the row search is conducted within each chunk in parallel. In the R code below, a comparison is done between the standard row search and the parallel row search with the FOREACH

Read more »

Chain Operations: An Interesting Feature in dplyr Package

July 28, 2014
By
Chain Operations: An Interesting Feature in dplyr Package

Read more »

Efficiency of Importing Large CSV Files in R

February 10, 2014
By
Efficiency of Importing Large CSV Files in R

Read more »

Julia and SQLite

February 8, 2014
By
Julia and SQLite

Similar to R and Pandas in Python, Julia provides a simple yet efficient interface with SQLite database. In addition, it is extremely handy to use sqldf() function, which is almost identical to the sqldf package in R, in SQLite package for data munging.

Read more »

Simplex Model in R

February 2, 2014
By
Simplex Model in R

R CODE R OUTPUT SAS CODE & OUTPUT FOR COMPARISON

Read more »

rPython – R Interface to Python

October 13, 2013
By
rPython – R Interface to Python

Read more »