# Going over the speed limit

April 17, 2011
By

(This article was first published on eKonometrics, and kindly contributed to R-bloggers)

In an earlier post [Speeding tickets for R and Stata]  I had reported on how R compared with Stata for executing algorithms involving maximum likelihood estimation. This post  offers the following updates on the last post:

• Stata is in fact even faster than previously reported.
• The 64-bit version of the newly released R 2.13.0 reports faster times than the 32-bit version for R 2.12.0.
• Limdep/NLogit, a popular econometrics software amongst discrete choice modellers, reported slower execution times than R 2.13 (64-bit version).
• Advice on speed from experienced R users.

My data set (used for the test results reported below) comprised an ordinal dependant variable [5 categories] and categorical explanatory variables with 63,122 observations. I used a computer running Windows 7 Professional on Intel Core 2 Quad CPU Q9300 @ 2.5 GHz with 8 GB of RAM. Further details about the tests are listed in the following Table.

 Software Routines Stata 11 (duo core) R (2.12.0) [32-bit] R x64 2.13.0 NLogit/Limdep Commercial license price US$2,495 Free Free$1,395 Multinomial Logit mlogit, 9.06 seconds     (2.89 seconds with     the “quietly” option") multinom, 50.59 sec + 52.29 sec zelig (mlogit), 77.89 sec VGLM (multinomial), 64.4 sec multinom, 32.7 sec + 49.8 sec zelig (mlogit), 69.92 sec VGLM (multinomial), 63.76 sec Logit; 36.72 sec Proportional odds model ologit, 1.69 sec           0.91 sec [quietly] oprobit, 0.91 sec [quietly] VGLM (parallel = T), 16.26 sec polr, 22.62 sec [o.logit] VGLM (parallel = T), 14.94 sec polr, 13.49 sec [o.logit] polr, 14.94 sec [o.probit] Ordered [Logit] 18.50 sec Ordered [Probit] 36.33 sec Generalized Logit gologit2, 18.67 sec (15.1 seconds with     the “quietly” option") VGLM (parallel = F), 64.71 sec VGLM (parallel = F),  64.86 sec

### Stata is even faster

When I reran the models using the quietly option (which supresses terminal output ) in Stata, I obtained the actual algorithm convergence times. For the multinomial logit model, Stata took fewer than 3 seconds to converge, making it 10-times faster than R. Similar reductions in execution times for Stata were observed for other algorithms reported in the table above.

### 64-bit version of R is faster, sometimes

The 64-bit version of R (2.13.0) reported faster execution times. The same was observed for the 64-bit version of R (2.12.0). Notice in the table above the dramatic reduction in the convergence times for the multinomial logit model (using multinom). R 2.13.0 [64-bit] took 35.4% less time to converge than R 2.12.0 [32-bit]. However, Zelig and VGLM based algorithms reported very modest improvements in execution times.

The ordered logit and ordered probit models (executed using the polr algorithm) also reported significant improvements in execution times.The ordered logit model took 40.3% less time in converging for R 2.13.0 [64-bit] than R 2.12.0 [32-bit].

I still do not understand why the summary(multinomial logit model) still takes an additional 49.8 seconds on top of 32.7 seconds to report summary results for the multinomial logit model. When I do not use summary() and instead use coef(multinomial logit model), I get instantaneous output.

In summary, it appears that not all algorithms would converge faster in the updated 64-bit version of R 2.13.0.

### R is faster than Limdep/NLogit

In comparison, R [2.13.0] offered faster convergence times than NLogit for multinomial and ordered logit models and for ordered probit models. This puts R in the middle of two popular econometrics software. Stata is significantly faster than R, and R offers faster execution times than NLogit (see the difference for ordered logit in the table above).

### What R Pros are saying about my post

If you were to scroll down to the comments section of my last post [Speeding tickets for R and Stata], you’ll notice some advice from experienced users of R. I have been advised to re-run the tests by first obtaining the optimised version of BLAS and LAPACK libraries.  I am not sure how much difference would that make. However, it would be a little difficult for ordinary users of R (such as myself) to be able to determine what BLAS and LAPACK libraries to choose and install that are appropriate for their computer systems.

If significant speed gains could be achieved by using optimised BLAS and LAPACK libraries, the R installation routines may then be improved so that these libraries are made available to the novice end users of R.