## R frustration of the day

April 11, 2010
Whenever you take a 1 column slice of a matrix, that gets automatically converted into a vector. But if you take a slice of several columns, it remains a matrix. The problem is you don’t always know in advance how big the slice will be, so if you do this: newMatrix

## Historical / Future Volatility Correlation Stability

April 11, 2010
Michael Stokes, author of the MarketSci blog recently published a thought-provoking post about the correlation between historical and future volatility (measured as the standard deviation of daily close price percentage changes). This post is intended...

April 11, 2010
There is a central notion in Time Series Econometrics, cointegration. Loosely it refers to finding the long run equilibrium of two non-stationary series. As the most know non-stationary series examples comes from finance, cointegration is nowadays a tool for traders (not a common one though!). They use it as the theory behind pairs trading (aka

## Summarising data using histograms

April 11, 2010
The histogram is a standard type of graphic used to summarise univariate data where the range of values in the data set is divided into regions and a bar (usually vertical) is plotted in each of these regions with height proportional to the frequency of observations in that region. In some cases the proportion of

## Compiling 64-bit R 2.10.1 with MKL in Linux

April 10, 2010
The rationale for compiling R using the Intel Math Kernel LibraryRecently, there has been a surge in the use of Intel's Math Kernel Library (MKL; http://software.intel.com/en-us/intel-mkl/) among data analysis packages. MKL is a highly optimized set of...

## Where do you sit? Author position and the h-index

April 10, 2010
I was recently introduced to the concept of the h-index and was compelled to find out my own h-index via Scopus.  Numbers don't matter, but discussion with my colleagues turned to the issue of author position.  We quickly decided that there are three important "positions" in the list of authors for a publication: first, last and everywhere else...

## Because it’s Friday: Pixels invade New York

April 9, 2010
Posted for no other reason than it warms my gamer-geek heart to see NYC taken over by 8-bit video game characters. The Tetris sequence is particularly cool. Update: The original video was deleted from YouTube, I'm guessing because of copyright issues with the music. This version has no music. (Thanks to reader MB in the comments for the heads-up.)

## REvolution R Community 3.2 now available

April 9, 2010
REvolution R Community, REvolution's free distribution based on R from the R Project, has been updated to version 3.2 and is now available for download for Windows and MacOS. Some features of this release include: Upgraded R engine. This release is based on R 2.10.1, the latest release (as of this writing). This brings many new features to the...

## Chicago R User Group… It’s for the sexy people!

April 9, 2010
I think we all know that Morris Day was talking about when he wrote the lyrics to “The Bird”: Yes! Hold on now, this dance ain’t for everybody. Just the sexy people. White folks, you’re much too tight. You gotta shake your head like the black folks. You might get some tonight. Look out! That’s right, he was talking about the new

## The Future of Math is Statistics

April 9, 2010
The future of math is statistics… and the language of that future is R: I’ve often thought there was way too little “statistical intuition” in the workplace. I think Author Benjamin would agree.

## Maximum Probability of Profit

April 9, 2010
To continue with the LSPM examples, this post shows how to optimize a Leverage Space Portfolio for the maximum probability of profit. The data and example are again taken from The Leverage Space Trading Model by Ralph Vince. These optimizaitons take ...

## GLMM using DPpackage

April 9, 2010
I was able to fit a semi-parametric Bayesian GLMM model using DPpackage. It took me many hours to sample from the posterior distribution (DPM prior):MCMC scan 1000 of 5000 (CPU time: 18950.080 s)MCMC scan 2000 of 5000 (CPU time: 22510.100 s)M...

## Gravity Game in R

April 8, 2010
So why should R only be used for ’serious’ stuff? No longer! I’ve written the following code in R which executes a little gravitational physics game. The goal of the game is simple. You supply a velocity and direction to a spaceship with the goal of getting the ship to the

## R: heatmaps with gplots

April 8, 2010
I use heatmaps quite a lot for visualizing data, microarrays of course but also DNA motif enrichment, base composition and other things. I particular like the heatmap.2 function of the gplots package. It has a couple of defaults that are a little ugly ...

## New R User Group in Dallas

April 8, 2010
Wow, it's a big week for new R User Groups. Larry D'Agostino has started up a new R user group based in Dallas, Texas (USA). It's just getting started, and Larry posted the following request on the r-help mailing list: I would like to know if there is anyone like me interested in an R User Group in Dallas,...

## R: another nifty graph

April 8, 2010
Make sure to click on the image to see the large version. Code for this graph: moxbuller = function(n) { u = runif(n) v = runif(n) x = cos(2*pi*u)*sqrt(-2*log(v)) y = sin(2*pi*v)*sqrt(-2*log(u)) r = list(x=x, y=y) return(r) } r = moxbuller(50000) par(bg="black") par(mar=c(0,0,0,0)) plot(r\$x,r\$y, pch=".", col="blue", cex=1.2)

## Video of UCLA / LA RUG talk on R and C++ integration

April 7, 2010
Thanks to the efforts of the tireless R User Group organizers Szilard Pafka (in Los Angeles, recording the talk) and Drew Conway (in New York, converting and organising hosting), there is now a video and slide combo of my recent talk about Rcpp and R...

## An obscure integral

April 7, 2010
$An obscure integral$

Here is an email from Thomas I received yesterday about a computation in our book Introducing Monte Carlo Methods with R: I’m currently reading your book “Introduction to Monte Carlo Methods with R” and I quite highly appreciate your work. I’m not able to see how the integral on page 74, that describes the marginal

## Correlation scatter-plot matrix for ordered-categorical data

April 7, 2010
When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item’s (for example: two ordered categorical vectors ranging from 1 to 5). When dealing with several such Likert variable’s, a clear presentation of all the pairwise relation’s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R...

## Video: Seamless R Extensions using Rcpp and RInside

April 7, 2010
Dirk Eddelbuettel presented joint work with Romain François on calling C++ from R at the LA R User Group meeting last week. Now, with thanks to Drew Conway of the NY R User Group, video of the presentation is now available. It's also embedded below -- click on it for a larger view. Dirk's slides are also available for...

## Seamless R Extensions using Rcpp and RInside

April 7, 2010
I just added a new video to the R repository, and this one comes from the Los Angeles R Meetup. The folks in LA were fortunate enough to have Dirk Eddelbuettel—renowned R expert and StackOverflow super-user—discuss his joint work with Romain François for interfacing C++ and R code using the Rcpp package. For those

## Seamless R Extensions using Rcpp and RInside

April 7, 2010
Dirk Eddelbuettel discusses his joint work with Romain François for interfacing C++ and R code at the Los Angeles R Users Group on March 30, 2010. Dirk provides a motivation for the Rcpp packages, as well as examples and speed benchmarks.

## Matrix determinant with the Lapack routine dspsv

April 6, 2010
The Lapack routine dspsv solves the linear system of equations Ax=b, where A is a symmetric matrix in packed storage format. However, there appear to be no Lapack functions that compute the determinant of such a matrix. We need to compute the determinant, for instance, in order to compute the multivariate normal density function. The