## Visualizing Likert Items

November 11, 2011
By

I have become quite a big fan of graphics that combine the features of traditional figures (e.g. bar charts, histograms, etc.) with tables. That is, the combination of numerical results with a visual representation has been quite useful for exploring descriptive statistics. I have wrapped two of my favorites (build around ggplot2) and included them as part

## Web Scraping Google+ via XPath

November 11, 2011


Google+ just opened up to allow brands, groups, and organizations to create their very own public Pages on the site. This didn’t bother me to much but I’ve been hearing a lot about Google+ lately so figured it might be fun to set up an XPath scraper to extract information from each post of a status

## Harmonic means, reciprocals, and ratios of random variables

In my last few posts, I have considered “long-tailed” distributions whose probability density decays much more slowly than standard distributions like the Gaussian.  For these slowly-decaying distributions, the harmonic mean often turns out to be a much better (i.e., less variable) characterization than the arithmetic mean, which is generally not even well-defined theoretically for these distributions.  Since the harmonic...

## Propagation of error

November 11, 2011


At the onset, this was strictly an excercise of my own curiosity and I didn't imagine writing this down in any form at all. As someone who has done some modelling work in the past, I'm embarrassed to say that I had never fully grasped how one can gauge the error of a...

## Pre-computing a trading plan in parallel

November 11, 2011


R version 2.14 introduced a new package, called parallel. This new package combines the functionality from two previous packages: snow and multicore. Since I was using multicore to parallelise my computations, I had to migrate to the new package and decided to publish some code. Often trading strategies are tested using the daily closing price

## A chart for marathoners

November 11, 2011


Here's a cool application of calendar heat maps: runner Andy used R to catalogue his daily running mileage over the last 2+ years: There are lots of ways to chart data like this (a simple time-series chart, for example), but sometimes looking at data in new ways offers fresh perspectives. For example, Andy notes: "Apparently I missed running on...

## The Marriage of Hadoop and R: Revolution Analytics at Hadoop World

November 11, 2011


Revolution Analytics CTO David Champagne visited Hadoop World 2011 this week, and delivered a presentation on "The Powerful Marriage of R and Hadoop" to a standing-room-only crowd of R and Hadoop enthusiasts. I've included David's slides below: The talk also generated praise on Twitter, for example: David was also interviewed by The Cube during the conference. In the video...

## Train neural network in R, predict in SAS

November 11, 2011


This R code fits an artificial neural network in R and generates Base SAS code, so new records can be scored entirely in Base SAS. This is intended to be a simple, elegant, fast solution. You don’t need SAS Enterprise … Continue reading →

## RStudio: a cross-platform IDE for R

November 11, 2011


Which text editor do you use? Once in a while this question pops up on the R-help mailing list. Up until recently I used the KDE text editor Kate under Linux. Recently, I came across a new text editor for… See more ›

## Resampling and Shrinkage : Solutions to Instability of mean-variance efficient portfolios

November 11, 2011


Small changes in the input assumptions often lead to very different efficient portfolios constructed with mean-variance optimization. I will discuss Resampling and Covariance Shrinkage Estimator – two common techniques to make portfolios in the mean-variance efficient frontier more diversified and immune to small changes in the input assumptions. Resampling was introduced by Michaud in Efficient

## What you wish you knew before you started a PhD

November 11, 2011


I asked my research group recently what they wished they had learned before they started work on a PhD. Here are some of the responses. More mathematics. Particular topics they named included real analysis, functional analysis, measure theory, algebra, linear algebra. That would have been my response also. I still wish I knew more mathematics than

## Another look at autocorrelation in the S&P 500

November 11, 2011


Casting doubt on the possibility of mean reversion in the S&P 500 lately. Previously A look at volatility estimates in “The mystery of volatility estimates from daily versus monthly returns” led to considering the possibility of autocorrelation in the returns.  I estimated an AR(1) model through time and added a naive confidence interval to the … Continue reading...

## Surviving a binomial mixed model

November 11, 2011


A few years ago we had this really cool idea: we had to establish a trial to understand wood quality in context. Sort of following the saying “we don’t know who discovered water, but we are sure that it wasn’t … Continue reading →





## Plotting implicit functions in R

November 11, 2011


So in prepping for my latest manuscript on population dynamics I have been creating all the necessary figures.  One of them I considered was a 2-d surface plot of a modified Ricker equation showing the transitions from extinction stability, and stability to limit cycles.  Inconveniently though the only way to do this is with an implicit function.  Since becoming...

## What 5,728.986 miles look like…

November 10, 2011
