Monthly Archives: September 2013

The Problem with Percentiles

September 8, 2013
By
The Problem with Percentiles

The Problem with Percentiles Percentiles (or, more accurately, quantiles) are deeply embedded in the psyche of actuaries, statisticians and similar beasts. They are referred to implicitly in the Solvency 2 directive (Article 100, Value at Risk) without explanation. They are so ingrained...

Read more »

Visualizing optimization process

September 8, 2013
By
Visualizing optimization process

One of the approaches to graph drawing is application of so called force-directed algorithms. In its simplest form the idea is to layout the nodes on plane so that all edges in the graph have approximately equal length. This problem has very intuitive ...

Read more »

Linear regression from a contingency table

September 7, 2013
By
Linear regression from a contingency table

This morning, Benoit sent me an email, about an exercise he found in an econometric textbook, about linear regression. Consider the following dataset, Here, variable X denotes the income, and Y the expenses. The goal was to fit a linear regression (actually, in the email, it was mentioned that we should try to fit an heteroscedastic model, but let...

Read more »

Vectors, Looping, and Performance

September 7, 2013
By
Vectors, Looping, and Performance

Vectors are at the heart of R and represent a true convenience. Moreover, vectors are essential for good performance especially when your are working with lots of data. We’ll explore these concepts in this posting. As a motivational example let’s generate a sequence of data from -3 to 3. We’ll also use each point as

Read more »

Vectors, Looping, and Performance

September 7, 2013
By
Vectors, Looping, and Performance

Vectors are at the heart of R and represent a true convenience. Moreover, vectors are essential for good performance especially when your are working with lots of data. We’ll explore these concepts in this posting. As a motivational example let’s generate a sequence of data from -3 to 3. We’ll also use each point as

Read more »

A bit of benchmarking with string distances

September 7, 2013
By

After my last post about the stringdist package, Zachary Mayer pointed out to me that the implementation of the Levenshtein and Jaro-Winkler distances implemented in the RecordLinkage package are about two-three times faster. His benchmark compares randomly generated character strings … Continue reading →

Read more »

First post, and its a doozy!

September 7, 2013
By
First post, and its a doozy!

Well, not really a doozy.  Just something nice and slow to get me going. So, seeing as I intend to post stuff about R along with the other things, I thought it best to understand how all those great R bloggers embed the highlighted R code into their WordPress blogs.  As it turns out, I

Read more »

Fearsome Engines, Part 1

September 7, 2013
By
Fearsome Engines, Part 1

Back in June I discovered pqR, Radford Neal’s fork of R designed to improve performance. Then in July, I heard about Tibco’s TERR, a C++ rewrite of the R engine suitable for the enterprise. At this point it dawned on me that R might end up like SQL, with many different implementations of a common

Read more »

Presenting your findings with R

September 7, 2013
By
Presenting your findings with R

R packages to provide more interactive and richer user experience: knitr, rCharts, slidify, shiny and OpenCPU.

Read more »

Probability of Avoiding a Run-off in the NYC 2013 Democratic Primary Election

September 6, 2013
By
Probability of Avoiding a Run-off in the NYC 2013 Democratic Primary Election

The New York City mayoral Democratic primary election is taking place this coming Tuesday (Sep. 10th) and there are several candidates in the running. Bill de Blasio is the front runner and is expected to win. However, there is a catch. Even if he takes the plurality of the vote he may not actually win.

Read more »