My PLStroika and some thoughts on developing R packages

November 7, 2012
I’ve been using R on a daily basis for more than seven years, and I’ve been developing R packages over the last three years. I consider myself an experienced useR and most of my coworkers consider me an R master. However, there are still a lot of things that I don’t know about R. To be … Continue reading...

Shootout 2012: Test & Val Sets proyections

November 7, 2012
It is obvious (after seeing the spectra of the calibration set), that we have at least three clusters, and that this can be related with the concentration of the active ingredient in the tablets. If we see the scores in the PC1-PC2 score map we will se...

Cash–Opportunity Lost or Opportunity Gained

November 7, 2012
Tom Brakke from http://researchpuzzle.com/ wrote a great thought piece Cash as Trash, Cash as King, and Cash as a Weapon for the CFA Institute blog.  My favorite part comes in the last paragraph: “That’s the kind of analysis that should be br...

a two-minute introduction to statistical programming and other short stories

November 7, 2012
some hacker news criticism convinced me to take a stab at the question: "what is R?" answer: the lingua statistica, s'il vous plaît. met my hero and the inspiration behind this website. bonus video: how to make coffee with r in two minutes.

November 7, 2012
You’ve already seen everyone else’s electoral map (see this amazing array of maps from 2008), how would you like to make your own? Today’s Gist allows you to do just that — input (manually!) state-by-state results, and output a...

Project Euler — problem 22

November 7, 2012
Just had my supper. Stomach is full of stewed beef and potato.  I’d like to solve the 22nd Euler problem before tonight work (right, I’ll work late in my office). Using names.txt (right click and ‘Save Link/Target As…’), a 46K text file containing over … Continue reading →

Gotcha!

November 7, 2012
I should start this with a disclaimer, ie that I'm not really claiming any "success" with this post. But I find it quite interesting that the estimations I produced with this very, very simple model turned out to be quite good.The idea was to use the e...

Nate Silver does it again! Will pundits finally accept defeat?

November 6, 2012
My favorite statistician did it again! Just like in 2008, he predicted the presidential election results almost perfectly. For those that don’t know, Nate Silver is the statistician that runs the fivethirtyeight blog. He combines data from hundreds of polls, uses … Continue reading →

Credit Scoring in R 101

In this post we'll fit some predicitve models in (well know) data bases, and evalute the performance of each model. Disclaimer1: for simplicity the predictive variables are treating without apply any transformation to get a better performance or stabil...

Using R — Calling C Code ‘Hello World!’

November 6, 2012
This entry is part 6 of 11 in the series Using ROne of the reasons that R has so much functionality is that people have incorporated a lot of academic code written in C, C++, Fortran and Java into various …   read more ...

Webinar Thursday: How R is used to optimize tractor production at John Deere

November 6, 2012
I just sat in on the rehearsal for Thursday's webinar by John Deere's Derek Hoffman, Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and Flexibility. Derek will give a spirited argument of why R is critical for the faming equipment manufacturer's operations: from forecasting demand for equipment, forecasting crop yields (they produce forecasts for more than half...

2012-9 Writing grid Extensions

November 6, 2012
New hook functions, makeContext() and makeContent(), have been added to the grid graphics package. These functions allow an alternative approach to developing custom grobs when a grob can only decide what to draw at drawing time rather than when the … Continue reading →

rfigshare tutorial

November 6, 2012
Recently we at rOpenSci released our rfigshare package up on cran (or you can check out the most up to date version on github. So what’s so great about being able to create figshare articles through R? For some time now I’ve been advocating the use of a workflow that involves documents written in R...

R BLAS: GotoBLAS2 vs OpenBLAS vs MKL

November 6, 2012
Short update to Speed up R by using a different BLAS implementation/: MKL is overall the fastest OpenBLAS is faster than its parent GotoBLAS and comes close to MKL A = matrix(rnorm(n*n),n,n) A %*% A solve(A) svd(A)

Reverse engineering the SAS data file format

November 6, 2012
I think it’s rather marvellous that a few expert coders are working on dispelling the cloud of mystery around the proprietary file format used by SAS software. Essentially, saving your data in a SAS format (with a name like mydata.sas7bdat) … Continue reading →

EPS Market Map in R

November 6, 2012
There are a few minor tweaks renaming on this map before it is complete, but I wanted to share the EPS Market Map I put together.  It can be downloaded using this link. This file is meant to be used with R and divides the lower 48 states into the CollegeBoard’s Enrollment Planning Service markets.

ggplot graphs in publications?

November 6, 2012
The grey background and/or default choice of colours for groups makes a ggplot graph stand out to any R user when seen in a presentation. But ggplot graphs get all ninja when it comes to publications, either that or not … Continue reading →

Forest plots in R (ggplot) with side table

November 6, 2012
A friend asked me to help with a forest plot recently. After chatting about what she wanted the end result to look like, this is what I came up with. grid.arrange(data_table, p, ncol=2) ## Warning: Removed 1 rows containing missing … Continue reading →

analyze the national survey on drug use and health (nsduh) with r

November 6, 2012
the national survey on drug use and health (nsduh) monitors illicit drug, alcohol, and tobacco use with more detail than any other survey out there.  if you wanna know the average age at first chewing tobacco dip, the prevalence of needle-sharing,...

Simulating Multiple Asset Paths in R

November 5, 2012
By
$Simulating Multiple Asset Paths in R$

I recently came across the Optimal Rebalancing Strategy Using Dynamic Programming for Institutional Portfolios by W. Sun, A. Fan, L. Chen, T. Schouwenaars, M. Albota paper that examines the cost of different rebablancing methods. For example, one might use calendar rebalancing: i.e. rebalance every month / quarter / year. Or one might use threshold rebalancing:

Quick Post About Getting and Plotting Polls in R

November 5, 2012
With the election nearly upon us, I wanted to share an easy way I just found to download polling data and graph a few with ggplot2. dlinzer at github created a function to download poll data from the Huffington Post's Pollster API.The default is to dow...

Another look at ideology of the US congress

November 5, 2012
In response to last week's post on the rapidly increasing ideology of the US Republican Party, Mike Lawrence suggested another way of looking at the DW-NOMINATE ideology data. Rather than simply looking at boxplots of the congress scores by party over time, we could fit a smooth curve to get a better sense of the trends over time. Mike...

RInside 0.2.9

November 5, 2012
A new version 0.2.9 of RInside arrived on CRAN earlier today; Windows binaries have already been built too. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and ...

OOP with Rcpp modules

November 5, 2012
The purpose of Rcpp modules has always been to make it easy to expose C++ functions and classes to R. Up to now, Rcpp modules did not have a way to declare inheritance between C++ classes. This is now fixed in the development version, and t...

Network visualization in R with the igraph package

November 5, 2012
In this post I showed a visualization of the organizational network of my department. Since several people asked for details how the plot has been produced, I will provide the code and some extensions below. The plot has been done … Continue reading →

If we truly want to foster collaboration, we need to rethink the “independence” criteria during promotion

November 5, 2012
When I talk about collaborative work, I don’t mean spending a day or two helping compute some p-values and end up as middle author in a subject-matter paper. I mean spending months working on a project, from start to finish, with experts … Continue reading →

Retrieving the VIX term structure in R

November 5, 2012
Much of my time lately has gone into analyzing and trading products in the volatility complex.  As a result, I regularly watch the VIX term structure for continuations or deviations from trend.  To make analysis simpler, I’ve written some… Read more ›

Multi-stage sampling together with hierarchical/ mixed effects models: which packages?

November 5, 2012
Dear R experts, I sent this question to the r-help list but didn’t get much response, probably because it is more of a stats question. But as this blog is syndicated on r-bloggers I thought I would try it again here on this blog. If I am barking up the wrong tree, feel free to