# programming

### Implementing the Exact Binomial Test in Julia

April 14, 2012 |

One major benefit of spending my time recently adding statistical functionality to Julia is that I’ve learned a lot about the inner guts of algorithmic null hypothesis significance testing. Implementing Welch’s two-sample t-test last week was a trivial task because of the symmetry of the null hypothesis, but ... [Read more...]

### Floating Point Arithmetic and The Descent into Madness

April 13, 2012 |

While I should confess upfront that I’ve always had a weaker command of the details of floating point arithmetic than I feel I ought to have, this sort of thing still blows my mind when I stumble upon it. These moments invariably make me realize that floating point math ... [Read more...]

### Low Volatility with R

April 12, 2012 |

Low volatility and minimum variance strategies have been getting a lot of attention lately due to their outperformance in recent years. Let’s take a look at how we can incorporate this low volatility effect into a monthly rotational strategy with a basket of ETFs. Performance Summary from Low Volatility ... [Read more...]

### Comparing Julia and R’s Vocabularies

April 9, 2012 |

While exploring the Julia manual recently, I realized that it might be helpful to put the basic vocabularies of Julia and R side-by-side for easy comparison. So I took Hadley Wickham’s R Vocabulary section from the book he’s putting together on the devtools wiki, put all of the ... [Read more...]

### Resampling Hierarchically Structured Data Recursively

April 4, 2012 |

That's a mouthful! I presented this topic to a group of Vandy statisticians a few days ago. My notes (essentially reproduced in this post) are recorded at the Dept. of Biostatistics wiki: HowToBootstrapCorrelatedData. The presentation covers some bootstrap strategies for hierarchically structured (correlated) data, but focuses on the multi-stage bootstrap; ... [Read more...]

### An unabashedly narcissistic data analysis of my own tweets. The…

April 2, 2012 |

pie( table( whence.i.tweet )) qplot( whence ) + coord_polar() pie( log( table( whence )))+RColorBrewer ggplot (see below) plot( density( tweets.len )) qplot(... stat="density") + geom_density qplot(...stat="bin") + geom_text(...) tweeple tweep... [Read more...]

### Statistics on the length and linguistic complexity of bills

February 13, 2012 |

Where would you go to find out what the longest bill of the 112th Congress was by number of sections (H. R. 1473)?  How about by number of unique words (H.R. 3671)?  What about by Flesh-Kincaid reading level  (S. … Continue reading → [Read more...]

### the Art of R Programming [guest post]

January 30, 2012 |

(This post is the preliminary version of a book review by Alessandra Iacobucci, to appear in CHANCE. Enjoy [both the review and the book]!) As Rob J. Hyndman enthusiastically declares in his blog, “this is a gem of a book”. I would go even further and argue that The Art ... [Read more...]

### Mortgage Refinance Calculator

December 20, 2011 |

Mortgage rates are low, considering historical rates for the last 50 years. It may be timely to consider a mortgage refinance. The image above links to a simple tool for exploring mortgage refinance, built using rapache and the yet-to-be-archived yarr package for R. Hence, there are now two mortgage-related calculators on ... [Read more...]

### Using Sparse Matrices in R

October 31, 2011 |

Introduction I’ve recently been working with a couple of large, extremely sparse data sets in R. This has pushed me to spend some time trying to master the CRAN packages that support sparse matrices. This post describes three of them: the Matrix, slam and glmnet packages. The first two ... [Read more...]

### Creating an R package, using developer/productivity tools

October 27, 2011 |

Couple of R programming (mainly infrastructure/workflow) related topics discussed at the Los Angeles R users group in a tutorial/demo-like form (targeted mainly to beginners) by Szilard Pafka and Jeroen Ooms: how easy it is to create a simple package for … Continue reading →

### The Psychology of Music and the ‘tuneR’ Package

October 25, 2011 |

Introduction This semester I’m TA’ing a course on the Psychology of Music taught by Phil Johnson-Laird. It’s been a great course to teach because (i) so much of the material is new to me and (ii) because the study of the psychology of music brings together so ... [Read more...]

### Another Mystery: sas7bdat != sd2

October 14, 2011 |

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted ... [Read more...]

### Ghastly R code

September 27, 2011 |

My R package, R/qtl, contains about 33k lines of R code (and 21k lines of C code). Some of it is quite good; some of it is terrible. Here’s another example of the terrible. I’ve long needed to revise the function scantwo, for performing a two-dimensional genome ... [Read more...]

### Interacting with bioinformatics webservers using R

September 8, 2011 |

In an ideal world, all bioinformatics tools would be made available via the Web as a web service with an API, as well as a standalone package to download for local use. This is rarely the case and sometimes, even where one or the other is available, factors such as ... [Read more...]

### Seriously … why don’t math classes use computers?…

August 31, 2011 |

Seriously … why don’t math classes use computers? Excel, simple Python scripts, Mathematica / Sage, everything beyond the TI-83. Kids could be creating totally sweet visuals instead of cribbing formulae. And thinking instead of copying. I can sa... [Read more...]

### ggplot2 Version of Figures in “25 Recipes for Getting Started with R”

August 16, 2011 |

In order to provide an option to compare graphs produced by basic internal plot function and ggplot2, I recreated the figures in the book, 25 Recipes for Getting Started with R, with ggplot2. The code used to create the images is in separate paragraphs, allowing easy comparison. Read More: 336 Words Totally [Read more...]

### Getters and setters in R

August 12, 2011 |

Tweet When I first started using R, one of the things that attracted me was its claim to be an object-oriented programming (OOP) language. Coming from a Java background, I was used to designing software with OOP concepts like encapsulation and inheritance but, when I turned my hand to R, ... [Read more...]

### Programmers Should Know R

August 6, 2011 |

Programmers should definitely know how to use R. I don’t mean they should switch from their current language to R, but they should think of R as a handy tool during development.Again and again I find myself working with Java code like the following. td.linenos { background-color: #f0... [Read more...]

### WordPress WordCloud with R

August 3, 2011 |

These days one can frequently read about wordclouds created with R, initiated by the release of the wordcloud package by Ian Fellows on July 23rd. So here I am to put in my two cents. I thought about creating a wordcloud of a complete blog history, so I build a ... [Read more...]
1 2 3 4 9