# Monthly Archives: June 2010

## Entropy augmentation the modulo way

June 29, 2010
Long before I had heard about the connection between entropy and probability theory, I knew about it from the physical sciences. This is most likely how you met it, too. You heard that entropy in the universe is always increasing, and, if you’re like me, that made very little sense. Then you may have heard

## Tips for managing memory in R

June 29, 2010
R is an in-memory application, so every new object you create takes up RAM. (Yes, there are ways around that, but that's a topic for another article.) If you're working on a small machine (say, a 32-bit Windows system with 1Gb of RAM or less) you might need to be careful with the object you create. This StackOverflow question...

## Sweave.sh in Eclipse-StatET

June 29, 2010
Sébastien Bihorel sent the following instructions on how to use my sweave.sh shell script in Eclipse-StatET. 1- First, you need to know the path to your TEXINPUTS settings. Type R CMD env |grep TEXINPUTS in a shell. In my installation (opensuse 11.2), the shell returned the followingTEXINPUTS=.::/usr/lib/R/share/texmf:2- Edit your .bashrc file (located in your home directory) and add...

## Analyze Twitter Data Using R

June 28, 2010
Twitter data available through its API provides a wealth of real time information.  This article demonstrates a graph of user relationships and an analysis of tweets returned in a search using R.  Keep in mind, Twitter has announced that basi...

## Second year of entries!

June 28, 2010
Hello, readers new and old!We started adding examples a year ago, in advance of the book's publication. To mark the occasion, we're closing chapter 7 and starting chapter 8 next week. We've crafted a listing of all entries from the first year and mad...

## Bootstrapping the latest R into Amazon Elastic Map Reduce

June 28, 2010
I’ve been continuing to muck around with using R inside of Amazon Elastic Map reduce jobs. I’ve been working on abstracting the lapply() logic so that R will farm the pieces out to Amazon EMR. This is coming along really well, thanks in no small part to the Stack Overflow community. I have no

## How to peg 7 cores with doSMP

June 28, 2010
Statistics PhD student Nathan VanHoudnos has an 8-core laptop, and by his own admission, takes "an almost unhealthy pleasure in pushing computer to its limits". It seems like he's found an outlet for this passion with the new doSMP library included with Revolution R, that allows him to use all his processors for some gnarly simulations in R:...

## Plot Multiple Time Series using the flow / inkblot / river / ribbon / volcano / hourglass / area / whatchamacallit plots ~ blue whale catch per country w/ ggplot2

June 27, 2010
Ever since I first looked at this NYT visualization by Amanda Cox, I’ve always wanted to reproduce this in R. This is a plot that stacks multiple time series onto one another, with the width of the river/ribbon/hourglass representing the strength at each time. The NYT article used box office revenue as the width of

## Another harmonic mean approximation

June 26, 2010
$Another harmonic mean approximation$

Martin Weinberg posted on arXiv a revision of his paper, Computing the Bayesian Factor from a Markov chain Monte Carlo Simulation of the Posterior Distribution, that is submitted to Bayesian Analysis. I have already mentioned this paper in a previous post, but I remain unconvinced of the appeal of the paper method, given that it

## Weekend art in R (Part 2)

June 26, 2010
I put together four of the best looking images generated by the code shown here: # More aRt par(bg="white") par(mar=c(0,0,0,0)) plot(c(0,1),c(0,1),col="white",pch=".",xlim=c(0,1),ylim=c(0,1)) iters = 500 for(i in 1:iters) { center = runif(2) size = 1/rbeta(2,1,3)   # Let's create random HTML-style colors color = sample(c(0:9,"A","B","C","D","E","F"),12,replace=T) fill = paste("#", paste(color[1:6],collapse=""),sep="") brdr = paste("#", paste(color[7:12],collapse=""),sep="")   points(center[1], center[2],