R 2.13.2 released

September 30, 2011
The R core team announced today that R 2.13.2 is now available: The byte pixies have rolled up R-2.13.2.tar.gz at 9:00 this morning. This is intended to be the final release of the 2.13 series, for the benefit of those apprehensive of putting 2.14.x into production use. This update fixes a number of minor bugs (for example, pch="." will...

Bootstrapping the Truncated Normal Distribution

March 2, 2011
Here’s a post generated from my own ignorance of statistics (as opposed to just being marred by it)! In Labor Economics we walked through something called the truncated normal distribution. Truncated distributions come up a lot in the sciences because … Continue reading →

Weight Loss Predictor

February 5, 2011
Got for 2010 Xmas a very cool book called the "4 Hour Body"(thanks Jose Santos) written by Tim Ferriss who write a previous favorite of mine about productivity, the 4 hour work week. Its an interesting book, because it has a scientific approach, it ...

Hard drive occupation prediction with R – part 2

On the first article, we saw a quick-and-dirty method to predict disk space exhaustion when the usage pattern is rigorously linear. We did that by importing our data into R and making a linear regression. In this article we will see the problems wit...

Hard drive occupation prediction with R – part 2 – Getting the probability distribution

On the first article, we saw a quick-and-dirty method to predict disk space exhaustion when the usage pattern is rigorously linear. We did that by importing our data into R and making a linear regression. In this article we will see the problems with that method, and deploy a more robust solution. Besides robustness, we will also see how we can generate...

Visualizing US House Results with a Seats-Votes curve

November 16, 2010
A few weeks ago I wrote about ways to compare major-party returns in US House elections. I experimented with several visualizations, none as useful as the seats-votes curve. A traditional seats-votes cure measures average party performance against individual US House results. Our simplified curve uses a density plot to measure major-party (Democratic, in this case)

Cooling stations. A UHI Hint

September 29, 2010
Update: google earth files in the box: Personally I like to look at things backwards. Why are cool sites cool? So download the kml or kmz file and you can tour 62 sites: All with 90 years of data or more. All with a cooling trend. And all “supposedly” urban. what do you see at

Monte Carlo testing of classification groups

September 1, 2010
This is another article on the theme of defining groups in a hierarchical classification. A previous article described homogeneity analysis to visualize how any well any number of groups, defined at the same level accounts for the variability in the da...

R’s Normal Distribution Functions: rnorm and pals

July 14, 2010
The rnorm() function in R is a convenient way to simulate values from the normal distribution, characterized by a given mean and standard deviation. I hadn't previously used the associated commands dnorm() (normal density function), pnorm() (cumulative...

Visualizing Drought

March 6, 2010
The impacts of drought depend on time-scale. On short time-scales, drought means dry soil. On long time-scales, it means dry rivers and empty reservoirs. A region may simultaneously experience dry conditions on one time-scale and wet conditions on another e.g. wet soil but low streamflow or visa versa. Standardized Precipitation Index (SPI) is a widely