Monthly Archives: January 2011

Abusing Amazon’s Elastic MapReduce Hadoop service… easily, from R

January 10, 2011
By
Abusing Amazon’s Elastic MapReduce Hadoop service… easily, from R

JD Long's experimental segue package makes it easy to use Amazon's Elastic MapReduce service to fire up a Hadoop cluster and use it for non-Big Data, computationally-intensive tasks. The package provides a cluster-aware version of lapply() which "just works".

Read more »

Install R Packages wherever needed

January 10, 2011
By

I frequently occupy computers everywhere with extensive MCMC tasks. Installing R doesn't take long, but it can be very annoying if you manually have to install dozens of R packages before your code is able to run. Well, now I use the following command ...

Read more »

Install R Packages wherever needed

January 10, 2011
By

I frequently occupy computers everywhere with extensive MCMC tasks. Installing R doesn't take long, but it can be very annoying if you manually have to install dozens of R packages before your code is able to run. Well, now I use the following command ...

Read more »

General-purpose MCMC draw saver for R

January 10, 2011
By

If you do MCMC with R, you probably know how nasty "bookkeeping" of draws can be. So I quickly coded up a small function which does everything for you. Every parameter has to begin with "mcmc_" or another to-be-defined string, then just run mcmcsave...

Read more »

General-purpose MCMC draw saver for R

January 10, 2011
By

If you do MCMC with R, you probably know how nasty "bookkeeping" of draws can be. So I quickly coded up a small function which does everything for you. Every parameter has to begin with "mcmc_" or another to-be-defined string, then just run mcmcsave...

Read more »

R function for extracting F-test P-value from linear model object

January 10, 2011
By

I thought it would be trivial to extract the p-value on the F-test of a linear regression model (testing the null hypothesis R²=0). If I fit the linear model: fit<-lm(y~x1+x2), I can't seem to find it in names(fit) or summary(fit). But summary(fit)$fstatistic does give you the F statistic, and both degrees of freedom, so I wrote this function to...

Read more »

Really useful bits of code that are missing from R

January 10, 2011
By
Really useful bits of code that are missing from R

There are some pieces of code that are so simple and obvious that they really ought to be included in base R somewhere. Geometric mean and standard deviation – a staple for anyone who deals with lognormally distributed data. geomean <- function(x, na.rm = FALSE, trim = 0, ...) { exp(mean(log(x, ...), na.rm = na.rm,

Read more »

R interface to Google Chart Tools

January 10, 2011
By

Hans Rosling eat your heart out! It is now possible to interface R statistics software to Google’s Gapminder inspired Chart Tools. The plots below were produced using the googleVis R package and three datasets from the Gapminder website. The first shows the relationship between income, life expectancy and population for 20 countries with the highest ...

Read more »

EmEditor R code macro – Almost interactive R development for Emeditor

January 10, 2011
By

Get the new macro now hosted on githubEdit 18th Jan 2011: The below text refers to the old version of the macro and is no longer relevant, a new post will  describe the new macro, and it is also documented on the github site.As a follow ...

Read more »

Using R for Introductory Statistics, Chapter 4, Model Formulae

January 10, 2011
By
Using R for Introductory Statistics, Chapter 4, Model Formulae

Several R functions take model formulae as parameters. Model formulae are symbolic expressions. They define a relationship between variables rather than an arithmetic expression to be evaluated immediately. Model formulae are defined with the tilde ope...

Read more »