(This article was first published on

The Gibbs sampler discussed on Darren Wilkinson's blog and also on Dirk Eddelbuettel's blog has been implemented in several languages, the first of which was R. **A second megabyte of memory**, and kindly contributed to R-bloggers)In preparation for a session at useR!2012 on "What other languages should R users know about?", Dirk, Chris Fonnesbeck and I have considered implementations of this simple sampler in other languages. I describe a Julia implementation below. Full details on all of the implementations are available at Chris's github repository.

The task is to create a Gibbs sampler for the unscaled density

` f(x,y) = x x^2 \exp(-xy^2 - y^2 + 2y - 4x)`

using the conditional distributions` x|y \sim Gamma(3, y^2 +4)`

y|x \sim Normal(\frac{1}{1+x}, \frac{1}{2(1+x)})

Dirk's version of Darren's original R function is`Rgibbs <- function(N,thin) {`

mat <- matrix(0,ncol=2,nrow=N)

x <- 0

y <- 0

for (i in 1:N) {

for (j in 1:thin) {

x <- rgamma(1,3,y*y+4)

y <- rnorm(1,1/(x+1),1/sqrt(2*(x+1)))

}

mat[i,] <- c(x,y)

}

mat

}

`RCgibbs <- cmpfun(Rgibbs)`

`examples`

directory of the Rcpp package, Dirk provides an `R`

script using the inline, `Rcpp`

and RcppGSL packages to implement this sampler in `C++`

code callable from `R`

and time the results. On my desktop computer, timing 10 replications of `Rgibbs(20000, 200)`

and the other versions produces` test replications elapsed relative user.self sys.self`

4 GSLGibbs(N, thn) 10 8.338 1.000000 8.336 0.000

3 RcppGibbs(N, thn) 10 13.285 1.593308 13.285 0.000

2 RCgibbs(N, thn) 10 369.843 44.356320 369.327 0.032

1 Rgibbs(N, thn) 10 473.511 56.789518 472.754 0.044

`C`

code for R's d-p-q-r functions for probability densities, cumulative distribution, quantile and random sampling can be compiled into a separate Rmath library. These sources are included with the Julia sources and Julia functions with similar calling sequences are available as "extras/Rmath.jl"`load("extras/Rmath.jl")`

function JGibbs1(N::Int, thin::Int)

mat = Array(Float64, (N, 2))

x = 0.

y = 0.

for i = 1:N

for j = 1:thin

x = rgamma(1,3,(y*y + 4))[1]

y = rnorm(1, 1/(x+1),1/sqrt(2(x + 1)))[1]

end

mat[i,:] = [x,y]

end

mat

end

`JGibbs1`

is essentially the same code as `Rgibbs`

with minor adjustments for syntax. A similar timing on the same computer gives`julia> sum([@elapsed JGibbs1(20000, 200) for i=1:10])`

27.748079776763916

julia> sum([@elapsed JGibbs1(20000, 200) for i=1:10])

27.782002687454224

`Rgibbs`

and 13 times faster than `RCgibbs`

. It's actually within a factor of 2 of the compiled code in `RcppGibbs`

.One of the big differences between this function and the compiled C++ function,

`RcppGibbs`

, is that the compiled function calls the underlying `C`

code for the samplers directly, avoiding the overhead of creating a vector of length 1 and indexing to get the first element. As these operations are done in the inner loop of `JGibbs1`

their overhead mounts up.Fortunately, Julia allows for calling a C function directly. You need the symbol from the dynamically loaded library, the signature of the function and the arguments.

It looks like

`function JGibbs2(N::Int, thin::Int)`

mat = Array(Float64, (N, 2))

x = 0.

y = 0.

for i = 1:N

for j = 1:thin

x = ccall(dlsym(_jl_libRmath, :rgamma),

Float64, (Float64, Float64), 3., (y*y + 4))

y = ccall(dlsym(_jl_libRmath, :rnorm),

Float64, (Float64, Float64), 1/(x+1), 1/sqrt(2(x + 1)))

end

mat[i,:] = [x,y]

end

mat

end

`RcppGibbs`

`julia> sum([@elapsed JGibbs2(20000, 200) for i=1:10])`

13.596416234970093

julia> sum([@elapsed JGibbs2(20000, 200) for i=1:10])

13.584651470184326

`function JGibbs3(N::Int, thin::Int)`

mat = Array(Float64, (N, 2))

x = 0.

y = 0.

for i = 1:N

for j = 1:thin

x = randg(3) * (y*y + 4)

y = 1/(x + 1) + randn()/sqrt(2(x + 1))

end

mat[i,:] = [x,y]

end

mat

end

`julia> sum([@elapsed JGibbs3(20000, 200) for i=1:10])`

6.603794574737549

julia> sum([@elapsed JGibbs3(20000, 200) for i=1:10])

6.58268928527832

`RcppGibbs`

(which is using slower samplers) and `GSLGibbs`

(faster samplers but not as fast as those in Julia) while writing code that looks very much like the original R function.**But wait, there's more!**

This computer has a 4-core processor (AMD Athlon(tm) II X4 635 Processor @ 2.9 GHz) and Julia can take advantage of that. When starting Julia we specify the number of processes

`julia -p 4`

`dJGibbs3a`

, leaves the result as a distributed array and the second, `dJGibbs3b`

, converts the result to a single array in the parent process.`## Distributed versions - keeping the results as a distributed array`

dJGibbs3a(N::Int, thin::Int) = darray((T,d,da)->JGibbs3(d[1],thin), Float64, (N, 2), 1)

## Converting the results to an array controlled by the parent process

dJGibbs3b(N::Int, thin::Int) = convert(Array{Float64,2}, dJGibbs3a(N, thin))

`dJGibbs3a`

is declared with the right-pointing arrow construction `->`

. Its arguments are the type of the array, `T`

, the dimensions of the local chunk, `d`

, and the dimension on which the array is distributed, `da`

. Here we only use the number of rows, `d[1]`

, of the chunk to be generated.The timings,

`julia> sum([@elapsed dJGibbs3a(20000, 200) for i=1:10])`

1.6914057731628418

julia> sum([@elapsed dJGibbs3a(20000, 200) for i=1:10])

1.6724529266357422

julia> sum([@elapsed dJGibbs3b(20000, 200) for i=1:10])

2.2329299449920654

julia> sum([@elapsed dJGibbs3b(20000, 200) for i=1:10])

2.267037868499756

If you haven't looked into Julia before now, you owe it to yourself to do so.

To

**leave a comment**for the author, please follow the link and comment on his blog:**A second megabyte of memory**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...