What you get and what you should be getting: checking numerical code

August 22, 2012
By

(This article was first published on dahtah » R, and kindly contributed to R-bloggers)

[Update 24.08.12: added missing alias following comment by brobar]

Whenever I write numerical code I spend half my time debugging my algebra, painstakingly uncovering one sign mistake after another in my calculations. Usually I have computed by hand the gradient or the integral of some nasty function, and I have to check it against a numerical estimate (for example returned by the wonderful numDeriv package). I have written a small utility called cmpplot, for “comparison plot”, that allows me to quickly plot what I’m getting from my hand-computed function against what I should be getting (the numerical estimate). It’s rather useful if you need to diagnose sign mistakes, forgotten factors, etc. It’s too small to be a R package so I’m just going to put the code below, with an example.

Let’s say that after lengthy calculations we have decided that the derivative of cos(x) is sin(x):

fun <- function(x) cos(x)
dfun <- function(x) sin(x)

Although quite confident in our result, we decide to check against a numerical derivative anyway:

require(numDeriv)
xv <- seq(0,2*pi,l=100)
d.num <- sapply(xv,function(v) grad(fun,v))
d.th <- dfun(xv)

A call to cmpplot reveals an embarassing mistake:

cmpplot(d.num,d.th)


The points should all fall on the diagonal line, and most obviously they don’t. cmpplot also displays the result of a regression of d.th on d.num, which tells us we’ve made a sign mistake and that the derivative of cos(x) is actually -sin(x).

Sometimes a function is only correct for some arguments: for example

fun <- function(x) sign(x)*x^2
dfun <- function(x) 2*x ##WRONG for x < 0

in cases like this calling cmpplot with argument “type = 2″ provides a better way of visualising what’s going on:

cmpplot(d.num,d.th,type=2)

The black dots should all align with the blue dots and only half of them do so.

cmpplot also has methods for matrices: this time we take cos(x) as a function that takes a vector argument and returns a vector argument, and we want to compute its Jacobian. Again we have made a mistake in our theoretical calculations:

fun <- function(x) cos(2*x)
Jfun <- function(x) -diag(sin(2*x))

Checking against a numerical Jacobian:

<pre>x <- rnorm(10)
J.num <- jacobian(fun,x)
J.th <- Jfun(x)
cmpplot(J.th,J.num)

cmpplot produces an image plot of the absolute error, and we see that we have made a mistake on the diagonal.

Other options include:

cmpplot(J.th,J.num,type=2)

to plot just the eigen- or singular values of the two matrices and

cmpplot(J.th,J.num,flatten=T)

to treat the matrices as vectors.

Here’s the full code for cmpplot, just copy and paste!


#Comparison Plots
#Some visual tools to compare what you got and what you should be getting
#Simon Barthelmé (University of Geneva), 2012

cmpplot <- function(x,...) UseMethod("cmpplot",x)

cmpplot.matrix <- function(A,B,flatten=F,type=1,...)
{
name.a <- deparse(substitute(A))
name.b <- deparse(substitute(B))

if (flatten)
{
cmpplot.numeric(as.numeric(A),as.numeric(B),type=type,...)
}
else
{
if (type == 1)
{
A <- Matrix(A)
B <- Matrix(B)
ttl <- sprintf("Absolute difference between %s and %s",name.a,name.b)
image(abs(A-B),sub=ttl,colorkey=T)
}
else if (type == 2)
{
if (is.symmetric(A) & is.symmetric(B))
{
sym <- T
vA <- eigen(A)$val
vB <- eigen(B)$val
ylab <- "Eigenvalue"
}
else
{
sym <- F
vA <- svd(A)$d
vB <- svd(B)$d
ylab <- "Singular value"
}
ttl <- sprintf('Spectrum of %s vs. spectrum of %s',name.a,name.b)
ind <- 1:length(vA)
ylim <- range(c(vA,vB))
plot(ind,vA,ylim=ylim,pch=20,xlab="Index",ylab=ylab,main=ttl)
points(ind,vB,pch=3,col="red")
legend("topright",c(name.a,name.b),pch=c(20,3),col=c('black','blue'),fill="white")
}
}
}

cmpplot.numeric <- function(a,b,type=1,stretch=F)
{
name.a <- deparse(substitute(a))
name.b <- deparse(substitute(b))
if (type==1)
{
if (stretch)
{
xlim <- range(c(a,b))
ylim <- xlim
}
else
{
xlim <- range(a)
ylim <- range(b)
}

plot(a,b,xlab=name.a,ylab=name.b,xlim=xlim,ylim=ylim,col="blue",pch=20)
abline(0,1)
m <- lm(b ~ a)
msg <- sprintf('Regression line: y = %.2f * x + %.2f',coef(m)[2],coef(m)[1])
mtext(msg)
}
else if (type == 2)
{
ylim <- range(c(a,b))
plot(a,xlab="Index",ylab="Value",pch=20,ylim=ylim)
pointsb(b,pch=3)
legend("topright",c(name.a,name.b),pch=c(20,3),col=c('black','blue'),bg="white")

}
else if (type == 3)
{
a <- rank(a);b <- rank(b);
name.a <- paste(name.a,"(rank)")
name.b <- paste(name.b,"(rank)")
plot(a,b,xlab=name.a,ylab=name.b)
abline(0,1)
}

}

is.symmetric <- function(A)
{
if (nrow(A) == ncol(A))
{
all(A==t(A))
}
else
{
FALSE
}
}

pointsb <- function(...)
{
points(...,col="blue")
}

 

To leave a comment for the author, please follow the link and comment on his blog: dahtah » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.