What you get and what you should be getting: checking numerical code

[This article was first published on dahtah » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

[Update 24.08.12: added missing alias following comment by brobar]

Whenever I write numerical code I spend half my time debugging my algebra, painstakingly uncovering one sign mistake after another in my calculations. Usually I have computed by hand the gradient or the integral of some nasty function, and I have to check it against a numerical estimate (for example returned by the wonderful numDeriv package). I have written a small utility called cmpplot, for “comparison plot”, that allows me to quickly plot what I’m getting from my hand-computed function against what I should be getting (the numerical estimate). It’s rather useful if you need to diagnose sign mistakes, forgotten factors, etc. It’s too small to be a R package so I’m just going to put the code below, with an example.

Let’s say that after lengthy calculations we have decided that the derivative of cos(x) is sin(x):

fun <- function(x) cos(x)
dfun <- function(x) sin(x)

Although quite confident in our result, we decide to check against a numerical derivative anyway:

require(numDeriv)
xv <- seq(0,2*pi,l=100)
d.num <- sapply(xv,function(v) grad(fun,v))
d.th <- dfun(xv)

A call to cmpplot reveals an embarassing mistake:

cmpplot(d.num,d.th)


The points should all fall on the diagonal line, and most obviously they don’t. cmpplot also displays the result of a regression of d.th on d.num, which tells us we’ve made a sign mistake and that the derivative of cos(x) is actually -sin(x).

Sometimes a function is only correct for some arguments: for example

fun <- function(x) sign(x)*x^2
dfun <- function(x) 2*x ##WRONG for x < 0

in cases like this calling cmpplot with argument “type = 2″ provides a better way of visualising what’s going on:

cmpplot(d.num,d.th,type=2)

The black dots should all align with the blue dots and only half of them do so.

cmpplot also has methods for matrices: this time we take cos(x) as a function that takes a vector argument and returns a vector argument, and we want to compute its Jacobian. Again we have made a mistake in our theoretical calculations:

fun <- function(x) cos(2*x)
Jfun <- function(x) -diag(sin(2*x))

Checking against a numerical Jacobian:

<pre>x <- rnorm(10)
J.num <- jacobian(fun,x)
J.th <- Jfun(x)
cmpplot(J.th,J.num)

cmpplot produces an image plot of the absolute error, and we see that we have made a mistake on the diagonal.

Other options include:

cmpplot(J.th,J.num,type=2)

to plot just the eigen- or singular values of the two matrices and

cmpplot(J.th,J.num,flatten=T)

to treat the matrices as vectors.

Here’s the full code for cmpplot, just copy and paste!


#Comparison Plots
#Some visual tools to compare what you got and what you should be getting
#Simon Barthelmé (University of Geneva), 2012

cmpplot <- function(x,...) UseMethod("cmpplot",x)

cmpplot.matrix <- function(A,B,flatten=F,type=1,...)
{
name.a <- deparse(substitute(A))
name.b <- deparse(substitute(B))

if (flatten)
{
cmpplot.numeric(as.numeric(A),as.numeric(B),type=type,...)
}
else
{
if (type == 1)
{
A <- Matrix(A)
B <- Matrix(B)
ttl <- sprintf("Absolute difference between %s and %s",name.a,name.b)
image(abs(A-B),sub=ttl,colorkey=T)
}
else if (type == 2)
{
if (is.symmetric(A) & is.symmetric(B))
{
sym <- T
vA <- eigen(A)$val
vB <- eigen(B)$val
ylab <- "Eigenvalue"
}
else
{
sym <- F
vA <- svd(A)$d
vB <- svd(B)$d
ylab <- "Singular value"
}
ttl <- sprintf('Spectrum of %s vs. spectrum of %s',name.a,name.b)
ind <- 1:length(vA)
ylim <- range(c(vA,vB))
plot(ind,vA,ylim=ylim,pch=20,xlab="Index",ylab=ylab,main=ttl)
points(ind,vB,pch=3,col="red")
legend("topright",c(name.a,name.b),pch=c(20,3),col=c('black','blue'),fill="white")
}
}
}

cmpplot.numeric <- function(a,b,type=1,stretch=F)
{
name.a <- deparse(substitute(a))
name.b <- deparse(substitute(b))
if (type==1)
{
if (stretch)
{
xlim <- range(c(a,b))
ylim <- xlim
}
else
{
xlim <- range(a)
ylim <- range(b)
}

plot(a,b,xlab=name.a,ylab=name.b,xlim=xlim,ylim=ylim,col="blue",pch=20)
abline(0,1)
m <- lm(b ~ a)
msg <- sprintf('Regression line: y = %.2f * x + %.2f',coef(m)[2],coef(m)[1])
mtext(msg)
}
else if (type == 2)
{
ylim <- range(c(a,b))
plot(a,xlab="Index",ylab="Value",pch=20,ylim=ylim)
pointsb(b,pch=3)
legend("topright",c(name.a,name.b),pch=c(20,3),col=c('black','blue'),bg="white")

}
else if (type == 3)
{
a <- rank(a);b <- rank(b);
name.a <- paste(name.a,"(rank)")
name.b <- paste(name.b,"(rank)")
plot(a,b,xlab=name.a,ylab=name.b)
abline(0,1)
}

}

is.symmetric <- function(A)
{
if (nrow(A) == ncol(A))
{
all(A==t(A))
}
else
{
FALSE
}
}

pointsb <- function(...)
{
points(...,col="blue")
}

 

To leave a comment for the author, please follow the link and comment on their blog: dahtah » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)