one-dimensional integrals

December 25, 2010
By

(This article was first published on YGC » R, and kindly contributed to R-bloggers)

The foundamental idea of numerical integration is to estimate the area of the region in the xy-plane bounded by the graph of function f(x). The integral was esimated by divide x to small intervals, then add all the small approximations to give a total approximation.

Numerical integration can be done by trapezoidal rule, simpson’s rule and quadrature rules. R has a built-in function integrate, which performs adaptive quadrature.

Trapezoidal rule works by approximating the region under the graph f(x) as a trapezoid and calculating its area.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
trapezoid <- function(fun, a, b, n=100) {
	# numerical integral of fun from a to b
	# using the trapezoid rule with n subdivisions
	# assume a < b and n is a positive integer
	h <- (b-a)/n
	x <- seq(a, b, by=h)
	y <- fun(x)
	s <- h * (y[1]/2 + sum(y[2:n]) + y[n+1]/2)
	return(s)
}

Simpson’s rule subdivides the interval [a,b] into n subintervals, where n is even, then on each consecutive pairs of subintervals, it approximates the behaviour of f(x) by a parabola (polynomial of degree 2) rather than by the straight lines used in the trapezoidal rule.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
simpson <- function(fun, a, b, n=100) {
	# numerical integral using Simpson's rule
	# assume a < b and n is an even positive integer
	h <- (b-a)/n
	x <- seq(a, b, by=h)
	if (n == 2) {
		s <- fun(x[1]) + 4*fun(x[2]) +fun(x[3])
	} else {
		s <- fun(x[1]) + fun(x[n+1]) + 2*sum(fun(x[seq(2,n,by=2)])) + 4 *sum(fun(x[seq(3,n-1, by=2)]))
	}
	s <- s*h/3
	return(s)
}

To calculate an integrate over infinite interval, one way is to transform it into an integral over a finite interval as introduce in wiki.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
simpson_v2 <- function(fun, a, b, n=100) {
	# numerical integral using Simpson's rule
	# assume a < b and n is an even positive integer
	if (a == -Inf & b == Inf) {
		f <- function(t) (fun((1-t)/t) + fun((t-1)/t))/t^2
		s <- simpson_v2(f, 0, 1, n)
	} else if (a == -Inf & b != Inf) {
		f <- function(t) fun(b-(1-t)/t)/t^2
		s <- simpson_v2(f, 0, 1, n)
	} else if (a != -Inf & b == Inf) {
		f <- function(t) fun(a+(1-t)/t)/t^2
		s <- simpson_v2(f, 0, 1, n)
	} else {
		h <- (b-a)/n
		x <- seq(a, b, by=h)
		y <- fun(x)
		y[is.nan(y)]=0
		s <- y[1] + y[n+1] + 2*sum(y[seq(2,n,by=2)]) + 4 *sum(y[seq(3,n-1, by=2)])
		s <- s*h/3
	}
	return(s)
}
> phi <- function(x) exp(-x^2/2)/sqrt(2*pi)  ##normal distribution
> simpson_v2(phi, 0, Inf, n=100)
[1] 0.4986569
> simpson_v2(phi, -Inf,3, n=100)
[1] 0.998635
> pnorm(3)
[1] 0.9986501

Simpson’s rule is more accuracy than trapezoidal rule. To compare the accuracy between simpson’s rule and trapezoidal rule, I estimated  int_{0.01}^{1} 1/x dx = -log(0.01) for a sequence of increasing values of n.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
f <- function(x) 1/x
#integrate(f, 0.01, 1) == -log(0.01)
S.trapezoid <- function(n) trapezoid(f, 0.01, 1, n)
S.simpson <- function(n) simpson(f, 0.01, 1, n)
 
n <- seq(10, 1000, by = 10)
St <- sapply(n, S.trapezoid)
Ss <- sapply(n, S.simpson)
 
opar <- par(mfrow = c(2, 2))
plot(n,St + log(0.01), type="l", xlab="n", ylab="error", main="Trapezoidal rule")
plot(n,Ss + log(0.01), type="l", xlab="n", ylab="error",main="Simpson's rule")
plot(log(n), log(St+log(0.01)),type="l", xlab="log(n)", ylab="log(error)",main="Trapezoidal rule")
plot(log(n), log(Ss+log(0.01)),type="l", xlab="log(n)", ylab="log(error)",main="Simpson's rule")


The plot showed that log(error) against log(n) appears to have a slope of -1.90 and -3.28 for trapezoidal rule and simpson’s rule respectively.

Another way to compare their accuracy is to calculate how large of the partition size n for reaching a specific tolerance.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
yIntegrate <- function(fun, a, b, tol=1e-8, method= "simpson", verbose=TRUE) {
	# numerical integral of fun from a to b, assume a < b 
	# with tolerance tol
	n <- 4
	h <- (b-a)/4
	x <- seq(a, b, by=h)
	y <- fun(x)
	yIntegrate_internal <- function(y, h, n, method) {
		if (method == "simpson") {
			s <- y[1] + y[n+1] + 4*sum(y[seq(2,n,by=2)]) + 2 *sum(y[seq(3,n-1, by=2)])
			s <- s*h/3
		} else if (method == "trapezoidal") {
			s <- h * (y[1]/2 + sum(y[2:n]) + y[n+1]/2)
		} else {
		}		
		return(s)
	}
 
	s <- yIntegrate_internal(y, h, n, method)
	s.diff <- tol + 1 # ensures to loop at once.
	while (s.diff > tol ) {
		s.old <- s
		n <- 2*n
		h <- h/2
		y[seq(1, n+1, by=2)] <- y ##reuse old fun values
		y[seq(2,n, by=2)] <- sapply(seq(a+h, b-h, by=2*h), fun)
		s <- yIntegrate_internal(y, h, n, method)
		s.diff <- abs(s-s.old)
	}
	if (verbose) {
		cat("partition size", n, "n")
	}
	return(s)
}
> fun <- function(x) exp(x) - x^2
> yIntegrate(fun, 3,5,tol=1e-8, method="simpson")
partition size 512
[1] 95.66096
> yIntegrate(fun, 3,5,tol=1e-8, method="trapezoidal")
partition size 131072
[1] 95.66096

As show above, simpson’s rule converge much faster than trapezoidal rule.

Quadrature rule is more efficient than traditional algorithms. In adaptive quadrature, the subinterval width h is not constant over the interval [a,b], but instead adapts to the function.
Here, I presented the adaptive simpson’s method.

?View Code RSPLUS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
quadrature <- function(fun, a, b, tol=1e-8) {
	# numerical integration using adaptive quadrature
 
	quadrature_internal <- function(S.old, fun, a, m, b, tol, level) {
		level.max <- 100
		if (level > level.max) {
			cat ("recursion limit reached: singularity likelyn")
			return (NULL)
		}
		S.left <- simpson(fun, a, m, n=2) 
		S.right <- simpson(fun, m, b, n=2)
		S.new <- S.left + S.right
		if (abs(S.new-S.old) > tol) {
			S.left <- quadrature_internal(S.left, fun, a, (a+m)/2, m, tol/2, level+1)
			S.right <- quadrature_internal(S.right, fun, m, (m+b)/2, b, tol/2, level+1)
			S.new <- S.left + S.right
		}
		return(S.new)
	}
 
	level = 1
	S.old <- (b-a) * (fun(a) + fun(b))/2
	S.new <- quadrature_internal(S.old, fun, a, (a+b)/2, b, tol, level+1)
	return(S.new)
}

Quadrature rule is effective when f(x) is steep.

> fun <- function(x) return(1.5 * sqrt(x))
> system.time(yIntegrate(fun, 0,1, tol=1e-9, method="simpson"))
partition size 524288
   user  system elapsed
   1.58    0.00    1.58
> system.time(quadrature(fun, 0,1, tol=1e-9))
   user  system elapsed
   0.25    0.00    0.25

Reference:
1. Robinson A., O. Jones, and R. Maillardet. 2009. Introduction to Scientific Programming and Simulation Using R. Chapman and Hall.
2. http://en.wikipedia.org/wiki/Numerical_integration
3. http://en.wikipedia.org/wiki/Trapezoidal_rule
4. http://en.wikipedia.org/wiki/Simpson%27s_rule
5. http://en.wikipedia.org/wiki/Adaptive_quadrature

Related Posts

To leave a comment for the author, please follow the link and comment on his blog: YGC » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.