A Practical Example of Calculating Padé Approximant Coefficients Using R

[This article was first published on Strange Attractors » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

I recently had the opportunity to use Padé approximants. There is a lot of good information available on line on the theory and applications of using Padé approximants, but I had trouble finding a good example explaining just how to calculate the co-efficients.

Basic Background

Hearken back to undergraduate calculus for a moment. For a given function, its The Taylor series is the “best” polynomial representations of that function. If the function is being evaluated at 0, the Taylor series representation is also called the Maclaurin series. The error is proportional to the first “left-off” term. Also, the series is only a good estimate in a small radius around the point for which it is calculated (e.g. 0 for a Maclaurin series).

Padé approximants estimate functions as the quotient of two polynomials. Specifically, given a Taylor series expansion of a function T(x) of order m + n, there are two polynomials, P(x) of order m and Q(x) of order n, such that \frac{P(x)}{Q(x)}, called the Padé approximant of order [m/n], “agrees” with the original function in order m + n. You may ask, “but the Taylor series from whence it is derived is also of order m + n?” And you would be correct. However, the Padé approximant seems to consistently have a wider radius of convergence than its parent Taylor series, and, being a quotient, is composed of lower-degree polynomials. With the normalization that the first term of Q(x) is always 1, there is a set of linear equations which will generate the unique Padé approximant coefficients. Letting a_n be the coefficients for the Taylor series, one can solve:

(1)   \begin{align*} &a_0 &= p_0\\ &a_1 + a_0q_1 &= p_1\\ &a_2 + a_1q_1 + a_0q_2 &= p_2\\ &a_3 + a_2q_1 + a_1q_2 + a_0q_3 &= p_3\\ &a_4 + a_3q_1 + a_2q_2 + a_1q_3 + a_0q_4 &= p_4\\ &\vdots&\vdots\\ &a_{m+n} + a_{m+n-1}q_1 + \ldots + a_0q_{m+n} &= p_{m+n} \end{align*}

remembering that all m” title=”Rendered by QuickLaTeX.com” height=”17″ width=”75″ style=”vertical-align: -4px;”/> and n” title=”Rendered by QuickLaTeX.com” height=”17″ width=”69″ style=”vertical-align: -4px;”/> are 0.

There is a lot of research on the theory of Padé approximants and Padé tables, how they work, their relationship to continued fractions, and why they work so well. For example, the interested reader is directed to Baker (1975), Van Assche (2006), Wikipedia, and MathWorld for more.

Practical Example

The function \log(1 + x) will be used as the example. This function has a special representation in almost all computer languages, often called log1p(x), as naïve implementation as log(1 + x) will suffer catastrophic floating point errors for x near 0.

The Maclaurin series expansion for \log(1 + x) is:

    \[ \sum_{k=1}^\infty -1^{k+1}\frac {x^k}{k} = x - \frac{1}{2}x^2 + \frac{1}{3}x^3 - \frac{1}{4}x^4 + \frac{1}{5}x^5 - \frac{1}{6}x^6\ldots \]

The code below will compare a Padé[3,3] approximant with the 6-term Maclaurin series, which would actually be the Padé[6/0] approximant. First to calculate the coefficients. We know the Maclaurin coefficients, they are 0, 1, -\frac{1}{2}, \frac{1}{3}, -\frac{1}{4}, \frac{1}{5}, -\frac{1}{6}. Therefore, the system of linear equations looks like this:

(2)   \begin{align*} &0 &= p_0\\ &1 + 0q_1 &= p_1\\ &-\frac{1}{2} + q_1 + 0q_2 &= p_2\\ &\frac{1}{3} - \frac{1}{2}q_1 + q_2 + a_0q_3 &= p_3\\ &-\frac{1}{4} + \frac{1}{3}q_1 - \frac{1}{2} q_2 + q_3 &= 0\\ &\frac{1}{5} - \frac{1}{4}q_1 + \frac{1}{3} q_2 - \frac{1}{2} q_3 &= 0\\ &-\frac{1}{6} + \frac{1}{5}q_1 - \frac{1}{4} q_2 + \frac{1}{3} q_3 &= 0 \end{align*}

Rewriting in terms of the known a_n coefficients, we get:

(3)   \begin{align*} &-p_0 =& 0\\ &0q_1 - p_1 =& -1\\ &q_1 + 0q_2 -p_2 =& \frac{1}{2}\\ &-\frac{1}{2}q_1 + q_2 + a_0q_3 - p_3 =& -\frac{1}{3}\\ &\frac{1}{3}q_1 - \frac{1}{2} q_2 + q_3 =& \frac{1}{4}\\ &-\frac{1}{4}q_1 + \frac{1}{3} q_2 - \frac{1}{2} q_3 =& -\frac{1}{5}\\ &\frac{1}{5}q_1 - \frac{1}{4} q_2 + \frac{1}{3} q_3 =& \frac{1}{6} \end{align*}

We can solve this in R using solve:

A <- matrix(c(0, 0, 1, -.5 ,1 / 3 , -.25, .2, 0, 0, 0, 1, -.5, 1 / 3, -.25, 0, 0, 0, 0, 1, -.5, 1 / 3, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0, 0, 0, 0, 0, -1, 0, 0, 0), ncol = 7)
B <- c(0, -1, .5, -1/3, .25, -.2, 1/6)
P_Coeff <- solve(A, B)
print(P_Coeff)

## [1] 1.5000000 0.6000000 0.0500000 0.0000000 1.0000000 1.0000000 0.1833333

Now we can create the estimating functions:
ML <- function(x){x - .5 * x ^ 2 + x ^ 3 / 3 - .25 * x ^ 4 + .2 * x ^ 5 - x ^ 6 / 6}

PD33 <- function(x){
  NUMER <- x + x ^ 2 + 0.1833333333333333 * x ^ 3
  DENOM <- 1 + 1.5 * x + .6 * x ^ 2 + 0.05 * x ^ 3
  return(NUMER / DENOM)
}

Let’s compare the behavior of these functions around 0 with the naïve and sophisticated implementations of \log(1+x) in R.
library(dplyr)
library(ggplot2)
library(tidyr)
D <- seq(-1e-2, 1e-2, 1e-6)
RelErr <- tbl_df(data.frame(X = D, Naive = (log(1 + D) - log1p(D)) / log1p(D), MacL = (ML(D) - log1p(D)) / log1p(D), Pade = (PD33(D) - log1p(D)) / log1p(D)))
RelErr2 <- gather(RelErr, Type, Error, -X)
RelErr2 %>% group_by(Type) %>% summarize(MeanError = mean(Error, na.rm = TRUE)) %>% knitr::kable(digits = 18)

Type MeanError
Naïve -4.3280e-15
MacL -2.0417e-14
Pade -5.2000e-17

Graphing the relative error in a small area around 0 shows the differing behaviors. First, against the naïve implementation, both estimates do much better.

ggplot(RelErr2, aes(x = X)) + geom_point(aes(y = Error, colour = Type), alpha = 0.5)

Graph1

But when compared one against the other, the Pade approximant (blue) shows better behavior than the Maclaurin (red) and it’s relative error stays below EPS for a wider swath.

ggplot(RelErr, aes(x = X)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

Graph2

Just for fun, restricting the y axis to that above, overlaying the naïve formulation (green) looks like this:

ggplot(RelErr, aes(x = X)) + geom_point(aes(y = Naive), colour = 'green', alpha = 0.5) + scale_y_continuous(limits = c(-1.5e-13, 0)) + geom_point(aes(y = MacL), colour = 'red', alpha = 0.5) + geom_point(aes(y = Pade), colour = 'blue', alpha = 0.5)

Graph3

There are certainly more efficient and elegant ways to calculate Padé approximants, but I found this exercise helpful, and I hope you do as well!

References

  • Baker, G. A. Essentials of Padé Approximants Academic Press, 1975
  • Van Assche, W. Pade and Hermite-Pade approximation and orthogonality ArXiv Mathematics e-prints, 2006

To leave a comment for the author, please follow the link and comment on their blog: Strange Attractors » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)