Monte Carlo-based prediction intervals for nonlinear regression

May 11, 2018

(This article was first published on Rmazing, and kindly contributed to R-bloggers)

Calculation of the propagated uncertainty \sigma_y using \nabla \Sigma \nabla^T (1), where \nabla is the gradient and \Sigma the covariance matrix of the coefficients \beta_i, is called the “Delta Method” and is widely applied in nonlinear least-squares (NLS) fitting. However, this method is based on first-order Taylor expansion and thus assummes linearity around \hat{y} = f(x, \hat{\beta}). The second-order approach can partially correct for this restriction by using a second-order polynomial around \hat{y}, which is \nabla \Sigma \nabla^T + \frac{1}{2} \rm{tr}(H \Sigma H \Sigma) (2), where \rm{tr}(\cdot) is the matrix trace and \rm{H} is the Hessian.
Confidence and prediction intervals for NLS models are calculated using t(1 - \frac{\alpha}{2}, \nu) \cdot \sigma_y (3) or t(1 - \frac{\alpha}{2}, \nu) \cdot \sqrt{\sigma_y^2 + \sigma_r^2} (4), respectively, where the residual variance \sigma_r^2 = \frac{\sum_{i=1}^n (y_i - \hat{y}_i)^2}{\nu} (5).
Now, how can we employ the matrix notation of error propagation for creating Taylor expansion- and Monte Carlo-based prediction intervals?
The inclusion of \sigma_r^2 in the prediction interval can be implemented as an extended gradient and “augmented” covariance matrix. So instead of using \hat{y} = f(x, \hat{\beta}) (6), we take \hat{y} = f(x, \hat{\beta}) + \sigma_r^2 (7) as the expression and augment the i \times i covariance matrix \Sigma to an (i+1) \times (i+1) covariance matrix, where \Sigma_{i+1, i+1} = \sigma_r^2. Partial differentiation and matrix multiplication will then yield, for example with two coefficients \beta_1 and \beta_2 and their corresponding covariance matrix \Sigma:
\left[\frac{\partial f}{\partial \beta_1}\; \frac{\partial f}{\partial \beta_2}\; 1\right] \left[ \begin{array}{ccc} \;\sigma_1^2\;\;\; \sigma_1\sigma_2\;\; 0 \\ \sigma_2\sigma_1\;\; \sigma_2^2\;\;\;\; 0 \\ \;\;\;0\;\;\;\;\;\; 0\;\;\;\;\; \sigma_r^2 \end{array} \right] \left[ \begin{array}{c} \frac{\partial f}{\partial \beta_1} \\ \frac{\partial f}{\partial \beta_2} \\ 1 \end{array} \right] (8)
= \left(\frac{\partial f}{\partial \beta_1}\right)^2\sigma_1^2 + 2 \frac{\partial f}{\partial \beta_1} \frac{\partial f}{\partial \beta_2} \sigma_1 \sigma_2 + \left(\frac{\partial f}{\partial \beta_2}\right)^2\sigma_2^2 + \sigma_r^2 (9)
\equiv \sigma_y^2 + \sigma_r^2, which then goes into (4).
The advantage of the augmented covariance matrix is that it can be exploited for creating Monte Carlo-based prediction intervals. This is new from propagate version 1.0-6 and is based on the paradigm that we add another dimension by employing the augmented covariance matrix of (8) in the multivariate t-distribution random number generator (in our case rtmvt), with \mu = 0. All n simulations are then evaluated with (7) and the usual [1 - \frac{\alpha}{2}, \frac{\alpha}{2}] quantiles calculated for the prediction interval. Using the original covariance matrix with (6) will deliver the MC-based confidence interval.
Application of second-order Taylor expansion or the MC-based approach demonstrates nicely that for the majority of nonlinear models, the confidence/prediction intervals around \hat{y} are quite asymmetric, which the classical Delta Method does not capture:


DNase1 <- subset(DNase, Run == 1)
fm3DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)),
data = DNase1, start = list(Asym = 3, xmid = 0, scal = 1))
## first-order prediction interval
PROP1 <- predictNLS(fm3DNase1, newdata = data.frame(conc = 2), nsim = 1000000,
second.order = FALSE, interval = "prediction")

\begin{array}{cc}  Prop.Mean.1 & 0.74804722 \\ & 0.02081131 \\  Prop.2.5 & 0.70308712 \\  Prop.97.5 & 0.79300731 \\  \end{array}

## second-order prediction interval and MC
PROP2 <- predictNLS(fm3DNase1, newdata = data.frame(conc = 2), nsim = 1000000,
second.order = TRUE, interval = "prediction")

\begin{array}{cc}  Prop.Mean.1 & 0.74804722 \\  Prop.Mean.2 & 0.74815136 \\ &  0.02081131 \\ &  0.02081520 \\  Prop.2.5 &  0.70318286 \\  Prop.97.5 & 0.79311987 \\  Sim.Mean & 0.74815598 \\ & 0.02261884 \\  Sim.2.5 & 0.70317629 \\  Sim.97.5 & 0.79309874 \\  \end{array}

What we see here is that
i) the first-order prediction interval [0.70308712; 0.79300731] is symmetric and slightly down-biased compared to the second-order one [0.70317629; 0.79309874],
ii) the second-order prediction interval tallies nicely up to the 4th decimal with the new MC-based interval (0.70318286 and 0.70317629; 0.79311987 and 0.79309874).
I believe this clearly demonstrates the usefulness of the MC-based approach for NLS prediction interval estimation…

To leave a comment for the author, please follow the link and comment on their blog: Rmazing. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)