Monte Carlo-based prediction intervals for nonlinear regression

[This article was first published on Rmazing, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Calculation of the propagated uncertainty \sigma_y using \nabla \Sigma \nabla^T (1), where \nabla is the gradient and \Sigma the covariance matrix of the coefficients \beta_i, is called the “Delta Method” and is widely applied in nonlinear least-squares (NLS) fitting. However, this method is based on first-order Taylor expansion and thus assummes linearity around \hat{y} = f(x, \hat{\beta}). The second-order approach can partially correct for this restriction by using a second-order polynomial around \hat{y}, which is \nabla \Sigma \nabla^T + \frac{1}{2} \rm{tr}(H \Sigma H \Sigma) (2), where \rm{tr}(\cdot) is the matrix trace and \rm{H} is the Hessian.
Confidence and prediction intervals for NLS models are calculated using t(1 - \frac{\alpha}{2}, \nu) \cdot \sigma_y (3) or t(1 - \frac{\alpha}{2}, \nu) \cdot \sqrt{\sigma_y^2 + \sigma_r^2} (4), respectively, where the residual variance \sigma_r^2 = \frac{\sum_{i=1}^n (y_i - \hat{y}_i)^2}{\nu} (5).
Now, how can we employ the matrix notation of error propagation for creating Taylor expansion- and Monte Carlo-based prediction intervals?
The inclusion of \sigma_r^2 in the prediction interval can be implemented as an extended gradient and “augmented” covariance matrix. So instead of using \hat{y} = f(x, \hat{\beta}) (6), we take \hat{y} = f(x, \hat{\beta}) + \sigma_r^2 (7) as the expression and augment the i \times i covariance matrix \Sigma to an (i+1) \times (i+1) covariance matrix, where \Sigma_{i+1, i+1} = \sigma_r^2. Partial differentiation and matrix multiplication will then yield, for example with two coefficients \beta_1 and \beta_2 and their corresponding covariance matrix \Sigma:
\left[\frac{\partial f}{\partial \beta_1}\; \frac{\partial f}{\partial \beta_2}\; 1\right] \left[ \begin{array}{ccc} \;\sigma_1^2\;\;\; \sigma_1\sigma_2\;\; 0 \\ \sigma_2\sigma_1\;\; \sigma_2^2\;\;\;\; 0 \\ \;\;\;0\;\;\;\;\;\; 0\;\;\;\;\; \sigma_r^2 \end{array} \right] \left[ \begin{array}{c} \frac{\partial f}{\partial \beta_1} \\ \frac{\partial f}{\partial \beta_2} \\ 1 \end{array} \right] (8)
= \left(\frac{\partial f}{\partial \beta_1}\right)^2\sigma_1^2 + 2 \frac{\partial f}{\partial \beta_1} \frac{\partial f}{\partial \beta_2} \sigma_1 \sigma_2 + \left(\frac{\partial f}{\partial \beta_2}\right)^2\sigma_2^2 + \sigma_r^2 (9)
\equiv \sigma_y^2 + \sigma_r^2, which then goes into (4).
The advantage of the augmented covariance matrix is that it can be exploited for creating Monte Carlo-based prediction intervals. This is new from propagate version 1.0-6 and is based on the paradigm that we add another dimension by employing the augmented covariance matrix of (8) in the multivariate t-distribution random number generator (in our case rtmvt), with \mu = 0. All n simulations are then evaluated with (7) and the usual [1 - \frac{\alpha}{2}, \frac{\alpha}{2}] quantiles calculated for the prediction interval. Using the original covariance matrix with (6) will deliver the MC-based confidence interval.
Application of second-order Taylor expansion or the MC-based approach demonstrates nicely that for the majority of nonlinear models, the confidence/prediction intervals around \hat{y} are quite asymmetric, which the classical Delta Method does not capture:

DNase1 <- subset(DNase, Run == 1) fm3DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)), data = DNase1, start = list(Asym = 3, xmid = 0, scal = 1)) ## first-order prediction interval set.seed(123) PROP1 <- predictNLS(fm3DNase1, newdata = data.frame(conc = 2), nsim = 1000000, second.order = FALSE, interval = "prediction") t(PROP1$summary)

\begin{array}{cc}  Prop.Mean.1 & 0.74804722 \\ & 0.02081131 \\  Prop.2.5 & 0.70308712 \\  Prop.97.5 & 0.79300731 \\  \end{array}

## second-order prediction interval and MC


PROP2 <- predictNLS(fm3DNase1, newdata = data.frame(conc = 2), nsim = 1000000, second.order = TRUE, interval = "prediction") t(PROP2$summary)

\begin{array}{cc}  Prop.Mean.1 & 0.74804722 \\  Prop.Mean.2 & 0.74815136 \\ &  0.02081131 \\ &  0.02081520 \\  Prop.2.5 &  0.70318286 \\  Prop.97.5 & 0.79311987 \\  Sim.Mean & 0.74815598 \\ & 0.02261884 \\  Sim.2.5 & 0.70317629 \\  Sim.97.5 & 0.79309874 \\  \end{array}

What we see here is that
i) the first-order prediction interval [0.70308712; 0.79300731] is symmetric and slightly down-biased compared to the second-order one [0.70317629; 0.79309874],
ii) the second-order prediction interval tallies nicely up to the 4th decimal with the new MC-based interval (0.70318286 and 0.70317629; 0.79311987 and 0.79309874).
I believe this clearly demonstrates the usefulness of the MC-based approach for NLS prediction interval estimation…

To leave a comment for the author, please follow the link and comment on their blog: Rmazing. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)