# SAS PROC MCMC example in R: Nonlinear Poisson Regression Multilevel Random-Effects Model

March 8, 2015
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

I am slowly working my way through the PROC MCMCexamples. Regarding these data, the SAS manual says: ‘This example uses the pump failure data of Gaver and O’Muircheartaigh (1987) to illustrate how to fit a multilevel random-effects model with PROC MCMC. The number of failures and the time of operation are recorded for 10 pumps. Each of the pumps is classified into one of two groups that correspond to either continuous or intermittent operation’.
Regarding this example, while I was able to reproduce it well in JAGS and STAN, I was not able to get the LaplacesDemon perfect.

Previous post in the series PROC MCMC examples programmed in R were: example 1: sampling, example 2: Box Cox transformation, example 5: Poisson regression, example 6: Non-Linear Poisson Regression and example 6: Logistic Regression Random-Effects Model.

### Data

observed <-
‘5  94.320 1  1 15.720 2    5 62.880 1
14 125.760 1  3  5.240 2   19 31.440 1
1   1.048 2  1  1.048 2    4  2.096 2
22  10.480 2′
observed <- scan(text=gsub(‘[[:space:]]+’,’ ‘,observed),
what=list(y=integer(),t=double(),group=integer()),
sep=’ ‘
)
observed <- as.data.frame(observed)
observed\$pump <- factor(1:nrow(observed))
observed\$group <- factor(observed\$group)
observed\$logtstd <- log(observed\$t)
observed\$logtstd <- observed\$logtstd-mean(observed\$logtstd)

### Model

This example has the following model in SAS:

`proc mcmc data=pump outpost=postout_c seed=248601 nmc=10000   plots=trace diag=none;   ods select tracepanel postsumint;   array u[2] alpha beta;   array mu[2] (0 0);   parms s2 1;   prior s2 ~ igamma(0.01, scale=0.01);   random u ~ mvnar(mu, sd=1e6, rho=0) subject=group monitor=(u);   w = alpha + beta * logtstd;   random llambda ~ normal(w, var = s2) subject=pump monitor=(random(1));   lambda = exp(llambda);   model y ~ poisson(lambda);run;`

The most interesting bit is the llambda distribution statement. Somehow, it contributes to the model fit, and it fills llambda with values. In JAGS a similar construction can be made, but not in LaplacesDemon.

#### JAGS solution

This is a fairly straightforward transformation of the SAS model.
library(R2jags)
jagsdata <- list(
N=nrow(observed),
group=c(1,2)[observed\$group],
logtstd=observed\$logtstd,
y=observed\$y)
jagsm <- function() {
for(i in 1:2) {
alpha[i] ~ dnorm(0.0,1.0E-3)
beta[i]~dnorm(0.0,1.0e-3)
}
for(i in 1:N) {
w[i] <- alpha[group[i]]+beta[group[i]]*logtstd[i]
llambda[i] ~ dnorm(w[i],tau)
lambda[i] <- exp(llambda[i])
y[i] ~ dpois(lambda[i])
}
tau ~ dgamma(.01,0.01)
s2 <- 1/tau
ll8 <- llambda[8]
}
params <- c(‘alpha’,’beta’,’s2′,’ll8′)
myj <-jags(model=jagsm,
data = jagsdata,
parameters=params)
myj

Inference for Bugs model at “…model9ef72aab174.txt”, fit using jags,
3 chains, each with 2000 iterations (first 1000 discarded)
n.sims = 3000 iterations saved
mu.vect sd.vect   2.5%    25%    50%    75%  97.5%  Rhat n.eff
alpha[1]   2.958   2.482 -2.154  1.548  2.945  4.397  7.878 1.004   640
alpha[2]   1.618   0.887 -0.240  1.130  1.686  2.159  3.223 1.001  2800
beta[1]   -0.444   1.322 -3.083 -1.217 -0.422  0.306  2.241 1.004   600
beta[2]    0.581   0.575 -0.640  0.241  0.604  0.940  1.639 1.003   910
ll8       -0.077   0.835 -1.919 -0.559  0.021  0.503  1.298 1.003  1400
s2         1.792   1.950  0.289  0.717  1.232  2.182  6.463 1.002  3000
deviance  43.132   4.266 36.715 39.992 42.425 45.679 53.082 1.001  3000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 9.1 and DIC = 52.2
DIC is an estimate of expected predictive error (lower deviance is better).

#### Jags Solution 2

If I want to replace the llambda distribution by its contributing parts, the model becomes as below. There is now an explicit parameter for the pumps, which shows how much each pump deviates from the hyperdistribution.
jagsm <- function() {
for(i in 1:2) {
alpha[i] ~ dnorm(0.0,1.0E-3)
beta[i]~dnorm(0.0,1.0e-3)
}
for(i in 1:N) {
w[i] <- alpha[group[i]]+beta[group[i]]*logtstd[i]
pumps[i]  ~ dnorm(0,tau)
llambda[i] <- pumps[i]+w[i]

lambda[i] <- exp(llambda[i])
y[i] ~ dpois(lambda[i])
}
tau ~ dgamma(.01,0.01)
s2 <- 1/tau
ll8 <- llambda[8]
}

Inference for Bugs model at “/tmp/RtmpdTDOhm/model9ef186d53b5.txt”, fit using jags,
3 chains, each with 2000 iterations (first 1000 discarded)
n.sims = 3000 iterations saved
mu.vect sd.vect   2.5%    25%    50%    75%  97.5%  Rhat n.eff
alpha[1]   2.948   2.331 -1.735  1.574  2.930  4.292  7.846 1.004   830
alpha[2]   1.610   0.858 -0.235  1.138  1.658  2.151  3.126 1.007   590
beta[1]   -0.450   1.252 -3.055 -1.153 -0.432  0.284  2.056 1.004   960
beta[2]    0.568   0.570 -0.590  0.232  0.562  0.901  1.670 1.004  1800
ll8       -0.048   0.833 -1.849 -0.571  0.031  0.542  1.405 1.002  1100
s2         1.692   1.796  0.252  0.719  1.173  1.980  6.400 1.010   230
deviance  43.227   5.903 36.514 39.652 42.404 45.584 53.990 1.004   810

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 17.4 and DIC = 60.6
DIC is an estimate of expected predictive error (lower deviance is better).

#### STAN

In STAN it is fairly simple to program the explicit pump effect. However, the code is slightly longer.
stanData <- list(
N=nrow(observed),
group=c(1,2)[observed\$group],
logtstd=observed\$logtstd,
y=observed\$y)

library(rstan)
smodel <- ‘
data {
int N;
int group[N];
int  y[N];
real logtstd[N];
}
parameters {
real Beta[2];
real Alpha[2];
real s2;
vector[N] pumps;
}
transformed parameters {
vector[N] w;
vector[N] llambda;
vector[N] lambda;
for (i in 1:N) {
w[i] <- Alpha[group[i]]+Beta[group[i]]*logtstd[i];
}
llambda <- w+pumps;
lambda <- exp(llambda);
}
model {
y ~ poisson(lambda);
s2 ~ inv_gamma(.01,.01);
pumps ~ normal(0,sqrt(s2));
Alpha ~ normal(0,1000);
Beta ~ normal(0,1000);
}
generated quantities {
real ll8;
ll8 <- llambda[8];
}
‘
fstan <- stan(model_code = smodel,
data = stanData,
pars=c(‘Beta’,’Alpha’,’s2′,’ll8′))
fstan
Inference for Stan model: smodel.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean   sd  2.5%   25%    50%    75%  97.5% n_eff Rhat
Beta[1]  -0.40    0.04 1.29 -3.08 -1.12  -0.41   0.36   2.19  1317 1.00
Beta[2]   0.58    0.02 0.60 -0.69  0.23   0.60   0.97   1.72  1365 1.00
Alpha[1]  2.89    0.06 2.41 -1.93  1.48   2.89   4.26   7.90  1388 1.00
Alpha[2]  1.60    0.03 0.92 -0.39  1.09   1.66   2.18   3.28  1211 1.00
s2        1.79    0.07 1.95  0.31  0.76   1.24   2.12   6.27   710 1.01
ll8      -0.12    0.02 0.88 -2.10 -0.64  -0.01   0.52   1.30  3164 1.00
lp__     99.76    0.13 3.57 91.68 97.55 100.09 102.35 105.74   705 1.00

Samples were drawn using NUTS(diag_e) at Sat Mar  7 19:19:20 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

#### LaplacesDemon

This was a problem. Given the code above, it should be straightforward. However, it seems not to give decent parameter values. The code below does use a normal distribution for sqrt(s2). I must say that the sensitivity for various priors of s2 in this model is somewhat worrisome.
library(‘LaplacesDemon’)
mon.names <- “LP”
parm.names <- c(paste(rep(c(‘alpha’,’beta’),each=2),c(1,2,1,2),sep=”),
‘s2’,
paste(‘pump’,1:10,sep=”))
PGF <- function(Data) {
c(runif(5,0,2),
rnorm(10,0,1))
}

MyData <- list(mon.names=mon.names,
parm.names=parm.names,
PGF=PGF,
group=c(1,2)[observed\$group],
logtstd=observed\$logtstd,
y=observed\$y)
Model <- function(parm, Data)
{
beta=parm
alpha=parm
s2=parm[5]
pumps <- parm[6:15]
w <- alpha[Data\$group]+beta[Data\$group]*Data\$logtstd
lambda  <- exp(w+pumps)
if(s2<0 ) {
LL <- -Inf
LP <- -Inf
} else {
sd=sqrt(s2)
LL <- sum(dpois(Data\$y,lambda,log=TRUE)) +
sum(dnorm(pumps,0,sd,log=TRUE))
prior <- sum(dnorm(parm[1:4],0,1e6,log=TRUE)) +
dnorm(sd,0,10,log=TRUE)
LP=LL+prior
}
Modelout <- list(LP=LP, Dev=-2*LL, Monitor=LP,
yhat=lambda,
parm=parm)
return(Modelout)
}
Initial.Values <- GIV(Model, MyData, PGF=TRUE)
Model(Initial.Values,MyData)

LA <- LaplaceApproximation(Model,
parm=Initial.Values,
Data=MyData,
Method=’BFGS’,
Iterations=2000,
Stop.Tolerance=.001)

#LA\$Covar       # covariance
Fit <- LaplacesDemon(Model,
Data=MyData,
Covar=LA\$Covar,
Algorithm=’IM’,
Specs=list(mu=LA\$Summary2[1:15,1]),
Initial.Values = LA\$Summary2[1:15,1],
Iterations=40000,Status=1000,Thinning=40
)
Fit
Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = LA\$Summary2[1:15,
1], Covar = LA\$Covar, Iterations = 40000, Status = 1000,
Thinning = 40, Algorithm = “IM”, Specs = list(mu = LA\$Summary2[1:15,
1]))

Acceptance Rate: 0.04538
Algorithm: Independence Metropolis
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
alpha1     alpha2      beta1      beta2         s2      pump1      pump2
0.35685994 0.04768758 0.10016722 0.03220140 0.01307985 0.05277034 0.07376908
pump3      pump4      pump5      pump6      pump7      pump8      pump9
0.04553773 0.06197309 0.05189976 0.08482326 0.06433302 0.07132478 0.06423315
pump10
0.06345573

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
All Stationary
Dbar 57.754     58.036
pD    2.056      1.696
DIC  59.811     59.733
Initial Values:
alpha1      alpha2       beta1       beta2          s2       pump1
3.02746793  1.84533490 -0.40853043  0.64195643  0.45668880 -0.41735869
pump2       pump3       pump4       pump5       pump6       pump7
-0.97307147 -0.59334362  0.47363733 -0.19075343  0.23899690 -0.13979169
pump8       pump9      pump10
-0.05190615  0.41583871  1.18566087

Iterations: 40000
Log(Marginal Likelihood): -24.60551
Minutes of run-time: 0.19
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 15
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 100
Recommended Burn-In of Un-thinned Samples: 4000
Recommended Thinning: 160
Specs: (NOT SHOWN HERE)
Status is displayed every 1000 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 1000
Thinning: 40

Summary of All Samples
Mean        SD        MCSE       ESS           LB       Median
alpha1     3.02902055 0.5976765 0.033875624 465.42785   1.94620166   3.02746793
alpha2     1.85405458 0.2184840 0.015060612 349.50295   1.39240432   1.84803324
beta1     -0.41890176 0.3166503 0.018997023 481.52841  -1.05188183  -0.40853043
beta2      0.63293644 0.1795371 0.011348265 436.71962   0.28758602   0.63737896
s2         0.43066261 0.1144214 0.007302261 274.64302   0.22155991   0.43348687
pump1     -0.37243690 0.2298285 0.014321032 391.96002  -0.82709530  -0.38552546
pump2     -0.93920883 0.2717384 0.016885114 404.84643  -1.48440399  -0.95172632
pump3     -0.54940028 0.2134980 0.013956405 447.00328  -0.94402826  -0.57351952
pump4      0.48362669 0.2490683 0.012248080 522.99249  -0.01000839   0.47363733
pump5     -0.18414709 0.2279291 0.013777023 446.75275  -0.63917122  -0.19075343
pump6      0.25575145 0.2913896 0.018120358 443.85113  -0.33560223   0.23899690
pump7     -0.11427184 0.2537652 0.012513772 614.87586  -0.60451815  -0.10665053
pump8     -0.06221604 0.2672004 0.015234186 433.78090  -0.59796589  -0.05190615
pump9      0.38098070 0.2535670 0.015870761 514.86989  -0.15911780   0.41583871
pump10     1.16241102 0.2520292 0.016987945 337.82320   0.70150436   1.18413142
Deviance  57.75419036 2.0279787 0.277358759  64.22020  54.20970306  57.65028176
LP       -91.03856848 1.0140989 0.138688445  64.23136 -93.07403549 -90.98684715
UB
alpha1     4.21405063
alpha2     2.28149310
beta1      0.17711412
beta2      1.00444603
s2         0.68595978
pump1      0.05950237
pump2     -0.35260941
pump3     -0.10527832
pump4      1.00662909
pump5      0.30802060
pump6      0.78622765
pump7      0.36687146
pump8      0.50955936
pump9      0.86326767
pump10     1.69605148
Deviance  61.82358222
LP       -89.26645496

Summary of Stationary Samples
Mean        SD        MCSE      ESS          LB       Median
alpha1     3.03336875 0.6197257 0.037684775 419.6243   1.9462017   3.02051199
alpha2     1.85320726 0.2265947 0.016535659 309.2038   1.3924043   1.86648271
beta1     -0.42224878 0.3281491 0.021042445 434.6613  -1.0589187  -0.39535607
beta2      0.62932660 0.1858990 0.012220603 393.5420   0.2875860   0.61842982
s2         0.42761565 0.1194602 0.007947874 330.2853   0.2074068   0.42158060
pump1     -0.36915544 0.2395712 0.015756385 348.6680  -0.8440599  -0.35070290
pump2     -0.93369795 0.2832837 0.018533606 362.1901  -1.4896206  -0.93595868
pump3     -0.54539147 0.2221992 0.015225595 399.3457  -0.9580968  -0.54747889
pump4      0.48469802 0.2601885 0.013553884 516.0653  -0.0511650   0.47031679
pump5     -0.18647603 0.2332461 0.014356623 418.1546  -0.6435100  -0.19870380
pump6      0.25620372 0.3030826 0.020288966 398.5128  -0.3356022   0.25472223
pump7     -0.11024675 0.2634318 0.013481854 555.4753  -0.6184502  -0.09874182
pump8     -0.06356307 0.2750350 0.017000257 387.8663  -0.6158914  -0.08068829
pump9      0.37664857 0.2636891 0.017295769 464.9818  -0.1592315   0.39456739
pump10     1.16067466 0.2622398 0.018642621 303.4919   0.6766951   1.16507221
Deviance  58.03623819 1.8418930 0.200950651 146.1974  54.5450513  57.89738441
LP       -91.17957717 0.9211010 0.138688445 146.1838 -93.1095221 -91.10993804
UB
alpha1     4.24847266
alpha2     2.28149310
beta1      0.18123920
beta2      1.01535987
s2         0.68595978
pump1      0.05950237
pump2     -0.35260941
pump3     -0.09900105
pump4      1.01182177
pump5      0.30716549
pump6      0.81374718
pump7      0.36734086
pump8      0.51642505
pump9      0.87133356
pump10     1.69605148
Deviance  61.89407135
LP       -89.43333665

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...