[This article was first published on SAS and R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Shared frailty models are a way of allowing correlated observations into proportional hazards models. Briefly, instead of l_i(t) = l_0(t)e^(x_iB), we allow l_ij(t) = l_0(t)e^(x_ijB + g_i), where observations j are in clusters i, g_i is typically normal with mean 0, and g_i is uncorrelated with g_i’. The nomenclature frailty comes from examining the logs of the g_i and rewriting the model as l_ij(t) = l_0(t)u_i*e^(xB) where the u_i are now lognormal with median 0. Observations j within cluster i share the frailty u_i, and fail faster (are frailer) than average if u_i > 1.

In SAS 9.2, this model could not be fit, though it is included in the survival package in R. (Section 4.3.2) With SAS 9.3, it can now be fit. We explore here through simulation, extending the approach shown in example 7.30.

SAS
To include frailties in the model, we loop across the clusters to first generate the frailties, then insert the loop from example 7.30, which now represents the observations within cluster, adding the frailty to the survival time model. There’s no need to adjust the censoring time.

```data simfrail;
beta1 = 2;
beta2 = -1;
lambdat = 0.002; *baseline hazard;
lambdac = 0.004; *censoring hazard;
do i = 1 to 250; *new frailty loop;
frailty = normal(1999) * sqrt(.5);
do j = 1 to 5; *original loop;
x1 = normal(0);
x2 = normal(0);
* new model of event time, with frailty added;
linpred = exp(-beta1*x1 - beta2*x2 + frailty);
t = rand("WEIBULL", 1, lambdaT * linpred);
* time of event;
c = rand("WEIBULL", 1, lambdaC);
* time of censoring;
time = min(t, c);    * which came first?;
censored = (c lt t);
output;
end;
end;
run;
```

For comparison’s sake, we replicate the naive model assuming independence:
```proc phreg data=simfrail;
model time*censored(1) = x1 x2;
run;

Parameter   Standard                         Hazard
Parameter DF   Estimate      Error Chi-Square Pr > ChiSq    Ratio

x1         1    1.68211    0.05859   824.1463     ```

The parameter estimates are rather biased. In contrast, here is the correct frailty model.
```proc phreg data=simfrail;
class i;
model time*censored(1) = x1 x2;
random i / noclprint;
run;
Cov         REML    Standard
Parm    Estimate       Error

i         0.5329     0.07995

Parameter   Standard                         Hazard
Parameter DF   Estimate      Error Chi-Square Pr > ChiSq    Ratio

x1         1    2.03324    0.06965   852.2179     ```

This returns estimates gratifyingly close to the truth. The syntax of the random statement is fairly straightforward– the noclprint option prevents printing all the values of i. The clustering variable must be specified in the class statement. The output shows the estimated variance of the g_i.

R
In our book (section 4.16.14) we show an example of fitting the uncorrelated data model, but we don’t display a frailty model. Here, we use the data generated in SAS, so we omit the data simulation in R. As in SAS, it would be a trivial extension of the code presented in example 7.30. For parallelism, we show the results of ignoring the correlation, first.
```> library(survival)
> with(simfrail, coxph(formula = Surv(time, 1-censored) ~ x1 + x2))

coef exp(coef) se(coef)     z p
x1  1.682     5.378   0.0586  28.7 0
x2 -0.886     0.412   0.0439 -20.2 0
```

with identical results to above. Note that the Surv function expects an indicator of the event, vs. SAS expecting a censoring indicator.

As with SAS, the syntax for incorporating the frailty is simple.
```> with(simfrail, coxph(formula = Surv(time, 1-censored) ~ x1 + x2
+ frailty(i)))

coef se(coef) se2    Chisq DF  p
x1          2.02 0.0692   0.0662 850     1 0
x2         -1.00 0.0506   0.0484 393     1 0
frailty(i)                       332   141 0

Variance of random effect= 0.436
```

Here, the results differ slightly from the SAS model, but still return parameter estimates that are quite similar. We’re not familiar enough with the computational methods to diagnose the differences.

To leave a comment for the author, please follow the link and comment on their blog: SAS and R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)