Multistate Models for Medical Applications
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Clinical research studies and healthcare economics studies are frequently concerned with assessing the prognosis for survival in circumstances where patients suffer from a disease that progresses from state to state. Standard survival models only directly model two states: alive and dead. Multi-state models enable directly modeling disease progression where patients are observed to be in various states of health or disease at random intervals, but for which, except for death, the times of entering or leaving states are unknown. Multi-state models easily accommodate interval censored intermediate states while making the usual assumption that death times are known but may be right censored.
A natural way to conceptualize modeling the dynamics of disease progression with interval censored states is as continuous time Markov chains. The following diagram illustrates a possible disease progression model where there is some possibility of dying from any state, but otherwise a patient would progress from being healthy, to mild disease, to severe disease and then perhaps death.
## Loading required package: shape
It is true that in that the Markov assumption implies that the time patients spend in the various states are exponentially distributed. However, the mathematical theory of stochastic multi-state processes is very rich and can accommodate more realistic models with state dependent hazard rates that vary over time and other relaxations of the Markov assumption. Moreover, there is robust software in R (and other languages) that make multi-state stochastic survival models practical.
In the remainder of this post, I present a variation of a disease progression model discussed by Ardo van den Hout in some detail in his incredibly informative and very readable monograph Multi-State Survival Models for Interval Censored Data . Also note that van den Hout’s model is itself an elaboration of the main example presented by Christopher Jackson in the vignette to his
msmpackage. This post presents a slower development of the model developed by Jackson and van den Hout that might be easier for a person not already familiar with the
msm package to follow.
library(tidyverse) library(tidymodels) library(msm)
The data set explored by both Jackson and van den Hout is the Cardiac Allograft Vasculopathy (CAV) data set which contains the individual histories of angiographic examinations of 622 heart transplant recipients collected at the Papworth Hospital in the United Kingdom. This data is included in the
msm package and is a good candidate to be the iris dataset for progressive disease models. It is a rich data set with 2846 rows and multiple covariates, including patient age and time time since transplant, both of which can be use for time scales, multiple state transitions among four states and no missing values. Observations of intermediate states are interval censored and have been recorded varying time intervals. Deaths are “exact” or right censored.
The following code creates a new variable that preserves the original state data for each observation and displays the data in tibble format.
set.seed(1234) df <- tibble(cav) %>% mutate(o_state = state) df ## # A tibble: 2,846 × 11 ## PTNUM age years dage sex pdiag cumrej state firstobs statemax o_state ## <int> <dbl> <dbl> <int> <int> <fct> <int> <int> <int> <dbl> <int> ## 1 100002 52.5 0 21 0 IHD 0 1 1 1 1 ## 2 100002 53.5 1.00 21 0 IHD 2 1 0 1 1 ## 3 100002 54.5 2.00 21 0 IHD 2 2 0 2 2 ## 4 100002 55.6 3.09 21 0 IHD 2 2 0 2 2 ## 5 100002 56.5 4 21 0 IHD 3 2 0 2 2 ## 6 100002 57.5 5.00 21 0 IHD 3 3 0 3 3 ## 7 100002 58.4 5.85 21 0 IHD 3 4 0 4 4 ## 8 100003 29.5 0 17 0 IHD 0 1 1 1 1 ## 9 100003 30.7 1.19 17 0 IHD 1 1 0 1 1 ## 10 100003 31.5 2.01 17 0 IHD 1 3 0 3 3 ## # ℹ 2,836 more rows
The state table which presents the number of times each pair of states were observed in successive observation times shows that 46 transitions from state 2 (Mild CAV) to state 1 (No CAV), 4 transitions from state 3 (Severe CAV) to Healthy and 13 transitions from Severe CAV to Mild CAV.
statetable.msm(state = state, subject = PTNUM, data = df) ## to ## from 1 2 3 4 ## 1 1367 204 44 148 ## 2 46 134 54 48 ## 3 4 13 107 55
I will follow van den Hout and assume these backward transitions are misclassified and alter the state variable so there is no back sliding. The following code does this in a tidy way and also creates a new variable b_age which records the baseline age at which patients entered the study. (Note: you can find van den Hout’s code here)
df1 <- df %>% group_by(PTNUM) %>% mutate(b_age = min(age), state = cummax(state) )
This transformation will make the state transition table conform to the diagram above, but with state 1 representing No CAV rather than Health.
statetable.msm(state = state, subject = PTNUM, data = df1) ## to ## from 1 2 3 4 ## 1 1336 185 40 139 ## 2 0 220 52 49 ## 3 0 0 140 63
Setting Up and Running the Model
The next step is to set up the model using the function
msm() whose great flexibility means that some care must be taken to set parameter values.
First, we set up the initial guess for the intensity matrix, Q, which determines the transition rates among states for a continuous time Markov chain. For the
msm function, positive values indicate possible transitions.
# Intensity matrix Q: q <- 0.01 Q <- rbind(c(0,q,0,q), c(0,0,q,q),c(0,0,0,q),c(0,0,0,0)) qnames <- c("q12","q14","q23","q24","q34")
Next, we set up the covariate structure which van den Hout discusses in his monograph, but does not show in the code on the book’s website referenced above. For this model, transitions from state 1 to state 2 and from state 1 to state 4 depend on time,
years, the age of the patient at transplant time
dage, the age of the donor. The other transitions depend only on
dage. So, we see that
msm() can deal with time varying covariates as well as permitting individual state transitions to be driven by different covariates.
covariates = list("1-2" = ~ years + b_age + dage , "1-4" = ~ years + b_age + dage , "2-3" = ~ dage, "2-4" = ~ dage, "3-4" = ~ dage)
Now, we set the remaining parameters for the model.
obstype <- 1 center <- FALSE deathexact <- TRUE method <- "BFGS" control <- list(trace = 0, REPORT = 1)
obstype = 1 indicates that observations have been taken at arbitrary time points. They are snapshots of the process that are common for panel data.
center = FALSE means that covariates will not be centered at their means during the maximum likelihood estimation process. The default for this parameter is TRUE.
deathexact = TRUE indicates that the final absorbing state is exactly observed. This is the defining assumption survival data. In
msm this is equivalent to setting obstupe = 3 for state 4, our absorbing state.
** method = BFGS** signals
optim() to use the optimization method published simultaneously in 1970 by Broyden, Fletcher, Goldfarb and Shanno. (look here). This is a quasi-Newton method that uses function values and gradients to build up a picture of the surface to be optimized.
control = list(trace=0,REPORT=1) indicates more parameters that will be passed to
REPORT sets the the frequency of reports for the “BFGS”, “L-BFGS-B” and “SANN” methods if control$trace is positive. Defaults to every 10 iterations for “BFGS” and “L-BFGS-B”, or every 100 temperatures for “SANN”. (Note: SANN is a variant of the simulated annealing method presented by C. J. P. Belisle (1992) Convergence theorems for a class of simulated annealing algorithms on Rd Journal of Applied Probability.)
trace is also passed to
optim(). trace must be a non-negative integer. If positive, tracing information on the progress of the optimization is produced. Higher values may produce more tracing information.
model_1 <- msm(state~years, subject = PTNUM, data = df1, center= center, qmatrix=Q, obstype = obstype, deathexact = deathexact, method = method, covariates = covariates, control = control)
First check to see if the model has converged. For the BFGS method, possible convergence codes returned by
0 indicates convergence, 1 indicates that the maximum iteration limit has been reached, 51 and 52 indicate warnings.
#Model Status conv <- model_1$opt$convergence; cat("Convergence code =", conv,"\n") ## Convergence code = 0
Next, look at a measure of how well the model fits the data proposed by using a visual test proposed by Gentleman et al. (1994) which plots the observed numbers of individuals occupying a state at a series of times against forecasts from the fitted model, for each state. The
plot.prevalence.msm() produces a perfectly adequate base R plot. However, to emphasize that
msm users are not limited to base R plots, I’ll do a little extra work to use
ggplot(). When a package author is kind enough to provide an extractor function you can do anything you want with the data.
prevalence.msm() function extracts both the observed and forecast prevalence matrices.
prev <- prevalence.msm(model_1)
This not very elegant, but straightforward code reshapes the data and plots.
# reshape observed prevalence do1 <-as_tibble(row.names(prev$Observed)) %>% rename(time = value) %>% mutate(time = as.numeric(time)) do2 <-as_tibble(prev$Observed) %>% mutate(type = "observed") do <- cbind(do1,do2) %>% select(-Total) do_l <- do %>% gather(state, number, -time, -type) # reshape expected prevalence de1 <-as_tibble(row.names(prev$Expected)) %>% rename(time = value) %>% mutate(time = as.numeric(time)) de2 <-as_tibble(prev$Expected) %>% mutate(type = "expected") de <- cbind(de1,de2) %>% select(-Total) de_l <- de %>% gather(state, number, -time, -type) # bind into a single data frame prev_l <-rbind(do_l,de_l) %>% mutate(type = factor(type), state = factor(state), time = round(time,3)) # plot for comparison prev_gp <- prev_l %>% group_by(state) pp <- prev_l %>% ggplot() + geom_line(aes(time, number, color = type)) + xlab("time") + ylab("") + ggtitle("") pp + facet_wrap(~state)
The agreement of the observed and forecast prevalence for states 1 through 3 look pretty good. After about 8 years the observed deaths are notably higher than the forecast. As Jackson points out (See the msm Manual page 33), this kind of discrepancy could indicate that the underlying process is not homogeneous. I have attempted to capture this non-homogeneity by having some of the transitions depend on time. And, although the plot above looks a little better that than the plot in the manual, which does not attempt to model non-homogeneity, it is apparent that there is room to find a better model!
Survival Curves and Calculations
Now, we can jump straight to the major result and look at the fitted survival curves. There is a
plot() method for
msm that will directly plot these curves. However, just to emphasize that if a package author is kind enough to provide a
plot method, it will probably not be too difficult to hack the code for the method to use an alternative plotting system. To save space, I will not show my code, but you can easily recreate it by stating with the
plot.msm() function, deleting the plotting parts and returning the values for time and the states “Health, Mild_CAV, and Severe_CAV which are used int the code below. Check my hack by running
# plot_prep was obtained from plot.msm() res <- plot_prep(model_1) time <- res[] Health <- res[] Mild_CAV <- res[] Severe_CAV <- res[] df_w <- tibble(time,Health, Mild_CAV, Severe_CAV) df_l <- df_w %>% gather("state", "survival", -time) p <- df_l %>% ggplot(aes(time, 1 - survival, group = state)) + geom_line(aes(color = state)) + xlab("Years") + ylab("Probability") + ggtitle("Fitted Survival Probabilities") p
These curves indicate that a treatment that could prevent CAV or at least delay progression from mild CAV to severe CAV might prolong survival. Additionally, the Markov structure of the model permits extracting information that relates to disease progression and the total time spent in each state.
totlos.msm() estimates the total expected time that a patient will spend in each state. Parameter settings for this function include:
start = c(1,0,0,0) specifies that patients will start in state 1 with probability 0.
fromt = 0 indicates starting at the beginning of the process.
covariates = “mean” indicates that the covariates will be set to their mean values for the calculation.
total_state_time <-totlos.msm(model_1,start = c(1,0,0,0), from = 0, covariates = "mean") total_state_time ## State 1 State 2 State 3 State 4 ## 7.002 2.473 1.621 Inf
The table indicates that the mean time a patient can expect to avoid CAV is about 7 years. After progressing to a Mild_CAV, a patient can expect five additional years.
A more direct calculation based on the intensity matrix, Q, give the expected time to the “absorbing” state, Death, from each of the “transient” living states.
time_to_death <- efpt.msm(model_1, tostate = 4, covariates = "mean") time_to_death ##  11.097 5.836 3.005 0.000
This agrees with the total state times calculated above.
sum(total_state_time[1:3]) ##  11.1
Within the scope of the information provided by the covariates, it is also possible to generate more individualized forecasts. For example, here is the expected time to death for a person starting off with no CAV at age 60, who received a heart from a 20 year old donor, 5 years after the transplant.
efpt.msm(model_1, tostate = 4, start = c(1,0,0,0), covariates = list(years = 5, b_age = 60, dage = 20)) ## [,1] ## [1,] 7.953
A related quantity, mean sojourn time, is the mean time that each visit to each state is expected to last. Since, we are assuming a progressive disease model where each patient visits each state only once, the estimate should be close to total time spent in each state. However, Jackson notes that in a progressive model, sojourn time in the disease state will be greater than the expected length of stay in the disease state because the mean sojourn time in a state is conditional on entering the state, whereas the expected total time in a diseased state is a forecast for an individual, who may die before getting the disease. (See help(totlos.msm)). And indeed, that is what we see here for states 2 and 3.
sojourn.msm(model_1, covariates="mean") ## estimates SE L U ## State 1 7.002 0.4024 6.256 7.837 ## State 2 3.525 0.3020 2.980 4.169 ## State 3 3.005 0.3748 2.353 3.837
model_1 will also product estimates of hazard ratios which show the estimate effect on transition intensities for each state.
Here is the table of Hazard Ratios:
model_1 ## ## Call: ## msm(formula = state ~ years, subject = PTNUM, data = df1, qmatrix = Q, obstype = obstype, covariates = covariates, deathexact = deathexact, center = center, method = method, control = control) ## ## Maximum likelihood estimates ## Baselines are with covariates set to 0 ## ## Transition intensities with hazard ratios for each covariate ## Baseline years ## State 1 - State 1 -0.032750 (-0.0607897,-0.017644) ## State 1 - State 2 0.030957 ( 0.0160470, 0.059721) 1.112 (1.061,1.166) ## State 1 - State 4 0.001793 ( 0.0004703, 0.006836) 1.093 (1.012,1.182) ## State 2 - State 2 -0.395633 (-0.6488723,-0.241227) ## State 2 - State 3 0.264310 ( 0.1488153, 0.469441) 1.000 ## State 2 - State 4 0.131323 ( 0.0330133, 0.522385) 1.000 ## State 3 - State 3 -0.434548 (-0.9113857,-0.207192) ## State 3 - State 4 0.434548 ( 0.2071918, 0.911386) 1.000 ## b_age dage ## State 1 - State 1 ## State 1 - State 2 1.001 (0.9884,1.014) 1.0281 (1.0159,1.040) ## State 1 - State 4 1.053 (1.0271,1.079) 1.0208 (1.0039,1.038) ## State 2 - State 2 ## State 2 - State 3 1.000 0.9932 (0.9756,1.011) ## State 2 - State 4 1.000 0.9757 (0.9298,1.024) ## State 3 - State 3 ## State 3 - State 4 1.000 0.9906 (0.9672,1.015) ## ## -2 * log-likelihood: 3466
The table shows that time, the covariate
years, affects disease progression represented by the the transition from state 1 to state 2, but has a smaller effect on the transition from state 1 to state 4.
b_age, the baseline age of patient at transplant time has a larger effect on dying before the onset of CAV than on the transition to CAV.
dage has a minor effect on the transitions from state 1 but apparently has no effect thereafter.
The hazard ratios are computed by calculating the exponential of the estimated covariate effects on the log-transition intensities for the Markov process which are stored in the model object.
To see how these work, first look at the baseline hazard ratios.
model_1$Qmatrices$baseline ## State 1 State 2 State 3 State 4 ## State 1 -0.03275 0.03096 0.0000 0.001793 ## State 2 0.00000 -0.39563 0.2643 0.131323 ## State 3 0.00000 0.00000 -0.4345 0.434548 ## State 4 0.00000 0.00000 0.0000 0.000000
These baseline hazard ratios are computed from the model intensity matrix, Q, assuming no covariates. They can also be directly extracted from the model by
qmatrix.msm() extractor function with covariates set to zero.
qmatrix.msm(model_1, covariates = 0) ## State 1 State 2 ## State 1 -0.032750 (-0.0607897,-0.017644) 0.030957 ( 0.0160470, 0.059721) ## State 2 0 -0.395633 (-0.6488723,-0.241227) ## State 3 0 0 ## State 4 0 0 ## State 3 State 4 ## State 1 0 0.001793 ( 0.0004703, 0.006836) ## State 2 0.264310 ( 0.1488153, 0.469441) 0.131323 ( 0.0330133, 0.522385) ## State 3 -0.434548 (-0.9113857,-0.207192) 0.434548 ( 0.2071918, 0.911386) ## State 4 0 0
The 95% confidence limits are computed by assuming normality of the log-effect.
A more representative value for the intensity matrix for this model can be obtained by setting the covariates to their expected mean values.
qmatrix.msm(model_1, covariates = "mean") ## State 1 State 2 ## State 1 -0.14281 (-0.15984,-0.12760) 0.10019 ( 0.08742, 0.11482) ## State 2 0 -0.28369 (-0.33556,-0.23984) ## State 3 0 0 ## State 4 0 0 ## State 3 State 4 ## State 1 0 0.04262 ( 0.03339, 0.05441) ## State 2 0.21821 ( 0.17636, 0.26999) 0.06548 ( 0.03872, 0.11075) ## State 3 -0.33281 (-0.42498,-0.26064) 0.33281 ( 0.26064, 0.42498) ## State 4 0 0
Next, we may want to examine the contribution of the covariate covariates to the hazard ratios. To take a particular example, look at the
dage to the hazard ratios
model_1$Qmatrices$dage ## State 1 State 2 State 3 State 4 ## State 1 0 0.02771 0.000000 0.020575 ## State 2 0 0.00000 -0.006783 -0.024625 ## State 3 0 0.00000 0.000000 -0.009439 ## State 4 0 0.00000 0.000000 0.000000
and focus on the contribution of
dage to the intensity matrix for the transition from state 3 to state 4 which is given as -0.009439 in the table above. Taking the exponential of this value, yields the hazard ratio for the
dage state 3 to 4 transition in the hazard ratio’s table we got by printing out
exp(-.009439) ##  0.9906
The hazard ratio tables for the remaining covariates are given by:
model_1$Qmatrices$years ## State 1 State 2 State 3 State 4 ## State 1 0 0.1064 0 0.08933 ## State 2 0 0.0000 0 0.00000 ## State 3 0 0.0000 0 0.00000 ## State 4 0 0.0000 0 0.00000
model_1$Qmatrices$b_age ## State 1 State 2 State 3 State 4 ## State 1 0 0.0009645 0 0.05152 ## State 2 0 0.0000000 0 0.00000 ## State 3 0 0.0000000 0 0.00000 ## State 4 0 0.0000000 0 0.00000
Exploring Transition Probabilities and Intensities
It is also possible to look at the state transition matrix at different times and see how these probabilities change over time. Here we compute the transition matrix at 1 year.
pmatrix.msm(model_1, t = 1, covariates = "mean") ## State 1 State 2 State 3 State 4 ## State 1 0.8669 0.08101 0.008493 0.04357 ## State 2 0.0000 0.75300 0.160340 0.08666 ## State 3 0.0000 0.00000 0.716903 0.28310 ## State 4 0.0000 0.00000 0.000000 1.00000
and at 5 years.
pmatrix.msm(model_1, t = 5, covariates = "mean") ## State 1 State 2 State 3 State 4 ## State 1 0.4897 0.1761 0.07871 0.2556 ## State 2 0.0000 0.2421 0.23419 0.5237 ## State 3 0.0000 0.0000 0.18937 0.8106 ## State 4 0.0000 0.0000 0.00000 1.0000
Additionally, it is possible to examine the effect of covariates on transition probabilities. Here are the 5 year transition probabilities for a patient with a baseline age of 35.
pmatrix.msm(model_1, t = 5, covariates = list(years = 5, b_age = 35, dage = 20)) ## State 1 State 2 State 3 State 4 ## State 1 0.5474 0.1674 0.07575 0.2095 ## State 2 0.0000 0.2112 0.21622 0.5726 ## State 3 0.0000 0.0000 0.16547 0.8345 ## State 4 0.0000 0.0000 0.00000 1.0000
and those who had the procedure at age 60.
pmatrix.msm(model_1, t = 5, covariates = list(years = 5, b_age = 60, dage = 20)) ## State 1 State 2 State 3 State 4 ## State 1 0.3863 0.1409 0.06784 0.4050 ## State 2 0.0000 0.2112 0.21622 0.5726 ## State 3 0.0000 0.0000 0.16547 0.8345 ## State 4 0.0000 0.0000 0.00000 1.0000
Note that the transitions from CAV states are unaffected.
To summarize: Continuous Time Markov Chains provide a natural framework for working with multi-state survival models. The
msm package is sufficiently sophisticated to permit modeling clinical process with level of fidelity that may provide insight about clinically observed disease progression. The software is relatively easy to use and there is plenty of documentation to help you get started.
Learning More About Multi-State Survival Models
To dive deeper into multi-state survival models, I am sure you will find Ardo van den Hout’ Multi-State Survival Models for Interval-Censored Data extraordinarily helpful. There are many good textbooks about the basics of Continuous Time Markov Chains. I recommend J.R.Norris’ - Markov Chains which is still modestly priced. There are also many expositions freely available on the internet including:
David F. Anderson - Chapter 6: Continuous Time Markov Chains from Lecture Notes on Stochastic Processes with Applications in Biology
Miranda Holmes-Cerfon - Lecture 4: Continuous-time Markov Chains
Søren Feodor Nielsen - Continuous-time homogeneous Markov chains
Karl Sigman - Continuous-Time Markov Chains
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.