Featured

Recent developments in joint modelling of longitudinal and survival data: dynamic predictions

by Guest Blogger Ipek Guler

Following previous posts on longitudinal analysis with time-to-event data, I would like to resume recent developments in joint modelling approaches which have gained a remarkable attention in the literature over recent years.

Joint modelling approaches to the analysis of longitudinal and time-to-event data are used to handle the association between longitudinal biomarkers and time-to-event data on follow-up studies. Previous research on joint modelling has mostly concentrated on single longitudinal and single time-to-event data. This methodological research has a wide range of biomedical applications and statistical software package facilities. There are also several extensions to joint modelling approaches such as the use of flexible longitudinal profiles using multiplicative random effects (Ding and Wang, 2008), alternatives to the common parametric assumptions for the random effects distribution (Brown and Ibrahim, 2003), and handling multiple failure times (Elashoff et al, 2008). For nice overviews on the topic, read Tsiatis and Davidian (2004) and Gould et al (2015). Beside these developments currently the interest lies on multiple longitudinal and time-to-event data (you can find a nice overview of multivariate joint modelling in Hickey et al. (2016)).

In this post, I will focus on an interesting feature of joint modelling approaches linked to an increasing interest in medical research towards personalized medicine.  Decision making based on the characteristics of individual patients optimizes the medical care and it is hoped that patients who are informed about their individual health risk will adjust their lifestyles according to their illness. That information often includes survival probabilities and predictions for future biomarker levels which joint models provide.

Specifically, subject-specific predictions for longitudinal and survival outcomes can be derived from the joint model (Rizopoulos 2011 & 2012). Those predictions have a dynamic nature coming from the effect of repeated measures taken in time t to the survival up to time t. This allows updating the prediction when we have new information recorded for the patient.

Rizopoulos (2011) uses a Bayesian formulation of the problem and Monte Carlo estimates of the conditional predictions with the MCMC sample from the posterior distribution of the parameters for the original data. The R package JMbayes calculates these subject-specific predictions for the survival and longitudinal outcomes using  the functions survfitJM() and predict(), respectively. As an illustration, the following functions can be utilized to derive predictions for a specific patient from the Mayo Clinic Primary Biliary Cirrhosis Data (PBC) dataset using a joint model.

library("JMbayes")
library("lattice")
data("pbc2")
data("pbc2.id")

pbc2$status2 <- as.numeric(pbc2$status != "alive")
pbc2.id$status2 <- as.numeric(pbc2.id$status != "alive")

## First we fit the joint model for repeated serum bilirumin
## measurements and the risk of death (Fleming, T., Harrington, D., 1991)

lmeFit.pbc <-
  lme(log(serBilir) ~ ns(year, 2),
      data = pbc2,
      random = ~ ns(year, 2) | id)
coxFit.pbc <-
  coxph(Surv(years, status2) ~ drug * age, data = pbc2.id, x = TRUE)
jointFit.pbc <-
  jointModelBayes(lmeFit.pbc, coxFit.pbc, timeVar = "year", n.iter = 30000)

## We extract the data of the patient 2 in a separate data frame
## for a specific dynamic predictions:

ND <- pbc2[pbc2$id == 2,]
write.table(ND,"pbc2.csv",sep=";",dec=",",row.names=FALSE)

sfit.pbc <- survfitJM(jointFit.pbc, newdata = ND)

#The plot()
#method for objects created by survfitJM() produces the figure
#of estimated conditional survival probabilities for Patient 2

plot(
  sfit.pbc,
  estimator = "mean",
  include.y = TRUE,
  conf.int = TRUE,
  fill.area = TRUE,
  col.area = "lightgrey"
)

## In a similar manner, predictions for the longitudinal outcome
## are calculated by the predict() function
## For example, predictions of future log serum bilirubin
## levels for Patient 2 are produced with the following code: 

Ps.pbc <- predict(
  jointFit.pbc,
  newdata = ND,
  type = "Subject",
  interval = "confidence",
  return = TRUE
)

## Plotting the dynamic predictions of the longitudinal measurements

last.time <- with(Ps.pbc, year[!is.na(low)][1])
xyplot(
  pred  + low + upp ~ year,
  data = Ps.pbc,
  type = "l",
  lty = c(1, 2, 2),
  col = c(2, 1, 1),
  abline = list(v = last.time, lty = 3),
  xlab = "Time (years)",
  ylab = "Predicted log(serum bilirubin)"
)

Furthermore, Rizopoulos (2014) presents a very useful tool for clinicians to present the results of joint models via a web interface using RStudio Shiny (see a previous post by Pilar on this here).  This is available in the demo folder of the package JMbayes and can be invoked with a call to the runDynPred() function. Several options are provided in the web interface such as predictions in case you have more than one model in the workspace based on different joint models, obtaining estimates at specific horizon times and extracting a dataset with the estimated conditional survival probabilities. Load your workspace and your new data (as described in the data tab just after you load your workspace), choose your model and select one of the interesting plots and representative tools. A detailed description of the options in this app is provided in the “Help” tab within the app.

Just try the code above and see!

References

  • Brown, E. R., Ibrahim, J. G. and Degruttola, V. (2005). A flexible B-spline model for multiple longitudinal biomarkers and survival. Biometrics 61, 6473.
  • Ding, J. and Wang, J.-L. (2008). Modeling longitudinal data with nonparametric multiplicative random effects jointly with survival data. Biometrics 64, 546 – 556.
  • Gould AL, Boye ME, Crowther MJ, Ibrahim JG, Quartey G, Micallef S, Bois FY.  (2015) Joint modeling of survival and longitudinal non-survival data: current methods and issues. Report of the DIA Bayesian joint modeling working group. Stat Med., 34, 2181–2195.
  • Hickey, G.L., Philipson, P., Jorgensen, A. et al. (2016) Joint modelling of time-to-event and multivariate longitudinal outcomes: recent developments and issues. BMC Medical Research Methodology. 16: 117.
  • Rizopoulos D (2011). Dynamic Predictions and Prospective Accuracy in Joint Models for Longitudinal and Time-to-Event Data.  Biometrics, 67, 819–829.
  • Rizopoulos D (2012). Joint Models for Longitudinal and Time-to-Event Data, with Applications in R. Chapman & Hall/CRC, Boca Raton.
  • Tsiatis, A. and Davidian, M. (2004). Joint modeling of longitudinal and time-to-event data: An overview. Statistica Sinica 14, 809 – 834.
Advertisements
Featured

Sharing statistical analysis and results through web applications

I have to admit I am not completely comfortable with the RStudio IDE yet. However, RStudio projects are of great interest, and the new package Shiny – released last November- is no exception.

Shiny allows you to build web applications in R, so anyone might be able to access your analysis results through an interactive web browser interface. As they state in their website “…your users choose input parameters using friendly controls like sliders, drop-downs, and text fields. Easily incorporate any number of outputs like plots, tables, and summaries..”

The application is structured in two components: a user interface script named ui.R  and a server script – server.R. The first one deals with the input and output format, the second one contains the code to run the analysis.

I made a quick trial following the tutorial and my (very simple) web app was ready in no time. It is based on some of the examples of the SNPassoc vignette, an R package designed to perform genetic association studies. In this application, you can check the summary for a small set of SNPs and get both a plot showing the frequencies and the results of the association study for a given SNP.

By using Shiny, you can run your applications locally or share them with other users so they can run the applications in their own computers. There are several options to distribute your apps, you can check all of them here. In this case, the app can be accessed through the GitHub Git repository. Once the Shiny package is installed (along with the SNPassoc package used in this example) you just have to run the following code:

library(shiny)
shiny:: runGist('4e5e618431a59abe692b')

In my opinion this tool has great potential for sharing analysis results through an interactive and friendly interface.  It might replace files containing a lot of graphics and tables while saving time both to the data analyst and the end user. Have you tried it? Do you want to share your apps with us?