Recent developments in joint modelling of longitudinal and survival data: dynamic predictions

by Guest Blogger Ipek Guler

Following previous posts on longitudinal analysis with time-to-event data, I would like to resume recent developments in joint modelling approaches which have gained a remarkable attention in the literature over recent years.

Joint modelling approaches to the analysis of longitudinal and time-to-event data are used to handle the association between longitudinal biomarkers and time-to-event data on follow-up studies. Previous research on joint modelling has mostly concentrated on single longitudinal and single time-to-event data. This methodological research has a wide range of biomedical applications and statistical software package facilities. There are also several extensions to joint modelling approaches such as the use of flexible longitudinal profiles using multiplicative random effects (Ding and Wang, 2008), alternatives to the common parametric assumptions for the random effects distribution (Brown and Ibrahim, 2003), and handling multiple failure times (Elashoff et al, 2008). For nice overviews on the topic, read Tsiatis and Davidian (2004) and Gould et al (2015). Beside these developments currently the interest lies on multiple longitudinal and time-to-event data (you can find a nice overview of multivariate joint modelling in Hickey et al. (2016)).

In this post, I will focus on an interesting feature of joint modelling approaches linked to an increasing interest in medical research towards personalized medicine.  Decision making based on the characteristics of individual patients optimizes the medical care and it is hoped that patients who are informed about their individual health risk will adjust their lifestyles according to their illness. That information often includes survival probabilities and predictions for future biomarker levels which joint models provide.

Specifically, subject-specific predictions for longitudinal and survival outcomes can be derived from the joint model (Rizopoulos 2011 & 2012). Those predictions have a dynamic nature coming from the effect of repeated measures taken in time t to the survival up to time t. This allows updating the prediction when we have new information recorded for the patient.

Rizopoulos (2011) uses a Bayesian formulation of the problem and Monte Carlo estimates of the conditional predictions with the MCMC sample from the posterior distribution of the parameters for the original data. The R package JMbayes calculates these subject-specific predictions for the survival and longitudinal outcomes using  the functions survfitJM() and predict(), respectively. As an illustration, the following functions can be utilized to derive predictions for a specific patient from the Mayo Clinic Primary Biliary Cirrhosis Data (PBC) dataset using a joint model.


pbc2$status2 <- as.numeric(pbc2$status != "alive")
pbc2.id$status2 <- as.numeric(pbc2.id$status != "alive")

## First we fit the joint model for repeated serum bilirumin
## measurements and the risk of death (Fleming, T., Harrington, D., 1991)

lmeFit.pbc <-
  lme(log(serBilir) ~ ns(year, 2),
      data = pbc2,
      random = ~ ns(year, 2) | id)
coxFit.pbc <-
  coxph(Surv(years, status2) ~ drug * age, data = pbc2.id, x = TRUE)
jointFit.pbc <-
  jointModelBayes(lmeFit.pbc, coxFit.pbc, timeVar = "year", n.iter = 30000)

## We extract the data of the patient 2 in a separate data frame
## for a specific dynamic predictions:

ND <- pbc2[pbc2$id == 2,]

sfit.pbc <- survfitJM(jointFit.pbc, newdata = ND)

#The plot()
#method for objects created by survfitJM() produces the figure
#of estimated conditional survival probabilities for Patient 2

  estimator = "mean",
  include.y = TRUE,
  conf.int = TRUE,
  fill.area = TRUE,
  col.area = "lightgrey"

## In a similar manner, predictions for the longitudinal outcome
## are calculated by the predict() function
## For example, predictions of future log serum bilirubin
## levels for Patient 2 are produced with the following code: 

Ps.pbc <- predict(
  newdata = ND,
  type = "Subject",
  interval = "confidence",
  return = TRUE

## Plotting the dynamic predictions of the longitudinal measurements

last.time <- with(Ps.pbc, year[!is.na(low)][1])
  pred  + low + upp ~ year,
  data = Ps.pbc,
  type = "l",
  lty = c(1, 2, 2),
  col = c(2, 1, 1),
  abline = list(v = last.time, lty = 3),
  xlab = "Time (years)",
  ylab = "Predicted log(serum bilirubin)"

Furthermore, Rizopoulos (2014) presents a very useful tool for clinicians to present the results of joint models via a web interface using RStudio Shiny (see a previous post by Pilar on this here).  This is available in the demo folder of the package JMbayes and can be invoked with a call to the runDynPred() function. Several options are provided in the web interface such as predictions in case you have more than one model in the workspace based on different joint models, obtaining estimates at specific horizon times and extracting a dataset with the estimated conditional survival probabilities. Load your workspace and your new data (as described in the data tab just after you load your workspace), choose your model and select one of the interesting plots and representative tools. A detailed description of the options in this app is provided in the “Help” tab within the app.

Just try the code above and see!


  • Brown, E. R., Ibrahim, J. G. and Degruttola, V. (2005). A flexible B-spline model for multiple longitudinal biomarkers and survival. Biometrics 61, 6473.
  • Ding, J. and Wang, J.-L. (2008). Modeling longitudinal data with nonparametric multiplicative random effects jointly with survival data. Biometrics 64, 546 – 556.
  • Gould AL, Boye ME, Crowther MJ, Ibrahim JG, Quartey G, Micallef S, Bois FY.  (2015) Joint modeling of survival and longitudinal non-survival data: current methods and issues. Report of the DIA Bayesian joint modeling working group. Stat Med., 34, 2181–2195.
  • Hickey, G.L., Philipson, P., Jorgensen, A. et al. (2016) Joint modelling of time-to-event and multivariate longitudinal outcomes: recent developments and issues. BMC Medical Research Methodology. 16: 117.
  • Rizopoulos D (2011). Dynamic Predictions and Prospective Accuracy in Joint Models for Longitudinal and Time-to-Event Data.  Biometrics, 67, 819–829.
  • Rizopoulos D (2012). Joint Models for Longitudinal and Time-to-Event Data, with Applications in R. Chapman & Hall/CRC, Boca Raton.
  • Tsiatis, A. and Davidian, M. (2004). Joint modeling of longitudinal and time-to-event data: An overview. Statistica Sinica 14, 809 – 834.

Reflections from “Dance your PhD”

by Guest Blogger Ipek Guler

In 2015, Ipek Guler submitted a video to the competition “Dance your  PhD” sponsored by Science/AAAS, organisation which each year encourages researchers to represent their work in the form of a dance.

You can check out her video below:

and read her reflections on the experience below:

How did you hear about the competition and why did you decide to enter?

I heard about the competition on John Bohannon’s TED talk where he gives a brilliant example of how to turn a presentation into a dance with a professional contemporary dance company. He talks about how the lasers cool down matter. Amazing! I think I had already decided to do it right at the beginning of the talk. As i do perform contemporary dancing as a semi-professional dancer, the idea was perfect for me.

Where  did you find the inspiration to translate biostatistical concepts into dance?

There are some inspiring sentences on the Dance Your PhD website: “So, what’s your Ph.D. research about?” You take a deep breath and launch into the explanation. People’s eyes begin to glaze over… At times like these, don’t you wish you could just turn to the nearest computer and show people an online video of your Ph.D. thesis interpreted in dance form?”

So this was my starting point. I was very excited to be able to explain my research to my friends, parents, relatives finally. The other good point was that I used it to introduce different concepts, feelings into my dance performances; this is what you are able to do in contemporary dancing that other forms of academic dissemination do not allow.

How  long did it  take you to finish  the video?

For a long time I had already been trying to summarise my PhD research to other people who have no idea about statistics. The process of translating it into dance helped me a lot for my future presentations and in my thesis.

The choreography took a few months to create in my mind. The next step was the rehearsal  with my dance group. It was the fastest and easiest process which took a few days because we used to create, dance and improvise together for years. Finally we shot the video in one day and had  lots of fun (we also added some of these moments at the end of the video. :))

Would you recommend it to other PhDs in Biostatistics?

I definitely recommend it to other researchers in Biostatistics who have just finished their PhD or are still PhD students. First of all you will able to resume your principal aim, the most important points of your PhD research, then you’ll have a great product to show when you are not able to explain to other people who have no idea what you are doing. Especially in biostatistics, people sometimes don’t understand what you’re really doing, so you have a brilliant option. Believe me, it works!

You can watch other Dance your PhD videos on mathematics here and  here, and some biomedical ones too here and here.