A quantile regression model for bounded longitudinal data and survival data

Divan A. Burger, Sean van der Merwe, Janet van Niekerk, Emmanuel Lesaffre, Antoine Pironet

Introduction

In this study you are interested in how long people are going to keep taking the medication and how consistently they are taking the medication. Further, you know that these two behaviours interact, and you want to know what the effect of an intervention is on the joint behaviour.

Source and authors

This presentation is drawn from our paper that is under revision/review for Statistical Methods in Medical Research, following minor corrections.

Divan A. Burger\(^{1,2,3}\), Sean van der Merwe\(^{3}\), Janet van Niekerk\(^{2,4}\), Emmanuel Lesaffre\(^{5}\), Antoine Pironet\(^{6}\)

  1. Syneos Health, Bloemfontein, Free State, South Africa
  2. Department of Statistics, University of Pretoria, Pretoria, South Africa
  3. Department of Mathematical Statistics and Actuarial Science, University of the Free State, Bloemfontein, South Africa
  4. CEMSE Division, 4700 King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
  5. I-BioStat, KU Leuven, Leuven, Belgium
  6. AARDEX Group, 15/1, Rue Bois St Jean, 4102, Liège, Belgium

Outline

  • The problem
  • The proposed model
  • The model fitting
  • The results
  • Making the results practical for a future person

Abstract

  • This study introduces a novel joint modelling framework
    • integrating quantile regression for longitudinal continuous proportions data
    • with Cox regression for time-to-event analysis
  • Employing integrated nested Laplace approximation (INLA) for Bayesian inference
  • Particularly (but not exclusively) in the context of medication adherence and persistence

Abstract (continued)

  • Applying this model to a dataset of patients who underwent treatment with atorvastatin, we see the
  • Impact of targeted interventions on improving medication adherence and persistence
  • We have developed a dynamic prediction method
  • The simulation study validates the reliability of our modelling approach

The problem

Adherence to a full course of medication

  • Vrijens et al. (2006) evaluated the influence of a pharmaceutical care program on the adherence and persistence of patients to a prescribed regimen of once-daily atorvastatin, a medication for managing cholesterol levels and mitigating the risk of cardiovascular diseases.
  • The study divided participants into two cohorts:
    • a control group that received standard care and
    • an intervention group engaged in a pharmaceutical care program.
  • The study required scheduled follow-up visits over 12 months based on the dosing frequency prescribed.

The variable types

  • Patients’ adherence is a continuous proportion outcome, tracked monthly and reflecting compliance in the last month.
    • This adherence data is derived from electronically compiled dosing histories.
  • Persistence is a time-to-event outcome that measures the treatment regimen’s duration from initiation to discontinuation.
  • The initial day of the study was recorded as the day the patients first opened their pill containers, initiating the baseline period that lasted until the second visit to the pharmacy. The duration of this baseline period varied among individuals.

Interactions and covariates

  • People who are taking the medication (adherence) daily seldom go cold-turkey,
  • but people who barely remember to take a pill in a week have no problem going all the way to zero.
  • This relationship should be part of the model for how long they will keep taking the pills (persistence).
  • People who receive the intervention, explaining/reminding them of the importance of taking the pills, will hopefully be more likely to both adhere and persist, although not necessarily to the same extent.
  • There may be other measurable factors that influence adherence and persistence (covariates).

The proposed model

The Kumaraswamy distribution

\[f\left(y;\alpha_1,\alpha_2\right)=\alpha_1\alpha_2y^{\alpha_1-1}\left(1-y^{\alpha_1}\right)^{\alpha_2-1},\] where \(0<y<1\) and \(\alpha_1,\alpha_2>0\) are shape parameters (Mitnik and Baek 2013).

An alternative parameterization of the Kumaraswamy distribution involves expressing \(\alpha_1\) and \(\alpha_2\) in terms of the \(q^\text{th}\) quantile location parameter \(\kappa_q\), defined such that \(F^{-1}\left(q\right)=\kappa_q\), where \(F^{-1}\) is the inverse of the CDF (Burger et al. 2023).

In this parameterization, \(\alpha_1\) and \(\alpha_2\) are given by \(\alpha_1=\frac{\log\left(1-\left(1-q\right)^\frac{1}{\alpha_2}\right)}{\log\left(\kappa_q\right)}\) and \(\alpha_2=\frac{\log\left(1-q\right)}{\log\left(1-e^{-\psi}\right)}\), with \(\psi>0\) as the precision parameter.

Continuous proportions quantile regression

Suppose that \(y_{ij}\) is the continuous proportions outcome for patient \(i=1,\ldots,I\) and time point \(j=1,\ldots,J_i\). The regression model of the \(q^\text{th}\) conditional quantile of \(y_{ij}\) is written as \(\kappa_{q_{ij}} = g^{-1}(\eta_i\left(t_{j}\right)),\)

where \(\eta_{q,i}\left(t_{ij}\right) = \left(\beta_0 + b_{0_i}\right) + \left(\beta_{\text{time}} + b_{\text{time}_i} + \mathbf{z}_i^{\prime} {\boldsymbol\beta}_{\text{tx}}\right) t_{ij} + \mathbf{x}_i^{\prime} {\boldsymbol\beta}_{\text{cov}}\)

is the linear predictor corresponding to subject \(i\) at time point \(t_{ij}\).

\(g\) is the logit link function; \(\beta_0\) is the fixed intercept; \(\beta_{\text{time}}\) is the fixed coefficient for the effect of time; \({\boldsymbol\beta}_{\text{cov}}\) is a vector of fixed coefficients corresponding to the covariates in \(\mathbf{x}_i\); \(\mathbf{z}_i\) is a vector of covariates for subject \(i\) that interact with time, and \({\boldsymbol\beta}_{\text{tx}}\) is a vector of coefficients for these interactions; \(b_{0_i}\) and \(b_{\text{time}_i}\) are the random intercept and slope for subject \(i\).

Time-to-event submodel

  • The time-to-event submodel is formulated using the Cox proportional hazards model to analyse the time until an event of interest occurs, considering both the event time and censoring.
  • The hazard function for patient \(i\) at time \(t\) in the Cox model is given by: \[h_i\left(t\right) = h_0\left(t\right)\exp\left(\phi\eta_i\left(t\right) + \mathbf{w}_i^{'}{\boldsymbol\gamma}\right),\] where \(h_i\left(t\right)\) denotes the hazard function for patient \(i\) at time \(t\); \(h_0\left(t\right)\) is the baseline hazard function; \(\eta_i\left(t\right)\) represents the linear predictor derived from the longitudinal submodel for patient \(i\) at time \(t\); and \(\phi\) measures the impact of the longitudinal process on the event risk; \(\mathbf{w}_i\) is a design matrix for patient \(i\) containing additional covariates; and \(\boldsymbol\gamma\) are coefficients.

The baseline hazard

In the INLA framework, the baseline hazard \(h_0\left(t\right)\) is discretized into intervals to allow flexibility in modelling the hazard over time. Specifically, for each interval \(k = 1, \ldots, K\), the hazard is assumed to be constant and is defined in terms of the RW2 process:

\[\delta_k = 2\delta_{k-1} - \delta_{k-2} + \epsilon_k,\] with \(\epsilon_k \sim N(0, \sigma^2)\). This formulation ensures a smoothing effect, reflecting the change from \(\delta_{k-1}\) to \(\delta_k\) based on the previous change (from \(\delta_{k-2}\) to \(\delta_{k-1}\)). Then \[h_0\left(t\right) = \exp(\delta_k) \quad \text{for } t \in (s_{k-1}, s_k],\] where \(\delta_k\) represents the log-baseline hazard at time \(t\) within the interval \(k\), and \((s_{k-1}, s_k]\) denotes the time interval.

The survival function

The survival function for patient \(i\), denoted as \(S_i\left(t\right)\), is defined as the probability of survival beyond time \(t\), expressed as the exponential of the negative cumulative hazard function: \[S_i\left(t\right) = \exp\left(-H_i\left(t\right)\right),\] where \(H_i\left(t\right)\) is the cumulative hazard function for patient \(i\). In the context of the RW2 model, the cumulative hazard is calculated as: \[H_i\left(t\right) = \sum_{k=1}^{K\left(t\right)} \left[\int_{s_{k-1}}^{\min\left(t, s_k\right)}\exp\left(\delta_k + \phi \eta_i\left(u\right)+\mathbf{w}_i^{'}{\boldsymbol\gamma}\right)du\right],\] where \(K\left(t\right)\) is the number of intervals up to time \(t\).

The model fitting

The INLA tool

  • We employ the INLA methodology (Rue, Martino, and Chopin 2009) to estimate the posterior distribution of the model parameters.
  • We generate samples from the approximated posterior of the fitted model, enabling us to compute
    • the longitudinal profiles,
    • hazard functions, and
    • survival functions
    • alongside their respective 95% credible intervals.

The speed of INLA

  • It fits large and complex models in seconds.
  • The incredible speed of INLA facilitates the calculation of dynamic predictions,
    • permitting us to estimate future probabilities of persistence based on the latest available longitudinal data.
  • Only works for models built into the framework, but
    • The list of models it can handle is massive
    • It is particularly good at spatial models (the gold standard really)
    • And very good for generalised linear mixed effects models on large data sets

The INLA process

Works in two steps:

  1. Calculate numeric approximations to the marginal posterior distributions and the correlation structure of the parameters.
  2. Draw samples from the approximate joint posterior distribution using the above approximation.

The second step is optional for many cases, but useful when making predictions. If you just want to analyse the parameters (e.g. coefficients) then the first step is enough.

  • The approximation is not perfect, but darn close for typical problems and medium or larger samples.
    • For small samples and weird problems I will always go to Stan.

The results

If plots appear badly sized please refresh (F5).

Coefficients

Partial dependence plots

  • When reviewing the fit of a generalised model, the interpretation of parameter values can be complex
    • Generalised models here include everything from GLMs (like logistic regression) to deep neural networks
  • The relationship between \(y\) and \(\mathbf{x}\) is not straightforward, and changes depending on the values of \(\mathbf{x}\)
  • Thus, it is important to consider the changing relationship as \(\mathbf{x}\) changes
  • But \(\mathbf{x}\) can have many components and plots have limited dimensions
  • Solution: fix most components of \(\mathbf{x}\), say at average values, while varying only 1 or 2 at a time

Parameters of interest

  • Primary variables of interest are treatment and time
    • These have strong interactions
    • So these are the primary axes
  • In this study the averages are 61.1 for age, 11.4 for cardiac risk score, and 56.6 for cholesterol score
    • These are of secondary interest
    • So these are typically fixed at the averages
    • Or illustrated at select values

Longitudinal adherence treatment effect

In the plots to follow each line has two intervals: a narrower interval representing the average subject, and a wider interval representing a random future subject.

By age

Hazard function

This boring plot is just a check that the modelled hazards are indeed proportional. If the estimate lines are parallel on the log scale then all is well, if not then we messed up the modelling. Parallel lines do not imply that the true hazards are proportional.

Survival curves

This plot is far more interesting as it shows the modelled persistence.

Kaplan-Maier analysis

Raw KM plot

Issues with KM

  • Wide intervals
  • Unstable
  • Doesn’t predict future subject
  • Specific to sample properties

If a future subject comes along the KM assumes that they have the same properties as the people in the study, but it is well known that study subjects differ from the broader population.

Making the results practical for a future person

Web interface

  • The company that owns the data (typically the one who developed the medication) can host an app on their website
  • That would allow a doctor or pharmacist to type in a patient’s record anonymously
  • And get reasonable predictions in seconds
    • of how well the patient will take the pills (adherence) and
    • how long a patient will keep taking the pills at all (persistance)

Mockup

  • Here is a look at what the app interface could look like:

Interactive version

Please ask your presenter to show you an interactive demo of the app now.

  • I wrote a guide for fitting and visualisation of the model in R
    • And the app that goes with it.
  • These use simulated data that is similar to the real data.
  • The demo code can be downloaded from Janet’s Github Repository

Small simulation study

Summary of one scenario

Parameter Value Bias Coverage
\(\beta_0\) 1.5 -0.056 0.933
\(\beta_{\text{time}}\) -0.4 0.010 0.933
\(\beta_{\text{bin}}\) 0.4 -0.023 0.957
\(\beta_{\text{cont}}\) 0.5 -0.012 0.937
\(\psi\) 0.3 0.007 0.917
\(\sigma_{b_0}\) 0.2 0.276 0.920
\(\sigma_{b_1}\) 0.2 0.267 0.970
\(\rho_{{b_0},{b_1}}\) 0.2 -0.211 0.957
\(\phi\) -0.4 -0.010 0.940

Conclusion

Summary

  • We saw that it is both possible and sensible to model the joint processes together.
  • The data types and processes might be very different, but they are connected, so statistical creativity and ingenuity is needed.
  • As mathematically intricate as this model may seem, it is ultimately built on an understanding of the real world data generating process, and driven by the need to help address a real world problem.

This presentation was created using the Reveal.js format in Quarto, using the RStudio IDE. Font and line colours according to UFS branding, and background image by Copilot (DALL-E_v3) using image editor GIMP.

References

Burger, D. A., S. van der Merwe, E. Lesaffre, P. C. le Roux, and M. J. Raath-Krüger. 2023. A robust mixed-effects parametric quantile regression model for continuous proportions: Quantifying the constraints to vitality in cushion plants.” Statistica Neerlandica 77 (4): 444–70.
Mitnik, P. A., and S. Baek. 2013. The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation.” Statistical Papers 54 (1): 177–92.
Rue, H., S. Martino, and N. Chopin. 2009. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations.” Journal of the Royal Statistical Society: Series B 71 (2): 319–92.
Vrijens, B., A. Belmans, K. Matthys, de Klerk, E., and E. Lesaffre. 2006. Effect of intervention through a pharmaceutical care program on patient adherence with prescribed once-daily atorvastatin.” Pharmacoepidemiology and Drug Safety 15 (2): 115–21.