In this study you are interested in how long people are going to keep taking the medication and how consistently they are taking the medication. Further, you know that these two behaviours interact, and you want to know what the effect of an intervention is on the joint behaviour.
This presentation is drawn from our paper that is under revision/review for Statistical Methods in Medical Research, following minor corrections.
Divan A. Burger\(^{1,2,3}\), Sean van der Merwe\(^{3}\), Janet van Niekerk\(^{2,4}\), Emmanuel Lesaffre\(^{5}\), Antoine Pironet\(^{6}\)
\[f\left(y;\alpha_1,\alpha_2\right)=\alpha_1\alpha_2y^{\alpha_1-1}\left(1-y^{\alpha_1}\right)^{\alpha_2-1},\] where \(0<y<1\) and \(\alpha_1,\alpha_2>0\) are shape parameters (Mitnik and Baek 2013).
An alternative parameterization of the Kumaraswamy distribution involves expressing \(\alpha_1\) and \(\alpha_2\) in terms of the \(q^\text{th}\) quantile location parameter \(\kappa_q\), defined such that \(F^{-1}\left(q\right)=\kappa_q\), where \(F^{-1}\) is the inverse of the CDF (Burger et al. 2023).
In this parameterization, \(\alpha_1\) and \(\alpha_2\) are given by \(\alpha_1=\frac{\log\left(1-\left(1-q\right)^\frac{1}{\alpha_2}\right)}{\log\left(\kappa_q\right)}\) and \(\alpha_2=\frac{\log\left(1-q\right)}{\log\left(1-e^{-\psi}\right)}\), with \(\psi>0\) as the precision parameter.
Suppose that \(y_{ij}\) is the continuous proportions outcome for patient \(i=1,\ldots,I\) and time point \(j=1,\ldots,J_i\). The regression model of the \(q^\text{th}\) conditional quantile of \(y_{ij}\) is written as \(\kappa_{q_{ij}} = g^{-1}(\eta_i\left(t_{j}\right)),\)
where \(\eta_{q,i}\left(t_{ij}\right) = \left(\beta_0 + b_{0_i}\right) + \left(\beta_{\text{time}} + b_{\text{time}_i} + \mathbf{z}_i^{\prime} {\boldsymbol\beta}_{\text{tx}}\right) t_{ij} + \mathbf{x}_i^{\prime} {\boldsymbol\beta}_{\text{cov}}\)
is the linear predictor corresponding to subject \(i\) at time point \(t_{ij}\).
\(g\) is the logit link function; \(\beta_0\) is the fixed intercept; \(\beta_{\text{time}}\) is the fixed coefficient for the effect of time; \({\boldsymbol\beta}_{\text{cov}}\) is a vector of fixed coefficients corresponding to the covariates in \(\mathbf{x}_i\); \(\mathbf{z}_i\) is a vector of covariates for subject \(i\) that interact with time, and \({\boldsymbol\beta}_{\text{tx}}\) is a vector of coefficients for these interactions; \(b_{0_i}\) and \(b_{\text{time}_i}\) are the random intercept and slope for subject \(i\).
In the INLA framework, the baseline hazard \(h_0\left(t\right)\) is discretized into intervals to allow flexibility in modelling the hazard over time. Specifically, for each interval \(k = 1, \ldots, K\), the hazard is assumed to be constant and is defined in terms of the RW2 process:
\[\delta_k = 2\delta_{k-1} - \delta_{k-2} + \epsilon_k,\] with \(\epsilon_k \sim N(0, \sigma^2)\). This formulation ensures a smoothing effect, reflecting the change from \(\delta_{k-1}\) to \(\delta_k\) based on the previous change (from \(\delta_{k-2}\) to \(\delta_{k-1}\)). Then \[h_0\left(t\right) = \exp(\delta_k) \quad \text{for } t \in (s_{k-1}, s_k],\] where \(\delta_k\) represents the log-baseline hazard at time \(t\) within the interval \(k\), and \((s_{k-1}, s_k]\) denotes the time interval.
The survival function for patient \(i\), denoted as \(S_i\left(t\right)\), is defined as the probability of survival beyond time \(t\), expressed as the exponential of the negative cumulative hazard function: \[S_i\left(t\right) = \exp\left(-H_i\left(t\right)\right),\] where \(H_i\left(t\right)\) is the cumulative hazard function for patient \(i\). In the context of the RW2 model, the cumulative hazard is calculated as: \[H_i\left(t\right) = \sum_{k=1}^{K\left(t\right)} \left[\int_{s_{k-1}}^{\min\left(t, s_k\right)}\exp\left(\delta_k + \phi \eta_i\left(u\right)+\mathbf{w}_i^{'}{\boldsymbol\gamma}\right)du\right],\] where \(K\left(t\right)\) is the number of intervals up to time \(t\).
Works in two steps:
The second step is optional for many cases, but useful when making predictions. If you just want to analyse the parameters (e.g. coefficients) then the first step is enough.
If plots appear badly sized please refresh (F5).
In the plots to follow each line has two intervals: a narrower interval representing the average subject, and a wider interval representing a random future subject.
This boring plot is just a check that the modelled hazards are indeed proportional. If the estimate lines are parallel on the log scale then all is well, if not then we messed up the modelling. Parallel lines do not imply that the true hazards are proportional.
This plot is far more interesting as it shows the modelled persistence.
If a future subject comes along the KM assumes that they have the same properties as the people in the study, but it is well known that study subjects differ from the broader population.
Please ask your presenter to show you an interactive demo of the app now.
Parameter | Value | Bias | Coverage |
---|---|---|---|
\(\beta_0\) | 1.5 | -0.056 | 0.933 |
\(\beta_{\text{time}}\) | -0.4 | 0.010 | 0.933 |
\(\beta_{\text{bin}}\) | 0.4 | -0.023 | 0.957 |
\(\beta_{\text{cont}}\) | 0.5 | -0.012 | 0.937 |
\(\psi\) | 0.3 | 0.007 | 0.917 |
\(\sigma_{b_0}\) | 0.2 | 0.276 | 0.920 |
\(\sigma_{b_1}\) | 0.2 | 0.267 | 0.970 |
\(\rho_{{b_0},{b_1}}\) | 0.2 | -0.211 | 0.957 |
\(\phi\) | -0.4 | -0.010 | 0.940 |
This presentation was created using the Reveal.js format in Quarto, using the RStudio IDE. Font and line colours according to UFS branding, and background image by Copilot (DALL-E_v3) using image editor GIMP.
2024/08/16 - Joint Model