This presentation focusses on the problem of parametric quantile regression where the dependent variable is measured in an imbalanced way across subjects.
This presentation is drawn from our paper that we want to submit this month.
Divan A. Burger\(^{1,2}\), Sean van der Merwe\(^{2}\), Emmanuel Lesaffre\(^{3,4}\)
The ACTG 315 dataset, available in the ushr R package, includes longitudinal measurements of HIV viral load (log\(_{10}\) RNA copies/mL) over time. It features data on 46 patients, with the longest measurement recorded on Day 196 after baseline (Day 0).
For interest, note that the median \(\log_{10}\)RNA on Day 0 was 4.9395.
We observe two kinds of censoring:
We address these in different ways:
We build up the model step by step.
A GLM revolves around two things:
\[g(E[Y])=X\boldsymbol\beta \]
For continuous data we might assume that \(y_i \sim N\left(\mu_i, \sigma^2\right)\),
and that \(\mu_i=\mathbf{x}_i\boldsymbol\beta\) with \(g(\mu)=\mu\).
For location-scale distributions such as the Gaussian:
\[y_i \sim N\left(\mu_i, \sigma^2\right) \Leftrightarrow y_i=\mu_i+\varepsilon_i,\ \varepsilon_i\sim i.i.d.\ N\left(0, \sigma^2\right)\]
Random effects are important to help address the fact that our residuals are not independent.
A standard linear mixed model fit for the CD4 counts yields:
Linear mixed model fit by REML ['lmerMod']
Formula: CD4 ~ Day + (1 | Patid)
Data: actg315
REML criterion at convergence: 782.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.0818 -0.5632 -0.0200 0.5566 3.6203
Random effects:
Groups Name Variance Std.Dev.
Patid (Intercept) 0.5093 0.7137
Residual 0.3594 0.5995
Number of obs: 361, groups: Patid, 46
Fixed effects:
Estimate Std. Error t value
(Intercept) 2.2692038 0.1126205 20.149
Day 0.0066982 0.0006782 9.876
Correlation of Fixed Effects:
(Intr)
Day -0.216
\[\mu_{ij}\left(p_0,\mathbf{v}_i,\boldsymbol\beta\right) = \log_{10}\left( P_1^{\left(i\right)}\exp\left(-\lambda_1^{\left(i\right)}t_{ij}\right)+ P_2^{\left(i\right)}\exp\left(-\lambda_2^{\left(ij\right)}t_{ij}\right) \right)\]
where \(\mu_{ij}\left(p_0,\mathbf{v}_i,\boldsymbol\beta\right)\) is the \(p_0\) quantile under either SL- or SEP-distributed errors (to be defined soon).
For patient \(i\),
\[\begin{aligned} P_1^{\left(i\right)} &= \exp\left(\beta_1 + v_{i,1}\right), & \lambda_1^{\left(i\right)} &= \beta_2 + v_{i,2}, \\ P_2^{\left(i\right)} &= \exp\left(\beta_3 + v_{i,3}\right), & \lambda_2^{\left(ij\right)} &= \beta_4 + v_{i,4}+\gamma\mathrm{cd4}_{ij} \end{aligned}\]
The fixed effects vector \(\boldsymbol\beta=\left(\beta_1,\dots,\beta_4\right)^{\top}\) sets the population-level coefficients for the \(p_{0}^{th}\) quantile model, and the terms \(v_{i,k}\) capture patient-specific deviations from those coefficients.
When building models, ask yourself whether you are trying to model or predict, then ask yourself whether you are trying to model or predict
If you think of your audience, what would they consider the typical value for something like income? Is it any of these?
What is the explanatory structure actually related to?
The simplest quantile regression is median regression. One way to do it is to just replace the normal distribution with a Laplace distribution (double exponential - one exponential tail pointing left and one pointing right).
From the Laplace distribution the logical evolution is the Skew Laplace distribution.
We will look at the definition shortly, but let’s just visualise it in two plots first and discuss what we see:
Note that lower quantiles produce right skew and upper quantiles produce left skew. This makes sense when modelling a simple sample:
But it makes less sense when modelling residuals, where upper observations could easily have right skew residuals:
The dependent values are assumed to follow one of two quantile-parameterized distributions, conditional on explanatory factors that may include both fixed and random effects. The two distributions are the skew Laplace distribution and the skew exponential power distribution.
Let \(\mu\), \(\sigma\), and \(p_0\) specify the SL distribution, where \(\mu\) is set to correspond to the \(p_0^{th}\) quantile, \(\sigma > 0\) is the scale parameter, and \(p_0 \in \left(0,1\right)\) governs the skewness (Geraci and Bottai 2007).
The SL PDF is given by:
\[f_{\text{SL}} \left(y \left| \mu, \sigma, p_0 \right.\right) = \begin{cases} \displaystyle \frac{2 p_0 \left(1 - p_0\right)}{\sigma} \exp\left(\frac{2\left(y - \mu\right)\left(1 - p_0\right)}{\sigma}\right), & \text{if } y \leq \mu, \\ \displaystyle \frac{2 p_0 \left(1 - p_0\right)}{\sigma} \exp\left(-\frac{2 p_0 \left(y - \mu\right)}{\sigma}\right), & \text{if } y > \mu. \end{cases}\]
The corresponding CDF is:
\[F_{\text{SL}} \left(y \left| \mu, \sigma, p_0 \right.\right) = \begin{cases} \displaystyle p_0 \exp\left(\frac{2\left(y - \mu\right)\left(1 - p_0\right)}{\sigma}\right), & \text{if } y \leq \mu, \\ \displaystyle 1 - \left(1 - p_0\right) \exp\left(-\frac{2 p_0 \left(y - \mu\right)}{\sigma}\right), & \text{if } y > \mu. \end{cases}\]
The SL distribution has moderately heavy tails.
To achieve greater flexibility in both skewness and tail thickness while retaining an analytical CDF, one can adopt the SEP distribution. Let \(\mu\) denote the \(p_0^{th}\) quantile (\(0<p_{0}<1\)), \(\sigma>0\) the scale, and \(\kappa_{1},\kappa_{2}>0\) the left- and right-tail shape parameters. Following (Zhu and Zinde-Walsh 2009), the PDF is
\[f_{\text{SEP}}\left(y \left| \mu, \sigma, \kappa_1, \kappa_2, p_0 \right.\right) = \begin{cases} \frac{1}{\sigma} \exp\left(-\frac{1}{\kappa_1}\left(\frac{\mu - y}{2p_0\sigma K_1}\right)^{\kappa_1}\right), & y \leq \mu, \\ \frac{1}{\sigma} \exp\left(-\frac{1}{\kappa_2}\left(\frac{y - \mu}{2\left(1 - p_0\right)\sigma K_2}\right)^{\kappa_2}\right), & y > \mu, \end{cases}\]
where the normalizing constants are
\[K_j = \frac{\kappa_j^{-1/\kappa_j}}{2 \Gamma\left(1 + \frac{1}{\kappa_j}\right)}, \quad j=1,2.\]
The distribution satisfies:
\[\int_{-\infty}^{\mu}f_{\text{SEP}}\left(y \left| \mu, \sigma, \kappa_1, \kappa_2, p_0 \right.\right)\mathrm{d}y = p_0,\]
thus defining \(\mu\) as the \(p_0\) quantile.
A corrected CDF derivation appears in the paper’s supplementary material. It fixes a scaling error in the incomplete gamma term of the expression reported by (Zhu and Zinde-Walsh 2009). The resulting CDF is
\[F_{\text{SEP}}\left(y \left| \mu, \sigma, \kappa_1, \kappa_2, p_0 \right.\right) = \begin{cases} p_0 \left(1 - G\left(\frac{1}{\kappa_1}\left(\frac{\mu - y}{2 p_0 \sigma K_1}\right)^{\kappa_1}, \frac{1}{\kappa_1}\right)\right), & y \leq \mu, \\ p_0 + \left(1 - p_0\right) G\left(\frac{1}{\kappa_2}\left(\frac{y - \mu}{2\left(1 - p_0\right) \sigma K_2}\right)^{\kappa_2}, \frac{1}{\kappa_2}\right), & y > \mu, \end{cases}\]
where \(G\left(a, b\right)\) is the regularized incomplete gamma function:
\[G\left(a, b\right) = \frac{1}{\Gamma\left(b\right)} \int_{0}^{a} t^{b-1} e^{-t} \mathrm{d}t,\]
The SEP distribution reduces to a normal distribution with mean \(\mu\) and standard deviation \(\sigma/\sqrt{2\pi}\) when \(p_0 = 0.5\) and \(\kappa_1 = \kappa_2 = 2\). When \(p_0 = 0.5\) and \(\kappa_1 = \kappa_2 = 1\), it becomes the symmetric Laplace distribution. As \(\left(\kappa_1, \kappa_2\right) \to \infty\), the SEP distribution approaches a uniform distribution.
If we vary \(\kappa_2\) it changes the right shape independently of \(p_0\).
It is not radically different from the established SKL model, but far more flexible.
Let’s get practical…
model {
for (r in 1:4) {
L_chol_v_rna[r, r] ~ dt(0, 0.5, 3)T(0, )
for (c in 1:(r - 1)) {
L_chol_v_rna[r, c] ~ dnorm(0, 0.001)
}
for (c in (r + 1):4) {
L_chol_v_rna[r, c] <- 0
}
}
for (r in 1:4) {
for (c in 1:4) {
inv_Sigma_v_rna[r, c] <- inprod(L_chol_v_rna[r, 1:4], L_chol_v_rna[c, 1:4])
}
}
for (i in 1:N) {
v_rna[i, 1:4] ~ dmnorm(beta_rna[1:4], inv_Sigma_v_rna[, ])
}
for (i in 1:N_total) {
log_P1[i] <- v_rna[id[i], 1]
lambda1[i] <- v_rna[id[i], 2]
log_P2[i] <- v_rna[id[i], 3]
lambda2[i] <- v_rna[id[i], 4] + gamma_rna*cd4[i]
log(P1[i]) <- log_P1[i]
log(P2[i]) <- log_P2[i]
mu[i] <- log(P1[i]*exp(-lambda1[i]*time[i]) + P2[i]*exp(-lambda2[i]*time[i]))/log(10)
delta[i] <- step(y[i] - mu[i])
pdf_left[i] <- (2*p0*(1 - p0)/sigma_rna_resid)*
exp( 2*(y[i] - mu[i])*(1 - p0)/sigma_rna_resid)*
(1 - delta[i])
pdf_right[i] <- (2*p0*(1 - p0)/sigma_rna_resid)*
exp(-2*p0*(y[i] - mu[i])/sigma_rna_resid)*
delta[i]
pdf[i] <- pdf_left[i] + pdf_right[i]
cdf_left[i] <- p0*exp(2*(y[i] - mu[i])*(1 - p0)/sigma_rna_resid)
cdf_right[i] <- 1 - (1 - p0)*exp(-2*p0*(y[i] - mu[i])/sigma_rna_resid)
cdf[i] <- cdf_left[i]*(1 - delta[i]) + cdf_right[i]*delta[i]
like[i] <- pow(pdf[i], censor[i])*pow(cdf[i], 1 - censor[i])
p[i] <- like[i]/C
ones[i] ~ dbern(p[i])
}
for (j in 1:4) {
beta_rna[j] ~ dnorm(0, 0.001)
}
gamma_rna ~ dnorm(0, 0.001)
sigma_rna_resid ~ dt(0, 0.5, 3)T(0, )
}
model {
for (r in 1:4) {
L_chol_v_rna[r, r] ~ dt(0, 0.5, 3)T(0, )
for (c in 1:(r - 1)) {
L_chol_v_rna[r, c] ~ dnorm(0, 0.001)
}
for (c in (r + 1):4) {
L_chol_v_rna[r, c] <- 0
}
}
for (r in 1:4) {
for (c in 1:4) {
inv_Sigma_v_rna[r, c] <- inprod(L_chol_v_rna[r, 1:4], L_chol_v_rna[c, 1:4])
}
}
for (i in 1:N) {
v_rna[i, 1:4] ~ dmnorm(beta_rna[1:4], inv_Sigma_v_rna[, ])
}
K_1 <- pow(kappa_left, -1/kappa_left)/(2*exp(loggam(1 + 1/kappa_left)))
K_2 <- pow(kappa_right, -1/kappa_right)/(2*exp(loggam(1 + 1/kappa_right)))
for (i in 1:N_total) {
log_P1[i] <- v_rna[id[i], 1]
lambda1[i] <- v_rna[id[i], 2]
log_P2[i] <- v_rna[id[i], 3]
lambda2[i] <- v_rna[id[i], 4] + gamma_rna*cd4[i]
log(P1[i]) <- log_P1[i]
log(P2[i]) <- log_P2[i]
mu[i] <- log(P1[i]*exp(-lambda1[i]*time[i]) + P2[i]*exp(-lambda2[i]*time[i]))/log(10)
delta[i] <- step(y[i] - mu[i])
left_term[i] <- pow((mu[i] - y[i])/(2*p0*sigma_rna_resid*K_1), kappa_left)
right_term[i] <- pow((y[i] - mu[i])/(2*(1 - p0)*sigma_rna_resid*K_2), kappa_right)
pdf_left[i] <- (1/sigma_rna_resid)*exp(-(1/kappa_left)*left_term[i])*(1 - delta[i])
pdf_right[i] <- (1/sigma_rna_resid)*exp(-(1/kappa_right)*right_term[i])*delta[i]
pdf[i] <- pdf_left[i] + pdf_right[i]
a_left[i] <- (1/kappa_left)*left_term[i]
a_right[i] <- (1/kappa_right)*right_term[i]
G_left[i] <- pgamma(a_left[i], 1/kappa_left, 1)
G_right[i] <- pgamma(a_right[i], 1/kappa_right, 1)
cdf_left[i] <- p0 + (-G_left[i]*p0)
cdf_right[i] <- p0 + (1 - p0)*G_right[i]
cdf[i] <- cdf_left[i]*(1 - delta[i]) + cdf_right[i]*delta[i]
like[i] <- pow(pdf[i], censor[i])*pow(cdf[i], 1 - censor[i])
p[i] <- like[i]/C
ones[i] ~ dbern(p[i])
}
for (j in 1:4) {
beta_rna[j] ~ dnorm(0, 0.001)
}
gamma_rna ~ dnorm(0, 0.001)
sigma_rna_resid ~ dt(0, 0.5, 3)T(0, )
kappa_left ~ dt(0, 0.5, 3)T(0, )
kappa_right ~ dt(0, 0.5, 3)T(0, )
}
\(p_{0}\) | \(\beta_{1}\) | \(\beta_{2}\) | \(\beta_{3}\) | \(\beta_{4}\) | \(\gamma\) | \(\sigma\) | \(\kappa_{1}\) | \(\kappa_{2}\) |
---|---|---|---|---|---|---|---|---|
0.05 | 10.96 | 0.33 | 5.88 | -0.01 | 0.01 | 0.52 | 0.35 | 1.43 |
0.25 | 11.28 | 0.33 | 6.32 | 0.00 | 0.00 | 0.33 | 0.57 | 0.83 |
0.50 | 11.63 | 0.34 | 6.82 | 0.01 | 0.00 | 0.33 | 0.91 | 0.62 |
0.75 | 12.02 | 0.35 | 7.24 | 0.01 | 0.00 | 0.48 | 1.69 | 0.49 |
0.95 | 12.38 | 0.36 | 7.49 | 0.00 | 0.00 | 0.68 | 2.98 | 0.35 |
It looks like the people that were less sick were more affected by CD4 counts.
Is it possible that people who feel healthier are less diligent in taking their medication?
Marginal log likelihood - larger is better
Here we consider Subject 8. Thanks to the use of random effects we obtain personalised predictions:
A simulation study was performed under a few scenarios similar to the data set being studied.
This allows us to confirm that the model fitting works as intended.
A model that can’t fit data meeting its assumptions is definitely not going to fit real data.
Cen. | \(p_0\) | \(\kappa_1\) | \(\kappa_2\) | Param. | TRUE | SKL.Bias | SKL.RMSE | SKL.Len | SKL.CP | SEP.Bias | SEP.RMSE | SEP.Len | SEP.CP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.05 | 0.5 | 2 | 0.5 | \(\beta_{1}\) | 5.00 | 0.1230 | 0.1610 | 0.4641 | 0.840 | 0.0254 | 0.1054 | 0.4430 | 0.963 |
\(\beta_{2}\) | -0.25 | 0.0031 | 0.1821 | 0.7487 | 0.933 | 0.0037 | 0.1808 | 0.7371 | 0.940 | ||||
0.05 | 0.5 | 1 | 1 | \(\beta_{1}\) | 5.00 | -0.0075 | 0.0939 | 0.4220 | 0.960 | -0.0077 | 0.0937 | 0.4315 | 0.967 |
\(\beta_{2}\) | -0.25 | 0.0124 | 0.1865 | 0.7301 | 0.953 | 0.0124 | 0.1863 | 0.7312 | 0.957 | ||||
0.05 | 0.5 | 0.5 | 2 | \(\beta_{1}\) | 5.00 | -0.1089 | 0.1514 | 0.4592 | 0.857 | -0.0110 | 0.1040 | 0.4425 | 0.973 |
\(\beta_{2}\) | -0.25 | 0.0149 | 0.1797 | 0.7525 | 0.957 | 0.0132 | 0.1786 | 0.7485 | 0.953 | ||||
0.05 | 0.8 | 2 | 0.5 | \(\beta_{1}\) | 5.00 | 0.0829 | 0.1400 | 0.4586 | 0.910 | 0.0252 | 0.1054 | 0.4344 | 0.957 |
\(\beta_{2}\) | -0.25 | 0.0056 | 0.1631 | 0.7247 | 0.963 | 0.0052 | 0.1618 | 0.7176 | 0.953 | ||||
0.05 | 0.8 | 1 | 1 | \(\beta_{1}\) | 5.00 | 0.0014 | 0.0971 | 0.4132 | 0.943 | 0.0150 | 0.0989 | 0.4199 | 0.943 |
\(\beta_{2}\) | -0.25 | 0.0136 | 0.1881 | 0.7340 | 0.927 | 0.0137 | 0.1879 | 0.7339 | 0.930 | ||||
0.05 | 0.8 | 0.5 | 2 | \(\beta_{1}\) | 5.00 | 0.0113 | 0.1025 | 0.4875 | 0.977 | 0.0055 | 0.0994 | 0.4650 | 0.983 |
\(\beta_{2}\) | -0.25 | -0.0061 | 0.1930 | 0.7459 | 0.960 | -0.0063 | 0.1911 | 0.7432 | 0.960 |
Cen. | \(p_0\) | \(\kappa_1\) | \(\kappa_2\) | Param. | TRUE | SKL.Bias | SKL.RMSE | SKL.Len | SKL.CP | SEP.Bias | SEP.RMSE | SEP.Len | SEP.CP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.11 | 0.5 | 2 | 0.5 | \(\beta_{1}\) | 5.00 | 0.1040 | 0.1486 | 0.4757 | 0.897 | 0.0120 | 0.1042 | 0.4543 | 0.970 |
\(\beta_{2}\) | -0.25 | -0.0155 | 0.1872 | 0.7650 | 0.937 | -0.0113 | 0.1848 | 0.7478 | 0.937 | ||||
0.11 | 0.5 | 1 | 1 | \(\beta_{1}\) | 5.00 | -0.0013 | 0.0917 | 0.4238 | 0.977 | -0.0007 | 0.0944 | 0.4322 | 0.980 |
\(\beta_{2}\) | -0.25 | 0.0089 | 0.1896 | 0.7357 | 0.933 | 0.0086 | 0.1894 | 0.7354 | 0.930 | ||||
0.11 | 0.5 | 0.5 | 2 | \(\beta_{1}\) | 5.00 | -0.1245 | 0.1602 | 0.4706 | 0.847 | -0.0296 | 0.1018 | 0.4504 | 0.963 |
\(\beta_{2}\) | -0.25 | 0.0024 | 0.1858 | 0.7352 | 0.937 | 0.0024 | 0.1831 | 0.7321 | 0.937 | ||||
0.11 | 0.8 | 2 | 0.5 | \(\beta_{1}\) | 5.00 | 0.0784 | 0.1302 | 0.4642 | 0.937 | 0.0294 | 0.1036 | 0.4435 | 0.963 |
\(\beta_{2}\) | -0.25 | -0.0106 | 0.1808 | 0.7327 | 0.930 | -0.0072 | 0.1802 | 0.7252 | 0.937 | ||||
0.11 | 0.8 | 1 | 1 | \(\beta_{1}\) | 5.00 | 0.0059 | 0.0918 | 0.4144 | 0.970 | 0.0185 | 0.0949 | 0.4211 | 0.960 |
\(\beta_{2}\) | -0.25 | -0.0087 | 0.1708 | 0.7259 | 0.933 | -0.0087 | 0.1708 | 0.7256 | 0.933 | ||||
0.11 | 0.8 | 0.5 | 2 | \(\beta_{1}\) | 5.00 | -0.0111 | 0.1009 | 0.4805 | 0.977 | -0.0110 | 0.0987 | 0.4626 | 0.973 |
\(\beta_{2}\) | -0.25 | 0.0111 | 0.1756 | 0.7615 | 0.957 | 0.0118 | 0.1735 | 0.7613 | 0.957 |
This presentation was created using the Reveal.js format in Quarto, using the RStudio IDE. Font and line colours according to UFS branding, and background image using image editor GIMP by compositing images from CoPilot (DALL-E_v3).
2025/08/15 - SEP