A flexible mixed-effects quantile regression model for censored data

Divan A. Burger, Sean van der Merwe, Emmanuel Lesaffre

Introduction

This presentation focusses on the problem of parametric quantile regression where the dependent variable is measured in an imbalanced way across subjects.

Source and authors

This presentation is drawn from our paper that we want to submit this month.

Divan A. Burger\(^{1,2}\), Sean van der Merwe\(^{2}\), Emmanuel Lesaffre\(^{3,4}\)

Syneos Health, Bloemfontein, Free State, South Africa
Department of Mathematical Statistics and Actuarial Science, University of the Free State, Bloemfontein, South Africa
I-BioStat, KU Leuven, Leuven, Belgium
Department of Statistics and Actuarial Science, University of Stellenbosch, Stellenbosch, South Africa

Outline

Motivation
Dataset to be used as example
Maths
Programming
Results
Checking and diagnostics

Motivation

We want people to get better from illness
Scientifically
Understand the process of getting better
Understand the factors that affect the process
Statistical uncertainty and variation
- Better modelling of the variation \(\longrightarrow\) better modelling of the process

Data set

The ACTG 315 dataset, available in the ushr R package, includes longitudinal measurements of HIV viral load (log\(_{10}\) RNA copies/mL) over time. It features data on 46 patients, with the longest measurement recorded on Day 196 after baseline (Day 0).

Data set

For interest, note that the median \(\log_{10}\)RNA on Day 0 was 4.9395.

RNA by time and subject

Censoring

We observe two kinds of censoring:

Subjects stop taking part in the study at different lengths of time
The proportion of observations censored below the lower detection limit is 0.111

We address these in different ways:

By storing the data in long form and fitting a mixed effects model (each patient has their own curve)
By putting censored observations into the likelihood via their CDF (not PDF), i.e. \(P[X <= LDL]\) instead of \(P[X = x]\)

The modelling

We build up the model step by step.

Generalised Linear Models (GLMs)

A GLM revolves around two things:

A distribution explaining the variation
A link function \((g)\) connecting the expected value to the linear predictor

\[g(E[Y])=X\boldsymbol\beta \]

For continuous data we might assume that \(y_i \sim N\left(\mu_i, \sigma^2\right)\),

and that \(\mu_i=\mathbf{x}_i\boldsymbol\beta\) with \(g(\mu)=\mu\).

For location-scale distributions such as the Gaussian:

\[y_i \sim N\left(\mu_i, \sigma^2\right) \Leftrightarrow y_i=\mu_i+\varepsilon_i,\ \varepsilon_i\sim i.i.d.\ N\left(0, \sigma^2\right)\]

Mixed effects models

Mixed effects models include fixed effects and random effects (e.g. random intercepts or random slopes per subject)
Mixed effects models are used when we have sampling of nominal observation groups from a population of possible groups
Typically a random sample of people from a population of people, but with multiple observations per person
- Can also be a random sample of animals, random sample of fields, random sample of classes or schools

Random effects are important to help address the fact that our residuals are not independent.

Basic R fit of CD4 counts

A standard linear mixed model fit for the CD4 counts yields:

lme4::lmer(CD4 ~ Day + (1 | Patid), data = actg315) |> summary()

Linear mixed model fit by REML ['lmerMod']
Formula: CD4 ~ Day + (1 | Patid)
   Data: actg315

REML criterion at convergence: 782.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0818 -0.5632 -0.0200  0.5566  3.6203 

Random effects:
 Groups   Name        Variance Std.Dev.
 Patid    (Intercept) 0.5093   0.7137  
 Residual             0.3594   0.5995  
Number of obs: 361, groups:  Patid, 46

Fixed effects:
             Estimate Std. Error t value
(Intercept) 2.2692038  0.1126205  20.149
Day         0.0066982  0.0006782   9.876

Correlation of Fixed Effects:
    (Intr)
Day -0.216

Expanding away from LMs / GLMs / GLMMs

Instead of normal we could use t, skew-normal, skew-t, logistic, Laplace (double exponential)
- Can still move from conditional distribution to residual distribution
But the conditional distribution notation has more freedom
- The same notation accommodates many distributions
- In general you must give up on the idea of residuals though

Non-linear regression

Looking at the log RNA plot we see a non-linear pattern.
- First a fast curve down
- then a slow curve down/flat/up

\[\mu_{ij}\left(p_0,\mathbf{v}_i,\boldsymbol\beta\right) = \log_{10}\left( P_1^{\left(i\right)}\exp\left(-\lambda_1^{\left(i\right)}t_{ij}\right)+ P_2^{\left(i\right)}\exp\left(-\lambda_2^{\left(ij\right)}t_{ij}\right) \right)\]

where \(\mu_{ij}\left(p_0,\mathbf{v}_i,\boldsymbol\beta\right)\) is the \(p_0\) quantile under either SL- or SEP-distributed errors (to be defined soon).

Non-linear model details

For patient \(i\),

\[\begin{aligned} P_1^{\left(i\right)} &= \exp\left(\beta_1 + v_{i,1}\right), & \lambda_1^{\left(i\right)} &= \beta_2 + v_{i,2}, \\ P_2^{\left(i\right)} &= \exp\left(\beta_3 + v_{i,3}\right), & \lambda_2^{\left(ij\right)} &= \beta_4 + v_{i,4}+\gamma\mathrm{cd4}_{ij} \end{aligned}\]

so that \(P_1^{\left(i\right)}\) is the initial viral load of the short-lived, rapidly clearing compartment with clearance rate \(\lambda_1^{\left(i\right)}\),
whereas \(P_2^{\left(i\right)}\) is the initial load of the long-lived reservoir
whose slower clearance rate \(\lambda_2^{\left(ij\right)}\) is adjusted at each time point \(j\) by the corresponding CD4 count through the coefficient \(\gamma\).

The fixed effects vector \(\boldsymbol\beta=\left(\beta_1,\dots,\beta_4\right)^{\top}\) sets the population-level coefficients for the \(p_{0}^{th}\) quantile model, and the terms \(v_{i,k}\) capture patient-specific deviations from those coefficients.

Non-linear curve illustration

Non-symmetric conditional distributions

For symmetric distributions such as the normal, logistic, and t distributions:
- \(\mu\ \equiv\) mean \(\equiv\) median \(\equiv\) mode
For the rest we have to explicitly state what we are trying to explain

When building models, ask yourself whether you are trying to model or predict, then ask yourself whether you are trying to model or predict

The expected value (mean)
The middle value (median)
The most likely value (mode)

Non-symmetry illustration

If you think of your audience, what would they consider the typical value for something like income? Is it any of these?

What is the explanatory structure actually related to?

Quantile regression

We don’t have to focus only on the relationship between the explanatory variables and the expected or typical value!
Have you ever considered regression on the 95% Value-at-risk (VAR) directly?
- The factors explaining it might be quite different to the factors explaining the expected value.

The simplest quantile regression is median regression. One way to do it is to just replace the normal distribution with a Laplace distribution (double exponential - one exponential tail pointing left and one pointing right).

What about other quantiles though?

From the Laplace distribution the logical evolution is the Skew Laplace distribution.

We will look at the definition shortly, but let’s just visualise it in two plots first and discuss what we see:

Note that lower quantiles produce right skew and upper quantiles produce left skew. This makes sense when modelling a simple sample:

But it makes less sense when modelling residuals, where upper observations could easily have right skew residuals:

Conditional distributions for the current data

The dependent values are assumed to follow one of two quantile-parameterized distributions, conditional on explanatory factors that may include both fixed and random effects. The two distributions are the skew Laplace distribution and the skew exponential power distribution.

Skew Laplace distribution

Let \(\mu\), \(\sigma\), and \(p_0\) specify the SL distribution, where \(\mu\) is set to correspond to the \(p_0^{th}\) quantile, \(\sigma > 0\) is the scale parameter, and \(p_0 \in \left(0,1\right)\) governs the skewness (Geraci and Bottai 2007).

The SL PDF is given by:

\[f_{\text{SL}} \left(y \left| \mu, \sigma, p_0 \right.\right) = \begin{cases} \displaystyle \frac{2 p_0 \left(1 - p_0\right)}{\sigma} \exp\left(\frac{2\left(y - \mu\right)\left(1 - p_0\right)}{\sigma}\right), & \text{if } y \leq \mu, \\ \displaystyle \frac{2 p_0 \left(1 - p_0\right)}{\sigma} \exp\left(-\frac{2 p_0 \left(y - \mu\right)}{\sigma}\right), & \text{if } y > \mu. \end{cases}\]

The corresponding CDF is:

\[F_{\text{SL}} \left(y \left| \mu, \sigma, p_0 \right.\right) = \begin{cases} \displaystyle p_0 \exp\left(\frac{2\left(y - \mu\right)\left(1 - p_0\right)}{\sigma}\right), & \text{if } y \leq \mu, \\ \displaystyle 1 - \left(1 - p_0\right) \exp\left(-\frac{2 p_0 \left(y - \mu\right)}{\sigma}\right), & \text{if } y > \mu. \end{cases}\]

The SL distribution has moderately heavy tails.

Skew exponential power distribution (SEP)

To achieve greater flexibility in both skewness and tail thickness while retaining an analytical CDF, one can adopt the SEP distribution. Let \(\mu\) denote the \(p_0^{th}\) quantile (\(0<p_{0}<1\)), \(\sigma>0\) the scale, and \(\kappa_{1},\kappa_{2}>0\) the left- and right-tail shape parameters. Following (Zhu and Zinde-Walsh 2009), the PDF is

\[f_{\text{SEP}}\left(y \left| \mu, \sigma, \kappa_1, \kappa_2, p_0 \right.\right) = \begin{cases} \frac{1}{\sigma} \exp\left(-\frac{1}{\kappa_1}\left(\frac{\mu - y}{2p_0\sigma K_1}\right)^{\kappa_1}\right), & y \leq \mu, \\ \frac{1}{\sigma} \exp\left(-\frac{1}{\kappa_2}\left(\frac{y - \mu}{2\left(1 - p_0\right)\sigma K_2}\right)^{\kappa_2}\right), & y > \mu, \end{cases}\]

where the normalizing constants are

\[K_j = \frac{\kappa_j^{-1/\kappa_j}}{2 \Gamma\left(1 + \frac{1}{\kappa_j}\right)}, \quad j=1,2.\]

SEP distribution properties

The distribution satisfies:

\[\int_{-\infty}^{\mu}f_{\text{SEP}}\left(y \left| \mu, \sigma, \kappa_1, \kappa_2, p_0 \right.\right)\mathrm{d}y = p_0,\]

thus defining \(\mu\) as the \(p_0\) quantile.

For \(\kappa_{1}<2\), the left tail of the SEP distribution is heavier than a normal tail,
and for \(\kappa_{2}<2\), its right tail is heavier.
When either shape parameter exceeds 2, the corresponding tail becomes lighter than the normal tail (with \(\kappa_{1}=\kappa_{2}=2\) reproducing normal tails on both sides).

SEP distribution CDF

A corrected CDF derivation appears in the paper’s supplementary material. It fixes a scaling error in the incomplete gamma term of the expression reported by (Zhu and Zinde-Walsh 2009). The resulting CDF is

\[F_{\text{SEP}}\left(y \left| \mu, \sigma, \kappa_1, \kappa_2, p_0 \right.\right) = \begin{cases} p_0 \left(1 - G\left(\frac{1}{\kappa_1}\left(\frac{\mu - y}{2 p_0 \sigma K_1}\right)^{\kappa_1}, \frac{1}{\kappa_1}\right)\right), & y \leq \mu, \\ p_0 + \left(1 - p_0\right) G\left(\frac{1}{\kappa_2}\left(\frac{y - \mu}{2\left(1 - p_0\right) \sigma K_2}\right)^{\kappa_2}, \frac{1}{\kappa_2}\right), & y > \mu, \end{cases}\]

where \(G\left(a, b\right)\) is the regularized incomplete gamma function:

\[G\left(a, b\right) = \frac{1}{\Gamma\left(b\right)} \int_{0}^{a} t^{b-1} e^{-t} \mathrm{d}t,\]

The SEP distribution reduces to a normal distribution with mean \(\mu\) and standard deviation \(\sigma/\sqrt{2\pi}\) when \(p_0 = 0.5\) and \(\kappa_1 = \kappa_2 = 2\). When \(p_0 = 0.5\) and \(\kappa_1 = \kappa_2 = 1\), it becomes the symmetric Laplace distribution. As \(\left(\kappa_1, \kappa_2\right) \to \infty\), the SEP distribution approaches a uniform distribution.

SEP distribution visualisation

If we vary \(\kappa_2\) it changes the right shape independently of \(p_0\).

It is not radically different from the established SKL model, but far more flexible.

The implementation and results

Let’s get practical…

Model fit procedure

The best tool for fitting this particular model is JAGS
- JAGS stands for Just Another Gibbs Sampler
The fitting was done using the R package runjags (Denwood 2016)
JAGS uses clever Gibbs sampling to arrive at posterior simulations for all the parameters
Once you have posterior simulations you can calculate anything you want
- Fits and predictions for a particular subject
- Fits for the average subject
- Predictions for a random future subject, with full uncertainty

Skew Laplace model

model {
  for (r in 1:4) {
    L_chol_v_rna[r, r] ~ dt(0, 0.5, 3)T(0, )
    for (c in 1:(r - 1)) {
      L_chol_v_rna[r, c] ~ dnorm(0, 0.001)
    }
    for (c in (r + 1):4) {
      L_chol_v_rna[r, c] <- 0
    }
  }

  for (r in 1:4) {
    for (c in 1:4) {
      inv_Sigma_v_rna[r, c] <- inprod(L_chol_v_rna[r, 1:4], L_chol_v_rna[c, 1:4])
    }
  }

  for (i in 1:N) {
    v_rna[i, 1:4] ~ dmnorm(beta_rna[1:4], inv_Sigma_v_rna[, ])
  }

  for (i in 1:N_total) {
    log_P1[i]  <- v_rna[id[i], 1]
    lambda1[i] <- v_rna[id[i], 2]
    log_P2[i]  <- v_rna[id[i], 3]
    lambda2[i] <- v_rna[id[i], 4] + gamma_rna*cd4[i]

    log(P1[i]) <- log_P1[i]
    log(P2[i]) <- log_P2[i]

    mu[i] <- log(P1[i]*exp(-lambda1[i]*time[i]) + P2[i]*exp(-lambda2[i]*time[i]))/log(10)

    delta[i]     <- step(y[i] - mu[i])
    pdf_left[i]  <- (2*p0*(1 - p0)/sigma_rna_resid)*
                    exp( 2*(y[i] - mu[i])*(1 - p0)/sigma_rna_resid)*
                    (1 - delta[i])
    pdf_right[i] <- (2*p0*(1 - p0)/sigma_rna_resid)*
                    exp(-2*p0*(y[i] - mu[i])/sigma_rna_resid)*
                    delta[i]
    pdf[i] <- pdf_left[i] + pdf_right[i]

    cdf_left[i]  <- p0*exp(2*(y[i] - mu[i])*(1 - p0)/sigma_rna_resid)
    cdf_right[i] <- 1 - (1 - p0)*exp(-2*p0*(y[i] - mu[i])/sigma_rna_resid)
    cdf[i] <- cdf_left[i]*(1 - delta[i]) + cdf_right[i]*delta[i]

    like[i] <- pow(pdf[i], censor[i])*pow(cdf[i], 1 - censor[i])
    p[i] <- like[i]/C
    ones[i] ~ dbern(p[i])
  }

  for (j in 1:4) {
    beta_rna[j] ~ dnorm(0, 0.001)
  }

  gamma_rna ~ dnorm(0, 0.001)
  sigma_rna_resid ~ dt(0, 0.5, 3)T(0, )
}

Skew exponential power model

model {
  for (r in 1:4) {
    L_chol_v_rna[r, r] ~ dt(0, 0.5, 3)T(0, )
    for (c in 1:(r - 1)) {
      L_chol_v_rna[r, c] ~ dnorm(0, 0.001)
    }
    for (c in (r + 1):4) {
      L_chol_v_rna[r, c] <- 0
    }
  }

  for (r in 1:4) {
    for (c in 1:4) {
      inv_Sigma_v_rna[r, c] <- inprod(L_chol_v_rna[r, 1:4], L_chol_v_rna[c, 1:4])
    }
  }

  for (i in 1:N) {
    v_rna[i, 1:4] ~ dmnorm(beta_rna[1:4], inv_Sigma_v_rna[, ])
  }

  K_1 <- pow(kappa_left, -1/kappa_left)/(2*exp(loggam(1 + 1/kappa_left)))
  K_2 <- pow(kappa_right, -1/kappa_right)/(2*exp(loggam(1 + 1/kappa_right)))

  for (i in 1:N_total) {
    log_P1[i] <- v_rna[id[i], 1]
    lambda1[i] <- v_rna[id[i], 2]
    log_P2[i] <- v_rna[id[i], 3]
    lambda2[i] <- v_rna[id[i], 4] + gamma_rna*cd4[i]

    log(P1[i]) <- log_P1[i]
    log(P2[i]) <- log_P2[i]

    mu[i] <- log(P1[i]*exp(-lambda1[i]*time[i]) + P2[i]*exp(-lambda2[i]*time[i]))/log(10)

    delta[i] <- step(y[i] - mu[i])
    left_term[i]  <- pow((mu[i] - y[i])/(2*p0*sigma_rna_resid*K_1), kappa_left)
    right_term[i] <- pow((y[i] - mu[i])/(2*(1 - p0)*sigma_rna_resid*K_2), kappa_right)

    pdf_left[i]  <- (1/sigma_rna_resid)*exp(-(1/kappa_left)*left_term[i])*(1 - delta[i])
    pdf_right[i] <- (1/sigma_rna_resid)*exp(-(1/kappa_right)*right_term[i])*delta[i]
    pdf[i] <- pdf_left[i] + pdf_right[i]

    a_left[i]  <- (1/kappa_left)*left_term[i]
    a_right[i] <- (1/kappa_right)*right_term[i]
    G_left[i]  <- pgamma(a_left[i],  1/kappa_left,  1)
    G_right[i] <- pgamma(a_right[i], 1/kappa_right, 1)

    cdf_left[i]  <- p0 + (-G_left[i]*p0)
    cdf_right[i] <- p0 + (1 - p0)*G_right[i]
    cdf[i] <- cdf_left[i]*(1 - delta[i]) + cdf_right[i]*delta[i]

    like[i] <- pow(pdf[i], censor[i])*pow(cdf[i], 1 - censor[i])
    p[i] <- like[i]/C
    ones[i] ~ dbern(p[i])
  }

  for (j in 1:4) {
    beta_rna[j] ~ dnorm(0, 0.001)
  }

  gamma_rna ~ dnorm(0, 0.001)
  sigma_rna_resid ~ dt(0, 0.5, 3)T(0, )
  kappa_left      ~ dt(0, 0.5, 3)T(0, )
  kappa_right     ~ dt(0, 0.5, 3)T(0, )
}

Real data parameter estimates

\(p_{0}\)	\(\beta_{1}\)	\(\beta_{2}\)	\(\beta_{3}\)	\(\beta_{4}\)	\(\gamma\)	\(\sigma\)	\(\kappa_{1}\)	\(\kappa_{2}\)
0.05	10.96	0.33	5.88	-0.01	0.01	0.52	0.35	1.43
0.25	11.28	0.33	6.32	0.00	0.00	0.33	0.57	0.83
0.50	11.63	0.34	6.82	0.01	0.00	0.33	0.91	0.62
0.75	12.02	0.35	7.24	0.01	0.00	0.48	1.69	0.49
0.95	12.38	0.36	7.49	0.00	0.00	0.68	2.98	0.35

It looks like the people that were less sick were more affected by CD4 counts.

Is it possible that people who feel healthier are less diligent in taking their medication?

Model comparison

For model comparison, we use the marginal likelihood, which we obtain by integrating out the random effects instead of conditioning on them (Gelman et al. 2013)
Many Bayesian mixed-effects analyses assess model fit by conditioning on subject-specific effects
- A common tool is the conditional deviance information criterion of Spiegelhalter et al. (2002), appreciated for its straightforward calculation from typical MCMC output
However, treating the latent random effects as fixed parameters ignores their prior uncertainty and ultimately understates model complexity (Quintero and Lesaffre 2018)
The marginal likelihood penalizes that complexity automatically (de Santis and Spezzaferri 1998)
Because the resulting high-dimensional integral is analytically intractable, we compute it with bridge sampling algorithms, which give accurate numerical estimates (Gronau, Heathcote, and Matzke 2020)

Model comparison results

Marginal log likelihood - larger is better

On the Kass-Raftery scale, differences above \(2\) are considered positive evidence, and those above \(10\) are deemed decisive (Kass and Raftery 1995)
These results favour the SEP, especially for the higher-quantiles

Real data residual diagnostics

We used the DHARMa approach to check the goodness-of-fit for the two models (Hartig and Lohse 2022)
Fits were nearly perfect for sicker patients but flawed for less sick patients
The two models were equally good for lower quantiles \((p_0=0.1\text{ and }p_0=0.25)\)
The SEP model produced a superior fit for middle to high quantiles \((p_0=0.5,0.75,0.9)\)

Prediction for example subject

Here we consider Subject 8. Thanks to the use of random effects we obtain personalised predictions:

Simulation study

A simulation study was performed under a few scenarios similar to the data set being studied.

This allows us to confirm that the model fitting works as intended.

A model that can’t fit data meeting its assumptions is definitely not going to fit real data.

Simulation study results, low censoring

Cen.	\(p_0\)	\(\kappa_1\)	\(\kappa_2\)	Param.	TRUE	SKL.Bias	SKL.RMSE	SKL.Len	SKL.CP	SEP.Bias	SEP.RMSE	SEP.Len	SEP.CP
0.05	0.5	2	0.5	\(\beta_{1}\)	5.00	0.1230	0.1610	0.4641	0.840	0.0254	0.1054	0.4430	0.963
				\(\beta_{2}\)	-0.25	0.0031	0.1821	0.7487	0.933	0.0037	0.1808	0.7371	0.940
0.05	0.5	1	1	\(\beta_{1}\)	5.00	-0.0075	0.0939	0.4220	0.960	-0.0077	0.0937	0.4315	0.967
				\(\beta_{2}\)	-0.25	0.0124	0.1865	0.7301	0.953	0.0124	0.1863	0.7312	0.957
0.05	0.5	0.5	2	\(\beta_{1}\)	5.00	-0.1089	0.1514	0.4592	0.857	-0.0110	0.1040	0.4425	0.973
				\(\beta_{2}\)	-0.25	0.0149	0.1797	0.7525	0.957	0.0132	0.1786	0.7485	0.953
0.05	0.8	2	0.5	\(\beta_{1}\)	5.00	0.0829	0.1400	0.4586	0.910	0.0252	0.1054	0.4344	0.957
				\(\beta_{2}\)	-0.25	0.0056	0.1631	0.7247	0.963	0.0052	0.1618	0.7176	0.953
0.05	0.8	1	1	\(\beta_{1}\)	5.00	0.0014	0.0971	0.4132	0.943	0.0150	0.0989	0.4199	0.943
				\(\beta_{2}\)	-0.25	0.0136	0.1881	0.7340	0.927	0.0137	0.1879	0.7339	0.930
0.05	0.8	0.5	2	\(\beta_{1}\)	5.00	0.0113	0.1025	0.4875	0.977	0.0055	0.0994	0.4650	0.983
				\(\beta_{2}\)	-0.25	-0.0061	0.1930	0.7459	0.960	-0.0063	0.1911	0.7432	0.960

For bias, root mean square error (RMSE), and interval length (Len): closer to 0 is better
For coverage probability (CP): closer to 0.95 is better

Simulation study results, more censoring

Cen.	\(p_0\)	\(\kappa_1\)	\(\kappa_2\)	Param.	TRUE	SKL.Bias	SKL.RMSE	SKL.Len	SKL.CP	SEP.Bias	SEP.RMSE	SEP.Len	SEP.CP
0.11	0.5	2	0.5	\(\beta_{1}\)	5.00	0.1040	0.1486	0.4757	0.897	0.0120	0.1042	0.4543	0.970
				\(\beta_{2}\)	-0.25	-0.0155	0.1872	0.7650	0.937	-0.0113	0.1848	0.7478	0.937
0.11	0.5	1	1	\(\beta_{1}\)	5.00	-0.0013	0.0917	0.4238	0.977	-0.0007	0.0944	0.4322	0.980
				\(\beta_{2}\)	-0.25	0.0089	0.1896	0.7357	0.933	0.0086	0.1894	0.7354	0.930
0.11	0.5	0.5	2	\(\beta_{1}\)	5.00	-0.1245	0.1602	0.4706	0.847	-0.0296	0.1018	0.4504	0.963
				\(\beta_{2}\)	-0.25	0.0024	0.1858	0.7352	0.937	0.0024	0.1831	0.7321	0.937
0.11	0.8	2	0.5	\(\beta_{1}\)	5.00	0.0784	0.1302	0.4642	0.937	0.0294	0.1036	0.4435	0.963
				\(\beta_{2}\)	-0.25	-0.0106	0.1808	0.7327	0.930	-0.0072	0.1802	0.7252	0.937
0.11	0.8	1	1	\(\beta_{1}\)	5.00	0.0059	0.0918	0.4144	0.970	0.0185	0.0949	0.4211	0.960
				\(\beta_{2}\)	-0.25	-0.0087	0.1708	0.7259	0.933	-0.0087	0.1708	0.7256	0.933
0.11	0.8	0.5	2	\(\beta_{1}\)	5.00	-0.0111	0.1009	0.4805	0.977	-0.0110	0.0987	0.4626	0.973
				\(\beta_{2}\)	-0.25	0.0111	0.1756	0.7615	0.957	0.0118	0.1735	0.7613	0.957

For bias, root mean square error (RMSE), and interval length (Len): closer to 0 is better
For coverage probability (CP): closer to 0.95 is better

Conclusion

Summary

A flexible residual structure in the parametric regression framework allows for accurately capturing both the relationships and the residual variation.
- Thus, we can do both inference and prediction with uncertainty from one model fit (or 5).

This presentation was created using the Reveal.js format in Quarto, using the RStudio IDE. Font and line colours according to UFS branding, and background image using image editor GIMP by compositing images from CoPilot (DALL-E_v3).

References

de Santis, F., and F. Spezzaferri. 1998. “Bayes factors and hierarchical models.” Journal of Statistical Planning and Inference 74 (2): 323–42.

Denwood, M. J. 2016. “runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS.” Journal of Statistical Software 71 (9): 1–25.

Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. 2013. Bayesian Data Analysis. 3rd ed. New York, NY: Chapman; Hall/CRC.

Geraci, M., and M. Bottai. 2007. “Quantile regression for longitudinal data using the asymmetric Laplace distribution.” Biostatistics 8 (1): 140–54.

Gronau, Q. F., A. Heathcote, and D. Matzke. 2020. “Computing Bayes factors for evidence-accumulation models using Warp-III bridge sampling.” Behavior Research Methods 52 (2): 918–37.

Hartig, F., and L. Lohse. 2022. DHARMa: residual diagnostics for hierarchical (multi-level / mixed) regression models. https://florianhartig.github.io/DHARMa/.

Kass, R. E., and A. E. Raftery. 1995. “Bayes factors.” Journal of the American Statistical Association 90 (430): 773–95.

Quintero, A., and E. Lesaffre. 2018. “Comparing hierarchical models via the marginalized deviance information criterion.” Statistics in Medicine 37 (16): 2440–2454. DOI: 10.1002/sim.7649.

Spiegelhalter, D. J., N. G. Best, B. P. Carlin, and A. van der Linde. 2002. “Bayesian measures of model complexity and fit.” Journal of the Royal Statistical Society: Series B 64 (4): 583-639. DOI: 10.1111/1467-9868.00353.

Zhu, D., and V. Zinde-Walsh. 2009. “Properties and estimation of asymmetric exponential power distribution.” Journal of Econometrics 148 (1): 86–99.