Bayesian Approaches in Actuarial Modelling

Adriaan Rowan and Sean van der Merwe

Introduction

Motivation

.

Outline

Short term insurance

Introduction

In this example we consider a data set of death events, with explanatory variables ALB, Gender, log_salary, Industry, Client.

Random effects

  • When a factor level will be known exactly going forward then it is treated as a fixed effect.
  • When a factor level is a random draw from a larger group of possible factor levels, and the future spread of these factor levels may expand or change, then they are classed as random effects.
  • Typically, the levels are referred to as ‘subjects’ in the sense of a statistical experiment.
  • Random effects are important to incorporate in models when we have repeated sampling from each ‘subject’, especially if that sampling is unbalanced, because samples tend to be correlated within each subject.

The case of Clients

  • When a product will only ever be sold to a fixed set of clients, and we already have data for all of them, then we can treat ‘Client’ as a fixed effect.
  • Typically this is not the case though.
  • Usually the existing clients are a sample from a bigger pool of possible clients.
  • Thus, ‘Client’ should actually be a random effect in the models.
  • It would be ideal to include all explanatory variables that account for the differences between clients into the model (instead of ‘Client’) but that is rarely possible.

The case of Industries

  • Industries can typically be classified exhaustively, implying fixed effects.
    • Again, it would be better to explain all the differences between industries using the underlying causes if it were possible.
  • And yet, should experience only be available for some industries and the company wishes to expand products to other industries then it might be prudent to model industry as a random effect instead.

Example scenario

We will assume that we are currently serving Clients A, B, F, I, J, and P.

library(rstanarm)
options(mc.cores = 4)
Clients <- c("A", "B", "F", "I", "J", "P")
Training <- full_data$Client %in% Clients
training_sample <- full_data |> filter(Training)
fit1 <- stan_glmer(Target ~ ALB*log_salary*Gender + Industry + (1|Client), 
                   data = training_sample, 
                   family = poisson(link = "log"))

Why random effects?

If we consider the client sizes:

We see that they are severely unbalanced. Since clients differ in terms of unmeasured variables then the resulting fit will be biased to the experience of the larger clients.

Any new client will be assumed to be like the large clients, regardless of their specific characteristics.

Fit results

Frequentist fit

library(lme4)
library(lmerTest)
freq_fit <- glmer(Target ~ ALB*log_salary*Gender + Industry + (1|Client), 
                   data = training_sample, 
                   family = poisson(link = "log"))

Premiums

  • Let the premiums be set at the 60% quantile of the distribution of expected losses for each client.
  • Already Bayes will have an impact since the parameter uncertainty is taken into account.
    • Common approaches assume parameters are accurately estimated, but this is not even close to true here.
  • On the next slide the expected payout density from the model is illustrated, standardised by number of individuals.
    • The “All” client represents setting a combined premium for all clients, taking their random effects into account.
  • We must account for the random effects, otherwise the premium will be biased to the experience of the larger clients.

Client difference illustration

New clients come in

Client with no history

  • Now Client C arrives with no mortality data.

Observations from plot

  • We see that we should be setting a higher premium than is suggested by just the demographics (C), in order to account for the extra uncertainty from not knowing the specific peculiarities of this client (Random).

Initial data comes in

  • Suppose we now have some mortality data for Client C and new Client K.
  • We can compare the observed mortality to the expected mortality distribution.
  • Or we can compare the observed payout to the expected payout distribution.
  • Does not require re-fitting the model.
    • Refitting the model is almost always better though as it lets you do more.
      • Refitting the model would allow you to compare distributions, not just a point value.

Point comparison of deaths

Point comparison of payouts

We clearly lost money on Client C, but using the Bayesian mixed effects model reduced the loss.

Does Client A need adjustment?

  • Suppose that the premium for Client A was set using the data from Clients B, F, I, J, & P, at 170/individual.
  • Then we observed Client A and refit the model including their data.
  • The goal now is to determine whether action is needed based on the experience.
  • One approach is to calculate the profit distribution under the current premium and then determine whether that is outside a tolerance.
    • The profit distribution has the advantage of being able to apply a non-linear profit function (without having to fit another model).

Payout distribution difference A

First we look at the distribution of the difference between the payout for an average client and Client A.

Payout distribution difference P

Second we look at the distribution of the difference between the payout for an average client and Client P.

Profit distribution for Client P

profit_f <- function(events, sum_assured, premium = 170) {
  length(events)*premium^0.95*0.95 - 10000 - events%*%(sum_assured+200)
}

Conclusion

The end

.

Thank you for your time and attention.

This presentation was created using the Reveal.js format in Quarto, using the RStudio IDE. Font and line colours according to UFS branding, and background image by Midjouney using image editor GIMP.