Additive tree latent variable models with applications to insurance loss prediction

Zhihao Wang, Yanlin Shi, Guangyuan Gao

Research output: Contribution to journalArticlepeer-review

Abstract

We consider a specific class of regression models with discrete latent variables, which are commonly used in actuarial science and other fields. When fitting these parametric regression models, regression functions are estimated for both the observed response variable and the latent variable, respectively. Feature engineering, variable selection and model selection become challenging due to the involvement of multiple regression functions and latent variable. To address these challenges, we propose additive tree latent variable models. To calibrate these models, we introduce an iteratively re-weighted gradient boosting (IRGB) algorithm that combines the EM algorithm with the gradient boosting. In each iteration, the IRGB algorithm trains only one weak learner in a stagewise manner. Theoretical analysis demonstrates the monotonic behavior of the likelihood in the IRGB algorithm. We further illustrate the advantages of the proposed nonparametric methods through an empirical example of motor insurance claim counts and a case study on French motor third-party liability insurance pure premiums.
Original languageEnglish
Article number103168
Pages (from-to)1-10
Number of pages10
JournalInsurance: Mathematics and Economics
Volume125
DOIs
Publication statusPublished - Nov 2025

Keywords

  • Nonparametric regression
  • EM Algorithm
  • Gradient boosting
  • Risk classification

Fingerprint

Dive into the research topics of 'Additive tree latent variable models with applications to insurance loss prediction'. Together they form a unique fingerprint.

Cite this