Abstract
We consider a specific class of regression models with discrete latent variables, which are commonly used in actuarial science and other fields. When fitting these parametric regression models, regression functions are estimated for both the observed response variable and the latent variable, respectively. Feature engineering, variable selection and model selection become challenging due to the involvement of multiple regression functions and latent variable. To address these challenges, we propose additive tree latent variable models. To calibrate these models, we introduce an iteratively re-weighted gradient boosting (IRGB) algorithm that combines the EM algorithm with the gradient boosting. In each iteration, the IRGB algorithm trains only one weak learner in a stagewise manner. Theoretical analysis demonstrates the monotonic behavior of the likelihood in the IRGB algorithm. We further illustrate the advantages of the proposed nonparametric methods through an empirical example of motor insurance claim counts and a case study on French motor third-party liability insurance pure premiums.
| Original language | English |
|---|---|
| Article number | 103168 |
| Pages (from-to) | 1-10 |
| Number of pages | 10 |
| Journal | Insurance: Mathematics and Economics |
| Volume | 125 |
| DOIs | |
| Publication status | Published - Nov 2025 |
Keywords
- Nonparametric regression
- EM Algorithm
- Gradient boosting
- Risk classification
Fingerprint
Dive into the research topics of 'Additive tree latent variable models with applications to insurance loss prediction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver