Abstract
Cellwise contamination remains a challenging problem for data scientists, particularly in research fields that require the selection of sparse features. Traditional robust methods may not be feasible nor efficient in dealing with such contaminated datasets. A robust Lasso-type cellwise regularization procedure is proposed which is coined CR-Lasso, that performs feature selection in the presence of cellwise outliers by minimising a regression loss and cell deviation measure simultaneously. The evaluation of this approach involves simulation studies that compare its selection and prediction performance with several sparse regression methods. The results demonstrate that CR-Lasso is competitive within the considered settings. The effectiveness of the proposed method is further illustrated through an analysis of a bone mineral density dataset.
Original language | English |
---|---|
Article number | 107971 |
Pages (from-to) | 1-14 |
Number of pages | 14 |
Journal | Computational Statistics and Data Analysis |
Volume | 197 |
DOIs | |
Publication status | Published - Sept 2024 |
Bibliographical note
© 2024 The Author(s). Published by Elsevier B.V. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Cellwise contamination
- Cellwise regularization
- Feature selection
- Robust sparse regression