Robust variable selection under cellwise contamination

Su Peng*, Garth Tarr, Samuel Muller

*Corresponding author for this work

Research output: Working paperPreprint

Abstract

Cellwise outliers are widespread in data and traditional robust methods may fail when applied to datasets under such contamination. We propose a variable selection procedure, that uses a pairwise robust estimator to obtain an initial empirical covariance matrix among the response and potentially many predictors. Then we replace the primary design matrix and the response vector with their robust counterparts based on the estimated covariance matrix. Finally, we adopt the adaptive Lasso to obtain variable selection results. The proposed approach is robust to cellwise outliers in regular and high dimensional settings and empirical results show good performance in comparison with recently proposed alternative robust approaches, particularly in the challenging setting when contamination rates are high but the magnitude of outliers is moderate. Real data applications demonstrate the practical utility of the proposed method
Original languageEnglish
PublisherarXiv.org
DOIs
Publication statusSubmitted - 2021

Publication series

NamearXiv

Fingerprint

Dive into the research topics of 'Robust variable selection under cellwise contamination'. Together they form a unique fingerprint.

Cite this