Comparison of sliced inverse regression approaches for underdetermined cases

Raphaël Coudret, Benoit Liquet, Jérôme Saracco

Research output: Contribution to journalArticlepeer-review

Abstract

Among methods to analyze high-dimensional data, the sliced inverse regression (SIR) is of particular interest for non-linear relations between the dependent variable and some indices of the covariate. When the dimension of the covariate is greater than the number of observations, classical versions of SIR cannot be applied. Various upgrades were then proposed to tackle this issue such as regularized SIR (RSIR) and sparse ridge SIR (SR-SIR), to estimate the parameters of the underlying model and to select variables of interest. In this paper, we introduce two new estimation methods respectively based on the QZ algorithm and on the Moore-Penrose pseudo-inverse. We also describe a new selection procedure of the most relevant components of the covariate that relies on a proximity criterion between submodels and the initial one. These approaches are compared with RSIR and SR-SIR in a simulation study. Finally we applied SIR-QZ and the associated selection procedure to a genetic dataset in order to find markers that are linked to the expression of a gene. These markers are called expression quantitative trait loci (eQTL).
Original languageEnglish
Pages (from-to)72-96
Number of pages25
JournalJournal de la Société Française de Statistique
Volume155
Issue number2
Publication statusPublished - 2014
Externally publishedYes

Keywords

  • dimension reduction
  • high-dimensional data
  • semiparametric regression
  • sparsity

Fingerprint

Dive into the research topics of 'Comparison of sliced inverse regression approaches for underdetermined cases'. Together they form a unique fingerprint.

Cite this