Sparse partial least squares with group and subgroup structure

Matthew Sutton*, Rodolphe Thiébaut, Benoît Liquet

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)

Abstract

Integrative analysis of high dimensional omics datasets has been studied by many authors in recent years. By incorporating prior known relationships among the variables, these analyses have been successful in elucidating the relationships between different sets of omics data. In this article, our goal is to identify important relationships between genomic expression and cytokine data from a human immunodeficiency virus vaccine trial. We proposed a flexible partial least squares technique, which incorporates group and subgroup structure in the modelling process. Our new method accounts for both grouping of genetic markers (eg, gene sets) and temporal effects. The method generalises existing sparse modelling techniques in the partial least squares methodology and establishes theoretical connections to variable selection methods for supervised and unsupervised problems. Simulation studies are performed to investigate the performance of our methods over alternative sparse approaches. Our R package sgspls is available at https://github.com/matt-sutton/sgspls.

Original languageEnglish
Pages (from-to)3338-3356
Number of pages19
JournalStatistics in Medicine
Volume37
Issue number23
DOIs
Publication statusPublished - 15 Oct 2018
Externally publishedYes

Keywords

  • feature selection
  • group variable selection
  • latent variable modelling
  • partial least squares

Fingerprint

Dive into the research topics of 'Sparse partial least squares with group and subgroup structure'. Together they form a unique fingerprint.

Cite this