TY - JOUR
T1 - Deciphering the complex
T2 - methodological overview of statistical models to derive OMICS-based biomarkers
AU - Chadeau-Hyam, Marc
AU - Campanella, Gianluca
AU - Jombart, Thibaut
AU - Bottolo, Leonardo
AU - Portengen, Lutzen
AU - Vineis, Paolo
AU - Liquet, Benoit
AU - Vermeulen, Roel C. H.
PY - 2013/8
Y1 - 2013/8
N2 - Recent technological advances in molecular biology have given rise to numerous large-scale datasets whose analysis imposes serious methodological challenges mainly relating to the size and complex structure of the data. Considerable experience in analyzing such data has been gained over the past decade, mainly in genetics, from the Genome-Wide Association Study era, and more recently in transcriptomics and metabolomics. Building upon the corresponding literature, we provide here a nontechnical overview of well-established methods used to analyze OMICS data within three main types of regression-based approaches: univariate models including multiple testing correction strategies, dimension reduction techniques, and variable selection models. Our methodological description focuses on methods for which ready-to-use implementations are available. We describe the main underlying assumptions, the main features, and advantages and limitations of each of the models. This descriptive summary constitutes a useful tool for driving methodological choices while analyzing OMICS data, especially in environmental epidemiology, where the emergence of the exposome concept clearly calls for unified methods to analyze marginally and jointly complex exposure and OMICS datasets. Environ. Mol. Mutagen. 54:542-557, 2013.
AB - Recent technological advances in molecular biology have given rise to numerous large-scale datasets whose analysis imposes serious methodological challenges mainly relating to the size and complex structure of the data. Considerable experience in analyzing such data has been gained over the past decade, mainly in genetics, from the Genome-Wide Association Study era, and more recently in transcriptomics and metabolomics. Building upon the corresponding literature, we provide here a nontechnical overview of well-established methods used to analyze OMICS data within three main types of regression-based approaches: univariate models including multiple testing correction strategies, dimension reduction techniques, and variable selection models. Our methodological description focuses on methods for which ready-to-use implementations are available. We describe the main underlying assumptions, the main features, and advantages and limitations of each of the models. This descriptive summary constitutes a useful tool for driving methodological choices while analyzing OMICS data, especially in environmental epidemiology, where the emergence of the exposome concept clearly calls for unified methods to analyze marginally and jointly complex exposure and OMICS datasets. Environ. Mol. Mutagen. 54:542-557, 2013.
KW - OMICS data
KW - biomarkers
KW - statistical review
UR - http://www.scopus.com/inward/record.url?scp=84882254552&partnerID=8YFLogxK
U2 - 10.1002/em.21797
DO - 10.1002/em.21797
M3 - Review article
C2 - 23918146
AN - SCOPUS:84882254552
SN - 0893-6692
VL - 54
SP - 542
EP - 557
JO - Environmental and Molecular Mutagenesis
JF - Environmental and Molecular Mutagenesis
IS - 7
ER -