TY - GEN
T1 - Prediction of coronary artery disease risk using genetic and phenotypic variables
AU - Sng, Letitia M.F.
AU - Sharma, Reevanshi
AU - Bagot, Sam
AU - Bauer, Denis C.
AU - Twine, Natalie A.
N1 - Copyright the Author(s) 2024. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
PY - 2024
Y1 - 2024
N2 - Coronary artery disease (CAD) has the highest disease burden worldwide. To manage this burden, predictive models are required to screen patients for preventative treatment. A range of variables have been explored for their capacity to predict disease, including phenotypic (age, sex, BMI and smoking status), medical imaging (carotid artery thickness) and genotypic. We use a machine learning models and the UK Biobank cohort to measure the prediction capacity of these 3 variable categories, both in combination and isolation. We demonstrate that phenotypic variables from the Framingham risk score have the best prediction capacity, although a combination of phenotypic, medical imaging and genotypic variables deliver the most specific models. Furthermore, we demonstrate that Variant Spark, a random forest based GWAS platform, performs effective feature selection for SNP-based genotype variables, identifying 115 significantly associated SNPs to the CAD phenotype.
AB - Coronary artery disease (CAD) has the highest disease burden worldwide. To manage this burden, predictive models are required to screen patients for preventative treatment. A range of variables have been explored for their capacity to predict disease, including phenotypic (age, sex, BMI and smoking status), medical imaging (carotid artery thickness) and genotypic. We use a machine learning models and the UK Biobank cohort to measure the prediction capacity of these 3 variable categories, both in combination and isolation. We demonstrate that phenotypic variables from the Framingham risk score have the best prediction capacity, although a combination of phenotypic, medical imaging and genotypic variables deliver the most specific models. Furthermore, we demonstrate that Variant Spark, a random forest based GWAS platform, performs effective feature selection for SNP-based genotype variables, identifying 115 significantly associated SNPs to the CAD phenotype.
KW - Cardiovascular disease
KW - disease risk prediction
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85183586561&partnerID=8YFLogxK
U2 - 10.3233/SHTI231119
DO - 10.3233/SHTI231119
M3 - Conference proceeding contribution
C2 - 38269969
AN - SCOPUS:85183586561
SN - 9781643684567
VL - 310
T3 - Studies in Health Technology and Informatics
SP - 1021
EP - 1025
BT - MEDINFO 2023 - The Future is Accessible
A2 - Bichel-Findlay, Jen
A2 - Otero, Paula
A2 - Scott, Philip
A2 - Huesing, Elaine
PB - IOS Press
CY - Amsterdam
T2 - 19th World Congress on Medical and Health Informatics, MedInfo 2023
Y2 - 8 July 2023 through 12 July 2023
ER -