TY - JOUR
T1 - Optic disc classification by deep learning versus expert neuro-ophthalmologists
AU - Biousse, Valérie
AU - Newman, Nancy J.
AU - Najjar, Raymond P.
AU - Vasseneix, Caroline
AU - Xu, Xinxing
AU - Ting, Daniel S.
AU - Milea, Léonard B.
AU - Hwang, Jeong Min
AU - Kim, Dong Hyun
AU - Yang, Hee Kyung
AU - Hamann, Steffen
AU - Chen, John J.
AU - Liu, Yong
AU - Wong, Tien Yin
AU - Milea, Dan
AU - BONSAI (Brain and Optic Nerve Study with Artificial Intelligence) Group
AU - Rondé-Courbis, Barnabé
AU - Gohier, Philippe
AU - Miller, Neil
AU - Padungkiatsagul, Tanyatuth
AU - Poonyathalang, Anuchit
AU - Suwan, Yanin
AU - Vanikieti, Kavin
AU - Amore, Giulia
AU - Barboni, Piero
AU - Carbonelli, Michele
AU - Carelli, Valerio
AU - La Morgia, Chiara
AU - Romagnoli, Martina
AU - Rougier, Marie Bénédicte
AU - Ambika, Selvakumar
AU - Komma, Swetha
AU - Fonseca, Pedro
AU - Raimundo, Miguel
AU - Karlesand, Isabelle
AU - Alexander Lagrèze, Wolf
AU - Sanda, Nicolae
AU - Thumann, Gabriele
AU - Aptel, Florent
AU - Chiquet, Christophe
AU - Liu, Kaiqun
AU - Yang, Hui
AU - Chan, Carmen K. M.
AU - Chan, Noel C. Y.
AU - Cheung, Carol Y.
AU - Chau Tran, Thi Ha
AU - Acheson, James
AU - Habib, Maged S.
AU - Jurkute, Neringa
AU - Yu-Wai-Man, Patrick
AU - Kho, Richard
AU - Jonas, Jost B.
AU - Sabbagh, Nouran
AU - Vignal-Clermont, Catherine
AU - Hage, Rabih
AU - Khanna, Raoul K.
AU - Aung, Tin
AU - Cheng, Ching Yu
AU - Lamoureux, Ecosse
AU - Loo, Jing Liang
AU - Singhal, Shweta
AU - Tow, Sharon
AU - Jiang, Zhubo
AU - Fraser, Clare L.
AU - Mejico, Luis J.
AU - Fard, Masoud Aghsaei
PY - 2020/10/1
Y1 - 2020/10/1
N2 - Objective: To compare the diagnostic performance of an artificial intelligence deep learning system with that of expert neuro-ophthalmologists in classifying optic disc appearance. Methods: The deep learning system was previously trained and validated on 14,341 ocular fundus photographs from 19 international centers. The performance of the system was evaluated on 800 new fundus photographs (400 normal optic discs, 201 papilledema [disc edema from elevated intracranial pressure], 199 other optic disc abnormalities) and compared with that of 2 expert neuro-ophthalmologists who independently reviewed the same randomly presented images without clinical information. Area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity were calculated. Results: The system correctly classified 678 of 800 (84.7%) photographs, compared with 675 of 800 (84.4%) for Expert 1 and 641 of 800 (80.1%) for Expert 2. The system yielded areas under the receiver operating characteristic curve of 0.97 (95% confidence interval [CI] = 0.96–0.98), 0.96 (95% CI = 0.94–0.97), and 0.89 (95% CI = 0.87–0.92) for the detection of normal discs, papilledema, and other disc abnormalities, respectively. The accuracy, sensitivity, and specificity of the system's classification of optic discs were similar to or better than the 2 experts. Intergrader agreement at the eye level was 0.71 (95% CI = 0.67–0.76) between Expert 1 and Expert 2, 0.72 (95% CI = 0.68–0.76) between the system and Expert 1, and 0.65 (95% CI = 0.61–0.70) between the system and Expert 2. Interpretation: The performance of this deep learning system at classifying optic disc abnormalities was at least as good as 2 expert neuro-ophthalmologists. Future prospective studies are needed to validate this system as a diagnostic aid in relevant clinical settings. ANN NEUROL 2020;88:785–795.
AB - Objective: To compare the diagnostic performance of an artificial intelligence deep learning system with that of expert neuro-ophthalmologists in classifying optic disc appearance. Methods: The deep learning system was previously trained and validated on 14,341 ocular fundus photographs from 19 international centers. The performance of the system was evaluated on 800 new fundus photographs (400 normal optic discs, 201 papilledema [disc edema from elevated intracranial pressure], 199 other optic disc abnormalities) and compared with that of 2 expert neuro-ophthalmologists who independently reviewed the same randomly presented images without clinical information. Area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity were calculated. Results: The system correctly classified 678 of 800 (84.7%) photographs, compared with 675 of 800 (84.4%) for Expert 1 and 641 of 800 (80.1%) for Expert 2. The system yielded areas under the receiver operating characteristic curve of 0.97 (95% confidence interval [CI] = 0.96–0.98), 0.96 (95% CI = 0.94–0.97), and 0.89 (95% CI = 0.87–0.92) for the detection of normal discs, papilledema, and other disc abnormalities, respectively. The accuracy, sensitivity, and specificity of the system's classification of optic discs were similar to or better than the 2 experts. Intergrader agreement at the eye level was 0.71 (95% CI = 0.67–0.76) between Expert 1 and Expert 2, 0.72 (95% CI = 0.68–0.76) between the system and Expert 1, and 0.65 (95% CI = 0.61–0.70) between the system and Expert 2. Interpretation: The performance of this deep learning system at classifying optic disc abnormalities was at least as good as 2 expert neuro-ophthalmologists. Future prospective studies are needed to validate this system as a diagnostic aid in relevant clinical settings. ANN NEUROL 2020;88:785–795.
UR - http://www.scopus.com/inward/record.url?scp=85089082629&partnerID=8YFLogxK
U2 - 10.1002/ana.25839
DO - 10.1002/ana.25839
M3 - Article
C2 - 32621348
AN - SCOPUS:85089082629
VL - 88
SP - 785
EP - 795
JO - Annals of Neurology
JF - Annals of Neurology
SN - 0364-5134
IS - 4
ER -