TY - JOUR
T1 - Identifying clinical study types from PubMed metadata
T2 - 15th World Congress on Health and Biomedical Informatics, MEDINFO 2015
AU - Dunn, Adam G.
AU - Arachi, Diana
AU - Bourgeois, Florence T.
N1 - Copyright the Publisher 2015. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
PY - 2015
Y1 - 2015
N2 - We examined a process for automating the classification of articles in MEDLINE aimed at minimising manual effort without sacrificing accuracy. From 22,808 articles pertaining to 19 antidepressants, 1000 were randomly selected and manually labelled according to article type (including, randomised controlled trials, editorials, etc.). We applied a machine learning approach termed 'active learning', where the learner (machine) selects the order in which the user (human) labels examples. Via simulation, we determined the number of articles a user needed to label to produce a classifier with at least 95% recall and 90% precision in three scenarios related to evidence synthesis. We found that the active learning process reduced the number of training instances required by 70%, 19%, and 14% in the three scenarios. The results show that the active learning method may be used in some scenarios to produce accurate classifiers that meet the needs of evidence synthesis tasks and reduce manual effort.
AB - We examined a process for automating the classification of articles in MEDLINE aimed at minimising manual effort without sacrificing accuracy. From 22,808 articles pertaining to 19 antidepressants, 1000 were randomly selected and manually labelled according to article type (including, randomised controlled trials, editorials, etc.). We applied a machine learning approach termed 'active learning', where the learner (machine) selects the order in which the user (human) labels examples. Via simulation, we determined the number of articles a user needed to label to produce a classifier with at least 95% recall and 90% precision in three scenarios related to evidence synthesis. We found that the active learning process reduced the number of training instances required by 70%, 19%, and 14% in the three scenarios. The results show that the active learning method may be used in some scenarios to produce accurate classifiers that meet the needs of evidence synthesis tasks and reduce manual effort.
UR - http://www.scopus.com/inward/record.url?scp=84952043168&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-564-7-867
DO - 10.3233/978-1-61499-564-7-867
M3 - Conference paper
C2 - 26262175
AN - SCOPUS:84952043168
VL - 216
SP - 867
EP - 871
JO - Studies in Health Technology and Informatics
JF - Studies in Health Technology and Informatics
SN - 0926-9630
Y2 - 19 August 2015 through 23 August 2015
ER -