TY - JOUR
T1 - Automated classification of primary care patient safety incident report content and severity using supervised machine learning (ML) approaches
AU - Evans, Huw Prosser
AU - Anastasiou, Athanasios
AU - Edwards, Adrian
AU - Hibbert, Peter
AU - Makeham, Meredith
AU - Luz, Saturnino
AU - Sheikh, Aziz
AU - Donaldson, Liam
AU - Carson-Stevens, Andrew
PY - 2020/12
Y1 - 2020/12
N2 - Learning from patient safety incident reports is a vital part of improving healthcare. However, the volume of reports and their largely free-text nature poses a major analytic challenge. The objective of this study was to test the capability of autonomous classifying of free text within patient safety incident reports to determine incident type and the severity of harm outcome. Primary care patient safety incident reports (n=31333) previously expert-categorised by clinicians (training data) were processed using J48, SVM and Naïve Bayes. The SVM classifier was the highest scoring classifier for incident type (AUROC, 0.891) and severity of harm (AUROC, 0.708). Incident reports containing deaths were most easily classified, correctly identifying 72.82% of reports. In conclusion, supervised ML can be used to classify patient safety incident report categories. The severity classifier, whilst not accurate enough to replace manual processing, could provide a valuable screening tool for this critical aspect of patient safety.
AB - Learning from patient safety incident reports is a vital part of improving healthcare. However, the volume of reports and their largely free-text nature poses a major analytic challenge. The objective of this study was to test the capability of autonomous classifying of free text within patient safety incident reports to determine incident type and the severity of harm outcome. Primary care patient safety incident reports (n=31333) previously expert-categorised by clinicians (training data) were processed using J48, SVM and Naïve Bayes. The SVM classifier was the highest scoring classifier for incident type (AUROC, 0.891) and severity of harm (AUROC, 0.708). Incident reports containing deaths were most easily classified, correctly identifying 72.82% of reports. In conclusion, supervised ML can be used to classify patient safety incident report categories. The severity classifier, whilst not accurate enough to replace manual processing, could provide a valuable screening tool for this critical aspect of patient safety.
KW - incident reporting
KW - machine learning
KW - natural language processing
KW - patient safety
KW - quality improvement
UR - http://www.scopus.com/inward/record.url?scp=85062702543&partnerID=8YFLogxK
U2 - 10.1177/1460458219833102
DO - 10.1177/1460458219833102
M3 - Article
C2 - 30843455
AN - SCOPUS:85062702543
SN - 1460-4582
VL - 26
SP - 3123
EP - 3139
JO - Health Informatics Journal
JF - Health Informatics Journal
IS - 4
ER -