Background: Evidence-based medicine practice requires medical practitioners to rely on the best available evidence, in addition to their expertise, when making clinical decisions. The medical domain boasts a large amount of published medical research data, indexed in various medical databases such as MEDLINE. As the size of this data grows, practitioners increasingly face the problem of information overload, and past research has established the time-associated obstacles faced by evidence-based medicine practitioners. In this paper, we focus on the problem of automatic text summarisation to help practitioners quickly find query-focused information from relevant documents.
Methods: We utilise an annotated corpus that is specialised for the task of evidence-based summarisation of text. In contrast to past summarisation approaches, which mostly rely on surface level features to identify salient pieces of texts that form the summaries, our approach focuses on the use of corpus-based statistics, and domain-specific lexical knowledge for the identification of summary contents. We also apply a target-sentence-specific summarisation technique that reduces the problem of underfitting that persists in generic summarisation models.
Results: In automatic evaluations run over a large number of annotated summaries, our extractive summarisation technique statistically outperforms various baseline and benchmark summarisation models with a percentile rank of 96.8%. A manual evaluation shows that our extractive summarisation approach is capable of selecting content with high recall and precision, and may thus be used to generate bottom-line answers to practitioners' queries.
Conclusions: Our research shows that the incorporation of specialised data and domain-specific knowledge can significantly improve text summarisation performance in the medical domain. Due to the vast amounts of medical text available, and the high growth of this form of data, we suspect that such summarisation techniques will address the time-related obstacles associated with evidence-based medicine.
- automatic text summarisation
- query-focused text summarisation
- medical text processing
- evidence-based medicine