Abstract
We perform a quantitative analysis of data in a corpus that specialises on summarisation for Evidence Based Medicine (EBM). The intent of the analysis is to discover possible directions for performing automatic evidence-based summarisation. Our analysis attempts to ascertain the extent to which good, evidence-based, multidocument summaries can be obtained from individual single-document summaries of the source texts. We define a set of scores, which we call coverage scores, to estimate the degree of information overlap between the multi-document summaries and source texts of various granularities. Based on our analysis, using several variants of the coverage scores, and the results of a simple task oriented evaluation, we argue that approaches for the automatic generation of evidence-based, bottom-line, multi-document summaries may benefit by utilising a two-step approach: in the first step, content-rich, singledocument, query-focused summaries are generated; followed by a step to synthesise the information from the individual summaries.
Original language | English |
---|---|
Pages (from-to) | 79-87 |
Number of pages | 9 |
Journal | Proceedings of the Australasian Language Technology Association Workshop 2012 : ALTA 2012 |
Volume | 10 |
Publication status | Published - 2012 |
Event | Australasian Language Technology Workshop (10th : 2012) - Dunedin, New Zealand Duration: 4 Dec 2012 → 6 Dec 2012 |