We perform a quantitative analysis of data in a corpus that specialises on summarisation for Evidence Based Medicine (EBM). The intent of the analysis is to discover possible directions for performing automatic evidence-based summarisation. Our analysis attempts to ascertain the extent to which good, evidence-based, multidocument summaries can be obtained from individual single-document summaries of the source texts. We define a set of scores, which we call coverage scores, to estimate the degree of information overlap between the multi-document summaries and source texts of various granularities. Based on our analysis, using several variants of the coverage scores, and the results of a simple task oriented evaluation, we argue that approaches for the automatic generation of evidence-based, bottom-line, multi-document summaries may benefit by utilising a two-step approach: in the first step, content-rich, singledocument, query-focused summaries are generated; followed by a step to synthesise the information from the individual summaries.
|Number of pages||9|
|Journal||Proceedings of the Australasian Language Technology Association Workshop 2012 : ALTA 2012|
|Publication status||Published - 2012|
|Event||Australasian Language Technology Workshop (10th : 2012) - Dunedin, New Zealand|
Duration: 4 Dec 2012 → 6 Dec 2012