In this paper we show that information from citing papers can help perform extractive summarisation of medical publications, especially when the amount of text available for development is limited. We used the data of the TAC 2014 biomedical summarisation task. We report several methods to find the reference paper sentences that best match the citation text from the citing papers ("citances"). We observed that methods that incorporate lexical domain information from UMLS, and methods that use extended training data, perform best. We then used these ranked sentences to perform extractive summarisation and observed a dramatic improvement of ROUGE-L scores when compared with methods that do not use information from citing papers.
|Number of pages||9|
|Journal||Proceedings of Australasian Language Technology Association Workshop 2014 : ALTA 2014|
|Publication status||Published - 2014|
|Event||Australasian Language Technology Association Workshop (12th : 2014) - Melbourne, Australia|
Duration: 26 Nov 2014 → 28 Nov 2014