Experiments are reported that investigate the effect of various source document representations on the accuracy of the sentence extraction phase of a multi-document summarisation task. A novel representation is introduced based on generic relation extraction (GRE), which aims to build systems for relation identification and characterisation that can be transferred across domains and tasks without modification of model parameters. Results demonstrate performance that is significantly higher than a non-trivial baseline that uses tf*idf -weighted words and at least as good as a comparable but less general approach from the literature. Analysis shows that the representations compared are complementary, suggesting that extraction performance could be further improved through system combination.
|Title of host publication||EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009|
|Number of pages||10|
|Publication status||Published - 2009|
|Event||2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009 - Singapore, Singapore|
Duration: 6 Aug 2009 → 7 Aug 2009
|Other||2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009|
|Period||6/08/09 → 7/08/09|