Abstract
We examine the problem of content selection in statistical novel sentence generation. Our approach models the processes performed by professional editors when incorporating material from additional sentences to support some initially chosen key summary sentence, a process we refer to as Sentence Augmentation. We propose and evaluate a method called "Seed and Grow" for selecting such auxiliary information. Additionally, we argue that this can be performed using schemata, as represented by word-pair co-occurrences, and demonstrate its use in statistical summary sentence generation. Evaluation results are supportive, indicating that a schemata model significantly improves over the baseline.
Original language | English |
---|---|
Title of host publication | 2008 Conference on Empirical Methods in Natural Language Processing |
Subtitle of host publication | Proceedings of the Conference |
Editors | Mirella Lapata, Hwee Tou Ng |
Place of Publication | Stroudsburg, PA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 543-552 |
Number of pages | 10 |
Publication status | Published - 2008 |
Event | 2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation - Honolulu, HI, United States Duration: 25 Oct 2008 → 27 Oct 2008 |
Other
Other | 2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation |
---|---|
Country/Territory | United States |
City | Honolulu, HI |
Period | 25/10/08 → 27/10/08 |