Abstract
We describe a set of experiments using a wide range of machine learning techniques for the task of predicting the rhetorical status of sentences. The research is part of a text summarisation project for the legal domain for which we use a new corpus of judgments of the UK House of Lords. We present experimental results for classification according to a rhetorical scheme indicating a sentence's contribution to the overall argumentative structure of the legal judgments using four learning algorithms from the Weka package (C4.5, naïve Bayes, Winnow and SVMs). We also report results using maximum entropy models both in a standard classification frame-work and in a sequence labelling framework. The SVM classifier and the maximum entropy sequence tagger yield the most promising results.
Original language | English |
---|---|
Title of host publication | Applied Computing 2005 - Proceedings of the 20th Annual ACM Symposium on Applied Computing |
Editors | Hisham Haddad, Lorie M Liebrock, Andrea Omicini, Roger L Wainwright |
Pages | 292-296 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2005 |
Event | 20th Annual ACM Symposium on Applied Computing - Santa Fe, NM, United States Duration: 13 Mar 2005 → 17 Mar 2005 |
Other
Other | 20th Annual ACM Symposium on Applied Computing |
---|---|
Country/Territory | United States |
City | Santa Fe, NM |
Period | 13/03/05 → 17/03/05 |
Keywords
- Ar-tificial intelligence
- Automatic summarisation
- Discourse
- Law
- Natural language