Abstract
In our contribution to the ALTA 2012 shared task we experimented with the use of cluster-based features for sentence classification. In a first stage we cluster the documents according to the distribution of sentence labels. We then use this information as a feature in standard classifiers. We observed that the cluster-based feature improved the results for Naive-Bayes classifiers but not for better-informed classifiers such as MaxEnt or Logistic Regression.
Original language | English |
---|---|
Pages (from-to) | 139-142 |
Number of pages | 4 |
Journal | Proceedings of the Australasian Language Technology Association Workshop 2012 : ALTA 2012 |
Volume | 10 |
Publication status | Published - 2012 |
Event | Australasian Language Technology Workshop (10th : 2012) - Dunedin, New Zealand Duration: 4 Dec 2012 → 6 Dec 2012 |