In our contribution to the ALTA 2012 shared task we experimented with the use of cluster-based features for sentence classification. In a first stage we cluster the documents according to the distribution of sentence labels. We then use this information as a feature in standard classifiers. We observed that the cluster-based feature improved the results for Naive-Bayes classifiers but not for better-informed classifiers such as MaxEnt or Logistic Regression.
|Number of pages||4|
|Journal||Proceedings of the Australasian Language Technology Association Workshop 2012 : ALTA 2012|
|Publication status||Published - 2012|
|Event||Australasian Language Technology Workshop (10th : 2012) - Dunedin, New Zealand|
Duration: 4 Dec 2012 → 6 Dec 2012