Automatically judging sentences for their grammaticality is potentially useful for several purposes — evaluating language technology systems, assessing language competence of second or foreign language learners, and so on. Previous work has examined parser ‘byproducts’, in particular parse probabilities, to distinguish grammatical sentences from ungrammatical ones. The aim of the present paper is to examine whether the primary output of a parser, which we characterise via CFG production rules embodied in a parse, contains useful information for sentence grammaticality classification; and also to examine which feature selection metrics are most useful in this task. Our results show that using gold standard production rules alone can improve over using parse probabilities alone. Combining parser-produced production rules with parse probabilities further produces an improvement of 1.6% on average in the overall classification accuracy.
|Number of pages||9|
|Journal||Proceedings of the Australasian Language Technology Association Workshop 2010|
|Publication status||Published - 2010|
|Event||Australasian Language Technology Association Workshop - Melbourne|
Duration: 9 Dec 2010 → 10 Dec 2010