Abstract
We examine different ensemble methods, including an oracle, to estimate the upper-limit of classification accuracy for Native Language Identification (NLI). The oracle outperforms state-of-the-art systems by over 10% and results indicate that for many misclassified texts the correct class label receives a significant portion of the ensemble votes, often being the runner-up. We also present a pilot study of human performance for NLI, the first such experiment. While some participants achieve modest results on our simplified setup with 5 L1s, they did not outperform our NLI system, and this performance gap is likely to widen on the standard NLI setup.
Original language | English |
---|---|
Title of host publication | NAACL HLT 2015 |
Subtitle of host publication | The Tenth Workshop on Innovative Use of NLP for Building Educational Applications : proceedings of the workshop |
Place of Publication | Red Hook, New York |
Publisher | The Association for Computational Linguistics |
Pages | 172-178 |
Number of pages | 7 |
ISBN (Print) | 9781941643358 |
Publication status | Published - 2015 |
Event | Workshop on Innovative Use of NLP for Building Educational Applications (10th : 2015) - Denver, CO Duration: 4 Jun 2015 → 4 Jun 2015 |
Workshop
Workshop | Workshop on Innovative Use of NLP for Building Educational Applications (10th : 2015) |
---|---|
City | Denver, CO |
Period | 4/06/15 → 4/06/15 |