Native Language Identification with classifier stacking and ensembles

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.

LanguageEnglish
Pages403-446
Number of pages44
JournalComputational Linguistics
Volume44
Issue number3
DOIs
Publication statusPublished - Sep 2018

Fingerprint

Classifiers
language
Testing
examination
Experiments
experiment
Native Language
Ensemble
Classifier

Bibliographical note

Copyright the Association for Computational Linguistics 2018. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Cite this

@article{d201203a5267422d9ca1b84c88bdce92,
title = "Native Language Identification with classifier stacking and ensembles",
abstract = "Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.",
author = "Shervin Malmasi and Mark Dras",
note = "Copyright the Association for Computational Linguistics 2018. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.",
year = "2018",
month = "9",
doi = "10.1162/COLI_a_00323",
language = "English",
volume = "44",
pages = "403--446",
journal = "Computational Linguistics",
issn = "0891-2017",
publisher = "MIT Press Journals",
number = "3",

}

Native Language Identification with classifier stacking and ensembles. / Malmasi, Shervin; Dras, Mark.

In: Computational Linguistics, Vol. 44, No. 3, 09.2018, p. 403-446.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Native Language Identification with classifier stacking and ensembles

AU - Malmasi, Shervin

AU - Dras, Mark

N1 - Copyright the Association for Computational Linguistics 2018. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

PY - 2018/9

Y1 - 2018/9

N2 - Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.

AB - Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.

UR - http://www.scopus.com/inward/record.url?scp=85053933173&partnerID=8YFLogxK

U2 - 10.1162/COLI_a_00323

DO - 10.1162/COLI_a_00323

M3 - Article

VL - 44

SP - 403

EP - 446

JO - Computational Linguistics

T2 - Computational Linguistics

JF - Computational Linguistics

SN - 0891-2017

IS - 3

ER -