Chinese Native Language Identification

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

38 Citations (Scopus)

Abstract

We present the first application of Native Language Identification (NLI) to non-English data. Motivated by theories of language transfer, NLI is the task of identifying a writer's native language (L1) based on their writings in a second language (the L2). An NLI system was applied to Chinese learner texts using topic-independent syntactic models to assess their accuracy. We find that models using part-of-speech tags, context-free grammar production rules and function words are highly effective, achieving a maximum accuracy of 71% . Interestingly, we also find that when applied to equivalent English data, the model performance is almost identical. This finding suggests a systematic pattern of cross-linguistic transfer may exist, where the degree of transfer is independent of the L1 and L2.
Original languageEnglish
Title of host publicationEACL 2014
Subtitle of host publication14th Conference of the European Chapter of the Association for Computational Linguistics : proceedings of the conference
Place of PublicationStroudsburg, PA, USA
PublisherAssociation for Computational Linguistics
Pages95-99
Number of pages5
ISBN (Print)9781937284787
Publication statusPublished - 2014
EventConference of the European Chapter of the Association for Computational Linguistics (14th : 2014) - Gothenburg, Sweden
Duration: 26 Apr 201430 Apr 2014

Conference

ConferenceConference of the European Chapter of the Association for Computational Linguistics (14th : 2014)
CityGothenburg, Sweden
Period26/04/1430/04/14

Fingerprint

Dive into the research topics of 'Chinese Native Language Identification'. Together they form a unique fingerprint.

Cite this