Local n-grams for author identification: notebook for PAN at CLEF 2013

Robert Layton, Paul Watters, Richard Dazeley

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

1 Citation (Scopus)

Abstract

Our approach to the author identification task uses existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. This approach came in third for this year's competition, using a relatively simple scheme of weights by training set accuracy. LNG models create profiles, consisting of a list of character n-grams that best represent a particular author's writing. The use of a weighted ensemble improved upon the accuracy of the method without reducing the speed of the algorithm; the submitted solution was not only near the top of the leaderboard in terms of accuracy, but it was also one of the faster algorithms submitted.

Original languageEnglish
Title of host publicationProceedings of the CLEF 2013 conference
EditorsPamela Forner, Roberto Navigli, Dan Tufis, Nicola Ferro
PublisherCEUR Workshop Proceedings
Number of pages4
Publication statusPublished - 2013
Externally publishedYes
Event2013 Cross Language Evaluation Forum Conference, CLEF 2013 - Valencia, Spain
Duration: 23 Sept 201326 Sept 2013

Publication series

NameCEUR Workshop Proceedings
PublisherRWTH Aachen University
Volume1179
ISSN (Print)1613-0073

Conference

Conference2013 Cross Language Evaluation Forum Conference, CLEF 2013
Country/TerritorySpain
CityValencia
Period23/09/1326/09/13

Cite this