Authorship attribution of IRC messages using inverse author frequency

Robert Layton, Stephen McCombie, Paul Watters

    Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

    17 Citations (Scopus)

    Abstract

    Internet Relay Chat (IRC) is a useful and relativelysimple protocol for text based chat online, used in a variety ofareas online such as for discussion and technical support. IRC isalso used for cybercrime, with online rooms selling stolen creditcard details, botnet access and malware. The reasons for theuse of IRC in cybercrime include the widespread adoption andease of use, but also focus around the anonymity granted bythe protocol, allowing users to hide behind aliases that can bechanged regularly. In this research, we apply authorship analysistechniques to be able to attribute chat messages to known aliases.A preliminary experiment shows that this application is verydifficult, due to the short messages and repeated information.To improve the accuracy, we apply inverse-author-frequency(iaf) weighting, which gives higher weights to features used byfewer authors. This research is the first time that iaf has beenapplied to character n-gram models, previously being applied toword based models of authorship. We find that this improvesthe accuracy significantly for the RLP method and provides aplatform for successful applications of authorship analysis in thefuture. Overall, the method achieves accuracies of over 55% ina very difficult application domain.

    Original languageEnglish
    Title of host publicationProceedings - 2012 3rd Cybercrime and Trustworthy Computing Workshop, CTC 2012
    Place of PublicationPiscataway, NJ
    PublisherInstitute of Electrical and Electronics Engineers (IEEE)
    Pages7-13
    Number of pages7
    ISBN (Electronic)9780769549408
    ISBN (Print)9781467364607
    DOIs
    Publication statusPublished - 2013
    Event2012 3rd Cybercrime and Trustworthy Computing Workshop, CTC 2012 - Ballarat, VIC, Australia
    Duration: 29 Oct 201230 Oct 2012

    Other

    Other2012 3rd Cybercrime and Trustworthy Computing Workshop, CTC 2012
    CountryAustralia
    CityBallarat, VIC
    Period29/10/1230/10/12

    Fingerprint Dive into the research topics of 'Authorship attribution of IRC messages using inverse author frequency'. Together they form a unique fingerprint.

    Cite this