Challenges in discriminating profanity from hate speech

Shervin Malmasi, Marcos Zampieri

Research output: Contribution to journalArticleResearchpeer-review

Abstract

In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes n-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of 80% accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface n-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.

LanguageEnglish
Pages187-202
Number of pages16
JournalJournal of Experimental and Theoretical Artificial Intelligence
Volume30
Issue number2
DOIs
Publication statusPublished - 4 Mar 2018
Externally publishedYes

Fingerprint

N-gram
Classifiers
Ensemble Classifier
Social Media
Supervised Classification
Gold
Labels
Classifier
Clustering
Speech
Class
Text
Generalization

Keywords

  • bullying
  • classifier ensembles
  • Hate speech
  • social media
  • text classification
  • twitter

Cite this

@article{1a6d863564614a32a4381ca36db32ae9,
title = "Challenges in discriminating profanity from hate speech",
abstract = "In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes n-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of 80{\%} accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface n-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.",
keywords = "bullying, classifier ensembles, Hate speech, social media, text classification, twitter",
author = "Shervin Malmasi and Marcos Zampieri",
year = "2018",
month = "3",
day = "4",
doi = "10.1080/0952813X.2017.1409284",
language = "English",
volume = "30",
pages = "187--202",
journal = "Journal of Experimental and Theoretical Artificial Intelligence",
issn = "0952-813X",
publisher = "Taylor & Francis",
number = "2",

}

Challenges in discriminating profanity from hate speech. / Malmasi, Shervin; Zampieri, Marcos.

In: Journal of Experimental and Theoretical Artificial Intelligence, Vol. 30, No. 2, 04.03.2018, p. 187-202.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Challenges in discriminating profanity from hate speech

AU - Malmasi, Shervin

AU - Zampieri, Marcos

PY - 2018/3/4

Y1 - 2018/3/4

N2 - In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes n-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of 80% accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface n-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.

AB - In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes n-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of 80% accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface n-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.

KW - bullying

KW - classifier ensembles

KW - Hate speech

KW - social media

KW - text classification

KW - twitter

UR - http://www.scopus.com/inward/record.url?scp=85038080650&partnerID=8YFLogxK

U2 - 10.1080/0952813X.2017.1409284

DO - 10.1080/0952813X.2017.1409284

M3 - Article

VL - 30

SP - 187

EP - 202

JO - Journal of Experimental and Theoretical Artificial Intelligence

T2 - Journal of Experimental and Theoretical Artificial Intelligence

JF - Journal of Experimental and Theoretical Artificial Intelligence

SN - 0952-813X

IS - 2

ER -