NEHATE: large-scale annotated data shedding light on hate speech in Nepali local election discourse

Surendrabikram Thapa*, Kritesh Rauniyar, Shuvam Shiwakoti, Sweta Poudel, Usman Naseem, Mehwish Nasim

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

25 Citations (Scopus)
129 Downloads (Pure)

Abstract

The use of social media during election campaigns has become increasingly popular. However, the unbridled nature of online discourse can lead to the propagation of hate speech, which has far-reaching implications for the democratic process. Natural Language Processing (NLP) techniques are being used to counteract the spread of hate speech and promote healthy online discourse. Despite the increasing need for NLP techniques to combat hate speech, research on low-resource languages such as Nepali is limited, posing a challenge to the realization of the United Nations' Leave No One Behind principle, which calls for inclusive development that benefits all individuals and communities, regardless of their backgrounds or circumstances. To bridge this gap, we introduce NEHATE, a large-scale manually annotated dataset of hate speech and its targets in Nepali local election discourse. The dataset comprises 13,505 tweets, annotated for hate speech with further sub-categorization of hate speech into targets such as community, individual, and organization. Benchmarking of the dataset with various algorithms has shown potential for performance improvement. We have made the dataset publicly available at https://github.com/shucoll/NEHate to promote further research and development, while also contributing to the UN SDGs aimed at fostering peaceful, inclusive societies, and justice and strong institutions.

Original languageEnglish
Title of host publicationECAI 2023
EditorsKobi Gal, Ann Nowé, Grzegorz J. Nalepa, Roy Fairstein, Roxana Rădulescu
Place of PublicationAmsterdam
PublisherIOS Press
Pages2346-2353
Number of pages8
ISBN (Electronic)9781643684376
ISBN (Print)9781643684369
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event26th European Conference on Artificial Intelligence, ECAI 2023 - Krakow, Poland
Duration: 30 Sept 20234 Oct 2023

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume372
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference26th European Conference on Artificial Intelligence, ECAI 2023
Country/TerritoryPoland
CityKrakow
Period30/09/234/10/23

Bibliographical note

Copyright the Author(s) 2023. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Fingerprint

Dive into the research topics of 'NEHATE: large-scale annotated data shedding light on hate speech in Nepali local election discourse'. Together they form a unique fingerprint.

Cite this