Abstract
Hate speech consists of types of content (e.g. text, audio, image) that express derogatory sentiments and hate against certain people or groups of individuals. The internet, particularly social media and microblogging sites, have become an increasingly popular platform for expressing ideas and opinions. Hate speech is prevalent in both offline and online media. A substantial proportion of this kind of content is presented in different modalities (e.g. text, image, video). Taking into account that hate speech spreads quickly during political events, we present a novel multimodal dataset composed of 5680 text-image pairs of tweets data related to the Russia-Ukraine war and annotated with a binary class: "hate" or "no-hate" The baseline results show that multimodal resources are relevant to leverage the hateful information from different types of data. The baselines and dataset provided in this paper may boost researchers in direction of multimodal hate speech, mainly during serious conflicts such as war contexts.
Original language | English |
---|---|
Title of host publication | Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) |
Place of Publication | Stroudsburg |
Publisher | Association for Computational Linguistics |
Pages | 1-6 |
Number of pages | 6 |
ISBN (Electronic) | 9781959429050 |
DOIs | |
Publication status | Published - 2022 |
Externally published | Yes |
Event | 5th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text, CASE 2022 - Abu Dhabi, United Arab Emirates Duration: 7 Dec 2022 → 8 Dec 2022 |
Conference
Conference | 5th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text, CASE 2022 |
---|---|
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 7/12/22 → 8/12/22 |