Cross platform multimodal retrieval augmented distillation for code-switched content understanding

Surendrabikram Thapa, Hariram Veeramani, Imran Razzak, Roy Ka Wei Lee, Usman Naseem

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

In the era of digital communication and social media, the proliferation of multimodal content, such as code-switched memes, has become a ubiquitous form of expression. This phenomenon is especially significant for low-resource languages like Nepali, where the need for sentiment analysis and hate speech detection remains unmet due to the unavailability of publicly available datasets. To address this gap, we provide ENeMeme, an annotated dataset of 4, 211 code-switched memes in the Nepali-English language for sentiment and hate speech. While the previous state-of-the-art methods of meme analysis particularly focus on high-resource language, they fail to perform well in low-resource language. To bridge this gap, our paper also builds on existing literature to adapt a novel multimodal model, MM-RAD, designed to understand code-switched Nepali-English memes, leveraging both textual and visual content. The model's effectiveness is analyzed across various retrieval platforms. Our proposed MM-RAD demonstrates superior performance in sentiment analysis and hate speech detection compared to individual baseline models. The dataset can be availed through https://github.com/therealthapa/crossplatform.

Original languageEnglish
Title of host publicationWWW Companion '25
Subtitle of host publicationCompanion proceedings of the ACM Web Conference 2025
Place of PublicationNew York
PublisherAssociation for Computing Machinery
Pages2042-2051
Number of pages10
ISBN (Electronic)9798400713316
DOIs
Publication statusPublished - 2025
Event34th ACM Web Conference, WWW Companion 2025 - Sydney, Australia
Duration: 28 Apr 20252 May 2025

Conference

Conference34th ACM Web Conference, WWW Companion 2025
Country/TerritoryAustralia
CitySydney
Period28/04/252/05/25

Bibliographical note

Copyright the Author(s) 2025. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Alternative title of the host publication: "WWW '25: Companion Proceedings of the ACM on Web Conference 2025"; "Companion Proceedings of the ACM Web Conference 2025 (WWW Companion '25), April 28-May 2, 2025, Sydney, NSW, Australia"

Keywords

  • Multimodal Processing
  • Meme Analysis
  • Code-Switched Languages
  • Natural Language Processing

Fingerprint

Dive into the research topics of 'Cross platform multimodal retrieval augmented distillation for code-switched content understanding'. Together they form a unique fingerprint.

Cite this