Tanz-indicator: a novel framework for detection of Perso-Arabic-Scripted Urdu sarcastic opinions

Shabana Gul, Rafi Ullah Khan, Mohib Ullah, Roman Aftab, Abdul Waheed, Tsu-Yang Wu

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)
25 Downloads (Pure)

Abstract

Automatic sarcasm detection in textual data is a crucial task in sentiment analysis. This problem is complex because sarcastic comments usually carry the opposite meaning and are context-driven. The issue of sarcasm detection in comments written in Perso-Arabic-scripted Urdu text is even more challenging due to limited online linguistic resources. In this research, we proposed Tanz-Indicator, a lexicon-based framework to detect sarcasm in the user comments posted in Perso-Arabic Urdu language. We use a lexicon of over 3000 sarcastic tweets and 100 sarcastic features for experimentation. We also train two machine learning models with the same data to compare the performance of the lexicon-based model and machine learning-based model. The results show that the lexicon-based model correctly identified 48.5% sarcastic and 23.5% nonsarcastic tweets with the recall of 69.6% and 87.9% precision. The recall rate of Naïve Bayes and SVM-based machine learning models was 20.1% and 24.4%, respectively, with an overall accuracy of 65.2% and 60.1%, respectively.
Original languageEnglish
Article number9151890
Pages (from-to)1-9
Number of pages9
JournalWireless Communications and Mobile Computing
Volume2022
DOIs
Publication statusPublished - 2022
Externally publishedYes

Bibliographical note

Copyright the Author(s) 2022. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Fingerprint

Dive into the research topics of 'Tanz-indicator: a novel framework for detection of Perso-Arabic-Scripted Urdu sarcastic opinions'. Together they form a unique fingerprint.

Cite this