Which side are you on? Investigating politico-economic bias in Nepali language models

Surendrabikram Thapa, Kritesh Rauniyar, Ehsan Barkhordar, Hariram Veeramani, Usman Naseem

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

5 Downloads (Pure)

Abstract

Language models are trained on vast datasets sourced from the internet, which inevitably contain biases that reflect societal norms, stereotypes, and political inclinations. These biases can manifest in model outputs, influencing a wide range of applications. While there has been extensive research on bias detection and mitigation in large language models (LLMs) for widely spoken languages like English, there is a significant gap when it comes to low-resource languages such as Nepali. This paper addresses this gap by investigating the political and economic biases present in five fill-mask models and eleven generative models trained for the Nepali language. To assess these biases, we translated the Political Compass Test (PCT) into Nepali and evaluated the models’ outputs along social and economic axes. Our findings reveal distinct biases across models, with small LMs showing a right-leaning economic bias, while larger models exhibit more complex political orientations, including left-libertarian tendencies. This study emphasizes the importance of addressing biases in low-resource languages to promote fairness and inclusivity in AI-driven technologies. Our work provides a foundation for future research on bias detection and mitigation in underrepresented languages like Nepali, contributing to the broader goal of creating more ethical AI systems.

Original languageEnglish
Title of host publicationALTA 2024
Subtitle of host publicationProceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
EditorsTim Baldwin, Sergio José Rodríguez Méndez, Nicholas Kuo
Place of PublicationKerrville, TX
PublisherAssociation for Computational Linguistics (ACL)
Pages104-117
Number of pages14
Publication statusPublished - Dec 2024
EventAnnual Workshop of the Australasian Language Technology Association (22nd : 2024) - Canberra, Australia
Duration: 2 Dec 20244 Dec 2024

Conference

ConferenceAnnual Workshop of the Australasian Language Technology Association (22nd : 2024)
Abbreviated titleALTA 2024
Country/TerritoryAustralia
CityCanberra
Period2/12/244/12/24

Bibliographical note

Copyright 2024 Association for Computational Linguistics. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Cite this