Contrastive-based removal of negative information in multimodal emotion analysis

Rui Wang, Yaoyang Wang, Erik Cambria, Xuhui Fan, Xiaohan Yu, Yao Huang, Xiaosong E, Xianxun Zhu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Multimodal sentiment analysis bridges the communication gap between humans and machines by accurately recognizing human emotions. However, existing approaches often focus on synchronizing multimodal data to enhance accuracy, overlooking the critical role of negative information. Negative information generally refers to noise or inconsistencies with primary emotional labels within the data, such as disparities in emotional expressions across different modalities and noisy data elements. These issues can significantly compromise the effectiveness of sentiment analysis systems. To address this challenge, we propose a novel method based on contrastive learning for the removal of non-relevant features within single modalities, aiming to eliminate negative information in speech, text, and image data. Additionally, we have designed an enhanced multi-head attention mechanism that integrates the cleansed features into a unified representation for emotion analysis. Experimental evaluations on the CMU-MOSI and CMU-MOSEI datasets demonstrate that our method significantly outperforms existing approaches in sentiment analysis tasks. This method not only improves accuracy but also ensures the system’s robustness against the diverse and noisy nature of real-world data. The relevant code is available at https://github.com/YaoYangWang/MECAM.

[Graphical abstract presents]

Original languageEnglish
Article number107
Pages (from-to)1-16
Number of pages16
JournalCognitive Computation
Volume17
Issue number3
DOIs
Publication statusPublished - Jun 2025

Keywords

  • Multimodal sentiment analysis
  • Contrastive learning
  • Attention mechanisms

Cite this