TY - GEN
T1 - Balancing user-item structure and interaction with large language models and optimal transport for multimedia recommendation
AU - Li, Haodong
AU - Qi, Lianyong
AU - Liu, Weiming
AU - Xu, Xiaolong
AU - Dou, Wanchun
AU - Cao, Yang
AU - Zhang, Xuyun
AU - Beheshti, Amin
AU - Zhou, Xiaokang
PY - 2025
Y1 - 2025
N2 - The rapid growth of multimedia content has driven the development of recommender systems. Most previous work focuses on uncovering latent relationships among items to learn better representations. However, this approach does not sufficiently account for user affinities, potentially leading to an imbalance in the structure modeling of users and items. Moreover, the sparsity and imbalance of user-item interactions further hinder effective representation learning. To address these challenges, we propose a framework called BLAST, which BaLances structures and interActions via large language modelS and optimal Transport for multimodal recommendation. Specifically, we utilize large language models to summarize side information and generate user profiles. Based on these profiles, we design an intra- and inter-entity structure balancing module to capture item-item and user-user relationships, integrating these affinities into the final representations. Furthermore, we impose constraints on negative sample selection, augment the training data with false negative items and the optimal transport algorithm, thereby leading to smoother interactions. We evaluate BLAST on three real-world datasets, and the results demonstrate that our method significantly outperforms state-of-the-art baselines, which validates the superiority and effectiveness of BLAST.
AB - The rapid growth of multimedia content has driven the development of recommender systems. Most previous work focuses on uncovering latent relationships among items to learn better representations. However, this approach does not sufficiently account for user affinities, potentially leading to an imbalance in the structure modeling of users and items. Moreover, the sparsity and imbalance of user-item interactions further hinder effective representation learning. To address these challenges, we propose a framework called BLAST, which BaLances structures and interActions via large language modelS and optimal Transport for multimodal recommendation. Specifically, we utilize large language models to summarize side information and generate user profiles. Based on these profiles, we design an intra- and inter-entity structure balancing module to capture item-item and user-user relationships, integrating these affinities into the final representations. Furthermore, we impose constraints on negative sample selection, augment the training data with false negative items and the optimal transport algorithm, thereby leading to smoother interactions. We evaluate BLAST on three real-world datasets, and the results demonstrate that our method significantly outperforms state-of-the-art baselines, which validates the superiority and effectiveness of BLAST.
UR - https://www.scopus.com/pages/publications/105021835131
U2 - 10.24963/ijcai.2025/337
DO - 10.24963/ijcai.2025/337
M3 - Conference proceeding contribution
AN - SCOPUS:105021835131
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 3027
EP - 3035
BT - IJCAI 2025
A2 - Kwok, James
PB - International Joint Conferences on Artificial Intelligence Organization
CY - Montreal
T2 - 34th Internationa Joint Conference on Artificial Intelligence, IJCAI 2025
Y2 - 16 August 2025 through 22 August 2025
ER -