TY - GEN
T1 - Evaluating user experience in Conversational Recommender Systems
T2 - 37th Australian Conference on Human-Computer Interaction, OZCHI 2025
AU - Mahmud, Raj
AU - Wu, Yufeng
AU - Bin Sawad, Abdullah
AU - Berkovsky, Shlomo
AU - Prasad, Mukesh
AU - Kocaballi, A. Baki
N1 - Copyright the Author(s) 2025. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
PY - 2025/11/28
Y1 - 2025/11/28
N2 - Conversational Recommender Systems (CRSs) are receiving growing research attention across domains, yet their user experience (UX) evaluation remains limited. Existing reviews largely overlook empirical UX studies, particularly in adaptive and large language model (LLM)-based CRSs. To address this gap, we conducted a systematic review following PRISMA guidelines, synthesising 23 empirical studies published between 2017 and 2025. We analysed how UX has been conceptualised, measured, and shaped by domain, adaptivity, and LLM. Our findings reveal persistent limitations: post hoc surveys dominate, turn-level affective UX constructs are rarely assessed, and adaptive behaviours are seldom linked to UX outcomes. LLM-based CRSs introduce further challenges, including epistemic opacity and verbosity, yet evaluations infrequently address these issues. We contribute a structured synthesis of UX metrics, a comparative analysis of adaptive and nonadaptive systems, and a forward-looking agenda for LLM-aware UX evaluation. These findings support the development of more transparent, engaging, and user-centred CRS evaluation practices.
AB - Conversational Recommender Systems (CRSs) are receiving growing research attention across domains, yet their user experience (UX) evaluation remains limited. Existing reviews largely overlook empirical UX studies, particularly in adaptive and large language model (LLM)-based CRSs. To address this gap, we conducted a systematic review following PRISMA guidelines, synthesising 23 empirical studies published between 2017 and 2025. We analysed how UX has been conceptualised, measured, and shaped by domain, adaptivity, and LLM. Our findings reveal persistent limitations: post hoc surveys dominate, turn-level affective UX constructs are rarely assessed, and adaptive behaviours are seldom linked to UX outcomes. LLM-based CRSs introduce further challenges, including epistemic opacity and verbosity, yet evaluations infrequently address these issues. We contribute a structured synthesis of UX metrics, a comparative analysis of adaptive and nonadaptive systems, and a forward-looking agenda for LLM-aware UX evaluation. These findings support the development of more transparent, engaging, and user-centred CRS evaluation practices.
KW - Conversational Recommender Systems
KW - Systematic Review
KW - User Experience Evaluation
UR - https://www.scopus.com/pages/publications/105024849876
U2 - 10.1145/3764687.3764714
DO - 10.1145/3764687.3764714
M3 - Conference proceeding contribution
AN - SCOPUS:105024849876
T3 - OZCHI: Computer-Human Interaction of Australia - Proceedings
SP - 81
EP - 93
BT - OZCHI 2025
A2 - Fredericks, Joel
A2 - Yoo, Soojeong
A2 - Minh Tran, Tram Thi
A2 - Pantidi, Nadia
A2 - Hoang, Thuong
A2 - Hoggenmueller, Marius
A2 - Caldwell, Glenda
A2 - Tag, Benjamin
A2 - Andres, Josh
A2 - Davis, Hilary
A2 - Boden, Marie
A2 - Zhu, Howe
A2 - Harman, Joel
A2 - Rahman, Jessica
PB - Association for Computing Machinery, Inc
CY - Sydney
Y2 - 29 November 2025 through 3 December 2025
ER -