TY - JOUR
T1 - Fine-tuning large language model based explainable recommendation with explainable quality reward
AU - Yang, Mengyuan
AU - Zhu, Mengying
AU - Wang, Yan
AU - Chen, Linxun
AU - Zhao, Yilei
AU - Wang, Xiuyuan
AU - Han, Bing
AU - Zheng, Xiaolin
AU - Yin, Jianwei
PY - 2024
Y1 - 2024
N2 - Large language model-based explainable recommendation (LLM-based ER) systems can provide remarkable human-like explanations and have widely received attention from researchers. However, the original LLM-based ER systems face three low-quality problems in their generated explanations, i.e., lack of personalization, inconsistency, and questionable explanation data. To address these problems, we propose a novel LLM-based ER model denoted as LLM2ER to serve as a backbone and devise two innovative explainable quality reward models for fine-tuning such a backbone in a reinforcement learning paradigm, ultimately yielding a fine-tuned model denoted as LLM2ER-EQR, which can provide high-quality explanations. LLM2ER-EQR can generate personalized, informative, and consistent high-quality explanations learned from questionable-quality explanation datasets. Extensive experiments conducted on three real-world datasets demonstrate that our model can generate fluent, diverse, informative, and highly personalized explanations.
AB - Large language model-based explainable recommendation (LLM-based ER) systems can provide remarkable human-like explanations and have widely received attention from researchers. However, the original LLM-based ER systems face three low-quality problems in their generated explanations, i.e., lack of personalization, inconsistency, and questionable explanation data. To address these problems, we propose a novel LLM-based ER model denoted as LLM2ER to serve as a backbone and devise two innovative explainable quality reward models for fine-tuning such a backbone in a reinforcement learning paradigm, ultimately yielding a fine-tuned model denoted as LLM2ER-EQR, which can provide high-quality explanations. LLM2ER-EQR can generate personalized, informative, and consistent high-quality explanations learned from questionable-quality explanation datasets. Extensive experiments conducted on three real-world datasets demonstrate that our model can generate fluent, diverse, informative, and highly personalized explanations.
UR - http://www.scopus.com/inward/record.url?scp=85189635434&partnerID=8YFLogxK
U2 - 10.1609/aaai.v38i8.28777
DO - 10.1609/aaai.v38i8.28777
M3 - Conference paper
AN - SCOPUS:85189635434
SN - 2374-3468
VL - 38
SP - 9250
EP - 9259
JO - Proceedings of the AAAI Conference on Artificial Intelligence
JF - Proceedings of the AAAI Conference on Artificial Intelligence
IS - 8
T2 - 38th AAAI Conference on Artificial Intelligence, AAAI 2024
Y2 - 20 February 2024 through 27 February 2024
ER -