Skip to main navigation Skip to search Skip to main content

Diffusion policies for risk-averse behavior modeling in offline reinforcement learning

Xiaocong Chen*, Siyu Wang, Tong Yu, Lina Yao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data. A central concern in this context is ensuring the safety of the learned policy by quantifying uncertainties associated with various actions and environmental stochasticity. Traditional approaches primarily emphasize mitigating epistemic uncertainty by learning risk-averse policies, often overlooking environmental stochasticity. In this study, we propose an uncertainty-aware distributional offline RL method to simultaneously address both epistemic uncertainty and environmental stochasticity. We propose a model-free offline RL algorithm capable of learning risk-averse policies and characterizing the entire distribution of discounted cumulative rewards, as opposed to merely maximizing the expected value of accumulated discounted returns. Our method is rigorously evaluated through comprehensive experiments in both risk-sensitive and risk-neutral benchmarks, demonstrating its superior performance.

Original languageEnglish
Title of host publication2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Subtitle of host publicationconference proceedings
EditorsChristian Laugier, Alessandro Renzaglia, Nikolay Atanasov, Stan Birchfield, Grzegorz Cielniak, Leonardo De Mattos, Laura Fiorini, Philippe Giguère, Kenji Hashimoto, Javier Ibanez-Guzman, Tetsushi Kamegawa, Jinoh Lee, Giuseppe Loianno, Kevin Luck, Hisataka Maruyama, Philippe Martinet, Hadi Moradi, Urbano Nunes, Julien Pettre, Alberto Pretto, Tommaso Ranzani, Arne Rönnau, Silvia Rossi, Elliott Rouse, Fabio Ruggiero, Olivier Simonin, Danwei Wang, Ming Yang, Eiichi Yoshida, Huijing Zhao
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages567-574
Number of pages8
ISBN (Electronic)9798331543938
ISBN (Print)9798331543945
DOIs
Publication statusPublished - 2025
Event2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025 - Hangzhou, China
Duration: 19 Oct 202525 Oct 2025

Publication series

Name
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025
Country/TerritoryChina
CityHangzhou
Period19/10/2525/10/25

Fingerprint

Dive into the research topics of 'Diffusion policies for risk-averse behavior modeling in offline reinforcement learning'. Together they form a unique fingerprint.

Cite this