Maximum entropy reinforcement learning with evolution strategies

Longxiang Shi, Shijian Li*, Qian Zheng, Longbing Cao, Long Yang, Gang Pan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

5 Citations (Scopus)

Abstract

Evolution strategies (ES) have recently raised attention in solving challenging tasks with low computation costs and high scalability. However, it is well-known that evolution strategies reinforcement learning (RL) methods suffer from low stability. Without careful consideration, ES methods are sensitive to local optima and are unstable in learning. Therefore, there is an urgent need for improving the stability of ES methods in solving RL problems. In this paper, we propose a simple yet efficient ES method to stabilize the learning. Specifically, we propose a framework to incorporate the maximum entropy reinforcement learning with evolution strategies and derive an efficient entropy calculation method for linear policies. We further present a practical algorithm called maximum entropy evolution policy search based on the proposed framework, which is efficient and stable for policy search in continuous control. Our algorithm shows high stability across different random seeds and can obtain comparable results in performance against some existing derivative-free RL methods on several of the well-known benchmark MuJoCo robotic control tasks.

Original languageEnglish
Title of host publication2020 International Joint Conference on Neural Networks (IJCNN)
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Electronic)9781728169262
ISBN (Print)9781728169279
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event2020 International Joint Conference on Neural Networks, IJCNN 2020 - Virtual, Glasgow, United Kingdom
Duration: 19 Jul 202024 Jul 2020

Publication series

Name
ISSN (Print)2161-4393
ISSN (Electronic)2161-4407

Conference

Conference2020 International Joint Conference on Neural Networks, IJCNN 2020
Country/TerritoryUnited Kingdom
CityVirtual, Glasgow
Period19/07/2024/07/20

Fingerprint

Dive into the research topics of 'Maximum entropy reinforcement learning with evolution strategies'. Together they form a unique fingerprint.

Cite this