Computer Science > Machine Learning

arXiv:2408.09974 (cs)

[Submitted on 19 Aug 2024]

Title:The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective

Authors:Renye Yan, Yaozhong Gan, You Wu, Ling Liang, Junliang Xing, Yimao Cai, Ru Huang

Abstract:The imbalance of exploration and exploitation has long been a significant challenge in reinforcement learning. In policy optimization, excessive reliance on exploration reduces learning efficiency, while over-dependence on exploitation might trap agents in local optima. This paper revisits the exploration-exploitation dilemma from the perspective of entropy by revealing the relationship between entropy and the dynamic adaptive process of exploration and exploitation. Based on this theoretical insight, we establish an end-to-end adaptive framework called AdaZero, which automatically determines whether to explore or to exploit as well as their balance of strength. Experiments show that AdaZero significantly outperforms baseline models across various Atari and MuJoCo environments with only a single setting. Especially in the challenging environment of Montezuma, AdaZero boosts the final returns by up to fifteen times. Moreover, we conduct a series of visualization analyses to reveal the dynamics of our self-adaptive mechanism, demonstrating how entropy reflects and changes with respect to the agent's performance and adaptive process.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2408.09974 [cs.LG]
	(or arXiv:2408.09974v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2408.09974

Submission history

From: Yaozhong Gan [view email]
[v1] Mon, 19 Aug 2024 13:21:46 UTC (6,612 KB)

Computer Science > Machine Learning

Title:The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators