Skip to main content

Showing 1–4 of 4 results for author: Panahi, P M

  1. arXiv:2407.09702  [pdf, other

    cs.LG cs.AI

    Investigating the Interplay of Prioritized Replay and Generalization

    Authors: Parham Mohammad Panahi, Andrew Patterson, Martha White, Adam White

    Abstract: Experience replay, the reuse of past data to improve sample efficiency, is ubiquitous in reinforcement learning. Though a variety of smart sampling schemes have been introduced to improve performance, uniform sampling by far remains the most common approach. One exception is Prioritized Experience Replay (PER), where sampling is done proportionally to TD errors, inspired by the success of prioriti… ▽ More

    Submitted 19 October, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: Published in the Reinforcement Learning Conference 2024

  2. arXiv:2406.01562  [pdf, other

    cs.LG cs.AI

    A New View on Planning in Online Reinforcement Learning

    Authors: Kevin Roice, Parham Mohammad Panahi, Scott M. Jordan, Adam White, Martha White

    Abstract: This paper investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives, such as Double DQN, even though the former uses significantly more memory and computation. The fundament… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Published in the Planning and Reinforcement Learning Workshop at ICAPS 2024. arXiv admin note: text overlap with arXiv:2206.02902

  3. arXiv:2404.02113  [pdf, other

    cs.LG

    K-percent Evaluation for Lifelong RL

    Authors: Golnaz Mesbahi, Parham Mohammad Panahi, Olya Mastikhina, Martha White, Adam White

    Abstract: In continual or lifelong reinforcement learning, access to the environment should be limited. If we aspire to design algorithms that can run for long periods, continually adapting to new, unexpected situations, then we must be willing to deploy our agents without tuning their hyperparameters over the agent's entire lifetime. The standard practice in deep RL, and even continual RL, is to assume unf… ▽ More

    Submitted 25 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  4. arXiv:2206.02902  [pdf, other

    cs.LG cs.AI

    Goal-Space Planning with Subgoal Models

    Authors: Chunlok Lo, Kevin Roice, Parham Mohammad Panahi, Scott Jordan, Adam White, Gabor Mihucz, Farzane Aminmansour, Martha White

    Abstract: This paper investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives, such as Double DQN, even though the former uses significantly more memory and computation. The fundament… ▽ More

    Submitted 27 February, 2024; v1 submitted 6 June, 2022; originally announced June 2022.