Skip to main content

Showing 1–3 of 3 results for author: Kostas, J E

  1. arXiv:2305.09838  [pdf, other

    cs.LG cs.AI

    Coagent Networks: Generalized and Scaled

    Authors: James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas

    Abstract: Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpropagation-based deep learning (BDL) that overcomes some of backpropagation's main limitations. For example, coagent networks can compute different par… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  2. arXiv:2112.05812  [pdf, other

    cs.LG

    Edge-Compatible Reinforcement Learning for Recommendations

    Authors: James E. Kostas, Philip S. Thomas, Georgios Theocharous

    Abstract: Most reinforcement learning (RL) recommendation systems designed for edge computing must either synchronize during recommendation selection or depend on an unprincipled patchwork collection of algorithms. In this work, we build on asynchronous coagent policy gradient algorithms \citep{kostas2020asynchronous} to propose a principled solution to this problem. The class of algorithms that we propose… ▽ More

    Submitted 10 August, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

  3. arXiv:1902.05650  [pdf, other

    cs.LG stat.ML

    Asynchronous Coagent Networks

    Authors: James E. Kostas, Chris Nota, Philip S. Thomas

    Abstract: Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hiera… ▽ More

    Submitted 10 August, 2020; v1 submitted 14 February, 2019; originally announced February 2019.

    Comments: Updated version