subscribe to arXiv mailings

Recovering generalized homology from Floer homology: the complex oriented case

Authors: Laurent Côté, Yusuf Barış Kartal

Abstract: We associate an invariant called the completed Tate cohomology to a filtered circle equivariant spectrum and a complex oriented cohomology theory. We show that when the filtered spectrum is the spectral symplectic cohomology of a Liouville manifold, this invariant depends only on the stable homotopy type of the underlying manifold. We make explicit computations for several complex oriented cohomol… ▽ More We associate an invariant called the completed Tate cohomology to a filtered circle equivariant spectrum and a complex oriented cohomology theory. We show that when the filtered spectrum is the spectral symplectic cohomology of a Liouville manifold, this invariant depends only on the stable homotopy type of the underlying manifold. We make explicit computations for several complex oriented cohomology theories, including Eilenberg-Maclane spectra, Morava K-theories, their integral counterparts, and complex K-theory. We show that the result for Eilenberg-Maclane spectra depends only on the rational homology, and we use the computations for Morava K-theory to recover the integral homology (as an ungraded group). In a different direction, we use the completed Tate cohomology computations for the complex K-theory to recover the complex K-theory of the underlying manifold from its equivariant filtered Floer homotopy type. A key Floer theoretic input is the computation of local equivariant Floer theory near the orbit of an autonomous Hamiltonian, which may be of independent interest from the perspective of dynamics. △ Less

Submitted 13 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: 25 pages, 1 Figure. Minor changes. Comments are welcome

arXiv:2309.15089 [pdf, other]

Equivariant Floer homotopy via Morse-Bott theory

Authors: Laurent Côté, Yusuf Barış Kartal

Abstract: We generalize the Cohen-Jones-Segal construction to the Morse-Bott setting. In other words, we define framings for Morse-Bott analogues of flow categories and associate a stable homotopy type to this data. We use this to recover the stable homotopy type of a closed manifold from Morse-Bott theory, and the stable equivariant homotopy type of a closed manifold with the action of a compact Lie group… ▽ More We generalize the Cohen-Jones-Segal construction to the Morse-Bott setting. In other words, we define framings for Morse-Bott analogues of flow categories and associate a stable homotopy type to this data. We use this to recover the stable homotopy type of a closed manifold from Morse-Bott theory, and the stable equivariant homotopy type of a closed manifold with the action of a compact Lie group from Morse theory. We use this machinery in Floer theory to construct a genuine circle equivariant model for symplectic cohomology with coefficients in the sphere spectrum. Using the formalism of relative modules, we define equivariant maps to (Thom spectra over) the free loop space of exact, compact Lagrangians. We prove that this map is an equivalence of Borel equivariant spectra when the Lagrangian is the zero section of a cotangent bundle -- an equivariant Viterbo isomorphism theorem over the sphere spectrum. △ Less

Submitted 6 March, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: 51 pages, 2 figures. Comments are welcome

arXiv:2203.07655 [pdf, ps, other]

Joint Time-Vertex Fractional Fourier Transform

Authors: Tuna Alikaşifoğlu, Bünyamin Kartal, Eray Özgünay, Aykut Koç

Abstract: Graph signal processing (GSP) facilitates the analysis of high-dimensional data on non-Euclidean domains by utilizing graph signals defined on graph vertices. In addition to static data, each vertex can provide continuous time-series signals, transforming graph signals into time-series signals on each vertex. The joint time-vertex Fourier transform (JFT) framework offers spectral analysis capabili… ▽ More Graph signal processing (GSP) facilitates the analysis of high-dimensional data on non-Euclidean domains by utilizing graph signals defined on graph vertices. In addition to static data, each vertex can provide continuous time-series signals, transforming graph signals into time-series signals on each vertex. The joint time-vertex Fourier transform (JFT) framework offers spectral analysis capabilities to analyze these joint time-vertex signals. Analogous to the fractional Fourier transform (FRT) extending the ordinary Fourier transform (FT), we introduce the joint time-vertex fractional Fourier transform (JFRT) as a generalization of JFT. The JFRT enables fractional analysis for joint time-vertex processing by extending Fourier analysis to fractional orders in both temporal and vertex domains. We theoretically demonstrate that JFRT generalizes JFT and maintains properties such as index additivity, reversibility, reduction to identity, and unitarity for specific graph topologies. Additionally, we derive Tikhonov regularization-based denoising in the JFRT domain, ensuring robust and well-behaved solutions. Comprehensive numerical experiments on synthetic and real-world datasets highlight the effectiveness of JFRT in denoising and clustering tasks that outperform state-of-the-art approaches. △ Less

Submitted 10 July, 2024; v1 submitted 15 March, 2022; originally announced March 2022.

Comments: 30 pages, 8 figures

arXiv:2109.12256 [pdf, other]

Algebraic sheaves of Floer homology groups via algebraic torus actions on the Fukaya category

Authors: Yusuf Barış Kartal

Abstract: Let $(M,ω_M)$ be a monotone or negatively monotone symplectic manifold, or a Weinstein manifold. One can construct an "action" of $H^1(M,\mathbb{G}_m)$ on the Fukaya category (wrapped Fukaya category in the exact case) that reflects the action of $Symp^0(M,ω_M)$ on the set of Lagrangian branes. A priori this action is only analytic. The purpose of this work is to show the algebraicity of this acti… ▽ More Let $(M,ω_M)$ be a monotone or negatively monotone symplectic manifold, or a Weinstein manifold. One can construct an "action" of $H^1(M,\mathbb{G}_m)$ on the Fukaya category (wrapped Fukaya category in the exact case) that reflects the action of $Symp^0(M,ω_M)$ on the set of Lagrangian branes. A priori this action is only analytic. The purpose of this work is to show the algebraicity of this action under some assumptions. We use this to prove a tameness result for the sheaf of Lagrangian Floer homology groups obtained by moving one of the Lagrangians via global symplectic isotopies. We also show the algebraicity of the locus of $z\in H^1(M,\mathbb{G}_m)$ that fix a Lagrangian brane in the Fukaya category. The latter has applications to Lagrangian flux. Finally, we prove a statement in mirror symmetry: in the Weinstein case, assume that $M$ is mirror to an affine or projective variety $X$, that there exists an exact Lagrangian torus $L\subset M$ such that $H^1(M)\to H^1(L)$ is surjective, and that $L$ is sent to a smooth point of $x\in X$ under the mirror equivalence. Then we construct a Zariski chart of $X$ containing $x$, that is isomorphic to $H^1(L,\mathbb{G}_m)$, and such that other points of this chart correspond to non-exact deformations of $L$ (possibly equipped with unitary local systems). In particular, this implies rationality of the irreducible component containing $x$; however, it is stronger. Under our assumptions, one can construct an algebraic action of $H^1(M,\mathbb{G}_m)$, namely the action by non-unitary local systems. By combining techniques from family Floer homology and non-commutative geometry, we prove that this action coincides with the geometric action mentioned in the first paragraph. We use this to deduce the theorems above. △ Less

Submitted 24 September, 2021; originally announced September 2021.

Comments: 45 pages, 4 figures. Comments are welcome!

arXiv:2109.11094 [pdf, other]

PredictionNet: Real-Time Joint Probabilistic Traffic Prediction for Planning, Control, and Simulation

Authors: Alexey Kamenev, Lirui Wang, Ollin Boer Bohan, Ishwar Kulkarni, Bilal Kartal, Artem Molchanov, Stan Birchfield, David Nistér, Nikolai Smolyanskiy

Abstract: Predicting the future motion of traffic agents is crucial for safe and efficient autonomous driving. To this end, we present PredictionNet, a deep neural network (DNN) that predicts the motion of all surrounding traffic agents together with the ego-vehicle's motion. All predictions are probabilistic and are represented in a simple top-down rasterization that allows an arbitrary number of agents. C… ▽ More Predicting the future motion of traffic agents is crucial for safe and efficient autonomous driving. To this end, we present PredictionNet, a deep neural network (DNN) that predicts the motion of all surrounding traffic agents together with the ego-vehicle's motion. All predictions are probabilistic and are represented in a simple top-down rasterization that allows an arbitrary number of agents. Conditioned on a multi-layer map with lane information, the network outputs future positions, velocities, and backtrace vectors jointly for all agents including the ego-vehicle in a single pass. Trajectories are then extracted from the output. The network can be used to simulate realistic traffic, and it produces competitive results on popular benchmarks. More importantly, it has been used to successfully control a real-world vehicle for hundreds of kilometers, by combining it with a motion planning/control subsystem. The network runs faster than real-time on an embedded GPU, and the system shows good generalization (across sensory modalities and locations) due to the choice of input representation. Furthermore, we demonstrate that by extending the DNN with reinforcement learning (RL), it can better handle rare or unsafe events like aggressive maneuvers and crashes. △ Less

Submitted 19 May, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

Comments: 7 pages, 7 figures, accepted to ICRA 2022 conference, for associated video file, see https://youtu.be/C7Nb3DRjFP0

MSC Class: 68T07 (Primary) 68T37; 68T40 (Secondary) ACM Class: I.2.9; I.2.6; I.6

arXiv:2108.05938 [pdf, other]

Categorical action filtrations via localization and the growth as a symplectic invariant

Authors: Laurent Côté, Yusuf Barış Kartal

Abstract: We develop a purely categorical theory of action filtrations and their associated growth invariants. When specialized to categories of geometric interest, such as the wrapped Fukaya category of a Weinstein manifold, and the bounded derived category of coherent sheaves on a smooth algebraic variety, our categorical action filtrations essentially recover previously studied filtrations of geometric o… ▽ More We develop a purely categorical theory of action filtrations and their associated growth invariants. When specialized to categories of geometric interest, such as the wrapped Fukaya category of a Weinstein manifold, and the bounded derived category of coherent sheaves on a smooth algebraic variety, our categorical action filtrations essentially recover previously studied filtrations of geometric origin. Our approach is built around the notion of a smooth categorical compactification. We prove that a smooth categorical compactification induces well-defined growth invariants, which are moreover preserved under zig-zags of such compactifications. The technical heart of the paper is a method for computing these growth invariants in terms of the growth of certain colimits of (bi)modules. In practice, such colimits arise in both geometric settings of interest. The main applications are: (1) A "quantitative" refinement of homological mirror symmetry, which relates the growth of the Reeb-length filtration on the symplectic geometry side with the growth of filtrations on the algebraic geometry side defined by the order of pole at infinity (often these can be expressed in terms of the dimension of the support of sheaves). (2) A proof that the Reeb-length growth of symplectic cohomology and wrapped Floer cohomology on a Weinstein manifold are at most exponential. (3) Lower bounds for the entropy and polynomial entropy of certain natural endofunctors acting on Fukaya categories. △ Less

Submitted 18 May, 2023; v1 submitted 12 August, 2021; originally announced August 2021.

Comments: 45 pages, 3 figures

MSC Class: 2010 Mathematics Subject Classification. Primary 53D37; Secondary 18E30; 14F05; 53D40

arXiv:2008.08566 [pdf, other]

Iterations of symplectomorphisms and p-adic analytic actions on Fukaya category

Authors: Yusuf Barış Kartal

Abstract: Inspired by the work of Bell on dynamical Mordell-Lang conjecture, and by the family Floer homology, we construct $p$-adic analytic families of bimodules on the Fukaya category of a monotone or negatively monotone symplectic manifold, interpolating the bimodules corresponding to iterates of a symplectomorphism $φ$ isotopic to identity. We consider this family as a $p$-adic analytic action on the F… ▽ More Inspired by the work of Bell on dynamical Mordell-Lang conjecture, and by the family Floer homology, we construct $p$-adic analytic families of bimodules on the Fukaya category of a monotone or negatively monotone symplectic manifold, interpolating the bimodules corresponding to iterates of a symplectomorphism $φ$ isotopic to identity. We consider this family as a $p$-adic analytic action on the Fukaya category. Using this, we deduce that the ranks of Floer homology groups $HF(φ^k(L),L';Λ)$ are constant in $k\in\mathbb{Z}$, with finitely many possible exceptions. We also prove an analogous result without the monotonicity assumption for generic $φ$ isotopic to identity by showing how to construct a $p$-adic analytic action in this case. △ Less

Submitted 25 September, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

Comments: Real version is removed and some comments are added to the introduction. 48 pages, 7 figures. Comments are welcome

arXiv:2004.00600 [pdf, other]

Work in Progress: Temporally Extended Auxiliary Tasks

Authors: Craig Sherstan, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

Abstract: Predictive auxiliary tasks have been shown to improve performance in numerous reinforcement learning works, however, this effect is still not well understood. The primary purpose of the work presented here is to investigate the impact that an auxiliary task's prediction timescale has on the agent's policy performance. We consider auxiliary tasks which learn to make on-policy predictions using temp… ▽ More Predictive auxiliary tasks have been shown to improve performance in numerous reinforcement learning works, however, this effect is still not well understood. The primary purpose of the work presented here is to investigate the impact that an auxiliary task's prediction timescale has on the agent's policy performance. We consider auxiliary tasks which learn to make on-policy predictions using temporal difference learning. We test the impact of prediction timescale using a specific form of auxiliary task in which the input image is used as the prediction target, which we refer to as temporal difference autoencoders (TD-AE). We empirically evaluate the effect of TD-AE on the A2C algorithm in the VizDoom environment using different prediction timescales. While we do not observe a clear relationship between the prediction timescale on performance, we make the following observations: 1) using auxiliary tasks allows us to reduce the trajectory length of the A2C algorithm, 2) in some cases temporally extended TD-AE performs better than a straight autoencoder, 3) performance with auxiliary tasks is sensitive to the weight placed on the auxiliary loss, 4) despite this sensitivity, auxiliary tasks improved performance without extensive hyper-parameter tuning. Our overall conclusions are that TD-AE increases the robustness of the A2C algorithm to the trajectory length and while promising, further study is required to fully understand the relationship between auxiliary task prediction timescale and the agent's performance. △ Less

Submitted 16 April, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

Comments: Accepted for the Adaptive and Learning Agents (ALA) Workshop at AAMAS 2020

arXiv:1907.11788 [pdf, other]

On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman

Authors: Chao Gao, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

Abstract: How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL --- past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the env… ▽ More How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL --- past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the environment's complexity. In this paper, we illuminate reasons behind this failure by providing a thorough analysis on the hardness of random exploration in Pommerman. While model-free random exploration is typically futile, we develop a model-based automatic reasoning module that can be used for safer exploration by pruning actions that will surely lead the agent to death. We empirically demonstrate that this module can significantly improve learning. △ Less

Submitted 26 July, 2019; originally announced July 2019.

Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2019

arXiv:1907.11703 [pdf, other]

Action Guidance with MCTS for Deep Reinforcement Learning

Authors: Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

Abstract: Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framewor… ▽ More Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number rollouts, can be integrated within asynchronous distributed deep reinforcement learning methods. Compared to a vanilla deep RL algorithm, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. △ Less

Submitted 25 July, 2019; originally announced July 2019.

Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: substantial text overlap with arXiv:1904.05759, arXiv:1812.00045

arXiv:1907.10827 [pdf, other]

Terminal Prediction as an Auxiliary Task for Deep Reinforcement Learning

Authors: Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

Abstract: Deep reinforcement learning has achieved great successes in recent years, but there are still open challenges, such as convergence to locally optimal policies and sample inefficiency. In this paper, we contribute a novel self-supervised auxiliary task, i.e., Terminal Prediction (TP), estimating temporal closeness to terminal states for episodic tasks. The intuition is to help representation learni… ▽ More Deep reinforcement learning has achieved great successes in recent years, but there are still open challenges, such as convergence to locally optimal policies and sample inefficiency. In this paper, we contribute a novel self-supervised auxiliary task, i.e., Terminal Prediction (TP), estimating temporal closeness to terminal states for episodic tasks. The intuition is to help representation learning by letting the agent predict how close it is to a terminal state, while learning its control policy. Although TP could be integrated with multiple algorithms, this paper focuses on Asynchronous Advantage Actor-Critic (A3C) and demonstrating the advantages of A3C-TP. Our extensive evaluation includes: a set of Atari games, the BipedalWalker domain, and a mini version of the recently proposed multi-agent Pommerman game. Our results on Atari games and the BipedalWalker domain suggest that A3C-TP outperforms standard A3C in most of the tested domains and in others it has similar performance. In Pommerman, our proposed method provides significant improvement both in learning efficiency and converging to better policies against different opponents. △ Less

Submitted 24 July, 2019; originally announced July 2019.

Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: text overlap with arXiv:1812.00045

arXiv:1907.09597 [pdf, other]

Agent Modeling as Auxiliary Task for Deep Reinforcement Learning

Authors: Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

Abstract: In this paper we explore how actor-critic methods in deep reinforcement learning, in particular Asynchronous Advantage Actor-Critic (A3C), can be extended with agent modeling. Inspired by recent works on representation learning and multiagent deep reinforcement learning, we propose two architectures to perform agent modeling: the first one based on parameter sharing, and the second one based on ag… ▽ More In this paper we explore how actor-critic methods in deep reinforcement learning, in particular Asynchronous Advantage Actor-Critic (A3C), can be extended with agent modeling. Inspired by recent works on representation learning and multiagent deep reinforcement learning, we propose two architectures to perform agent modeling: the first one based on parameter sharing, and the second one based on agent policy features. Both architectures aim to learn other agents' policies as auxiliary tasks, besides the standard actor (policy) and critic (values). We performed experiments in both cooperative and competitive domains. The former is a problem of coordinated multiagent object transportation and the latter is a two-player mini version of the Pommerman game. Our results show that the proposed architectures stabilize learning and outperform the standard A3C architecture when learning a best response in terms of expected rewards. △ Less

Submitted 22 July, 2019; originally announced July 2019.

Comments: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19)

arXiv:1907.01156 [pdf, other]

doi 10.2140/gt.2021.25.1551

Distinguishing open symplectic mapping tori via their wrapped Fukaya categories

Authors: Yusuf Barış Kartal

Abstract: In this paper, we present partial results towards a classification of symplectic mapping tori using dynamical properties of wrapped Fukaya categories. More precisely, we construct a symplectic manifold $T_φ$ associated to a Weinstein domain $M$, and an exact, compactly supported symplectomorphism $φ$. $T_φ$ is another Weinstein domain and its contact boundary is independent of $φ$. In this paper,… ▽ More In this paper, we present partial results towards a classification of symplectic mapping tori using dynamical properties of wrapped Fukaya categories. More precisely, we construct a symplectic manifold $T_φ$ associated to a Weinstein domain $M$, and an exact, compactly supported symplectomorphism $φ$. $T_φ$ is another Weinstein domain and its contact boundary is independent of $φ$. In this paper, we distinguish $φ$ from $T_{1_M}$, under certain assumptions (Theorem 1.1). As an application, we obtain pairs of diffeomorphic Weinstein domains with the same contact boundary and whose symplectic cohomology groups are the same, as vector spaces, but that are different as Liouville domains. To our knowledge, this is the first example of such pairs that can be distinguished by their wrapped Fukaya category. Previously, we have suggested a categorical model $M_φ$ for the wrapped Fukaya category $\mathcal{W}(T_φ)$, and we have distinguished $M_φ$ from the mapping torus category of the identity. In this paper, we prove $\mathcal{W}(T_φ)$ and $M_φ$ are derived equivalent (Theorem 1.9); hence, deducing the promised Theorem 1.1. Theorem 1.9 is of independent interest as it preludes an algebraic description of wrapped Fukaya categories of locally trivial symplectic fibrations as twisted tensor products. △ Less

Submitted 10 July, 2021; v1 submitted 2 July, 2019; originally announced July 2019.

Comments: Published at Geometry & Topology. 66 pages, 24 figures

MSC Class: 53D37 (Primary); 16E45; 18G99; 53D40 (Secondary)

Journal ref: Geom. Topol. 25 (2021) 1551-1630

arXiv:1905.01360 [pdf, other]

Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team Competition

Authors: Chao Gao, Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

Abstract: The Pommerman Team Environment is a recently proposed benchmark which involves a multi-agent domain with challenges such as partial observability, decentralized execution (without communication), and very sparse and delayed rewards. The inaugural Pommerman Team Competition held at NeurIPS 2018 hosted 25 participants who submitted a team of 2 agents. Our submission nn_team_skynet955_skynet955 won 2… ▽ More The Pommerman Team Environment is a recently proposed benchmark which involves a multi-agent domain with challenges such as partial observability, decentralized execution (without communication), and very sparse and delayed rewards. The inaugural Pommerman Team Competition held at NeurIPS 2018 hosted 25 participants who submitted a team of 2 agents. Our submission nn_team_skynet955_skynet955 won 2nd place of the "learning agents'' category. Our team is composed of 2 neural networks trained with state of the art deep reinforcement learning algorithms and makes use of concepts like reward shaping, curriculum learning, and an automatic reasoning module for action pruning. Here, we describe these elements and additionally we present a collection of open-sourced agents that can be used for training and testing in the Pommerman environment. Code available at: https://github.com/BorealisAI/pommerman-baseline △ Less

Submitted 20 April, 2019; originally announced May 2019.

Comments: 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making

arXiv:1904.05759 [pdf, other]

Safer Deep RL with Shallow MCTS: A Case Study in Pommerman

Authors: Bilal Kartal, Pablo Hernandez-Leal, Chao Gao, Matthew E. Taylor

Abstract: Safe reinforcement learning has many variants and it is still an open research problem. Here, we focus on how to use action guidance by means of a non-expert demonstrator to avoid catastrophic events in a domain with sparse, delayed, and deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for reinforcement learning (RL) --- past work has sho… ▽ More Safe reinforcement learning has many variants and it is still an open research problem. Here, we focus on how to use action guidance by means of a non-expert demonstrator to avoid catastrophic events in a domain with sparse, delayed, and deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for reinforcement learning (RL) --- past work has shown that model-free RL algorithms fail to achieve significant learning. In this paper, we shed light into the reasons behind this failure by exemplifying and analyzing the high rate of catastrophic events (i.e., suicides) that happen under random exploration in this domain. While model-free random exploration is typically futile, we propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with small number of rollouts, can be integrated to asynchronous distributed deep reinforcement learning methods. Compared to vanilla deep RL algorithms, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. △ Less

Submitted 10 April, 2019; originally announced April 2019.

Comments: Adaptive Learning Agents (ALA) Workshop at AAMAS 2019. arXiv admin note: substantial text overlap with arXiv:1812.00045

arXiv:1812.00045 [pdf, other]

Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Authors: Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

Abstract: Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power. However, there are still several challenges to be addressed such as convergence to locally optimal policies and long training times. In this paper, firstly, we augment Asynchronous Advantage Actor-Critic (A3C) method with a novel self-supervised auxiliary task, i.… ▽ More Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power. However, there are still several challenges to be addressed such as convergence to locally optimal policies and long training times. In this paper, firstly, we augment Asynchronous Advantage Actor-Critic (A3C) method with a novel self-supervised auxiliary task, i.e. \emph{Terminal Prediction}, measuring temporal closeness to terminal states, namely A3C-TP. Secondly, we propose a new framework where planning algorithms such as Monte Carlo tree search or other sources of (simulated) demonstrators can be integrated to asynchronous distributed DRL methods. Compared to vanilla A3C, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. △ Less

Submitted 30 November, 2018; originally announced December 2018.

Comments: 9 pages, 6 figures, To appear at AAAI-19 Workshop on Reinforcement Learning in Games

arXiv:1810.05587 [pdf, other]

doi 10.1007/s10458-019-09421-1

A Survey and Critique of Multiagent Deep Reinforcement Learning

Authors: Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

Abstract: Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be address… ▽ More Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be addressed. The primary goal of this article is to provide a clear overview of current multiagent deep reinforcement learning (MDRL) literature. Additionally, we complement the overview with a broader analysis: (i) we revisit previous key components, originally presented in MAL and RL, and highlight how they have been adapted to multiagent deep reinforcement learning settings. (ii) We provide general guidelines to new practitioners in the area: describing lessons learned from MDRL works, pointing to recent benchmarks, and outlining open avenues of research. (iii) We take a more critical tone raising practical challenges of MDRL (e.g., implementation and computational demands). We expect this article will help unify and motivate future research to take advantage of the abundant literature that exists (e.g., RL and MAL) in a joint effort to promote fruitful research in the multiagent community. △ Less

Submitted 30 August, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

Comments: Under review since Oct 2018. Earlier versions of this work had the title: "Is multiagent deep reinforcement learning the answer or the question? A brief survey"

arXiv:1809.04046 [pdf, other]

Dynamical invariants of mapping torus categories

Authors: Yusuf Barış Kartal

Abstract: This paper describes constructions in homological algebra that are part of a strategy whose goal is to understand and classify symplectic mapping tori. More precisely, given a dg category and an auto-equivalence, satisfying certain assumptions, we introduce a category $M_φ$-called the mapping torus category that describes the wrapped Fukaya category of an open symplectic mapping torus. Then we def… ▽ More This paper describes constructions in homological algebra that are part of a strategy whose goal is to understand and classify symplectic mapping tori. More precisely, given a dg category and an auto-equivalence, satisfying certain assumptions, we introduce a category $M_φ$-called the mapping torus category that describes the wrapped Fukaya category of an open symplectic mapping torus. Then we define a family of bimodules on a natural deformation of $M_φ$, uniquely characterize it and using this, we distinguish $M_φ$ from the mapping torus category of the identity. The proof of the equivalence of $M_φ$ with wrapped Fukaya category is proven in a different paper (arXiv:1907.01156). △ Less

Submitted 10 July, 2021; v1 submitted 11 September, 2018; originally announced September 2018.

Comments: Accepted for publication in Advances in Mathematics, 83 pages, 6 figures

Showing 1–18 of 18 results for author: Kartal, B