-
Study of charging-up of PCB planes for neutrino experiment readout
Authors:
B. Baibussinov,
M. Bettini,
F. Fabris,
A. Guglielmi,
S. Marchini,
G. Meng,
M. Nicoletto,
F. Pietropaolo,
G. Rampazzo,
R. Triozzi,
F. Varanini
Abstract:
The use of double-faced, metallized, perforated PCB planes, segmented into strips for the anodic read-out of ionization signals in liquid argon TPCs, is emerging as a promising technology for charge readout in liquid argon TPCs used in large volume detectors.As a proof of concept, a prototype liquid Argon TPC hosting this new anode configuration based on single side perforated PCB planes has been…
▽ More
The use of double-faced, metallized, perforated PCB planes, segmented into strips for the anodic read-out of ionization signals in liquid argon TPCs, is emerging as a promising technology for charge readout in liquid argon TPCs used in large volume detectors.As a proof of concept, a prototype liquid Argon TPC hosting this new anode configuration based on single side perforated PCB planes has been constructed and exposed to cosmic rays at LNL in Italy. Tests were performed with both the metallized and insulating sides of the anode facing the drift volume, providing the first evidence of the focusing effect on drift electron trajectories through the PCB holes due to charge accumulation on the insulator surface.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
Authors:
Matteo Bettini,
Ryan Kortvelesy,
Amanda Prorok
Abstract:
The study of behavioral diversity in Multi-Agent Reinforcement Learning (MARL) is a nascent yet promising field. In this context, the present work deals with the question of how to control the diversity of a multi-agent system. With no existing approaches to control diversity to a set value, current solutions focus on blindly promoting it via intrinsic rewards or additional loss functions, effecti…
▽ More
The study of behavioral diversity in Multi-Agent Reinforcement Learning (MARL) is a nascent yet promising field. In this context, the present work deals with the question of how to control the diversity of a multi-agent system. With no existing approaches to control diversity to a set value, current solutions focus on blindly promoting it via intrinsic rewards or additional loss functions, effectively changing the learning objective and lacking a principled measure for it. To address this, we introduce Diversity Control (DiCo), a method able to control diversity to an exact value of a given metric by representing policies as the sum of a parameter-shared component and dynamically scaled per-agent components. By applying constraints directly to the policy architecture, DiCo leaves the learning objective unchanged, enabling its applicability to any actor-critic MARL algorithm. We theoretically prove that DiCo achieves the desired diversity, and we provide several experiments, both in cooperative and competitive tasks, that show how DiCo can be employed as a novel paradigm to increase performance and sample efficiency in MARL. Multimedia results are available on the paper's website: https://sites.google.com/view/dico-marl.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
The Cambridge RoboMaster: An Agile Multi-Robot Research Platform
Authors:
Jan Blumenkamp,
Ajay Shankar,
Matteo Bettini,
Joshua Bird,
Amanda Prorok
Abstract:
Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research. This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation. Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a bal…
▽ More
Compact robotic platforms with powerful compute and actuation capabilities are key enablers for practical, real-world deployments of multi-agent research. This article introduces a tightly integrated hardware, control, and simulation software stack on a fleet of holonomic ground robot platforms designed with this motivation. Our robots, a fleet of customised DJI Robomaster S1 vehicles, offer a balance between small robots that do not possess sufficient compute or actuation capabilities and larger robots that are unsuitable for indoor multi-robot tests. They run a modular ROS2-based optimal estimation and control stack for full onboard autonomy, contain ad-hoc peer-to-peer communication infrastructure, and can zero-shot run multi-agent reinforcement learning (MARL) policies trained in our vectorized multi-agent simulation framework. We present an in-depth review of other platforms currently available, showcase new experimental validation of our system's capabilities, and introduce case studies that highlight the versatility and reliabilty of our system as a testbed for a wide range of research demonstrations. Our system as well as supplementary material is available online: https://proroklab.github.io/cambridge-robomaster
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
Authors:
Matteo Bettini,
Amanda Prorok,
Vincent Moens
Abstract:
The field of Multi-Agent Reinforcement Learning (MARL) is currently facing a reproducibility crisis. While solutions for standardized reporting have been proposed to address the issue, we still lack a benchmarking tool that enables standardization and reproducibility, while leveraging cutting-edge Reinforcement Learning (RL) implementations. In this paper, we introduce BenchMARL, the first MARL tr…
▽ More
The field of Multi-Agent Reinforcement Learning (MARL) is currently facing a reproducibility crisis. While solutions for standardized reporting have been proposed to address the issue, we still lack a benchmarking tool that enables standardization and reproducibility, while leveraging cutting-edge Reinforcement Learning (RL) implementations. In this paper, we introduce BenchMARL, the first MARL training library created to enable standardized benchmarking across different algorithms, models, and environments. BenchMARL uses TorchRL as its backend, granting it high performance and maintained state-of-the-art implementations while addressing the broad community of MARL PyTorch users. Its design enables systematic configuration and reporting, thus allowing users to create and run complex benchmarks from simple one-line inputs. BenchMARL is open-sourced on GitHub: https://github.com/facebookresearch/BenchMARL
△ Less
Submitted 5 July, 2024; v1 submitted 3 December, 2023;
originally announced December 2023.
-
TorchRL: A data-driven decision-making library for PyTorch
Authors:
Albert Bou,
Matteo Bettini,
Sebastian Dittert,
Vikash Kumar,
Shagun Sodhani,
Xiaomeng Yang,
Gianni De Fabritiis,
Vincent Moens
Abstract:
PyTorch has ascended as a premier machine learning framework, yet it lacks a native and comprehensive library for decision and control tasks suitable for large development teams dealing with complex real-world data and environments. To address this issue, we propose TorchRL, a generalistic control library for PyTorch that provides well-integrated, yet standalone components. We introduce a new and…
▽ More
PyTorch has ascended as a premier machine learning framework, yet it lacks a native and comprehensive library for decision and control tasks suitable for large development teams dealing with complex real-world data and environments. To address this issue, we propose TorchRL, a generalistic control library for PyTorch that provides well-integrated, yet standalone components. We introduce a new and flexible PyTorch primitive, the TensorDict, which facilitates streamlined algorithm development across the many branches of Reinforcement Learning (RL) and control. We provide a detailed description of the building blocks and an extensive overview of the library across domains and tasks. Finally, we experimentally demonstrate its reliability and flexibility and show comparative benchmarks to demonstrate its computational efficiency. TorchRL fosters long-term support and is publicly available on GitHub for greater reproducibility and collaboration within the research community. The code is open-sourced on GitHub.
△ Less
Submitted 27 November, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
System Neural Diversity: Measuring Behavioral Heterogeneity in Multi-Agent Learning
Authors:
Matteo Bettini,
Ajay Shankar,
Amanda Prorok
Abstract:
Evolutionary science provides evidence that diversity confers resilience in natural systems. Yet, traditional multi-agent reinforcement learning techniques commonly enforce homogeneity to increase training sample efficiency. When a system of learning agents is not constrained to homogeneous policies, individuals may develop diverse behaviors, resulting in emergent complementarity that benefits the…
▽ More
Evolutionary science provides evidence that diversity confers resilience in natural systems. Yet, traditional multi-agent reinforcement learning techniques commonly enforce homogeneity to increase training sample efficiency. When a system of learning agents is not constrained to homogeneous policies, individuals may develop diverse behaviors, resulting in emergent complementarity that benefits the system. Despite this, there is a surprising lack of tools that quantify behavioral diversity. Such techniques would pave the way towards understanding the impact of diversity in collective artificial intelligence and enabling its control. In this paper, we introduce System Neural Diversity (SND): a measure of behavioral heterogeneity in multi-agent systems. We discuss and prove its theoretical properties, and compare it with alternate, state-of-the-art behavioral diversity metrics used in the robotics domain. Through simulations of a variety of cooperative multi-robot tasks, we show how our metric constitutes an important tool that enables measurement and control of behavioral heterogeneity. In dynamic tasks, where the problem is affected by repeated disturbances during training, we show that SND allows us to measure latent resilience skills acquired by the agents, while other proxies, such as task performance (reward), fail to. Finally, we show how the metric can be employed to control diversity, allowing us to enforce a desired heterogeneity set-point or range. We demonstrate how this paradigm can be used to bootstrap the exploration phase, finding optimal policies faster, thus enabling novel and more efficient MARL paradigms.
△ Less
Submitted 10 September, 2024; v1 submitted 3 May, 2023;
originally announced May 2023.
-
POPGym: Benchmarking Partially Observable Reinforcement Learning
Authors:
Steven Morad,
Ryan Kortvelesy,
Matteo Bettini,
Stephan Liwicki,
Amanda Prorok
Abstract:
Real world applications of Reinforcement Learning (RL) are often partially observable, thus requiring memory. Despite this, partial observability is still largely ignored by contemporary RL benchmarks and libraries. We introduce Partially Observable Process Gym (POPGym), a two-part library containing (1) a diverse collection of 15 partially observable environments, each with multiple difficulties…
▽ More
Real world applications of Reinforcement Learning (RL) are often partially observable, thus requiring memory. Despite this, partial observability is still largely ignored by contemporary RL benchmarks and libraries. We introduce Partially Observable Process Gym (POPGym), a two-part library containing (1) a diverse collection of 15 partially observable environments, each with multiple difficulties and (2) implementations of 13 memory model baselines -- the most in a single RL library. Existing partially observable benchmarks tend to fixate on 3D visual navigation, which is computationally expensive and only one type of POMDP. In contrast, POPGym environments are diverse, produce smaller observations, use less memory, and often converge within two hours of training on a consumer-grade GPU. We implement our high-level memory API and memory baselines on top of the popular RLlib framework, providing plug-and-play compatibility with various training algorithms, exploration strategies, and distributed training paradigms. Using POPGym, we execute the largest comparison across RL memory models to date. POPGym is available at https://github.com/proroklab/popgym.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Heterogeneous Multi-Robot Reinforcement Learning
Authors:
Matteo Bettini,
Ajay Shankar,
Amanda Prorok
Abstract:
Cooperative multi-robot tasks can benefit from heterogeneity in the robots' physical and behavioral traits. In spite of this, traditional Multi-Agent Reinforcement Learning (MARL) frameworks lack the ability to explicitly accommodate policy heterogeneity, and typically constrain agents to share neural network parameters. This enforced homogeneity limits application in cases where the tasks benefit…
▽ More
Cooperative multi-robot tasks can benefit from heterogeneity in the robots' physical and behavioral traits. In spite of this, traditional Multi-Agent Reinforcement Learning (MARL) frameworks lack the ability to explicitly accommodate policy heterogeneity, and typically constrain agents to share neural network parameters. This enforced homogeneity limits application in cases where the tasks benefit from heterogeneous behaviors. In this paper, we crystallize the role of heterogeneity in MARL policies. Towards this end, we introduce Heterogeneous Graph Neural Network Proximal Policy Optimization (HetGPPO), a paradigm for training heterogeneous MARL policies that leverages a Graph Neural Network for differentiable inter-agent communication. HetGPPO allows communicating agents to learn heterogeneous behaviors while enabling fully decentralized training in partially observable environments. We complement this with a taxonomical overview that exposes more heterogeneity classes than previously identified. To motivate the need for our model, we present a characterization of techniques that homogeneous models can leverage to emulate heterogeneous behavior, and show how this "apparent heterogeneity" is brittle in real-world conditions. Through simulations and real-world experiments, we show that: (i) when homogeneous methods fail due to strong heterogeneous requirements, HetGPPO succeeds, and, (ii) when homogeneous methods are able to learn apparently heterogeneous behaviors, HetGPPO achieves higher resilience to both training and deployment noise.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
On the properties of path additions for traffic routing
Authors:
Matteo Bettini,
Amanda Prorok
Abstract:
In this paper we investigate the impact of path additions to transport networks with optimised traffic routing. In particular, we study the behaviour of total travel time, and consider both self-interested routing paradigms, such as User Equilibrium (UE) routing, as well as cooperative paradigms, such as classic Multi-Commodity (MC) network flow and System Optimal (SO) routing. We provide a formal…
▽ More
In this paper we investigate the impact of path additions to transport networks with optimised traffic routing. In particular, we study the behaviour of total travel time, and consider both self-interested routing paradigms, such as User Equilibrium (UE) routing, as well as cooperative paradigms, such as classic Multi-Commodity (MC) network flow and System Optimal (SO) routing. We provide a formal framework for designing transport networks through iterative path additions, introducing the concepts of trip spanning tree and trip path graph. Using this formalisation, we prove multiple properties of the objective function for transport network design. Since the underlying routing problem is NP-Hard, we investigate properties that provide guarantees in approximate algorithm design. Firstly, while Braess' paradox has shown that total travel time is not monotonic non-increasing with respect to path additions under self-interested routing (UE), we prove that, instead, monotonicity holds for cooperative routing (MC and SO). This result has the important implication that cooperative agents make the best use of redundant infrastructure. Secondly, we prove via a counterexample that the intuitive statement `adding a path to a transport network always grants greater or equal benefit to users than adding it to a superset of that network' is false. In other words we prove that, for all the routing formulations studied, total travel time is not supermodular with respect to path additions. While this counter-intuitive result yields a hardness property for algorithm design, we provide particular instances where, instead, the property of supermodularity holds. Our study on monotonicity and supermodularity of total travel time with respect to path additions provides formal proofs and scenarios that constitute important insights for transport network designers.
△ Less
Submitted 10 July, 2022;
originally announced July 2022.
-
VMAS: A Vectorized Multi-Agent Simulator for Collective Robot Learning
Authors:
Matteo Bettini,
Ryan Kortvelesy,
Jan Blumenkamp,
Amanda Prorok
Abstract:
While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-…
▽ More
While many multi-robot coordination problems can be solved optimally by exact algorithms, solutions are often not scalable in the number of robots. Multi-Agent Reinforcement Learning (MARL) is gaining increasing attention in the robotics community as a promising solution to tackle such problems. Nevertheless, we still lack the tools that allow us to quickly and efficiently find solutions to large-scale collective learning tasks. In this work, we introduce the Vectorized Multi-Agent Simulator (VMAS). VMAS is an open-source framework designed for efficient MARL benchmarking. It is comprised of a vectorized 2D physics engine written in PyTorch and a set of twelve challenging multi-robot scenarios. Additional scenarios can be implemented through a simple and modular interface. We demonstrate how vectorization enables parallel simulation on accelerated hardware without added complexity. When comparing VMAS to OpenAI MPE, we show how MPE's execution time increases linearly in the number of simulations while VMAS is able to execute 30,000 parallel simulations in under 10s, proving more than 100x faster. Using VMAS's RLlib interface, we benchmark our multi-robot scenarios using various Proximal Policy Optimization (PPO)-based MARL algorithms. VMAS's scenarios prove challenging in orthogonal ways for state-of-the-art MARL algorithms. The VMAS framework is available at https://github.com/proroklab/VectorizedMultiAgentSimulator. A video of VMAS scenarios and experiments is available at https://youtu.be/aaDRYfiesAY.
△ Less
Submitted 17 September, 2022; v1 submitted 7 July, 2022;
originally announced July 2022.
-
Overhaul and Installation of the ICARUS-T600 Liquid Argon TPC Electronics for the FNAL Short Baseline Neutrino Program
Authors:
L. Bagby,
B. Baibussinov,
B. Behera,
V. Bellini,
R. Benocci,
M. Betancourt,
M. Bettini,
M. Bonesini,
T. Boone,
A. Braggiotti,
J. D. Brown,
H. Budd,
F. Calaon,
L. Castellani,
S. Centro,
A. G. Cocco,
M. Convery,
F. Fabris,
A. Falcone,
C. Farnese,
A. Fava,
F. Fichera,
M. Giarin,
D. Gibin,
A. Guglielmi
, et al. (39 additional authors not shown)
Abstract:
The ICARUS T600 liquid argon (LAr) time projection chamber (TPC) underwent a major overhaul at CERN in 2016-2017 to prepare for the operation at FNAL in the Short Baseline Neutrino (SBN) program. This included a major upgrade of the photo-multiplier system and of the TPC wire read-out electronics. The full TPC wire read-out electronics together with the new wire biasing and interconnection scheme…
▽ More
The ICARUS T600 liquid argon (LAr) time projection chamber (TPC) underwent a major overhaul at CERN in 2016-2017 to prepare for the operation at FNAL in the Short Baseline Neutrino (SBN) program. This included a major upgrade of the photo-multiplier system and of the TPC wire read-out electronics. The full TPC wire read-out electronics together with the new wire biasing and interconnection scheme are described. The design of a new signal feed-through flange is also a fundamental piece of this overhaul whose major feature is the integration of all electronics components onto the signal flange. Initial functionality tests of the full TPC electronics chain installed in the T600 detector at FNAL are also described.
△ Less
Submitted 25 November, 2020; v1 submitted 5 October, 2020;
originally announced October 2020.