Skip to main content

Showing 1–50 of 201 results for author: Stone, P

  1. arXiv:2410.11251  [pdf, other

    cs.LG cs.RO

    Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning

    Authors: Jiaheng Hu, Zizhao Wang, Peter Stone, Roberto Martín-Martín

    Abstract: A hallmark of intelligent agents is the ability to learn reusable skills purely from unsupervised interaction with the environment. However, existing unsupervised skill discovery methods often learn entangled skills where one skill variable simultaneously influences many entities in the environment, making downstream skill chaining extremely challenging. We propose Disentangled Unsupervised Skill… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: NeurIPS2024

  2. arXiv:2410.09754  [pdf, other

    cs.LG cs.AI

    SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

    Authors: Hojoon Lee, Dongyoon Hwang, Donghu Kim, Hyunseung Kim, Jun Jet Tai, Kaushik Subramanian, Peter R. Wurman, Jaegul Choo, Peter Stone, Takuma Seno

    Abstract: Recent advances in CV and NLP have been largely driven by scaling up the number of network parameters, despite traditional theories suggesting that larger networks are prone to overfitting. These large networks avoid overfitting by integrating components that induce a simplicity bias, guiding models toward simple and generalizable solutions. However, in deep RL, designing and scaling up networks h… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: preprint

  3. arXiv:2410.05828  [pdf, other

    cs.RO

    Effort Allocation for Deadline-Aware Task and Motion Planning: A Metareasoning Approach

    Authors: Yoonchang Sung, Shahaf S. Shperberg, Qi Wang, Peter Stone

    Abstract: In robot planning, tasks can often be achieved through multiple options, each consisting of several actions. This work specifically addresses deadline constraints in task and motion planning, aiming to find a plan that can be executed within the deadline despite uncertain planning and execution times. We propose an effort allocation problem, formulated as a Markov decision process (MDP), to find s… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 48 pages, 6 figures

  4. arXiv:2410.03016  [pdf, other

    cs.LG

    Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory

    Authors: Alexander Levine, Peter Stone, Amy Zhang

    Abstract: In order to train agents that can quickly adapt to new objectives or reward functions, efficient unsupervised representation learning in sequential decision-making environments can be important. Frameworks such as the Exogenous Block Markov Decision Process (Ex-BMDP) have been proposed to formalize this representation-learning problem (Efroni et al., 2022b). In the Ex-BMDP framework, the agent's h… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

  5. arXiv:2410.00868  [pdf, other

    cs.LG

    Fine-Grained Gradient Restriction: A Simple Approach for Mitigating Catastrophic Forgetting

    Authors: Bo Liu, Mao Ye, Peter Stone, Qiang Liu

    Abstract: A fundamental challenge in continual learning is to balance the trade-off between learning new tasks and remembering the previously acquired knowledge. Gradient Episodic Memory (GEM) achieves this balance by utilizing a subset of past training samples to restrict the update direction of the model parameters. In this work, we start by analyzing an often overlooked hyper-parameter in GEM, the memory… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  6. arXiv:2409.19816  [pdf, other

    cs.RO cs.AI

    Grounded Curriculum Learning

    Authors: Linji Wang, Zifan Xu, Peter Stone, Xuesu Xiao

    Abstract: The high cost of real-world data for robotics Reinforcement Learning (RL) leads to the wide usage of simulators. Despite extensive work on building better dynamics models for simulators to match with the real world, there is another, often-overlooked mismatch between simulations and the real world, namely the distribution of available training tasks. Such a mismatch is further exacerbated by exist… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

    Comments: 8 pages, 4 figures

  7. arXiv:2409.16578  [pdf, other

    cs.RO cs.CV cs.LG

    FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning

    Authors: Jiaheng Hu, Rose Hendrix, Ali Farhadi, Aniruddha Kembhavi, Roberto Martin-Martin, Peter Stone, Kuo-Hao Zeng, Kiana Ehsani

    Abstract: In recent years, the Robotics field has initiated several efforts toward building generalist robot policies through large-scale multi-task Behavior Cloning. However, direct deployments of these policies have led to unsatisfactory performance, where the policy struggles with unseen states and tasks. How can we break through the performance plateau of these models and elevate their capabilities to n… ▽ More

    Submitted 30 September, 2024; v1 submitted 24 September, 2024; originally announced September 2024.

  8. arXiv:2409.16012  [pdf, other

    cs.RO

    PRESTO: Fast motion planning using diffusion models based on key-configuration environment representation

    Authors: Mingyo Seo, Yoonyoung Cho, Yoonchang Sung, Peter Stone, Yuke Zhu, Beomjoon Kim

    Abstract: We introduce a learning-guided motion planning framework that provides initial seed trajectories using a diffusion model for trajectory optimization. Given a workspace, our method approximates the configuration space (C-space) obstacles through a key-configuration representation that consists of a sparse set of task-related key configurations, and uses this as an input to the diffusion model. The… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

    Comments: Submitted to ICRA 2025

  9. arXiv:2408.03539  [pdf, other

    cs.RO cs.LG

    Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes

    Authors: Chen Tang, Ben Abbatematteo, Jiaheng Hu, Rohan Chandra, Roberto Martín-Martín, Peter Stone

    Abstract: Reinforcement learning (RL), particularly its combination with deep neural networks referred to as deep RL (DRL), has shown tremendous promise across a wide range of applications, suggesting its potential for enabling the development of sophisticated robotic behaviors. Robotics problems, however, pose fundamental difficulties for the application of RL, stemming from the complexity and cost of inte… ▽ More

    Submitted 16 September, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: The first three authors contributed equally. Accepted to Annual Review of Control, Robotics, and Autonomous Systems

  10. arXiv:2407.14207  [pdf, other

    cs.LG

    Longhorn: State Space Models are Amortized Online Learners

    Authors: Bo Liu, Rui Wang, Lemeng Wu, Yihao Feng, Peter Stone, Qiang Liu

    Abstract: Modern large language models are built on sequence modeling via next-token prediction. While the Transformer remains the dominant architecture for sequence modeling, its quadratic decoding complexity in sequence length poses a major limitation. State-space models (SSMs) present a competitive alternative, offering linear decoding efficiency while maintaining parallelism during training. However, mo… ▽ More

    Submitted 2 October, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

  11. arXiv:2407.01862  [pdf, other

    cs.RO

    Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The 3rd BARN Challenge at ICRA 2024

    Authors: Xuesu Xiao, Zifan Xu, Aniket Datar, Garrett Warnell, Peter Stone, Joshua Julian Damanik, Jaewon Jung, Chala Adane Deresa, Than Duc Huy, Chen Jinyu, Chen Yichen, Joshua Adrian Cahyono, Jingda Wu, Longfei Mo, Mingyang Lv, Bowen Lan, Qingyang Meng, Weizhi Tao, Li Cheng

    Abstract: The 3rd BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024) in Yokohama, Japan and continued to evaluate the performance of state-of-the-art autonomous ground navigation systems in highly constrained environments. Similar to the trend in The 1st and 2nd BARN Challenge at ICRA 2022 and 2023 in Philadelphi… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.03205

  12. arXiv:2406.16258  [pdf, other

    cs.RO cs.AI cs.LG

    MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention

    Authors: Yuxin Chen, Chen Tang, Chenran Li, Ran Tian, Peter Stone, Masayoshi Tomizuka, Wei Zhan

    Abstract: Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's execution and provides interventions as feedback. However, existing methods often fail to utilize the prior policy efficiently to facilitate learning, thu… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    ACM Class: I.2.6; I.2.9

  13. arXiv:2406.12563  [pdf, other

    cs.LG cs.CV cs.RO

    A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo

    Authors: Miguel Vasco, Takuma Seno, Kenta Kawamoto, Kaushik Subramanian, Peter R. Wurman, Peter Stone

    Abstract: Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Turismo. However, this agent relied on global features that require instrumentation external to the car. This paper introduces,… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted at the Reinforcement Learning Conference (RLC) 2024

  14. arXiv:2405.20321  [pdf, other

    cs.RO cs.CV cs.LG

    Vision-based Manipulation from Single Human Video with Open-World Object Graphs

    Authors: Yifeng Zhu, Arisrei Lim, Peter Stone, Yuke Zhu

    Abstract: We present an object-centric approach to empower robots to learn vision-based manipulation skills from human videos. We investigate the problem of imitating robot manipulation from a single human video in the open-world setting, where a robot must learn to manipulate novel objects from one video demonstration. We introduce ORION, an algorithm that tackles the problem by extracting an object-centri… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  15. arXiv:2405.16439  [pdf, other

    cs.RO cs.AI cs.LG cs.MA

    Towards Imitation Learning in Real World Unstructured Social Mini-Games in Pedestrian Crowds

    Authors: Rohan Chandra, Haresh Karnan, Negar Mehr, Peter Stone, Joydeep Biswas

    Abstract: Imitation Learning (IL) strategies are used to generate policies for robot motion planning and navigation by learning from human trajectories. Recently, there has been a lot of excitement in applying IL in social interactions arising in urban environments such as university campuses, restaurants, grocery stores, and hospitals. However, obtaining numerous expert demonstrations in social settings mi… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  16. arXiv:2405.03113  [pdf, other

    cs.RO cs.AI

    Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning

    Authors: Caleb Chuck, Carl Qi, Michael J. Munje, Shuozhe Li, Max Rudolph, Chang Shi, Siddhant Agarwal, Harshit Sikchi, Abhinav Peri, Sarthak Dayal, Evan Kuo, Kavan Mehta, Anthony Wang, Peter Stone, Amy Zhang, Scott Niekum

    Abstract: Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  17. arXiv:2404.18798  [pdf, other

    cs.MA

    Multi-Agent Synchronization Tasks

    Authors: Rolando Fernandez, Garrett Warnell, Derrik E. Asher, Peter Stone

    Abstract: In multi-agent reinforcement learning (MARL), coordination plays a crucial role in enhancing agents' performance beyond what they could achieve through cooperation alone. The interdependence of agents' actions, coupled with the need for communication, leads to a domain where effective coordination is crucial. In this paper, we introduce and define $\textit{Multi-Agent Synchronization Tasks}$ (MSTs… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Adaptive Learning Agents Workshop at AAMAS 2024

  18. arXiv:2404.10740  [pdf, other

    cs.AI

    N-Agent Ad Hoc Teamwork

    Authors: Caroline Wang, Arrasy Rahman, Ishan Durugkar, Elad Liebman, Peter Stone

    Abstract: Current approaches to learning cooperative multi-agent behaviors assume relatively restrictive settings. In standard fully cooperative multi-agent reinforcement learning, the learning algorithm controls $\textit{all}$ agents in the scenario, while in ad hoc teamwork, the learning algorithm usually assumes control over only a $\textit{single}$ agent in the scenario. However, many cooperative settin… ▽ More

    Submitted 4 October, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    ACM Class: I.2.11; I.2.1; I.2.6; I.2.8

  19. arXiv:2404.04750  [pdf

    cs.CY

    Now, Later, and Lasting: Ten Priorities for AI Research, Policy, and Practice

    Authors: Eric Horvitz, Vincent Conitzer, Sheila McIlraith, Peter Stone

    Abstract: Advances in artificial intelligence (AI) will transform many aspects of our lives and society, bringing immense opportunities but also posing significant risks and challenges. The next several decades may well be a turning point for humanity, comparable to the industrial revolution. We write to share a set of recommendations for moving forward from the perspective of the founder and leaders of the… ▽ More

    Submitted 20 April, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: Four pages. To appear in Communications of the Association for Computing Machinery (CACM), June 2024

  20. arXiv:2403.17231  [pdf, other

    cs.RO cs.LG

    Dyna-LfLH: Learning Agile Navigation in Dynamic Environments from Learned Hallucination

    Authors: Saad Abdul Ghani, Zizhao Wang, Peter Stone, Xuesu Xiao

    Abstract: This paper presents a self-supervised learning method to safely learn a motion planner for ground robots to navigate environments with dense and dynamic obstacles. When facing highly-cluttered, fast-moving, hard-to-predict obstacles, classical motion planners may not be able to keep up with limited onboard computation. For learning-based planners, high-quality demonstrations are difficult to acqui… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Submitted to International Conference on Intelligent Robots and Systems (IROS) 2024

  21. arXiv:2403.11940  [pdf, other

    cs.LG eess.SY

    Multistep Inverse Is Not All You Need

    Authors: Alexander Levine, Peter Stone, Amy Zhang

    Abstract: In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. However, the controllable dynamics of the system are often far simpler than the dynamics of the raw observations. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. In this work, we consider the… ▽ More

    Submitted 6 September, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: RLC 2024

  22. arXiv:2403.07869  [pdf, other

    cs.RO cs.AI cs.LG

    TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation

    Authors: Shivin Dass, Wensi Ai, Yuqian Jiang, Samik Singh, Jiaheng Hu, Ruohan Zhang, Peter Stone, Ben Abbatematteo, Roberto Martín-Martín

    Abstract: A critical bottleneck limiting imitation learning in robotics is the lack of data. This problem is more severe in mobile manipulation, where collecting demonstrations is harder than in stationary manipulation due to the lack of available and easy-to-use teleoperation interfaces. In this work, we demonstrate TeleMoMa, a general and modular interface for whole-body teleoperation of mobile manipulato… ▽ More

    Submitted 21 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  23. arXiv:2403.03848  [pdf, other

    cs.RO cs.LG

    Dexterous Legged Locomotion in Confined 3D Spaces with Reinforcement Learning

    Authors: Zifan Xu, Amir Hossain Raj, Xuesu Xiao, Peter Stone

    Abstract: Recent advances of locomotion controllers utilizing deep reinforcement learning (RL) have yielded impressive results in terms of achieving rapid and robust locomotion across challenging terrain, such as rugged rocks, non-rigid ground, and slippery surfaces. However, while these controllers primarily address challenges underneath the robot, relatively little research has investigated legged mobilit… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  24. arXiv:2403.01636  [pdf, other

    stat.ML cs.LG

    Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

    Authors: Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari

    Abstract: Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks. However, while recent advancements in MTRL theory have focused on the improved statistical efficiency by assuming a shared structure across tasks, exploration--a crucial aspect of RL--has been largely overlooked. This paper addresses thi… ▽ More

    Submitted 5 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  25. arXiv:2401.12497  [pdf, other

    cs.AI cs.LG cs.RO

    Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning

    Authors: Zizhao Wang, Caroline Wang, Xuesu Xiao, Yuke Zhu, Peter Stone

    Abstract: Two desiderata of reinforcement learning (RL) algorithms are the ability to learn from relatively little experience and the ability to learn policies that generalize to a range of problem specifications. In factored state spaces, one approach towards achieving both goals is to learn state abstractions, which only keep the necessary variables for learning the tasks at hand. This paper introduces Ca… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted at AAAI24

    ACM Class: I.2.9; I.2.8; I.2.6

  26. arXiv:2401.02576  [pdf, other

    cs.LG cs.AI cs.NE

    t-DGR: A Trajectory-Based Deep Generative Replay Method for Continual Learning in Decision Making

    Authors: William Yue, Bo Liu, Peter Stone

    Abstract: Deep generative replay has emerged as a promising approach for continual learning in decision-making tasks. This approach addresses the problem of catastrophic forgetting by leveraging the generation of trajectories from previously encountered tasks to augment the current dataset. However, existing deep generative replay methods for continual learning rely on autoregressive models, which suffer fr… ▽ More

    Submitted 17 June, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: Published at 3rd Conference on Lifelong Learning Agents (CoLLAs), 2024

  27. arXiv:2312.04684  [pdf, other

    cs.CL cs.AI

    Latent Skill Discovery for Chain-of-Thought Reasoning

    Authors: Zifan Xu, Haozhu Wang, Dmitriy Bespalov, Xuan Wang, Peter Stone, Yanjun Qi

    Abstract: Chain-of-thought (CoT) prompting is a popular in-context learning (ICL) approach for large language models (LLMs), especially when tackling complex reasoning tasks. Traditional ICL approaches construct prompts using examples that contain questions similar to the input question. However, CoT prompting, which includes crucial intermediate reasoning steps (rationales) within its examples, necessitate… ▽ More

    Submitted 21 October, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Journal ref: Findings of Empirical Methods in Natural Language Processing 2024

  28. arXiv:2311.08783  [pdf, other

    cs.RO cs.AI

    ICRA Roboethics Challenge 2023: Intelligent Disobedience in an Elderly Care Home

    Authors: Sveta Paster, Kantwon Rogers, Gordon Briggs, Peter Stone, Reuth Mirsky

    Abstract: With the projected surge in the elderly population, service robots offer a promising avenue to enhance their well-being in elderly care homes. Such robots will encounter complex scenarios which will require them to perform decisions with ethical consequences. In this report, we propose to leverage the Intelligent Disobedience framework in order to give the robot the ability to perform a deliberati… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: This report is part of ICRA roboethics competition : https://competition.raiselab.ca/competition-details-2023_1/ethics-challenge/submitted-proposals/submission-1

  29. arXiv:2311.00785  [pdf, other

    cs.RO

    Exploring the Cost of Interruptions in Human-Robot Teaming

    Authors: Swathi Mannem, William Macke, Peter Stone, Reuth Mirsky

    Abstract: Productive and efficient human-robot teaming is a highly desirable ability in service robots, yet there is a fundamental trade-off that a robot needs to consider in such tasks. On the one hand, gaining information from communication with teammates can help individual planning. On the other hand, such communication comes at the cost of distracting teammates from efficiently completing their goals,… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Preprint of a paper accepted for publication in Humanoids 2023 (https://2023.ieee-humanoids.org/)

  30. arXiv:2310.14386  [pdf, other

    cs.RO cs.CV cs.LG

    Learning Generalizable Manipulation Policies with Object-Centric 3D Representations

    Authors: Yifeng Zhu, Zhenyu Jiang, Peter Stone, Yuke Zhu

    Abstract: We introduce GROOT, an imitation learning method for learning robust policies with object-centric and 3D priors. GROOT builds policies that generalize beyond their initial training conditions for vision-based manipulation. It constructs object-centric 3D representations that are robust toward background changes and camera views and reason over these representations using a transformer-based policy… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted at the 7th Annual Conference on Robot Learning (CoRL), 2023 in Atlanta, US

  31. arXiv:2310.08702  [pdf, other

    cs.LG cs.AI cs.RO

    ELDEN: Exploration via Local Dependencies

    Authors: Jiaheng Hu, Zizhao Wang, Peter Stone, Roberto Martin-Martin

    Abstract: Tasks with large state space and sparse rewards present a longstanding challenge to reinforcement learning. In these tasks, an agent needs to explore the state space efficiently until it finds a reward. To deal with this problem, the community has proposed to augment the reward function with intrinsic reward, a bonus signal that encourages the agent to visit interesting states. In this work, we pr… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  32. arXiv:2310.06794  [pdf, other

    cs.LG cs.AI cs.RO

    $f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences

    Authors: Siddhant Agarwal, Ishan Durugkar, Peter Stone, Amy Zhang

    Abstract: Goal-Conditioned Reinforcement Learning (RL) problems often have access to sparse rewards where the agent receives a reward signal only when it has achieved the goal, making policy optimization a difficult problem. Several works augment this sparse reward with a learned dense reward function, but this can lead to sub-optimal policies if the reward is misaligned. Moreover, recent works have demonst… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023

  33. arXiv:2310.06303  [pdf, other

    cs.RO cs.AI

    Dobby: A Conversational Service Robot Driven by GPT-4

    Authors: Carson Stark, Bohkyung Chun, Casey Charleston, Varsha Ravi, Luis Pabon, Surya Sunkari, Tarun Mohan, Peter Stone, Justin Hart

    Abstract: This work introduces a robotics platform which embeds a conversational AI agent in an embodied system for natural language understanding and intelligent decision-making for service tasks; integrating task planning and human-like conversation. The agent is derived from a large language model, which has learned from a vast corpus of general knowledge. In addition to generating dialogue, this agent c… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  34. arXiv:2310.02456  [pdf, other

    cs.LG cs.AI

    Learning Optimal Advantage from Preferences and Mistaking it for Reward

    Authors: W. Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson, Serena Booth, Anca Dragan, Peter Stone, Scott Niekum

    Abstract: We consider algorithms for learning reward functions from human preferences over pairs of trajectory segments, as used in reinforcement learning from human feedback (RLHF). Most recent work assumes that human preferences are generated based only upon the reward accrued within those segments, or their partial return. Recent work casts doubt on the validity of this assumption, proposing an alternati… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: 8 pages (16 pages with references and appendix), 11 figures

    ACM Class: I.2.6; I.2.8

  35. arXiv:2309.15302  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    STERLING: Self-Supervised Terrain Representation Learning from Unconstrained Robot Experience

    Authors: Haresh Karnan, Elvin Yang, Daniel Farkash, Garrett Warnell, Joydeep Biswas, Peter Stone

    Abstract: Terrain awareness, i.e., the ability to identify and distinguish different types of terrain, is a critical ability that robots must have to succeed at autonomous off-road navigation. Current approaches that provide robots with this awareness either rely on labeled data which is expensive to collect, engineered features and cost functions that may not generalize, or expert human demonstrations whic… ▽ More

    Submitted 20 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: Project website: https://hareshkarnan.github.io/sterling/

    Journal ref: Conference on Robot Learning (CoRL 2023)

  36. arXiv:2309.13466  [pdf, other

    cs.RO

    Rethinking Social Robot Navigation: Leveraging the Best of Two Worlds

    Authors: Amir Hossain Raj, Zichao Hu, Haresh Karnan, Rohan Chandra, Amirreza Payandeh, Luisa Mao, Peter Stone, Joydeep Biswas, Xuesu Xiao

    Abstract: Empowering robots to navigate in a socially compliant manner is essential for the acceptance of robots moving in human-inhabited environments. Previously, roboticists have developed geometric navigation systems with decades of empirical validation to achieve safety and efficiency. However, the many complex factors of social compliance make geometric navigation systems hard to adapt to social situa… ▽ More

    Submitted 9 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: 8 pages, 6 figures, ICRA-2024

  37. arXiv:2309.09912  [pdf, other

    cs.RO cs.AI cs.LG

    Wait, That Feels Familiar: Learning to Extrapolate Human Preferences for Preference Aligned Path Planning

    Authors: Haresh Karnan, Elvin Yang, Garrett Warnell, Joydeep Biswas, Peter Stone

    Abstract: Autonomous mobility tasks such as lastmile delivery require reasoning about operator indicated preferences over terrains on which the robot should navigate to ensure both robot safety and mission success. However, coping with out of distribution data from novel terrains or appearance changes due to lighting variations remains a fundamental problem in visual terrain adaptive navigation. Existing so… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Journal ref: Under Submission to ICRA 2024

  38. arXiv:2309.08897  [pdf, other

    cs.RO

    Asynchronous Task Plan Refinement for Multi-Robot Task and Motion Planning

    Authors: Yoonchang Sung, Rahul Shome, Peter Stone

    Abstract: This paper explores general multi-robot task and motion planning, where multiple robots in close proximity manipulate objects while satisfying constraints and a given goal. In particular, we formulate the plan refinement problem--which, given a task plan, finds valid assignments of variables corresponding to solution trajectories--as a hybrid constraint satisfaction problem. The proposed algorithm… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  39. arXiv:2308.14269  [pdf, other

    cs.AI

    Utilizing Mood-Inducing Background Music in Human-Robot Interaction

    Authors: Elad Liebman, Peter Stone

    Abstract: Past research has clearly established that music can affect mood and that mood affects emotional and cognitive processing, and thus decision-making. It follows that if a robot interacting with a person needs to predict the person's behavior, knowledge of the music the person is listening to when acting is a potentially relevant feature. To date, however, there has not been any concrete evidence th… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  40. arXiv:2308.10966  [pdf, other

    cs.RO cs.GT cs.MA eess.SY

    Deadlock-free, Safe, and Decentralized Multi-Robot Navigation in Social Mini-Games via Discrete-Time Control Barrier Functions

    Authors: Rohan Chandra, Vrushabh Zinage, Efstathios Bakolas, Peter Stone, Joydeep Biswas

    Abstract: We present an approach to ensure safe and deadlock-free navigation for decentralized multi-robot systems operating in constrained environments, including doorways and intersections. Although many solutions have been proposed that ensure safety and resolve deadlocks, optimally preventing deadlocks in a minimally invasive and decentralized fashion remains an open problem. We first formalize the obje… ▽ More

    Submitted 8 February, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: major update since last revision

  41. arXiv:2308.09595  [pdf, other

    cs.AI

    Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents

    Authors: Arrasy Rahman, Jiaxun Cui, Peter Stone

    Abstract: Robustly cooperating with unseen agents and human partners presents significant challenges due to the diverse cooperative conventions these partners may adopt. Existing Ad Hoc Teamwork (AHT) methods address this challenge by training an agent with a population of diverse teammate policies obtained through maximizing specific diversity metrics. However, prior heuristic-based diversity metrics do no… ▽ More

    Submitted 2 January, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted at AAAI-24 conference

  42. arXiv:2308.03205  [pdf, other

    cs.RO

    Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The 2nd BARN Challenge at ICRA 2023

    Authors: Xuesu Xiao, Zifan Xu, Garrett Warnell, Peter Stone, Ferran Gebelli Guinjoan, Romulo T. Rodrigues, Herman Bruyninckx, Hanjaya Mandala, Guilherme Christmann, Jose Luis Blanco-Claraco, Shravan Somashekara Rai

    Abstract: The 2nd BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2023 IEEE International Conference on Robotics and Automation (ICRA 2023) in London, UK and continued to evaluate the performance of state-of-the-art autonomous ground navigation systems in highly constrained environments. Compared to The 1st BARN Challenge at ICRA 2022 in Philadelphia, the competition has grown signi… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: arXiv admin note: text overlap with arXiv:2208.10473

  43. arXiv:2307.11889  [pdf, other

    cs.RO

    Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning

    Authors: Xiaohan Zhang, Yifeng Zhu, Yan Ding, Yuqian Jiang, Yuke Zhu, Peter Stone, Shiqi Zhang

    Abstract: In existing task and motion planning (TAMP) research, it is a common assumption that experts manually specify the state space for task-level planning. A well-developed state space enables the desirable distribution of limited computational resources between task planning and motion planning. However, developing such task-level state spaces can be non-trivial in practice. In this paper, we consider… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: To be published in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023

  44. arXiv:2307.08593  [pdf, other

    physics.acc-ph cs.LG hep-ex nucl-ex nucl-th

    Artificial Intelligence for the Electron Ion Collider (AI4EIC)

    Authors: C. Allaire, R. Ammendola, E. -C. Aschenauer, M. Balandat, M. Battaglieri, J. Bernauer, M. Bondì, N. Branson, T. Britton, A. Butter, I. Chahrour, P. Chatagnon, E. Cisbani, E. W. Cline, S. Dash, C. Dean, W. Deconinck, A. Deshpande, M. Diefenthaler, R. Ent, C. Fanelli, M. Finger, M. Finger, Jr., E. Fol, S. Furletov , et al. (70 additional authors not shown)

    Abstract: The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 27 pages, 11 figures, AI4EIC workshop, tutorials and hackathon

  45. arXiv:2306.16740  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

    Authors: Anthony Francis, Claudia Pérez-D'Arpino, Chengshu Li, Fei Xia, Alexandre Alahi, Rachid Alami, Aniket Bera, Abhijat Biswas, Joydeep Biswas, Rohan Chandra, Hao-Tien Lewis Chiang, Michael Everett, Sehoon Ha, Justin Hart, Jonathan P. How, Haresh Karnan, Tsang-Wei Edward Lee, Luis J. Manso, Reuth Mirksy, Sören Pirk, Phani Teja Singamaneni, Peter Stone, Ada V. Taylor, Peter Trautman, Nathan Tsoi , et al. (6 additional authors not shown)

    Abstract: A major challenge to deploying robots widely is navigation in human-populated environments, commonly referred to as social robot navigation. While the field of social navigation has advanced tremendously in recent years, the fair evaluation of algorithms that tackle social navigation remains hard because it involves not just robotic agents moving in static environments but also dynamic human agent… ▽ More

    Submitted 19 September, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 42 pages, 11 figures, 6 tables

    ACM Class: I.2.9

  46. arXiv:2306.07372  [pdf, other

    cs.LG cs.AI cs.GT

    Composing Efficient, Robust Tests for Policy Selection

    Authors: Dustin Morrill, Thomas J. Walsh, Daniel Hernandez, Peter R. Wurman, Peter Stone

    Abstract: Modern reinforcement learning systems produce many high-quality policies throughout the learning process. However, to choose which policy to actually deploy in the real world, they must be tested under an intractable number of environmental conditions. We introduce RPOSST, an algorithm to select a small set of test cases from a larger pool based on a relatively small number of sample evaluations.… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 26 pages, 13 figures. To appear in Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI 2023)

    ACM Class: B.8.1; I.2.6

  47. arXiv:2306.03792  [pdf, other

    cs.LG

    FAMO: Fast Adaptive Multitask Optimization

    Authors: Bo Liu, Yihao Feng, Peter Stone, Qiang Liu

    Abstract: One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL). However, in practice, applying gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks. Previous approaches that manipulate task gradients for a more bal… ▽ More

    Submitted 29 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  48. arXiv:2306.03310  [pdf, other

    cs.AI

    LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning

    Authors: Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, Peter Stone

    Abstract: Lifelong learning offers a promising paradigm of building a generalist agent that learns and adapts over its lifespan. Unlike traditional lifelong learning problems in image and text domains, which primarily involve the transfer of declarative knowledge of entities and concepts, lifelong learning in decision-making (LLDM) also necessitates the transfer of procedural knowledge, such as actions and… ▽ More

    Submitted 14 October, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

  49. arXiv:2305.10395  [pdf, other

    cs.RO

    Motion Planning (In)feasibility Detection using a Prior Roadmap via Path and Cut Search

    Authors: Yoonchang Sung, Peter Stone

    Abstract: Motion planning seeks a collision-free path in a configuration space (C-space), representing all possible robot configurations in the environment. As it is challenging to construct a C-space explicitly for a high-dimensional robot, we generally build a graph structure called a roadmap, a discrete approximation of a complex continuous C-space, to reason about connectivity. Checking collision-free c… ▽ More

    Submitted 18 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: 18 pages, 19 figures, Published in Robotics: Science and Systems (RSS), 2023

  50. arXiv:2305.04866  [pdf, other

    cs.RO cs.AI cs.LG

    Causal Policy Gradient for Whole-Body Mobile Manipulation

    Authors: Jiaheng Hu, Peter Stone, Roberto Martín-Martín

    Abstract: Developing the next generation of household robot helpers requires combining locomotion and interaction capabilities, which is generally referred to as mobile manipulation (MoMa). MoMa tasks are difficult due to the large action space of the robot and the common multi-objective nature of the task, e.g., efficiently reaching a goal while avoiding obstacles. Current approaches often segregate tasks… ▽ More

    Submitted 28 September, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Journal ref: Robotics: science and systems. 2023