-
Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Authors:
Jianxin Bi,
Kelvin Lim,
Kaiqi Chen,
Yifei Huang,
Harold Soh
Abstract:
Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficie…
▽ More
Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers by leveraging observational demonstration data. Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation. This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder, requiring minimal action-labeled data. Through experiments on simulated robot manipulation tasks and a real robot experiment with multi-modal expert demonstrations, we demonstrate that our approach significantly enhances action-data efficiency and achieves high task success rates with limited action data.
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
Stochastic Bandits for Egalitarian Assignment
Authors:
Eugene Lim,
Vincent Y. F. Tan,
Harold Soh
Abstract:
We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its a…
▽ More
We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent's objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks
Authors:
Zeyu Feng,
Hao Luan,
Kevin Yuchen Ma,
Harold Soh
Abstract:
Safe and successful deployment of robots requires not only the ability to generate complex plans but also the capacity to frequently replan and correct execution errors. This paper addresses the challenge of long-horizon trajectory planning under temporally extended objectives in a receding horizon manner. To this end, we propose DOPPLER, a data-driven hierarchical framework that generates and upd…
▽ More
Safe and successful deployment of robots requires not only the ability to generate complex plans but also the capacity to frequently replan and correct execution errors. This paper addresses the challenge of long-horizon trajectory planning under temporally extended objectives in a receding horizon manner. To this end, we propose DOPPLER, a data-driven hierarchical framework that generates and updates plans based on instruction specified by linear temporal logic (LTL). Our method decomposes temporal tasks into chain of options with hierarchical reinforcement learning from offline non-expert datasets. It leverages diffusion models to generate options with low-level actions. We devise a determinantal-guided posterior sampling technique during batch generation, which improves the speed and diversity of diffusion generated options, leading to more efficient querying. Experiments on robot navigation and manipulation tasks demonstrate that DOPPLER can generate sequences of trajectories that progressively satisfy the specified formulae for obstacle avoidance and sequential visitation. Demonstration videos are available online at: https://philiptheother.github.io/doppler/.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Arena 4.0: A Comprehensive ROS2 Development and Benchmarking Platform for Human-centric Navigation Using Generative-Model-based Environment Generation
Authors:
Volodymyr Shcherbyna1,
Linh Kästner,
Diego Diaz,
Huu Giang Nguyen,
Maximilian Ho-Kyoung Schreff,
Tim Lenz,
Jonas Kreutz,
Ahmed Martban,
Huajian Zeng,
Harold Soh
Abstract:
Building on the foundations of our previous work, this paper introduces Arena 4.0, a significant advancement over Arena 3.0, Arena-Bench, Arena 1.0, and Arena 2.0. Arena 4.0 offers three key novel contributions: (1) a generative-model-based world and scenario generation approach that utilizes large language models (LLMs) and diffusion models to dynamically generate complex, human-centric environme…
▽ More
Building on the foundations of our previous work, this paper introduces Arena 4.0, a significant advancement over Arena 3.0, Arena-Bench, Arena 1.0, and Arena 2.0. Arena 4.0 offers three key novel contributions: (1) a generative-model-based world and scenario generation approach that utilizes large language models (LLMs) and diffusion models to dynamically generate complex, human-centric environments from text prompts or 2D floorplans, useful for the development and benchmarking of social navigation strategies; (2) a comprehensive 3D model database, extendable with additional 3D assets that are semantically linked and annotated for dynamic spawning and arrangement within 3D worlds; and (3) a complete migration to ROS 2, enabling compatibility with modern hardware and enhanced functionalities for improved navigation, usability, and easier deployment on real robots. We evaluated the platform's performance through a comprehensive user study, demonstrating significant improvements in usability and efficiency compared to previous versions. Arena 4.0 is openly available at https://github.com/Arena-Rosnav.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting
Authors:
Ce Hao,
Kelvin Lin,
Siyuan Luo,
Harold Soh
Abstract:
Diffusion policies have demonstrated robust performance in generative modeling, prompting their application in robotic manipulation controlled via language descriptions. In this paper, we introduce a zero-shot, open-vocabulary diffusion policy method for robot manipulation. Using Vision-Language Models (VLMs), our method transforms linguistic task descriptions into actionable keyframes in 3D space…
▽ More
Diffusion policies have demonstrated robust performance in generative modeling, prompting their application in robotic manipulation controlled via language descriptions. In this paper, we introduce a zero-shot, open-vocabulary diffusion policy method for robot manipulation. Using Vision-Language Models (VLMs), our method transforms linguistic task descriptions into actionable keyframes in 3D space. These keyframes serve to guide the diffusion process via inpainting. However, naively enforcing the diffusion process to adhere to the generated keyframes is problematic: the keyframes from the VLMs may be incorrect and lead to action sequences where the diffusion model performs poorly. To address these challenges, we develop an inpainting optimization strategy that balances adherence to the keyframes v.s. the training data distribution. Experimental evaluations demonstrate that our approach surpasses the performance of traditional fine-tuned language-conditioned methods in both simulated and real-world settings.
△ Less
Submitted 23 September, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Arena 3.0: Advancing Social Navigation in Collaborative and Highly Dynamic Environments
Authors:
Linh Kästner,
Volodymyir Shcherbyna,
Huajian Zeng,
Tuan Anh Le,
Maximilian Ho-Kyoung Schreff,
Halid Osmaev,
Nam Truong Tran,
Diego Diaz,
Jan Golebiowski,
Harold Soh,
Jens Lambrecht
Abstract:
Building upon our previous contributions, this paper introduces Arena 3.0, an extension of Arena-Bench, Arena 1.0, and Arena 2.0. Arena 3.0 is a comprehensive software stack containing multiple modules and simulation environments focusing on the development, simulation, and benchmarking of social navigation approaches in collaborative environments. We significantly enhance the realism of human beh…
▽ More
Building upon our previous contributions, this paper introduces Arena 3.0, an extension of Arena-Bench, Arena 1.0, and Arena 2.0. Arena 3.0 is a comprehensive software stack containing multiple modules and simulation environments focusing on the development, simulation, and benchmarking of social navigation approaches in collaborative environments. We significantly enhance the realism of human behavior simulation by incorporating a diverse array of new social force models and interaction patterns, encompassing both human-human and human-robot dynamics. The platform provides a comprehensive set of new task modes, designed for extensive benchmarking and testing and is capable of generating realistic and human-centric environments dynamically, catering to a broad spectrum of social navigation scenarios. In addition, the platform's functionalities have been abstracted across three widely used simulators, each tailored for specific training and testing purposes. The platform's efficacy has been validated through an extensive benchmark and user evaluations of the platform by a global community of researchers and students, which noted the substantial improvement compared to previous versions and expressed interests to utilize the platform for future research and development. Arena 3.0 is openly available at https://github.com/Arena-Rosnav.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Out-of-Distribution Detection with a Single Unconditional Diffusion Model
Authors:
Alvin Heng,
Alexandre H. Thiery,
Harold Soh
Abstract:
Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples. Traditionally, unsupervised methods utilize a deep generative model for OOD detection. However, such approaches require a new model to be trained for each inlier dataset. This paper explores whether a single model can perform OOD detection across diverse tasks. To that end, we introd…
▽ More
Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples. Traditionally, unsupervised methods utilize a deep generative model for OOD detection. However, such approaches require a new model to be trained for each inlier dataset. This paper explores whether a single model can perform OOD detection across diverse tasks. To that end, we introduce Diffusion Paths (DiffPath), which uses a single diffusion model originally trained to perform unconditional generation for OOD detection. We introduce a novel technique of measuring the rate-of-change and curvature of the diffusion paths connecting samples to the standard normal. Extensive experiments show that with a single model, DiffPath is competitive with prior work using individual models on a variety of OOD tasks involving different distributions. Our code is publicly available at https://github.com/clear-nus/diffpath.
△ Less
Submitted 12 October, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning
Authors:
Zeyu Feng,
Hao Luan,
Pranav Goyal,
Harold Soh
Abstract:
Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework,…
▽ More
Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework, LTLDoG, that modifies the inference steps of the reverse process given an instruction specified using finite linear temporal logic ($\text{LTL}_f$). LTLDoG leverages a satisfaction value function on $\text{LTL}_f$ and guides the sampling steps using its gradient field. This value function can also be trained to generalize to new instructions not observed during training, enabling flexible test-time adaptability. Experiments in robot navigation and manipulation illustrate that the method is able to generate trajectories that satisfy formulae that specify obstacle avoidance and visitation sequences. Code and supplementary material are available online at https://github.com/clear-nus/ltldog.
△ Less
Submitted 30 September, 2024; v1 submitted 7 May, 2024;
originally announced May 2024.
-
Octopi: Object Property Reasoning with Large Tactile-Language Models
Authors:
Samson Yu,
Kelvin Lin,
Anxing Xiao,
Jiafei Duan,
Harold Soh
Abstract:
Physical reasoning is important for effective robot manipulation. Recent work has investigated both vision and language modalities for physical reasoning; vision can reveal information about objects in the environment and language serves as an abstraction and communication medium for additional context. Although these works have demonstrated success on a variety of physical reasoning tasks, they a…
▽ More
Physical reasoning is important for effective robot manipulation. Recent work has investigated both vision and language modalities for physical reasoning; vision can reveal information about objects in the environment and language serves as an abstraction and communication medium for additional context. Although these works have demonstrated success on a variety of physical reasoning tasks, they are limited to physical properties that can be inferred from visual or language inputs. In this work, we investigate combining tactile perception with language, which enables embodied systems to obtain physical properties through interaction and apply commonsense reasoning. We contribute a new dataset PhysiCLeAR, which comprises both physical/property reasoning tasks and annotated tactile videos obtained using a GelSight tactile sensor. We then introduce Octopi, a system that leverages both tactile representation learning and large vision-language models to predict and reason about tactile inputs with minimal language fine-tuning. Our evaluations on PhysiCLeAR show that Octopi is able to effectively use intermediate physical property predictions to improve its performance on various tactile-related tasks. PhysiCLeAR and Octopi are available at https://github.com/clear-nus/octopi.
△ Less
Submitted 4 June, 2024; v1 submitted 4 May, 2024;
originally announced May 2024.
-
Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction
Authors:
Bowen Zhang,
Harold Soh
Abstract:
In this work, we are interested in automated methods for knowledge graph creation (KGC) from input text. Progress on large language models (LLMs) has prompted a series of recent works applying them to KGC, e.g., via zero/few-shot prompting. Despite successes on small domain-specific datasets, these models face difficulties scaling up to text common in many real-world applications. A principal issu…
▽ More
In this work, we are interested in automated methods for knowledge graph creation (KGC) from input text. Progress on large language models (LLMs) has prompted a series of recent works applying them to KGC, e.g., via zero/few-shot prompting. Despite successes on small domain-specific datasets, these models face difficulties scaling up to text common in many real-world applications. A principal issue is that, in prior methods, the KG schema has to be included in the LLM prompt to generate valid triplets; larger and more complex schemas easily exceed the LLMs' context window length. Furthermore, there are scenarios where a fixed pre-defined schema is not available and we would like the method to construct a high-quality KG with a succinct self-generated schema. To address these problems, we propose a three-phase framework named Extract-Define-Canonicalize (EDC): open information extraction followed by schema definition and post-hoc canonicalization. EDC is flexible in that it can be applied to settings where a pre-defined target schema is available and when it is not; in the latter case, it constructs a schema automatically and applies self-canonicalization. To further improve performance, we introduce a trained component that retrieves schema elements relevant to the input text; this improves the LLMs' extraction performance in a retrieval-augmented generation-like manner. We demonstrate on three KGC benchmarks that EDC is able to extract high-quality triplets without any parameter tuning and with significantly larger schemas compared to prior works. Code for EDC is available at https://github.com/clear-nus/edc.
△ Less
Submitted 2 October, 2024; v1 submitted 4 April, 2024;
originally announced April 2024.
-
HyperCLOVA X Technical Report
Authors:
Kang Min Yoo,
Jaegeun Han,
Sookyo In,
Heewon Jeon,
Jisu Jeong,
Jaewook Kang,
Hyunwook Kim,
Kyung-Min Kim,
Munhyong Kim,
Sungju Kim,
Donghyun Kwak,
Hanock Kwak,
Se Jung Kwon,
Bado Lee,
Dongsoo Lee,
Gichang Lee,
Jooho Lee,
Baeseong Park,
Seongjin Shin,
Joonsang Yu,
Seolki Baek,
Sumin Byeon,
Eungsup Cho,
Dooseok Choe,
Jeesung Han
, et al. (371 additional authors not shown)
Abstract:
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t…
▽ More
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.
△ Less
Submitted 13 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Improving Demand Forecasting in Open Systems with Cartogram-Enhanced Deep Learning
Authors:
Sangjoon Park,
Yongsung Kwon,
Hyungjoon Soh,
Mi Jin Lee,
Seung-Woo Son
Abstract:
Predicting temporal patterns across various domains poses significant challenges due to their nuanced and often nonlinear trajectories. To address this challenge, prediction frameworks have been continuously refined, employing data-driven statistical methods, mathematical models, and machine learning. Recently, as one of the challenging systems, shared transport systems such as public bicycles hav…
▽ More
Predicting temporal patterns across various domains poses significant challenges due to their nuanced and often nonlinear trajectories. To address this challenge, prediction frameworks have been continuously refined, employing data-driven statistical methods, mathematical models, and machine learning. Recently, as one of the challenging systems, shared transport systems such as public bicycles have gained prominence due to urban constraints and environmental concerns. Predicting rental and return patterns at bicycle stations remains a formidable task due to the system's openness and imbalanced usage patterns across stations. In this study, we propose a deep learning framework to predict rental and return patterns by leveraging cartogram approaches. The cartogram approach facilitates the prediction of demand for newly installed stations with no training data as well as long-period prediction, which has not been achieved before. We apply this method to public bicycle rental-and-return data in Seoul, South Korea, employing a spatial-temporal convolutional graph attention network. Our improved architecture incorporates batch attention and modified node feature updates for better prediction accuracy across different time scales. We demonstrate the effectiveness of our framework in predicting temporal patterns and its potential applications.
△ Less
Submitted 26 May, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Don't Start from Scratch: Behavioral Refinement via Interpolant-based Policy Diffusion
Authors:
Kaiqi Chen,
Eugene Lim,
Kelvin Lin,
Yiyang Chen,
Harold Soh
Abstract:
Imitation learning empowers artificial agents to mimic behavior by learning from demonstrations. Recently, diffusion models, which have the ability to model high-dimensional and multimodal distributions, have shown impressive performance on imitation learning tasks. These models learn to shape a policy by diffusing actions (or states) from standard Gaussian noise. However, the target policy to be…
▽ More
Imitation learning empowers artificial agents to mimic behavior by learning from demonstrations. Recently, diffusion models, which have the ability to model high-dimensional and multimodal distributions, have shown impressive performance on imitation learning tasks. These models learn to shape a policy by diffusing actions (or states) from standard Gaussian noise. However, the target policy to be learned is often significantly different from Gaussian and this mismatch can result in poor performance when using a small number of diffusion steps (to improve inference speed) and under limited data. The key idea in this work is that initiating from a more informative source than Gaussian enables diffusion methods to mitigate the above limitations. We contribute both theoretical results, a new method, and empirical findings that show the benefits of using an informative source policy. Our method, which we call BRIDGER, leverages the stochastic interpolants framework to bridge arbitrary policies, thus enabling a flexible approach towards imitation learning. It generalizes prior work in that standard Gaussians can still be applied, but other source policies can be used if available. In experiments on challenging simulation benchmarks and on real robots, BRIDGER outperforms state-of-the-art diffusion policies. We provide further analysis on design considerations when applying BRIDGER. Code for BRIDGER is available at https://github.com/clear-nus/bridger.
△ Less
Submitted 10 July, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
Probable Object Location (POLo) Score Estimation for Efficient Object Goal Navigation
Authors:
Jiaming Wang,
Harold Soh
Abstract:
To advance the field of autonomous robotics, particularly in object search tasks within unexplored environments, we introduce a novel framework centered around the Probable Object Location (POLo) score. Utilizing a 3D object probability map, the POLo score allows the agent to make data-driven decisions for efficient object search. We further enhance the framework's practicality by introducing POLo…
▽ More
To advance the field of autonomous robotics, particularly in object search tasks within unexplored environments, we introduce a novel framework centered around the Probable Object Location (POLo) score. Utilizing a 3D object probability map, the POLo score allows the agent to make data-driven decisions for efficient object search. We further enhance the framework's practicality by introducing POLoNet, a neural network trained to approximate the computationally intensive POLo score. Our approach addresses critical limitations of both end-to-end reinforcement learning methods, which suffer from memory decay over long-horizon tasks, and traditional map-based methods that neglect visibility constraints. Our experiments, involving the first phase of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet significantly outperforms a range of baseline methods, including end-to-end RL techniques and prior map-based strategies. To provide a comprehensive evaluation, we introduce new performance metrics that offer insights into the efficiency and effectiveness of various agents in object goal navigation.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
GRaCE: Balancing Multiple Criteria to Achieve Stable, Collision-Free, and Functional Grasps
Authors:
Tasbolat Taunyazov,
Kelvin Lin,
Harold Soh
Abstract:
This paper addresses the multi-faceted problem of robot grasping, where multiple criteria may conflict and differ in importance. We introduce a probabilistic framework, Grasp Ranking and Criteria Evaluation (GRaCE), which employs hierarchical rule-based logic and a rank-preserving utility function for grasps based on various criteria such as stability, kinematic constraints, and goal-oriented func…
▽ More
This paper addresses the multi-faceted problem of robot grasping, where multiple criteria may conflict and differ in importance. We introduce a probabilistic framework, Grasp Ranking and Criteria Evaluation (GRaCE), which employs hierarchical rule-based logic and a rank-preserving utility function for grasps based on various criteria such as stability, kinematic constraints, and goal-oriented functionalities. GRaCE's probabilistic nature means the framework handles uncertainty in a principled manner, i.e., the method is able to leverage the probability that a given criteria is satisfied. Additionally, we propose GRaCE-OPT, a hybrid optimization strategy that combines gradient-based and gradient-free methods to effectively navigate the complex, non-convex utility function. Experimental results in both simulated and real-world scenarios show that GRaCE requires fewer samples to achieve comparable or superior performance relative to existing methods. The modular architecture of GRaCE allows for easy customization and adaptation to specific application needs.
△ Less
Submitted 29 May, 2024; v1 submitted 16 September, 2023;
originally announced September 2023.
-
Refining 6-DoF Grasps with Context-Specific Classifiers
Authors:
Tasbolat Taunyazov,
Heng Zhang,
John Patrick Eala,
Na Zhao,
Harold Soh
Abstract:
In this work, we present GraspFlow, a refinement approach for generating context-specific grasps. We formulate the problem of grasp synthesis as a sampling problem: we seek to sample from a context-conditioned probability distribution of successful grasps. However, this target distribution is unknown. As a solution, we devise a discriminator gradient-flow method to evolve grasps obtained from a si…
▽ More
In this work, we present GraspFlow, a refinement approach for generating context-specific grasps. We formulate the problem of grasp synthesis as a sampling problem: we seek to sample from a context-conditioned probability distribution of successful grasps. However, this target distribution is unknown. As a solution, we devise a discriminator gradient-flow method to evolve grasps obtained from a simpler distribution in a manner that mimics sampling from the desired target distribution. Unlike existing approaches, GraspFlow is modular, allowing grasps that satisfy multiple criteria to be obtained simply by incorporating the relevant discriminators. It is also simple to implement, requiring minimal code given existing auto-differentiation libraries and suitable discriminators. Experiments show that GraspFlow generates stable and executable grasps on a real-world Panda robot for a diverse range of objects. In particular, in 60 trials on 20 different household objects, the first attempted grasp was successful 94% of the time, and 100% grasp success was achieved by the second grasp. Moreover, incorporating a functional discriminator for robot-human handover improved the functional aspect of the grasp by up to 33%.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Latent Emission-Augmented Perspective-Taking (LEAPT) for Human-Robot Interaction
Authors:
Kaiqi Chen,
Jing Yu Lim,
Kingsley Kuan,
Harold Soh
Abstract:
Perspective-taking is the ability to perceive or understand a situation or concept from another individual's point of view, and is crucial in daily human interactions. Enabling robots to perform perspective-taking remains an unsolved problem; existing approaches that use deterministic or handcrafted methods are unable to accurately account for uncertainty in partially-observable settings. This wor…
▽ More
Perspective-taking is the ability to perceive or understand a situation or concept from another individual's point of view, and is crucial in daily human interactions. Enabling robots to perform perspective-taking remains an unsolved problem; existing approaches that use deterministic or handcrafted methods are unable to accurately account for uncertainty in partially-observable settings. This work proposes to address this limitation via a deep world model that enables a robot to perform both perception and conceptual perspective taking, i.e., the robot is able to infer what a human sees and believes. The key innovation is a decomposed multi-modal latent state space model able to generate and augment fictitious observations/emissions. Optimizing the ELBO that arises from this probabilistic graphical model enables the learning of uncertainty in latent space, which facilitates uncertainty estimation from high-dimensional observations. We tasked our model to predict human observations and beliefs on three partially-observable HRI tasks. Experiments show that our method significantly outperforms existing baselines and is able to infer visual observations available to other agent and their internal beliefs.
△ Less
Submitted 12 August, 2023;
originally announced August 2023.
-
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Authors:
Xudong Shen,
Hannah Brown,
Jiashu Tao,
Martin Strobel,
Yao Tong,
Akshay Narayan,
Harold Soh,
Finale Doshi-Velez
Abstract:
There is increasing attention being given to how to regulate AI systems. As governing bodies grapple with what values to encapsulate into regulation, we consider the technical half of the question: To what extent can AI experts vet an AI system for adherence to regulatory requirements? We investigate this question through the lens of two public sector procurement checklists, identifying what we ca…
▽ More
There is increasing attention being given to how to regulate AI systems. As governing bodies grapple with what values to encapsulate into regulation, we consider the technical half of the question: To what extent can AI experts vet an AI system for adherence to regulatory requirements? We investigate this question through the lens of two public sector procurement checklists, identifying what we can do now, what should be possible with technical innovation, and what requirements need a more interdisciplinary approach.
△ Less
Submitted 27 March, 2024; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Fisher information as general metrics of quantum synchronization
Authors:
Yuan Shen,
Hong Yi Soh,
Leong-Chuan Kwek,
Weijun Fan
Abstract:
Quantum synchronization has emerged as a crucial phenomenon in quantum nonlinear dynamics with potential applications in quantum information processing. Multiple measures for quantifying quantum synchronization exist. However, there is currently no widely agreed metric that is universally adopted. In this paper, we propose using classical and quantum Fisher information (FI) as alternative metrics…
▽ More
Quantum synchronization has emerged as a crucial phenomenon in quantum nonlinear dynamics with potential applications in quantum information processing. Multiple measures for quantifying quantum synchronization exist. However, there is currently no widely agreed metric that is universally adopted. In this paper, we propose using classical and quantum Fisher information (FI) as alternative metrics to detect and measure quantum synchronization. We establish the connection between FI and quantum synchronization, demonstrating that both classical and quantum FI can be deployed as more general indicators of quantum phase synchronization, in some regimes where all other existing measures fail to provide reliable results. We show advantages in FI-based measures, especially in 2-to-1 synchronization. Furthermore, we analyze the impact of noise on the synchronization measures, revealing the robustness and susceptibility of each method in the presence of dissipation and decoherence. Our results open up new avenues for understanding and exploiting quantum synchronization.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models
Authors:
Alvin Heng,
Harold Soh
Abstract:
The recent proliferation of large-scale text-to-image models has led to growing concerns that such models may be misused to generate harmful, misleading, and inappropriate content. Motivated by this issue, we derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models. Our method, dubbed Selective Amnesia, enables controllable forgetting wh…
▽ More
The recent proliferation of large-scale text-to-image models has led to growing concerns that such models may be misused to generate harmful, misleading, and inappropriate content. Motivated by this issue, we derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models. Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten. Selective Amnesia can be applied to conditional variational likelihood models, which encompass a variety of popular deep generative frameworks, including variational autoencoders and large-scale text-to-image diffusion models. Experiments across different models demonstrate that our approach induces forgetting on a variety of concepts, from entire classes in standard datasets to celebrity and nudity prompts in text-to-image models. Our code is publicly available at https://github.com/clear-nus/selective-amnesia.
△ Less
Submitted 16 October, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Generative Modeling with Flow-Guided Density Ratio Learning
Authors:
Alvin Heng,
Abdul Fatir Ansari,
Harold Soh
Abstract:
We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in…
▽ More
We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as $128\times128$, as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code is publicly available at ttps://github.com/clear-nus/fdrl.
△ Less
Submitted 4 June, 2024; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Large Language Models as Zero-Shot Human Models for Human-Robot Interaction
Authors:
Bowen Zhang,
Harold Soh
Abstract:
Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we…
▽ More
Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) -- which have consumed vast amounts of human-generated text data -- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.
△ Less
Submitted 1 October, 2024; v1 submitted 6 March, 2023;
originally announced March 2023.
-
Enhancing quantum synchronization through homodyne measurement, noise and squeezing
Authors:
Yuan Shen,
Hong Yi Soh,
Weijun Fan,
Leong-Chuan Kwek
Abstract:
Quantum synchronization has been a central topic in quantum nonlinear dynamics. Despite rapid development in this field, very few have studied how to efficiently boost synchronization. Homodyne measurement emerges as one of the successful candidates for this task, but preferably in the semi-classical regime. In our work, we focus on the phase synchronization of a harmonic-driven quantum Stuart-Lan…
▽ More
Quantum synchronization has been a central topic in quantum nonlinear dynamics. Despite rapid development in this field, very few have studied how to efficiently boost synchronization. Homodyne measurement emerges as one of the successful candidates for this task, but preferably in the semi-classical regime. In our work, we focus on the phase synchronization of a harmonic-driven quantum Stuart-Landau oscillator, and show that the enhancement induced by homodyne measurement persists into the quantum regime. Interestingly, optimal two-photon damping rates exist when the oscillator and driving are at resonance and with a small single-photon damping rate. We also report noise-induced enhancement in quantum synchronization when the single-photon damping rate is sufficiently large. Apart from these results, we discover that adding a squeezing Hamiltonian can further boost synchronization, especially in the semi-classical regime. Furthermore, the addition of squeezing causes the optimal two-photon pumping rates to shift and converge.
△ Less
Submitted 18 July, 2023; v1 submitted 26 February, 2023;
originally announced February 2023.
-
Translating Natural Language to Planning Goals with Large-Language Models
Authors:
Yaqi Xie,
Chen Yu,
Tongyao Zhu,
Jinbin Bai,
Ze Gong,
Harold Soh
Abstract:
Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains. Unfortunately, recent work has also shown that LLMs are unable to perform accurate reasoning nor solve planning problems, which may limit their usefulness for robotics-related tasks. In…
▽ More
Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains. Unfortunately, recent work has also shown that LLMs are unable to perform accurate reasoning nor solve planning problems, which may limit their usefulness for robotics-related tasks. In this work, our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language. If so, LLM can act as a natural interface between the planner and human users; the translated goal can be handed to domain-independent AI planners that are very effective at planning. Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning. We find that LLMs are able to leverage commonsense knowledge and reasoning to furnish missing details from under-specified goals (as is often the case in natural language). However, our experiments also reveal that LLMs can fail to generate goals in tasks that involve numerical or physical (e.g., spatial) reasoning, and that LLMs are sensitive to the prompts used. As such, these models are promising for translation to structured planning languages, but care should be taken in their use.
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series
Authors:
Abdul Fatir Ansari,
Alvin Heng,
Andre Lim,
Harold Soh
Abstract:
Learning accurate predictive models of real-world dynamic phenomena (e.g., climate, biological) remains a challenging task. One key issue is that the data generated by both natural and artificial processes often comprise time series that are irregularly sampled and/or contain missing observations. In this work, we propose the Neural Continuous-Discrete State Space Model (NCDSSM) for continuous-tim…
▽ More
Learning accurate predictive models of real-world dynamic phenomena (e.g., climate, biological) remains a challenging task. One key issue is that the data generated by both natural and artificial processes often comprise time series that are irregularly sampled and/or contain missing observations. In this work, we propose the Neural Continuous-Discrete State Space Model (NCDSSM) for continuous-time modeling of time series through discrete-time observations. NCDSSM employs auxiliary variables to disentangle recognition from dynamics, thus requiring amortized inference only for the auxiliary variables. Leveraging techniques from continuous-discrete filtering theory, we demonstrate how to perform accurate Bayesian inference for the dynamic states. We propose three flexible parameterizations of the latent dynamics and an efficient training objective that marginalizes the dynamic states during inference. Empirical results on multiple benchmark datasets across various domains show improved imputation and forecasting performance of NCDSSM over existing models.
△ Less
Submitted 18 June, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
Heterogeneous Beliefs and Multi-Population Learning in Network Games
Authors:
Shuyue Hu,
Harold Soh,
Georgios Piliouras
Abstract:
The effect of population heterogeneity in multi-agent learning is practically relevant but remains far from being well-understood. Motivated by this, we introduce a model of multi-population learning that allows for heterogeneous beliefs within each population and where agents respond to their beliefs via smooth fictitious play (SFP).We show that the system state -- a probability distribution over…
▽ More
The effect of population heterogeneity in multi-agent learning is practically relevant but remains far from being well-understood. Motivated by this, we introduce a model of multi-population learning that allows for heterogeneous beliefs within each population and where agents respond to their beliefs via smooth fictitious play (SFP).We show that the system state -- a probability distribution over beliefs -- evolves according to a system of partial differential equations akin to the continuity equations that commonly desccribe transport phenomena in physical systems. We establish the convergence of SFP to Quantal Response Equilibria in different classes of games capturing both network competition as well as network coordination. We also prove that the beliefs will eventually homogenize in all network games. Although the initial belief heterogeneity disappears in the limit, we show that it plays a crucial role for equilibrium selection in the case of coordination games as it helps select highly desirable equilibria. Contrary, in the case of network competition, the resulting limit behavior is independent of the initialization of beliefs, even when the underlying game has many distinct Nash equilibria.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
Mirror descent of Hopfield model
Authors:
Hyungjoon Soh,
Dongyeob Kim,
Juno Hwang,
Junghyo Jo
Abstract:
Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for utilizing mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate t…
▽ More
Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for utilizing mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models.
△ Less
Submitted 9 May, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Safety-Constrained Policy Transfer with Successor Features
Authors:
Zeyu Feng,
Bowen Zhang,
Jianxin Bi,
Harold Soh
Abstract:
In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We…
▽ More
In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We propose a Constrained Markov Decision Process (CMDP) formulation that simultaneously enables the transfer of policies and adherence to safety constraints. Our formulation cleanly separates task goals from safety considerations and permits the specification of a wide variety of constraints. Our approach relies on a novel extension of generalized policy improvement to constrained settings via a Lagrangian formulation. We devise a dual optimization algorithm that estimates the optimal dual variable of a target task, thus enabling safe transfer of policies derived from successor features learned on source tasks. Our experiments in simulated domains show that our approach is effective; it visits unsafe states less frequently and outperforms alternative state-of-the-art methods when taking safety constraints into account.
△ Less
Submitted 10 November, 2022;
originally announced November 2022.
-
Observed Adversaries in Deep Reinforcement Learning
Authors:
Eugene Lim,
Harold Soh
Abstract:
In this work, we point out the problem of observed adversaries for deep policies. Specifically, recent work has shown that deep reinforcement learning is susceptible to adversarial attacks where an observed adversary acts under environmental constraints to invoke natural but adversarial observations. This setting is particularly relevant for HRI since HRI-related robots are expected to perform the…
▽ More
In this work, we point out the problem of observed adversaries for deep policies. Specifically, recent work has shown that deep reinforcement learning is susceptible to adversarial attacks where an observed adversary acts under environmental constraints to invoke natural but adversarial observations. This setting is particularly relevant for HRI since HRI-related robots are expected to perform their tasks around and with other agents. In this work, we demonstrate that this effect persists even with low-dimensional observations. We further show that these adversarial attacks transfer across victims, which potentially allows malicious attackers to train an adversary without access to the target victim.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
SCALES: From Fairness Principles to Constrained Decision-Making
Authors:
Sreejith Balakrishnan,
Jianxin Bi,
Harold Soh
Abstract:
This paper proposes SCALES, a general framework that translates well-established fairness principles into a common representation based on the Constraint Markov Decision Process (CMDP). With the help of causal language, our framework can place constraints on both the procedure of decision making (procedural fairness) as well as the outcomes resulting from decisions (outcome fairness). Specifically…
▽ More
This paper proposes SCALES, a general framework that translates well-established fairness principles into a common representation based on the Constraint Markov Decision Process (CMDP). With the help of causal language, our framework can place constraints on both the procedure of decision making (procedural fairness) as well as the outcomes resulting from decisions (outcome fairness). Specifically, we show that well-known fairness principles can be encoded either as a utility component, a non-causal component, or a causal component in a SCALES-CMDP. We illustrate SCALES using a set of case studies involving a simulated healthcare scenario and the real-world COMPAS dataset. Experiments demonstrate that our framework produces fair policies that embody alternative fairness principles in single-step and sequential decision-making scenarios.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
MIRROR: Differentiable Deep Social Projection for Assistive Human-Robot Communication
Authors:
Kaiqi Chen,
Jeffrey Fong,
Harold Soh
Abstract:
Communication is a hallmark of intelligence. In this work, we present MIRROR, an approach to (i) quickly learn human models from human demonstrations, and (ii) use the models for subsequent communication planning in assistive shared-control settings. MIRROR is inspired by social projection theory, which hypothesizes that humans use self-models to understand others. Likewise, MIRROR leverages self-…
▽ More
Communication is a hallmark of intelligence. In this work, we present MIRROR, an approach to (i) quickly learn human models from human demonstrations, and (ii) use the models for subsequent communication planning in assistive shared-control settings. MIRROR is inspired by social projection theory, which hypothesizes that humans use self-models to understand others. Likewise, MIRROR leverages self-models learned using reinforcement learning to bootstrap human modeling. Experiments with simulated humans show that this approach leads to rapid learning and more robust models compared to existing behavioral cloning and state-of-the-art imitation learning methods. We also present a human-subject study using the CARLA simulator which shows that (i) MIRROR is able to scale to complex domains with high-dimensional observations and complicated world physics and (ii) provides effective assistive communication that enabled participants to drive more safely in adverse weather conditions.
△ Less
Submitted 6 March, 2022;
originally announced March 2022.
-
The Dynamics of Q-learning in Population Games: a Physics-Inspired Continuity Equation Model
Authors:
Shuyue Hu,
Chin-Wing Leung,
Ho-fung Leung,
Harold Soh
Abstract:
Although learning has found wide application in multi-agent systems, its effects on the temporal evolution of a system are far from understood. This paper focuses on the dynamics of Q-learning in large-scale multi-agent systems modeled as population games. We revisit the replicator equation model for Q-learning dynamics and observe that this model is inappropriate for our concerned setting. Motiva…
▽ More
Although learning has found wide application in multi-agent systems, its effects on the temporal evolution of a system are far from understood. This paper focuses on the dynamics of Q-learning in large-scale multi-agent systems modeled as population games. We revisit the replicator equation model for Q-learning dynamics and observe that this model is inappropriate for our concerned setting. Motivated by this, we develop a new formal model, which bears a formal connection with the continuity equation in physics. We show that our model always accurately describes the Q-learning dynamics in population games across different initial settings of MASs and game configurations. We also show that our model can be applied to different exploration mechanisms, describe the mean dynamics, and be extended to Q-learning in 2-player and n-player games. Last but not least, we show that our model can provide insights into algorithm parameters and facilitate parameter tuning.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
Deep Explicit Duration Switching Models for Time Series
Authors:
Abdul Fatir Ansari,
Konstantinos Benidis,
Richard Kurle,
Ali Caner Turkmen,
Harold Soh,
Alexander J. Smola,
Yuyang Wang,
Tim Januschowski
Abstract:
Many complex time series can be effectively subdivided into distinct regimes that exhibit persistent dynamics. Discovering the switching behavior and the statistical patterns in these regimes is important for understanding the underlying dynamical system. We propose the Recurrent Explicit Duration Switching Dynamical System (RED-SDS), a flexible model that is capable of identifying both state- and…
▽ More
Many complex time series can be effectively subdivided into distinct regimes that exhibit persistent dynamics. Discovering the switching behavior and the statistical patterns in these regimes is important for understanding the underlying dynamical system. We propose the Recurrent Explicit Duration Switching Dynamical System (RED-SDS), a flexible model that is capable of identifying both state- and time-dependent switching dynamics. State-dependent switching is enabled by a recurrent state-to-switch connection and an explicit duration count variable is used to improve the time-dependent switching behavior. We demonstrate how to perform efficient inference using a hybrid algorithm that approximates the posterior of the continuous states via an inference network and performs exact inference for the discrete switches and counts. The model is trained by maximizing a Monte Carlo lower bound of the marginal log-likelihood that can be computed efficiently as a byproduct of the inference routine. Empirical results on multiple datasets demonstrate that RED-SDS achieves considerable improvement in time series segmentation and competitive forecasting performance against the state of the art.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning
Authors:
Kaiqi Chen,
Yong Lee,
Harold Soh
Abstract:
This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor performance on downstream tasks and over-reliance on specific sensors. As a solution, we contribute a new multi-modal deep latent state-space model, trained using a…
▽ More
This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor performance on downstream tasks and over-reliance on specific sensors. As a solution, we contribute a new multi-modal deep latent state-space model, trained using a mutual information lower-bound. The key innovation is a specially-designed density ratio estimator that encourages consistency between the latent codes of each modality. We tasked our method to learn policies (in a self-supervised manner) on multi-modal Natural MuJoCo benchmarks and a challenging Table Wiping task. Experiments show our method significantly outperforms state-of-the-art deep reinforcement learning methods, particularly in the presence of missing observations.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
Extended Tactile Perception: Vibration Sensing through Tools and Grasped Objects
Authors:
Tasbolat Taunyazov,
Luar Shui Song,
Eugene Lim,
Hian Hian See,
David Lee,
Benjamin C. K. Tee,
Harold Soh
Abstract:
Humans display the remarkable ability to sense the world through tools and other held objects. For example, we are able to pinpoint impact locations on a held rod and tell apart different textures using a rigid probe. In this work, we consider how we can enable robots to have a similar capacity, i.e., to embody tools and extend perception using standard grasped objects. We propose that vibro-tacti…
▽ More
Humans display the remarkable ability to sense the world through tools and other held objects. For example, we are able to pinpoint impact locations on a held rod and tell apart different textures using a rigid probe. In this work, we consider how we can enable robots to have a similar capacity, i.e., to embody tools and extend perception using standard grasped objects. We propose that vibro-tactile sensing using dynamic tactile sensors on the robot fingers, along with machine learning models, enables robots to decipher contact information that is transmitted as vibrations along rigid objects. This paper reports on extensive experiments using the BioTac micro-vibration sensor and a new event dynamic sensor, the NUSkin, capable of multi-taxel sensing at 4~kHz. We demonstrate that fine localization on a held rod is possible using our approach (with errors less than 1 cm on a 20 cm rod). Next, we show that vibro-tactile perception can lead to reasonable grasp stability prediction during object handover, and accurate food identification using a standard fork. We find that multi-taxel vibro-tactile sensing at sufficiently high sampling rate led to the best performance across the various tasks and objects. Taken together, our results provides both evidence and guidelines for using vibro-tactile perception to extend tactile perception, which we believe will lead to enhanced competency with tools and better physical human-robot-interaction.
△ Less
Submitted 29 September, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Embedding Symbolic Temporal Knowledge into Deep Sequential Models
Authors:
Yaqi Xie,
Fan Zhou,
Harold Soh
Abstract:
Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning. In recent years, deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources. However, when data is limited, simpler models such as logic/rule-based methods work surprisingly well, especially w…
▽ More
Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning. In recent years, deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources. However, when data is limited, simpler models such as logic/rule-based methods work surprisingly well, especially when relevant prior knowledge is applied in their construction. However, unlike DNNs, these "structured" models can be difficult to extend, and do not work well with raw unstructured data. In this work, we seek to learn flexible DNNs, yet leverage prior temporal knowledge when available. Our approach is to embed symbolic knowledge expressed as linear temporal logic (LTL) and use these embeddings to guide the training of deep models. Specifically, we construct semantic-based embeddings of automata generated from LTL formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
Refining Deep Generative Models via Discriminator Gradient Flow
Authors:
Abdul Fatir Ansari,
Ming Liang Ang,
Harold Soh
Abstract:
Deep generative modeling has seen impressive advances in recent years, to the point where it is now commonplace to see simulated samples (e.g., images) that closely resemble real-world data. However, generation quality is generally inconsistent for any given model and can vary dramatically between samples. We introduce Discriminator Gradient flow (DGflow), a new technique that improves generated s…
▽ More
Deep generative modeling has seen impressive advances in recent years, to the point where it is now commonplace to see simulated samples (e.g., images) that closely resemble real-world data. However, generation quality is generally inconsistent for any given model and can vary dramatically between samples. We introduce Discriminator Gradient flow (DGflow), a new technique that improves generated samples via the gradient flow of entropy-regularized f-divergences between the real and the generated data distributions. The gradient flow takes the form of a non-linear Fokker-Plank equation, which can be easily simulated by sampling from the equivalent McKean-Vlasov process. By refining inferior samples, our technique avoids wasteful sample rejection used by previous methods (DRS & MH-GAN). Compared to existing works that focus on specific GAN variants, we show our refinement approach can be applied to GANs with vector-valued critics and even other deep generative models such as VAEs and Normalizing Flows. Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods.
△ Less
Submitted 5 June, 2021; v1 submitted 1 December, 2020;
originally announced December 2020.
-
Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization
Authors:
Sreejith Balakrishnan,
Quoc Phong Nguyen,
Bryan Kian Hsiang Low,
Harold Soh
Abstract:
The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed problem at its core; multiple reward functions coincide with the observed behavior and the actual reward function is not identifiable without prior knowledge or…
▽ More
The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed problem at its core; multiple reward functions coincide with the observed behavior and the actual reward function is not identifiable without prior knowledge or supplementary information. This paper presents an IRL framework called Bayesian optimization-IRL (BO-IRL) which identifies multiple solutions that are consistent with the expert demonstrations by efficiently exploring the reward function space. BO-IRL achieves this by utilizing Bayesian Optimization along with our newly proposed kernel that (a) projects the parameters of policy invariant reward functions to a single point in a latent space and (b) ensures nearby points in the latent space correspond to reward functions yielding similar likelihoods. This projection allows the use of standard stationary kernels in the latent space to capture the correlations present across the reward function space. Empirical results on synthetic and real-world environments (model-free and model-based) show that BO-IRL discovers multiple reward functions while minimizing the number of expensive exact policy optimizations.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
Event-Driven Visual-Tactile Sensing and Learning for Robots
Authors:
Tasbolat Taunyazov,
Weicong Sng,
Hian Hian See,
Brian Lim,
Jethro Kuan,
Abdul Fatir Ansari,
Benjamin C. K. Tee,
Harold Soh
Abstract:
This work contributes an event-driven visual-tactile perception system, comprising a novel biologically-inspired tactile sensor and multi-modal spike-based learning. Our neuromorphic fingertip tactile sensor, NeuTouch, scales well with the number of taxels thanks to its event-based nature. Likewise, our Visual-Tactile Spiking Neural Network (VT-SNN) enables fast perception when coupled with event…
▽ More
This work contributes an event-driven visual-tactile perception system, comprising a novel biologically-inspired tactile sensor and multi-modal spike-based learning. Our neuromorphic fingertip tactile sensor, NeuTouch, scales well with the number of taxels thanks to its event-based nature. Likewise, our Visual-Tactile Spiking Neural Network (VT-SNN) enables fast perception when coupled with event sensors. We evaluate our visual-tactile system (using the NeuTouch and Prophesee event camera) on two robot tasks: container classification and rotational slip detection. On both tasks, we observe good accuracies relative to standard deep learning methods. We have made our visual-tactile datasets freely-available to encourage research on multi-modal event-driven robot perception, which we believe is a promising approach towards intelligent power-efficient robot systems.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition
Authors:
Fuqiang Gu,
Weicong Sng,
Tasbolat Taunyazov,
Harold Soh
Abstract:
Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. How…
▽ More
Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. However, these unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. In this paper, we propose a novel spiking graph neural network for event-based tactile object recognition. To make use of local connectivity of taxels, we present several methods for organizing the tactile data in a graph structure. Based on the constructed graphs, we develop a spiking graph convolutional network. The event-driven nature of spiking neural network makes it arguably more suitable for processing the event-based data. Experimental results on two tactile datasets show that the proposed method outperforms other state-of-the-art spiking methods, achieving high accuracies of approximately 90\% when classifying a variety of different household objects.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
Getting to Know One Another: Calibrating Intent, Capabilities and Trust for Human-Robot Collaboration
Authors:
Joshua Lee,
Jeffrey Fong,
Bing Cai Kok,
Harold Soh
Abstract:
Common experience suggests that agents who know each other well are better able to work together. In this work, we address the problem of calibrating intention and capabilities in human-robot collaboration. In particular, we focus on scenarios where the robot is attempting to assist a human who is unable to directly communicate her intent. Moreover, both agents may have differing capabilities that…
▽ More
Common experience suggests that agents who know each other well are better able to work together. In this work, we address the problem of calibrating intention and capabilities in human-robot collaboration. In particular, we focus on scenarios where the robot is attempting to assist a human who is unable to directly communicate her intent. Moreover, both agents may have differing capabilities that are unknown to one another. We adopt a decision-theoretic approach and propose the TICC-POMDP for modeling this setting, with an associated online solver. Experiments show our approach leads to better team performance both in simulation and in a real-world study with human subjects.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Going Beyond the Debye Length: Overcoming Charge Screening Limitations in Next-Generation Bioelectronic Sensors
Authors:
Vladimir Kesler,
Boris Murmann,
H. Tom Soh
Abstract:
Electronic biosensors are a natural fit for field-deployable diagnostic devices, because they can be miniaturized, mass produced, and integrated with circuitry. Unfortunately, progress in the development of such platforms has been hindered by the fact that mobile ions present in biological samples screen charges from the target molecule, greatly reducing sensor sensitivity. Under physiological con…
▽ More
Electronic biosensors are a natural fit for field-deployable diagnostic devices, because they can be miniaturized, mass produced, and integrated with circuitry. Unfortunately, progress in the development of such platforms has been hindered by the fact that mobile ions present in biological samples screen charges from the target molecule, greatly reducing sensor sensitivity. Under physiological conditions, the thickness of the resulting electric double layer is less than 1 nm, and it has generally been assumed that electronic detection beyond this distance is virtually impossible. However, a few recently-described sensor design strategies seem to defy this conventional wisdom, exploiting the physics of electrical double layers in ways that traditional models do not capture. In the first strategy, charge screening is decreased by constraining the space in which double layers can form. The second strategy uses external stimuli to prevent double layers from reaching equilibrium, thereby effectively reducing charge screening. The goal of this article is to describe these relatively new concepts, and to offer theoretical insights into mechanisms that may enable electronic biosensing beyond the double-layer. If these concepts can be further developed and translated into practical electronic biosensors, we foresee exciting opportunities for the next generation of diagnostic technologies.
△ Less
Submitted 26 July, 2020;
originally announced July 2020.
-
The Evolutionary Dynamics of Independent Learning Agents in Population Games
Authors:
Shuyue Hu,
Chin-Wing Leung,
Ho-fung Leung,
Harold Soh
Abstract:
Understanding the evolutionary dynamics of reinforcement learning under multi-agent settings has long remained an open problem. While previous works primarily focus on 2-player games, we consider population games, which model the strategic interactions of a large population comprising small and anonymous agents. This paper presents a formal relation between stochastic processes and the dynamics of…
▽ More
Understanding the evolutionary dynamics of reinforcement learning under multi-agent settings has long remained an open problem. While previous works primarily focus on 2-player games, we consider population games, which model the strategic interactions of a large population comprising small and anonymous agents. This paper presents a formal relation between stochastic processes and the dynamics of independent learning agents who reason based on the reward signals. Using a master equation approach, we provide a novel unified framework for characterising population dynamics via a single partial differential equation (Theorem 1). Through a case study involving Cross learning agents, we illustrate that Theorem 1 allows us to identify qualitatively different evolutionary dynamics, to analyse steady states, and to gain insights into the expected behaviour of a population. In addition, we present extensive experimental results validating that Theorem 1 holds for a variety of learning methods and population games.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
ST-MNIST -- The Spiking Tactile MNIST Neuromorphic Dataset
Authors:
Hian Hian See,
Brian Lim,
Si Li,
Haicheng Yao,
Wen Cheng,
Harold Soh,
Benjamin C. K. Tee
Abstract:
Tactile sensing is an essential modality for smart robots as it enables them to interact flexibly with physical objects in their environment. Recent advancements in electronic skins have led to the development of data-driven machine learning methods that exploit this important sensory modality. However, current datasets used to train such algorithms are limited to standard synchronous tactile sens…
▽ More
Tactile sensing is an essential modality for smart robots as it enables them to interact flexibly with physical objects in their environment. Recent advancements in electronic skins have led to the development of data-driven machine learning methods that exploit this important sensory modality. However, current datasets used to train such algorithms are limited to standard synchronous tactile sensors. There is a dearth of neuromorphic event-based tactile datasets, principally due to the scarcity of large-scale event-based tactile sensors. Having such datasets is crucial for the development and evaluation of new algorithms that process spatio-temporal event-based data. For example, evaluating spiking neural networks on conventional frame-based datasets is considered sub-optimal. Here, we debut a novel neuromorphic Spiking Tactile MNIST (ST-MNIST) dataset, which comprises handwritten digits obtained by human participants writing on a neuromorphic tactile sensor array. We also describe an initial effort to evaluate our ST-MNIST dataset using existing artificial and spiking neural network models. The classification accuracies provided herein can serve as performance benchmarks for future work. We anticipate that our ST-MNIST dataset will be of interest and useful to the neuromorphic and robotics research communities.
△ Less
Submitted 8 May, 2020;
originally announced May 2020.
-
Design and Analysis of a Sample-and-Hold CMOS Electrochemical Sensor for Aptamer-based Therapeutic Drug Monitoring
Authors:
Jun-Chau Chien,
Sam W. Baker,
H. Tom Soh,
Amin Arbabian
Abstract:
In this paper, we present the design and the analysis of an electrochemical circuit for measuring the concentrations of therapeutic drugs using structure-switching aptamers. Aptamers are single-stranded nucleic acids, whose sequence is selected to exhibit high affinity and specificity toward a molecular target, and change its conformation upon binding. This property, when coupled with a redox repo…
▽ More
In this paper, we present the design and the analysis of an electrochemical circuit for measuring the concentrations of therapeutic drugs using structure-switching aptamers. Aptamers are single-stranded nucleic acids, whose sequence is selected to exhibit high affinity and specificity toward a molecular target, and change its conformation upon binding. This property, when coupled with a redox reporter and electrochemical detection, enables reagent-free biosensing with a sub-minute temporal resolution for in vivo therapeutic drug monitoring. Specifically, we design a chronoamperometry-based electrochemical circuit that measures the direct changes in the electron transfer (ET) kinetics of a methylene blue reporter conjugated at the distal-end of the aptamer. To overcome the high-frequency noise amplification issue when interfacing with a large-size (> 0.25 mm2) implantable electrode, we present a sample-and-hold (S/H) circuit technique in which the desired electrode potentials are held onto noiseless capacitors during the recording of the redox currents. This allows disconnecting the feedback amplifiers to avoid its noise injection while reducing the total power consumption. A prototype circuit implemented in 65-nm CMOS demonstrates a cell-capacitance-insensitive input-referred noise (IRN) current of 15.2 pArms at a 2.5-kHz filtering bandwidth. Tested in human whole blood samples, changes in the ET kinetics from the redox-labeled aminoglycoside aptamers at different kanamycin concentrations are measured from the recorded current waveforms. By employing principal component analysis (PCA) to compensate for the sampling errors, a detection limit (SNR = 1) of 3.1 uM under 1-sec acquisition is achieved at 0.22-mW power consumption.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding
Authors:
Jiyeon Ham,
Yo Joong Choe,
Kyubyong Park,
Ilji Choi,
Hyungjoon Soh
Abstract:
Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed K…
▽ More
Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed KorNLI and KorSTS, respectively. Following previous approaches, we machine-translate existing English training sets and manually translate development and test sets into Korean. To accelerate research on Korean NLU, we also establish baselines on KorNLI and KorSTS. Our datasets are publicly available at https://github.com/kakaobrain/KorNLUDatasets.
△ Less
Submitted 5 October, 2020; v1 submitted 7 April, 2020;
originally announced April 2020.
-
Overlimiting current in non-uniform arrays of microchannels
Authors:
Hyekyung Lee,
Shima Alizadeh,
Tae Jin Kim,
Seung-min Park,
Hyongsok Tom Soh,
Ali Mani,
Sung Jae Kim
Abstract:
Overlimiting current (OLC) through electrolytes interfaced with perm-selective membranes has been extensively researched in recent years for understanding the fundamental mechanisms of transport and developing efficient applications from electrochemistry to sample analysis and separation. Predominant mechanisms responsible for OLC include surface conduction, convection by electro-osmotic flow, and…
▽ More
Overlimiting current (OLC) through electrolytes interfaced with perm-selective membranes has been extensively researched in recent years for understanding the fundamental mechanisms of transport and developing efficient applications from electrochemistry to sample analysis and separation. Predominant mechanisms responsible for OLC include surface conduction, convection by electro-osmotic flow, and electro-osmotic instability depending on input parameters such as surface charge and geometric constrictions. This work studies how a network of microchannels in a non-uniform array, which mimicks a natural pore configuration, can contribute to OLC. To this end, micro/nanofluidic devices are fabricated with arrays of parallel microchannels with non-uniform size distributions. All cases maintain the same surface and bulk conduction to allow probing the sensitivity only by the non-uniformity of the channels. Both experimental and theoretical current-voltage relations demonstrate that OLCs increase with increasing non-uniformity. Furthermore, the visualization of internal recirculating flows indicates that the non-uniform arrays induce flow loops across the network enhancing advective transport. These evidences confirm a new driving mechanism of OLC, inspired by natural micro/nanoporous materials with random geometric structure. Therefore, this result can advance not only the fundamental understanding of nanoelectrokinetics but also the design rule of engineering applications of electrochemical membrane.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
A Characteristic Function Approach to Deep Implicit Generative Modeling
Authors:
Abdul Fatir Ansari,
Jonathan Scarlett,
Harold Soh
Abstract:
Implicit Generative Models (IGMs) such as GANs have emerged as effective data-driven models for generating samples, particularly images. In this paper, we formulate the problem of learning an IGM as minimizing the expected distance between characteristic functions. Specifically, we minimize the distance between characteristic functions of the real and generated data distributions under a suitably-…
▽ More
Implicit Generative Models (IGMs) such as GANs have emerged as effective data-driven models for generating samples, particularly images. In this paper, we formulate the problem of learning an IGM as minimizing the expected distance between characteristic functions. Specifically, we minimize the distance between characteristic functions of the real and generated data distributions under a suitably-chosen weighting distribution. This distance metric, which we term as the characteristic function distance (CFD), can be (approximately) computed with linear time-complexity in the number of samples, in contrast with the quadratic-time Maximum Mean Discrepancy (MMD). By replacing the discrepancy measure in the critic of a GAN with the CFD, we obtain a model that is simple to implement and stable to train. The proposed metric enjoys desirable theoretical properties including continuity and differentiability with respect to generator parameters, and continuity in the weak topology. We further propose a variation of the CFD in which the weighting distribution parameters are also optimized during training; this obviates the need for manual tuning, and leads to an improvement in test power relative to CFD. We demonstrate experimentally that our proposed method outperforms WGAN and MMD-GAN variants on a variety of unsupervised image generation benchmarks.
△ Less
Submitted 16 June, 2020; v1 submitted 16 September, 2019;
originally announced September 2019.
-
Robot Capability and Intention in Trust-based Decisions across Tasks
Authors:
Yaqi Xie,
Indu P Bodala,
Desmond C. Ong,
David Hsu,
Harold Soh
Abstract:
In this paper, we present results from a human-subject study designed to explore two facets of human mental models of robots---inferred capability and intention---and their relationship to overall trust and eventual decisions. In particular, we examine delegation situations characterized by uncertainty, and explore how inferred capability and intention are applied across different tasks. We develo…
▽ More
In this paper, we present results from a human-subject study designed to explore two facets of human mental models of robots---inferred capability and intention---and their relationship to overall trust and eventual decisions. In particular, we examine delegation situations characterized by uncertainty, and explore how inferred capability and intention are applied across different tasks. We develop an online survey where human participants decide whether to delegate control to a simulated UAV agent. Our study shows that human estimations of robot capability and intent correlate strongly with overall self-reported trust. However, overall trust is not independently sufficient to determine whether a human will decide to trust (delegate) a given task to a robot. Instead, our study reveals that estimations of robot intention, capability, and overall trust are integrated when deciding to delegate. From a broader perspective, these results suggest that calibrating overall trust alone is insufficient; to make correct decisions, humans need (and use) multi-faceted mental models when collaborating with robots across multiple contexts.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
Embedding Symbolic Knowledge into Deep Networks
Authors:
Yaqi Xie,
Ziwei Xu,
Mohan S. Kankanhalli,
Kuldeep S. Meel,
Harold Soh
Abstract:
In this work, we aim to leverage prior symbolic knowledge to improve the performance of deep models. We propose a graph embedding network that projects propositional formulae (and assignments) onto a manifold via an augmented Graph Convolutional Network (GCN). To generate semantically-faithful embeddings, we develop techniques to recognize node heterogeneity, and semantic regularization that incor…
▽ More
In this work, we aim to leverage prior symbolic knowledge to improve the performance of deep models. We propose a graph embedding network that projects propositional formulae (and assignments) onto a manifold via an augmented Graph Convolutional Network (GCN). To generate semantically-faithful embeddings, we develop techniques to recognize node heterogeneity, and semantic regularization that incorporate structural constraints into the embedding. Experiments show that our approach improves the performance of models trained to perform entailment checking and visual relation prediction. Interestingly, we observe a connection between the tractability of the propositional theory representation and the ease of embedding. Future exploration of this connection may elucidate the relationship between knowledge compilation and vector representation learning.
△ Less
Submitted 29 October, 2019; v1 submitted 3 September, 2019;
originally announced September 2019.