subscribe to arXiv mailings

Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers

Authors: Jianxin Bi, Kelvin Lim, Kaiqi Chen, Yifei Huang, Harold Soh

Abstract: Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficie… ▽ More Recent advances in diffusion-based robot policies have demonstrated significant potential in imitating multi-modal behaviors. However, these approaches typically require large quantities of demonstration data paired with corresponding robot action labels, creating a substantial data collection burden. In this work, we propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers by leveraging observational demonstration data. Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation. This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder, requiring minimal action-labeled data. Through experiments on simulated robot manipulation tasks and a real robot experiment with multi-modal expert demonstrations, we demonstrate that our approach significantly enhances action-data efficiency and achieves high task success rates with limited action data. △ Less

Submitted 9 October, 2024; originally announced October 2024.

arXiv:2410.05856 [pdf, other]

Stochastic Bandits for Egalitarian Assignment

Authors: Eugene Lim, Vincent Y. F. Tan, Harold Soh

Abstract: We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its a… ▽ More We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent's objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result. △ Less

Submitted 8 October, 2024; originally announced October 2024.

arXiv:2410.02389 [pdf, other]

Diffusion Meets Options: Hierarchical Generative Skill Composition for Temporally-Extended Tasks

Authors: Zeyu Feng, Hao Luan, Kevin Yuchen Ma, Harold Soh

Abstract: Safe and successful deployment of robots requires not only the ability to generate complex plans but also the capacity to frequently replan and correct execution errors. This paper addresses the challenge of long-horizon trajectory planning under temporally extended objectives in a receding horizon manner. To this end, we propose DOPPLER, a data-driven hierarchical framework that generates and upd… ▽ More Safe and successful deployment of robots requires not only the ability to generate complex plans but also the capacity to frequently replan and correct execution errors. This paper addresses the challenge of long-horizon trajectory planning under temporally extended objectives in a receding horizon manner. To this end, we propose DOPPLER, a data-driven hierarchical framework that generates and updates plans based on instruction specified by linear temporal logic (LTL). Our method decomposes temporal tasks into chain of options with hierarchical reinforcement learning from offline non-expert datasets. It leverages diffusion models to generate options with low-level actions. We devise a determinantal-guided posterior sampling technique during batch generation, which improves the speed and diversity of diffusion generated options, leading to more efficient querying. Experiments on robot navigation and manipulation tasks demonstrate that DOPPLER can generate sequences of trajectories that progressively satisfy the specified formulae for obstacle avoidance and sequential visitation. Demonstration videos are available online at: https://philiptheother.github.io/doppler/. △ Less

Submitted 3 October, 2024; originally announced October 2024.

arXiv:2409.12471 [pdf, other]

Arena 4.0: A Comprehensive ROS2 Development and Benchmarking Platform for Human-centric Navigation Using Generative-Model-based Environment Generation

Authors: Volodymyr Shcherbyna1, Linh Kästner, Diego Diaz, Huu Giang Nguyen, Maximilian Ho-Kyoung Schreff, Tim Lenz, Jonas Kreutz, Ahmed Martban, Huajian Zeng, Harold Soh

Abstract: Building on the foundations of our previous work, this paper introduces Arena 4.0, a significant advancement over Arena 3.0, Arena-Bench, Arena 1.0, and Arena 2.0. Arena 4.0 offers three key novel contributions: (1) a generative-model-based world and scenario generation approach that utilizes large language models (LLMs) and diffusion models to dynamically generate complex, human-centric environme… ▽ More Building on the foundations of our previous work, this paper introduces Arena 4.0, a significant advancement over Arena 3.0, Arena-Bench, Arena 1.0, and Arena 2.0. Arena 4.0 offers three key novel contributions: (1) a generative-model-based world and scenario generation approach that utilizes large language models (LLMs) and diffusion models to dynamically generate complex, human-centric environments from text prompts or 2D floorplans, useful for the development and benchmarking of social navigation strategies; (2) a comprehensive 3D model database, extendable with additional 3D assets that are semantically linked and annotated for dynamic spawning and arrangement within 3D worlds; and (3) a complete migration to ROS 2, enabling compatibility with modern hardware and enhanced functionalities for improved navigation, usability, and easier deployment on real robots. We evaluated the platform's performance through a comprehensive user study, demonstrating significant improvements in usability and efficiency compared to previous versions. Arena 4.0 is openly available at https://github.com/Arena-Rosnav. △ Less

Submitted 19 September, 2024; originally announced September 2024.

Comments: 7 pages, 7 figures

arXiv:2406.09767 [pdf, other]

Language-Guided Manipulation with Diffusion Policies and Constrained Inpainting

Authors: Ce Hao, Kelvin Lin, Siyuan Luo, Harold Soh

Abstract: Diffusion policies have demonstrated robust performance in generative modeling, prompting their application in robotic manipulation controlled via language descriptions. In this paper, we introduce a zero-shot, open-vocabulary diffusion policy method for robot manipulation. Using Vision-Language Models (VLMs), our method transforms linguistic task descriptions into actionable keyframes in 3D space… ▽ More Diffusion policies have demonstrated robust performance in generative modeling, prompting their application in robotic manipulation controlled via language descriptions. In this paper, we introduce a zero-shot, open-vocabulary diffusion policy method for robot manipulation. Using Vision-Language Models (VLMs), our method transforms linguistic task descriptions into actionable keyframes in 3D space. These keyframes serve to guide the diffusion process via inpainting. However, naively enforcing the diffusion process to adhere to the generated keyframes is problematic: the keyframes from the VLMs may be incorrect and lead to action sequences where the diffusion model performs poorly. To address these challenges, we develop an inpainting optimization strategy that balances adherence to the keyframes v.s. the training data distribution. Experimental evaluations demonstrate that our approach surpasses the performance of traditional fine-tuned language-conditioned methods in both simulated and real-world settings. △ Less

Submitted 23 September, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.00837 [pdf, other]

Arena 3.0: Advancing Social Navigation in Collaborative and Highly Dynamic Environments

Authors: Linh Kästner, Volodymyir Shcherbyna, Huajian Zeng, Tuan Anh Le, Maximilian Ho-Kyoung Schreff, Halid Osmaev, Nam Truong Tran, Diego Diaz, Jan Golebiowski, Harold Soh, Jens Lambrecht

Abstract: Building upon our previous contributions, this paper introduces Arena 3.0, an extension of Arena-Bench, Arena 1.0, and Arena 2.0. Arena 3.0 is a comprehensive software stack containing multiple modules and simulation environments focusing on the development, simulation, and benchmarking of social navigation approaches in collaborative environments. We significantly enhance the realism of human beh… ▽ More Building upon our previous contributions, this paper introduces Arena 3.0, an extension of Arena-Bench, Arena 1.0, and Arena 2.0. Arena 3.0 is a comprehensive software stack containing multiple modules and simulation environments focusing on the development, simulation, and benchmarking of social navigation approaches in collaborative environments. We significantly enhance the realism of human behavior simulation by incorporating a diverse array of new social force models and interaction patterns, encompassing both human-human and human-robot dynamics. The platform provides a comprehensive set of new task modes, designed for extensive benchmarking and testing and is capable of generating realistic and human-centric environments dynamically, catering to a broad spectrum of social navigation scenarios. In addition, the platform's functionalities have been abstracted across three widely used simulators, each tailored for specific training and testing purposes. The platform's efficacy has been validated through an extensive benchmark and user evaluations of the platform by a global community of researchers and students, which noted the substantial improvement compared to previous versions and expressed interests to utilize the platform for future research and development. Arena 3.0 is openly available at https://github.com/Arena-Rosnav. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 11 pages, 6 figures

Journal ref: Robotics Science and Systems 2024, Delft Netherlands

arXiv:2405.11881 [pdf, other]

Out-of-Distribution Detection with a Single Unconditional Diffusion Model

Authors: Alvin Heng, Alexandre H. Thiery, Harold Soh

Abstract: Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples. Traditionally, unsupervised methods utilize a deep generative model for OOD detection. However, such approaches require a new model to be trained for each inlier dataset. This paper explores whether a single model can perform OOD detection across diverse tasks. To that end, we introd… ▽ More Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples. Traditionally, unsupervised methods utilize a deep generative model for OOD detection. However, such approaches require a new model to be trained for each inlier dataset. This paper explores whether a single model can perform OOD detection across diverse tasks. To that end, we introduce Diffusion Paths (DiffPath), which uses a single diffusion model originally trained to perform unconditional generation for OOD detection. We introduce a novel technique of measuring the rate-of-change and curvature of the diffusion paths connecting samples to the standard normal. Extensive experiments show that with a single model, DiffPath is competitive with prior work using individual models on a variety of OOD tasks involving different distributions. Our code is publicly available at https://github.com/clear-nus/diffpath. △ Less

Submitted 12 October, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.04235 [pdf, other]

doi 10.1109/LRA.2024.3443501

LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning

Authors: Zeyu Feng, Hao Luan, Pranav Goyal, Harold Soh

Abstract: Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework,… ▽ More Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework, LTLDoG, that modifies the inference steps of the reverse process given an instruction specified using finite linear temporal logic ($\text{LTL}_f$). LTLDoG leverages a satisfaction value function on $\text{LTL}_f$ and guides the sampling steps using its gradient field. This value function can also be trained to generalize to new instructions not observed during training, enabling flexible test-time adaptability. Experiments in robot navigation and manipulation illustrate that the method is able to generate trajectories that satisfy formulae that specify obstacle avoidance and visitation sequences. Code and supplementary material are available online at https://github.com/clear-nus/ltldog. △ Less

Submitted 30 September, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Journal ref: in IEEE Robotics and Automation Letters, vol. 9, no. 10, pp. 8571-8578, Oct. 2024

arXiv:2405.02794 [pdf, other]

Octopi: Object Property Reasoning with Large Tactile-Language Models

Authors: Samson Yu, Kelvin Lin, Anxing Xiao, Jiafei Duan, Harold Soh

Abstract: Physical reasoning is important for effective robot manipulation. Recent work has investigated both vision and language modalities for physical reasoning; vision can reveal information about objects in the environment and language serves as an abstraction and communication medium for additional context. Although these works have demonstrated success on a variety of physical reasoning tasks, they a… ▽ More Physical reasoning is important for effective robot manipulation. Recent work has investigated both vision and language modalities for physical reasoning; vision can reveal information about objects in the environment and language serves as an abstraction and communication medium for additional context. Although these works have demonstrated success on a variety of physical reasoning tasks, they are limited to physical properties that can be inferred from visual or language inputs. In this work, we investigate combining tactile perception with language, which enables embodied systems to obtain physical properties through interaction and apply commonsense reasoning. We contribute a new dataset PhysiCLeAR, which comprises both physical/property reasoning tasks and annotated tactile videos obtained using a GelSight tactile sensor. We then introduce Octopi, a system that leverages both tactile representation learning and large vision-language models to predict and reason about tactile inputs with minimal language fine-tuning. Our evaluations on PhysiCLeAR show that Octopi is able to effectively use intermediate physical property predictions to improve its performance on various tactile-related tasks. PhysiCLeAR and Octopi are available at https://github.com/clear-nus/octopi. △ Less

Submitted 4 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

Comments: Accepted at Robotics: Science and Systems (R:SS 2024)

arXiv:2404.03868 [pdf, other]

Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction

Authors: Bowen Zhang, Harold Soh

Abstract: In this work, we are interested in automated methods for knowledge graph creation (KGC) from input text. Progress on large language models (LLMs) has prompted a series of recent works applying them to KGC, e.g., via zero/few-shot prompting. Despite successes on small domain-specific datasets, these models face difficulties scaling up to text common in many real-world applications. A principal issu… ▽ More In this work, we are interested in automated methods for knowledge graph creation (KGC) from input text. Progress on large language models (LLMs) has prompted a series of recent works applying them to KGC, e.g., via zero/few-shot prompting. Despite successes on small domain-specific datasets, these models face difficulties scaling up to text common in many real-world applications. A principal issue is that, in prior methods, the KG schema has to be included in the LLM prompt to generate valid triplets; larger and more complex schemas easily exceed the LLMs' context window length. Furthermore, there are scenarios where a fixed pre-defined schema is not available and we would like the method to construct a high-quality KG with a succinct self-generated schema. To address these problems, we propose a three-phase framework named Extract-Define-Canonicalize (EDC): open information extraction followed by schema definition and post-hoc canonicalization. EDC is flexible in that it can be applied to settings where a pre-defined target schema is available and when it is not; in the latter case, it constructs a schema automatically and applies self-canonicalization. To further improve performance, we introduce a trained component that retrieves schema elements relevant to the input text; this improves the LLMs' extraction performance in a retrieval-augmented generation-like manner. We demonstrate on three KGC benchmarks that EDC is able to extract high-quality triplets without any parameter tuning and with significantly larger schemas compared to prior works. Code for EDC is available at https://github.com/clear-nus/edc. △ Less

Submitted 2 October, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 18 pages, 3 figures, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2403.16049 [pdf, other]

doi 10.1016/j.chaos.2024.115032

Improving Demand Forecasting in Open Systems with Cartogram-Enhanced Deep Learning

Authors: Sangjoon Park, Yongsung Kwon, Hyungjoon Soh, Mi Jin Lee, Seung-Woo Son

Abstract: Predicting temporal patterns across various domains poses significant challenges due to their nuanced and often nonlinear trajectories. To address this challenge, prediction frameworks have been continuously refined, employing data-driven statistical methods, mathematical models, and machine learning. Recently, as one of the challenging systems, shared transport systems such as public bicycles hav… ▽ More Predicting temporal patterns across various domains poses significant challenges due to their nuanced and often nonlinear trajectories. To address this challenge, prediction frameworks have been continuously refined, employing data-driven statistical methods, mathematical models, and machine learning. Recently, as one of the challenging systems, shared transport systems such as public bicycles have gained prominence due to urban constraints and environmental concerns. Predicting rental and return patterns at bicycle stations remains a formidable task due to the system's openness and imbalanced usage patterns across stations. In this study, we propose a deep learning framework to predict rental and return patterns by leveraging cartogram approaches. The cartogram approach facilitates the prediction of demand for newly installed stations with no training data as well as long-period prediction, which has not been achieved before. We apply this method to public bicycle rental-and-return data in Seoul, South Korea, employing a spatial-temporal convolutional graph attention network. Our improved architecture incorporates batch attention and modified node feature updates for better prediction accuracy across different time scales. We demonstrate the effectiveness of our framework in predicting temporal patterns and its potential applications. △ Less

Submitted 26 May, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

Comments: 11 pages, 7 figures

arXiv:2402.16075 [pdf, other]

Don't Start from Scratch: Behavioral Refinement via Interpolant-based Policy Diffusion

Authors: Kaiqi Chen, Eugene Lim, Kelvin Lin, Yiyang Chen, Harold Soh

Abstract: Imitation learning empowers artificial agents to mimic behavior by learning from demonstrations. Recently, diffusion models, which have the ability to model high-dimensional and multimodal distributions, have shown impressive performance on imitation learning tasks. These models learn to shape a policy by diffusing actions (or states) from standard Gaussian noise. However, the target policy to be… ▽ More Imitation learning empowers artificial agents to mimic behavior by learning from demonstrations. Recently, diffusion models, which have the ability to model high-dimensional and multimodal distributions, have shown impressive performance on imitation learning tasks. These models learn to shape a policy by diffusing actions (or states) from standard Gaussian noise. However, the target policy to be learned is often significantly different from Gaussian and this mismatch can result in poor performance when using a small number of diffusion steps (to improve inference speed) and under limited data. The key idea in this work is that initiating from a more informative source than Gaussian enables diffusion methods to mitigate the above limitations. We contribute both theoretical results, a new method, and empirical findings that show the benefits of using an informative source policy. Our method, which we call BRIDGER, leverages the stochastic interpolants framework to bridge arbitrary policies, thus enabling a flexible approach towards imitation learning. It generalizes prior work in that standard Gaussians can still be applied, but other source policies can be used if available. In experiments on challenging simulation benchmarks and on real robots, BRIDGER outperforms state-of-the-art diffusion policies. We provide further analysis on design considerations when applying BRIDGER. Code for BRIDGER is available at https://github.com/clear-nus/bridger. △ Less

Submitted 10 July, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

arXiv:2311.07992 [pdf, other]

Probable Object Location (POLo) Score Estimation for Efficient Object Goal Navigation

Authors: Jiaming Wang, Harold Soh

Abstract: To advance the field of autonomous robotics, particularly in object search tasks within unexplored environments, we introduce a novel framework centered around the Probable Object Location (POLo) score. Utilizing a 3D object probability map, the POLo score allows the agent to make data-driven decisions for efficient object search. We further enhance the framework's practicality by introducing POLo… ▽ More To advance the field of autonomous robotics, particularly in object search tasks within unexplored environments, we introduce a novel framework centered around the Probable Object Location (POLo) score. Utilizing a 3D object probability map, the POLo score allows the agent to make data-driven decisions for efficient object search. We further enhance the framework's practicality by introducing POLoNet, a neural network trained to approximate the computationally intensive POLo score. Our approach addresses critical limitations of both end-to-end reinforcement learning methods, which suffer from memory decay over long-horizon tasks, and traditional map-based methods that neglect visibility constraints. Our experiments, involving the first phase of the OVMM 2023 challenge, demonstrate that an agent equipped with POLoNet significantly outperforms a range of baseline methods, including end-to-end RL techniques and prior map-based strategies. To provide a comprehensive evaluation, we introduce new performance metrics that offer insights into the efficiency and effectiveness of various agents in object goal navigation. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: Under review

arXiv:2309.08887 [pdf, other]

GRaCE: Balancing Multiple Criteria to Achieve Stable, Collision-Free, and Functional Grasps

Authors: Tasbolat Taunyazov, Kelvin Lin, Harold Soh

Abstract: This paper addresses the multi-faceted problem of robot grasping, where multiple criteria may conflict and differ in importance. We introduce a probabilistic framework, Grasp Ranking and Criteria Evaluation (GRaCE), which employs hierarchical rule-based logic and a rank-preserving utility function for grasps based on various criteria such as stability, kinematic constraints, and goal-oriented func… ▽ More This paper addresses the multi-faceted problem of robot grasping, where multiple criteria may conflict and differ in importance. We introduce a probabilistic framework, Grasp Ranking and Criteria Evaluation (GRaCE), which employs hierarchical rule-based logic and a rank-preserving utility function for grasps based on various criteria such as stability, kinematic constraints, and goal-oriented functionalities. GRaCE's probabilistic nature means the framework handles uncertainty in a principled manner, i.e., the method is able to leverage the probability that a given criteria is satisfied. Additionally, we propose GRaCE-OPT, a hybrid optimization strategy that combines gradient-based and gradient-free methods to effectively navigate the complex, non-convex utility function. Experimental results in both simulated and real-world scenarios show that GRaCE requires fewer samples to achieve comparable or superior performance relative to existing methods. The modular architecture of GRaCE allows for easy customization and adaptation to specific application needs. △ Less

Submitted 29 May, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

arXiv:2308.06928 [pdf, other]

Refining 6-DoF Grasps with Context-Specific Classifiers

Authors: Tasbolat Taunyazov, Heng Zhang, John Patrick Eala, Na Zhao, Harold Soh

Abstract: In this work, we present GraspFlow, a refinement approach for generating context-specific grasps. We formulate the problem of grasp synthesis as a sampling problem: we seek to sample from a context-conditioned probability distribution of successful grasps. However, this target distribution is unknown. As a solution, we devise a discriminator gradient-flow method to evolve grasps obtained from a si… ▽ More In this work, we present GraspFlow, a refinement approach for generating context-specific grasps. We formulate the problem of grasp synthesis as a sampling problem: we seek to sample from a context-conditioned probability distribution of successful grasps. However, this target distribution is unknown. As a solution, we devise a discriminator gradient-flow method to evolve grasps obtained from a simpler distribution in a manner that mimics sampling from the desired target distribution. Unlike existing approaches, GraspFlow is modular, allowing grasps that satisfy multiple criteria to be obtained simply by incorporating the relevant discriminators. It is also simple to implement, requiring minimal code given existing auto-differentiation libraries and suitable discriminators. Experiments show that GraspFlow generates stable and executable grasps on a real-world Panda robot for a diverse range of objects. In particular, in 60 trials on 20 different household objects, the first attempted grasp was successful 94% of the time, and 100% grasp success was achieved by the second grasp. Moreover, incorporating a functional discriminator for robot-human handover improved the functional aspect of the grasp by up to 33%. △ Less

Submitted 14 August, 2023; originally announced August 2023.

Comments: IROS 2023, Code and Datasets are available at https://github.com/tasbolat1/graspflow

arXiv:2308.06498 [pdf, other]

Latent Emission-Augmented Perspective-Taking (LEAPT) for Human-Robot Interaction

Authors: Kaiqi Chen, Jing Yu Lim, Kingsley Kuan, Harold Soh

Abstract: Perspective-taking is the ability to perceive or understand a situation or concept from another individual's point of view, and is crucial in daily human interactions. Enabling robots to perform perspective-taking remains an unsolved problem; existing approaches that use deterministic or handcrafted methods are unable to accurately account for uncertainty in partially-observable settings. This wor… ▽ More Perspective-taking is the ability to perceive or understand a situation or concept from another individual's point of view, and is crucial in daily human interactions. Enabling robots to perform perspective-taking remains an unsolved problem; existing approaches that use deterministic or handcrafted methods are unable to accurately account for uncertainty in partially-observable settings. This work proposes to address this limitation via a deep world model that enables a robot to perform both perception and conceptual perspective taking, i.e., the robot is able to infer what a human sees and believes. The key innovation is a decomposed multi-modal latent state space model able to generate and augment fictitious observations/emissions. Optimizing the ELBO that arises from this probabilistic graphical model enables the learning of uncertainty in latent space, which facilitates uncertainty estimation from high-dimensional observations. We tasked our model to predict human observations and beliefs on three partially-observable HRI tasks. Experiments show that our method significantly outperforms existing baselines and is able to infer visual observations available to other agent and their internal beliefs. △ Less

Submitted 12 August, 2023; originally announced August 2023.

arXiv:2306.12609 [pdf, other]

Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities

Authors: Xudong Shen, Hannah Brown, Jiashu Tao, Martin Strobel, Yao Tong, Akshay Narayan, Harold Soh, Finale Doshi-Velez

Abstract: There is increasing attention being given to how to regulate AI systems. As governing bodies grapple with what values to encapsulate into regulation, we consider the technical half of the question: To what extent can AI experts vet an AI system for adherence to regulatory requirements? We investigate this question through the lens of two public sector procurement checklists, identifying what we ca… ▽ More There is increasing attention being given to how to regulate AI systems. As governing bodies grapple with what values to encapsulate into regulation, we consider the technical half of the question: To what extent can AI experts vet an AI system for adherence to regulatory requirements? We investigate this question through the lens of two public sector procurement checklists, identifying what we can do now, what should be possible with technical innovation, and what requirements need a more interdisciplinary approach. △ Less

Submitted 27 March, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

Comments: scheduled for publication in the Communications of the ACM, titled "Directions of Technical Innovation for Regulatable AI Systems"

arXiv:2306.06897 [pdf, other]

doi 10.3390/e25081116

Fisher information as general metrics of quantum synchronization

Authors: Yuan Shen, Hong Yi Soh, Leong-Chuan Kwek, Weijun Fan

Abstract: Quantum synchronization has emerged as a crucial phenomenon in quantum nonlinear dynamics with potential applications in quantum information processing. Multiple measures for quantifying quantum synchronization exist. However, there is currently no widely agreed metric that is universally adopted. In this paper, we propose using classical and quantum Fisher information (FI) as alternative metrics… ▽ More Quantum synchronization has emerged as a crucial phenomenon in quantum nonlinear dynamics with potential applications in quantum information processing. Multiple measures for quantifying quantum synchronization exist. However, there is currently no widely agreed metric that is universally adopted. In this paper, we propose using classical and quantum Fisher information (FI) as alternative metrics to detect and measure quantum synchronization. We establish the connection between FI and quantum synchronization, demonstrating that both classical and quantum FI can be deployed as more general indicators of quantum phase synchronization, in some regimes where all other existing measures fail to provide reliable results. We show advantages in FI-based measures, especially in 2-to-1 synchronization. Furthermore, we analyze the impact of noise on the synchronization measures, revealing the robustness and susceptibility of each method in the presence of dissipation and decoherence. Our results open up new avenues for understanding and exploiting quantum synchronization. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 9 pages, 12 figures

Journal ref: Entropy 2023, 25(8), 1116

arXiv:2305.10120 [pdf, other]

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Authors: Alvin Heng, Harold Soh

Abstract: The recent proliferation of large-scale text-to-image models has led to growing concerns that such models may be misused to generate harmful, misleading, and inappropriate content. Motivated by this issue, we derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models. Our method, dubbed Selective Amnesia, enables controllable forgetting wh… ▽ More The recent proliferation of large-scale text-to-image models has led to growing concerns that such models may be misused to generate harmful, misleading, and inappropriate content. Motivated by this issue, we derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models. Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten. Selective Amnesia can be applied to conditional variational likelihood models, which encompass a variety of popular deep generative frameworks, including variational autoencoders and large-scale text-to-image diffusion models. Experiments across different models demonstrate that our approach induces forgetting on a variety of concepts, from entire classes in standard datasets to celebrity and nudity prompts in text-to-image models. Our code is publicly available at https://github.com/clear-nus/selective-amnesia. △ Less

Submitted 16 October, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2303.03714 [pdf, other]

Generative Modeling with Flow-Guided Density Ratio Learning

Authors: Alvin Heng, Abdul Fatir Ansari, Harold Soh

Abstract: We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in… ▽ More We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as $128\times128$, as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code is publicly available at ttps://github.com/clear-nus/fdrl. △ Less

Submitted 4 June, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

arXiv:2303.03548 [pdf, other]

doi 10.1109/IROS55552.2023.10341488

Large Language Models as Zero-Shot Human Models for Human-Robot Interaction

Authors: Bowen Zhang, Harold Soh

Abstract: Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we… ▽ More Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) -- which have consumed vast amounts of human-generated text data -- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI. △ Less

Submitted 1 October, 2024; v1 submitted 6 March, 2023; originally announced March 2023.

Comments: 8 pages

arXiv:2302.13465 [pdf, other]

doi 10.1103/PhysRevE.108.024204

Enhancing quantum synchronization through homodyne measurement, noise and squeezing

Authors: Yuan Shen, Hong Yi Soh, Weijun Fan, Leong-Chuan Kwek

Abstract: Quantum synchronization has been a central topic in quantum nonlinear dynamics. Despite rapid development in this field, very few have studied how to efficiently boost synchronization. Homodyne measurement emerges as one of the successful candidates for this task, but preferably in the semi-classical regime. In our work, we focus on the phase synchronization of a harmonic-driven quantum Stuart-Lan… ▽ More Quantum synchronization has been a central topic in quantum nonlinear dynamics. Despite rapid development in this field, very few have studied how to efficiently boost synchronization. Homodyne measurement emerges as one of the successful candidates for this task, but preferably in the semi-classical regime. In our work, we focus on the phase synchronization of a harmonic-driven quantum Stuart-Landau oscillator, and show that the enhancement induced by homodyne measurement persists into the quantum regime. Interestingly, optimal two-photon damping rates exist when the oscillator and driving are at resonance and with a small single-photon damping rate. We also report noise-induced enhancement in quantum synchronization when the single-photon damping rate is sufficiently large. Apart from these results, we discover that adding a squeezing Hamiltonian can further boost synchronization, especially in the semi-classical regime. Furthermore, the addition of squeezing causes the optimal two-photon pumping rates to shift and converge. △ Less

Submitted 18 July, 2023; v1 submitted 26 February, 2023; originally announced February 2023.

Comments: 6 pages, 8 figures

Journal ref: Phys. Rev. E 108, 024204, 2023

arXiv:2302.05128 [pdf, other]

Translating Natural Language to Planning Goals with Large-Language Models

Authors: Yaqi Xie, Chen Yu, Tongyao Zhu, Jinbin Bai, Ze Gong, Harold Soh

Abstract: Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains. Unfortunately, recent work has also shown that LLMs are unable to perform accurate reasoning nor solve planning problems, which may limit their usefulness for robotics-related tasks. In… ▽ More Recent large language models (LLMs) have demonstrated remarkable performance on a variety of natural language processing (NLP) tasks, leading to intense excitement about their applicability across various domains. Unfortunately, recent work has also shown that LLMs are unable to perform accurate reasoning nor solve planning problems, which may limit their usefulness for robotics-related tasks. In this work, our central question is whether LLMs are able to translate goals specified in natural language to a structured planning language. If so, LLM can act as a natural interface between the planner and human users; the translated goal can be handed to domain-independent AI planners that are very effective at planning. Our empirical results on GPT 3.5 variants show that LLMs are much better suited towards translation rather than planning. We find that LLMs are able to leverage commonsense knowledge and reasoning to furnish missing details from under-specified goals (as is often the case in natural language). However, our experiments also reveal that LLMs can fail to generate goals in tasks that involve numerical or physical (e.g., spatial) reasoning, and that LLMs are sensitive to the prompts used. As such, these models are promising for translation to structured planning languages, but care should be taken in their use. △ Less

Submitted 10 February, 2023; originally announced February 2023.

arXiv:2301.11308 [pdf, other]

Neural Continuous-Discrete State Space Models for Irregularly-Sampled Time Series

Authors: Abdul Fatir Ansari, Alvin Heng, Andre Lim, Harold Soh

Abstract: Learning accurate predictive models of real-world dynamic phenomena (e.g., climate, biological) remains a challenging task. One key issue is that the data generated by both natural and artificial processes often comprise time series that are irregularly sampled and/or contain missing observations. In this work, we propose the Neural Continuous-Discrete State Space Model (NCDSSM) for continuous-tim… ▽ More Learning accurate predictive models of real-world dynamic phenomena (e.g., climate, biological) remains a challenging task. One key issue is that the data generated by both natural and artificial processes often comprise time series that are irregularly sampled and/or contain missing observations. In this work, we propose the Neural Continuous-Discrete State Space Model (NCDSSM) for continuous-time modeling of time series through discrete-time observations. NCDSSM employs auxiliary variables to disentangle recognition from dynamics, thus requiring amortized inference only for the auxiliary variables. Leveraging techniques from continuous-discrete filtering theory, we demonstrate how to perform accurate Bayesian inference for the dynamic states. We propose three flexible parameterizations of the latent dynamics and an efficient training objective that marginalizes the dynamic states during inference. Empirical results on multiple benchmark datasets across various domains show improved imputation and forecasting performance of NCDSSM over existing models. △ Less

Submitted 18 June, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

Comments: ICML 2023 Camera Ready Version; Code available at https://github.com/clear-nus/NCDSSM

arXiv:2301.04929 [pdf, other]

Heterogeneous Beliefs and Multi-Population Learning in Network Games

Authors: Shuyue Hu, Harold Soh, Georgios Piliouras

Abstract: The effect of population heterogeneity in multi-agent learning is practically relevant but remains far from being well-understood. Motivated by this, we introduce a model of multi-population learning that allows for heterogeneous beliefs within each population and where agents respond to their beliefs via smooth fictitious play (SFP).We show that the system state -- a probability distribution over… ▽ More The effect of population heterogeneity in multi-agent learning is practically relevant but remains far from being well-understood. Motivated by this, we introduce a model of multi-population learning that allows for heterogeneous beliefs within each population and where agents respond to their beliefs via smooth fictitious play (SFP).We show that the system state -- a probability distribution over beliefs -- evolves according to a system of partial differential equations akin to the continuity equations that commonly desccribe transport phenomena in physical systems. We establish the convergence of SFP to Quantal Response Equilibria in different classes of games capturing both network competition as well as network coordination. We also prove that the beliefs will eventually homogenize in all network games. Although the initial belief heterogeneity disappears in the limit, we show that it plays a crucial role for equilibrium selection in the case of coordination games as it helps select highly desirable equilibria. Contrary, in the case of network competition, the resulting limit behavior is independent of the initialization of beliefs, even when the underlying game has many distinct Nash equilibria. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2211.15880 [pdf, other]

Mirror descent of Hopfield model

Authors: Hyungjoon Soh, Dongyeob Kim, Juno Hwang, Junghyo Jo

Abstract: Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for utilizing mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate t… ▽ More Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for utilizing mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models. △ Less

Submitted 9 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: 3 figures

arXiv:2211.05361 [pdf, other]

Safety-Constrained Policy Transfer with Successor Features

Authors: Zeyu Feng, Bowen Zhang, Jianxin Bi, Harold Soh

Abstract: In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We… ▽ More In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We propose a Constrained Markov Decision Process (CMDP) formulation that simultaneously enables the transfer of policies and adherence to safety constraints. Our formulation cleanly separates task goals from safety considerations and permits the specification of a wide variety of constraints. Our approach relies on a novel extension of generalized policy improvement to constrained settings via a Lagrangian formulation. We devise a dual optimization algorithm that estimates the optimal dual variable of a target task, thus enabling safe transfer of policies derived from successor features learned on source tasks. Our experiments in simulated domains show that our approach is effective; it visits unsafe states less frequently and outperforms alternative state-of-the-art methods when taking safety constraints into account. △ Less

Submitted 10 November, 2022; originally announced November 2022.

arXiv:2210.06787 [pdf, other]

Observed Adversaries in Deep Reinforcement Learning

Authors: Eugene Lim, Harold Soh

Abstract: In this work, we point out the problem of observed adversaries for deep policies. Specifically, recent work has shown that deep reinforcement learning is susceptible to adversarial attacks where an observed adversary acts under environmental constraints to invoke natural but adversarial observations. This setting is particularly relevant for HRI since HRI-related robots are expected to perform the… ▽ More In this work, we point out the problem of observed adversaries for deep policies. Specifically, recent work has shown that deep reinforcement learning is susceptible to adversarial attacks where an observed adversary acts under environmental constraints to invoke natural but adversarial observations. This setting is particularly relevant for HRI since HRI-related robots are expected to perform their tasks around and with other agents. In this work, we demonstrate that this effect persists even with low-dimensional observations. We further show that these adversarial attacks transfer across victims, which potentially allows malicious attackers to train an adversary without access to the target victim. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Report number: AIHRI/2022/7817

arXiv:2209.10860 [pdf, other]

doi 10.1145/3514094.3534190

SCALES: From Fairness Principles to Constrained Decision-Making

Authors: Sreejith Balakrishnan, Jianxin Bi, Harold Soh

Abstract: This paper proposes SCALES, a general framework that translates well-established fairness principles into a common representation based on the Constraint Markov Decision Process (CMDP). With the help of causal language, our framework can place constraints on both the procedure of decision making (procedural fairness) as well as the outcomes resulting from decisions (outcome fairness). Specifically… ▽ More This paper proposes SCALES, a general framework that translates well-established fairness principles into a common representation based on the Constraint Markov Decision Process (CMDP). With the help of causal language, our framework can place constraints on both the procedure of decision making (procedural fairness) as well as the outcomes resulting from decisions (outcome fairness). Specifically, we show that well-known fairness principles can be encoded either as a utility component, a non-causal component, or a causal component in a SCALES-CMDP. We illustrate SCALES using a set of case studies involving a simulated healthcare scenario and the real-world COMPAS dataset. Experiments demonstrate that our framework produces fair policies that embody alternative fairness principles in single-step and sequential decision-making scenarios. △ Less

Submitted 22 September, 2022; originally announced September 2022.

Comments: Accepted to the 2022 AAAI/ACM Conference on AI, Ethics, and Society (AIES '22), Updated version with additional citations, 14 pages

arXiv:2203.02877 [pdf, other]

MIRROR: Differentiable Deep Social Projection for Assistive Human-Robot Communication

Authors: Kaiqi Chen, Jeffrey Fong, Harold Soh

Abstract: Communication is a hallmark of intelligence. In this work, we present MIRROR, an approach to (i) quickly learn human models from human demonstrations, and (ii) use the models for subsequent communication planning in assistive shared-control settings. MIRROR is inspired by social projection theory, which hypothesizes that humans use self-models to understand others. Likewise, MIRROR leverages self-… ▽ More Communication is a hallmark of intelligence. In this work, we present MIRROR, an approach to (i) quickly learn human models from human demonstrations, and (ii) use the models for subsequent communication planning in assistive shared-control settings. MIRROR is inspired by social projection theory, which hypothesizes that humans use self-models to understand others. Likewise, MIRROR leverages self-models learned using reinforcement learning to bootstrap human modeling. Experiments with simulated humans show that this approach leads to rapid learning and more robust models compared to existing behavioral cloning and state-of-the-art imitation learning methods. We also present a human-subject study using the CARLA simulator which shows that (i) MIRROR is able to scale to complex domains with high-dimensional observations and complicated world physics and (ii) provides effective assistive communication that enabled participants to drive more safely in adverse weather conditions. △ Less

Submitted 6 March, 2022; originally announced March 2022.

Comments: 17 pages

arXiv:2203.01500 [pdf, other]

The Dynamics of Q-learning in Population Games: a Physics-Inspired Continuity Equation Model

Authors: Shuyue Hu, Chin-Wing Leung, Ho-fung Leung, Harold Soh

Abstract: Although learning has found wide application in multi-agent systems, its effects on the temporal evolution of a system are far from understood. This paper focuses on the dynamics of Q-learning in large-scale multi-agent systems modeled as population games. We revisit the replicator equation model for Q-learning dynamics and observe that this model is inappropriate for our concerned setting. Motiva… ▽ More Although learning has found wide application in multi-agent systems, its effects on the temporal evolution of a system are far from understood. This paper focuses on the dynamics of Q-learning in large-scale multi-agent systems modeled as population games. We revisit the replicator equation model for Q-learning dynamics and observe that this model is inappropriate for our concerned setting. Motivated by this, we develop a new formal model, which bears a formal connection with the continuity equation in physics. We show that our model always accurately describes the Q-learning dynamics in population games across different initial settings of MASs and game configurations. We also show that our model can be applied to different exploration mechanisms, describe the mean dynamics, and be extended to Q-learning in 2-player and n-player games. Last but not least, we show that our model can provide insights into algorithm parameters and facilitate parameter tuning. △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: the 21st International Conference on Autonomous Agents and Multiagent Systems(AAMAS 2022)

arXiv:2110.13878 [pdf, other]

Deep Explicit Duration Switching Models for Time Series

Authors: Abdul Fatir Ansari, Konstantinos Benidis, Richard Kurle, Ali Caner Turkmen, Harold Soh, Alexander J. Smola, Yuyang Wang, Tim Januschowski

Abstract: Many complex time series can be effectively subdivided into distinct regimes that exhibit persistent dynamics. Discovering the switching behavior and the statistical patterns in these regimes is important for understanding the underlying dynamical system. We propose the Recurrent Explicit Duration Switching Dynamical System (RED-SDS), a flexible model that is capable of identifying both state- and… ▽ More Many complex time series can be effectively subdivided into distinct regimes that exhibit persistent dynamics. Discovering the switching behavior and the statistical patterns in these regimes is important for understanding the underlying dynamical system. We propose the Recurrent Explicit Duration Switching Dynamical System (RED-SDS), a flexible model that is capable of identifying both state- and time-dependent switching dynamics. State-dependent switching is enabled by a recurrent state-to-switch connection and an explicit duration count variable is used to improve the time-dependent switching behavior. We demonstrate how to perform efficient inference using a hybrid algorithm that approximates the posterior of the continuous states via an inference network and performs exact inference for the discrete switches and counts. The model is trained by maximizing a Monte Carlo lower bound of the marginal log-likelihood that can be computed efficiently as a byproduct of the inference routine. Empirical results on multiple datasets demonstrate that RED-SDS achieves considerable improvement in time series segmentation and competitive forecasting performance against the state of the art. △ Less

Submitted 26 October, 2021; originally announced October 2021.

Comments: Accepted at NeurIPS 2021

arXiv:2107.02339 [pdf, other]

Multi-Modal Mutual Information (MuMMI) Training for Robust Self-Supervised Deep Reinforcement Learning

Authors: Kaiqi Chen, Yong Lee, Harold Soh

Abstract: This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor performance on downstream tasks and over-reliance on specific sensors. As a solution, we contribute a new multi-modal deep latent state-space model, trained using a… ▽ More This work focuses on learning useful and robust deep world models using multiple, possibly unreliable, sensors. We find that current methods do not sufficiently encourage a shared representation between modalities; this can cause poor performance on downstream tasks and over-reliance on specific sensors. As a solution, we contribute a new multi-modal deep latent state-space model, trained using a mutual information lower-bound. The key innovation is a specially-designed density ratio estimator that encourages consistency between the latent codes of each modality. We tasked our method to learn policies (in a self-supervised manner) on multi-modal Natural MuJoCo benchmarks and a challenging Table Wiping task. Experiments show our method significantly outperforms state-of-the-art deep reinforcement learning methods, particularly in the presence of missing observations. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: 10 pages, Published in ICRA 2021

arXiv:2106.00489 [pdf, other]

Extended Tactile Perception: Vibration Sensing through Tools and Grasped Objects

Authors: Tasbolat Taunyazov, Luar Shui Song, Eugene Lim, Hian Hian See, David Lee, Benjamin C. K. Tee, Harold Soh

Abstract: Humans display the remarkable ability to sense the world through tools and other held objects. For example, we are able to pinpoint impact locations on a held rod and tell apart different textures using a rigid probe. In this work, we consider how we can enable robots to have a similar capacity, i.e., to embody tools and extend perception using standard grasped objects. We propose that vibro-tacti… ▽ More Humans display the remarkable ability to sense the world through tools and other held objects. For example, we are able to pinpoint impact locations on a held rod and tell apart different textures using a rigid probe. In this work, we consider how we can enable robots to have a similar capacity, i.e., to embody tools and extend perception using standard grasped objects. We propose that vibro-tactile sensing using dynamic tactile sensors on the robot fingers, along with machine learning models, enables robots to decipher contact information that is transmitted as vibrations along rigid objects. This paper reports on extensive experiments using the BioTac micro-vibration sensor and a new event dynamic sensor, the NUSkin, capable of multi-taxel sensing at 4~kHz. We demonstrate that fine localization on a held rod is possible using our approach (with errors less than 1 cm on a 20 cm rod). Next, we show that vibro-tactile perception can lead to reasonable grasp stability prediction during object handover, and accurate food identification using a standard fork. We find that multi-taxel vibro-tactile sensing at sufficiently high sampling rate led to the best performance across the various tasks and objects. Taken together, our results provides both evidence and guidelines for using vibro-tactile perception to extend tactile perception, which we believe will lead to enhanced competency with tools and better physical human-robot-interaction. △ Less

Submitted 29 September, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: 9 pages, 7 figures. This version adds additional related work and updated results

Journal ref: IROS 2021

arXiv:2101.11981 [pdf, other]

Embedding Symbolic Temporal Knowledge into Deep Sequential Models

Authors: Yaqi Xie, Fan Zhou, Harold Soh

Abstract: Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning. In recent years, deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources. However, when data is limited, simpler models such as logic/rule-based methods work surprisingly well, especially w… ▽ More Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning. In recent years, deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources. However, when data is limited, simpler models such as logic/rule-based methods work surprisingly well, especially when relevant prior knowledge is applied in their construction. However, unlike DNNs, these "structured" models can be difficult to extend, and do not work well with raw unstructured data. In this work, we seek to learn flexible DNNs, yet leverage prior temporal knowledge when available. Our approach is to embed symbolic knowledge expressed as linear temporal logic (LTL) and use these embeddings to guide the training of deep models. Specifically, we construct semantic-based embeddings of automata generated from LTL formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning. △ Less

Submitted 28 January, 2021; originally announced January 2021.

arXiv:2012.00780 [pdf, other]

Refining Deep Generative Models via Discriminator Gradient Flow

Authors: Abdul Fatir Ansari, Ming Liang Ang, Harold Soh

Abstract: Deep generative modeling has seen impressive advances in recent years, to the point where it is now commonplace to see simulated samples (e.g., images) that closely resemble real-world data. However, generation quality is generally inconsistent for any given model and can vary dramatically between samples. We introduce Discriminator Gradient flow (DGflow), a new technique that improves generated s… ▽ More Deep generative modeling has seen impressive advances in recent years, to the point where it is now commonplace to see simulated samples (e.g., images) that closely resemble real-world data. However, generation quality is generally inconsistent for any given model and can vary dramatically between samples. We introduce Discriminator Gradient flow (DGflow), a new technique that improves generated samples via the gradient flow of entropy-regularized f-divergences between the real and the generated data distributions. The gradient flow takes the form of a non-linear Fokker-Plank equation, which can be easily simulated by sampling from the equivalent McKean-Vlasov process. By refining inferior samples, our technique avoids wasteful sample rejection used by previous methods (DRS & MH-GAN). Compared to existing works that focus on specific GAN variants, we show our refinement approach can be applied to GANs with vector-valued critics and even other deep generative models such as VAEs and Normalizing Flows. Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods. △ Less

Submitted 5 June, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

Comments: ICLR 2021 Camera Ready; Code available at https://github.com/clear-nus/DGflow; Updated Related Work

arXiv:2011.08541 [pdf, other]

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Authors: Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

Abstract: The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed problem at its core; multiple reward functions coincide with the observed behavior and the actual reward function is not identifiable without prior knowledge or… ▽ More The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed problem at its core; multiple reward functions coincide with the observed behavior and the actual reward function is not identifiable without prior knowledge or supplementary information. This paper presents an IRL framework called Bayesian optimization-IRL (BO-IRL) which identifies multiple solutions that are consistent with the expert demonstrations by efficiently exploring the reward function space. BO-IRL achieves this by utilizing Bayesian Optimization along with our newly proposed kernel that (a) projects the parameters of policy invariant reward functions to a single point in a latent space and (b) ensures nearby points in the latent space correspond to reward functions yielding similar likelihoods. This projection allows the use of standard stationary kernels in the latent space to capture the correlations present across the reward function space. Empirical results on synthetic and real-world environments (model-free and model-based) show that BO-IRL discovers multiple reward functions while minimizing the number of expensive exact policy optimizations. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Comments: Accepted to 34th Conference on Neural Information Processing Systems (NeurIPS 2020). Includes Appendix. 21 pages

arXiv:2009.07083 [pdf, other]

Event-Driven Visual-Tactile Sensing and Learning for Robots

Authors: Tasbolat Taunyazov, Weicong Sng, Hian Hian See, Brian Lim, Jethro Kuan, Abdul Fatir Ansari, Benjamin C. K. Tee, Harold Soh

Abstract: This work contributes an event-driven visual-tactile perception system, comprising a novel biologically-inspired tactile sensor and multi-modal spike-based learning. Our neuromorphic fingertip tactile sensor, NeuTouch, scales well with the number of taxels thanks to its event-based nature. Likewise, our Visual-Tactile Spiking Neural Network (VT-SNN) enables fast perception when coupled with event… ▽ More This work contributes an event-driven visual-tactile perception system, comprising a novel biologically-inspired tactile sensor and multi-modal spike-based learning. Our neuromorphic fingertip tactile sensor, NeuTouch, scales well with the number of taxels thanks to its event-based nature. Likewise, our Visual-Tactile Spiking Neural Network (VT-SNN) enables fast perception when coupled with event sensors. We evaluate our visual-tactile system (using the NeuTouch and Prophesee event camera) on two robot tasks: container classification and rotational slip detection. On both tasks, we observe good accuracies relative to standard deep learning methods. We have made our visual-tactile datasets freely-available to encourage research on multi-modal event-driven robot perception, which we believe is a promising approach towards intelligent power-efficient robot systems. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: RSS 2020, Code and Datasets are available at https://clear-nus.github.io/visuotactile/index.html

arXiv:2008.08046 [pdf, other]

TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Authors: Fuqiang Gu, Weicong Sng, Tasbolat Taunyazov, Harold Soh

Abstract: Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. How… ▽ More Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. However, these unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. In this paper, we propose a novel spiking graph neural network for event-based tactile object recognition. To make use of local connectivity of taxels, we present several methods for organizing the tactile data in a graph structure. Based on the constructed graphs, we develop a spiking graph convolutional network. The event-driven nature of spiking neural network makes it arguably more suitable for processing the event-based data. Experimental results on two tactile datasets show that the proposed method outperforms other state-of-the-art spiking methods, achieving high accuracies of approximately 90\% when classifying a variety of different household objects. △ Less

Submitted 31 July, 2020; originally announced August 2020.

Comments: IROS 2020

ACM Class: I.2.9

arXiv:2008.00699 [pdf, other]

Getting to Know One Another: Calibrating Intent, Capabilities and Trust for Human-Robot Collaboration

Authors: Joshua Lee, Jeffrey Fong, Bing Cai Kok, Harold Soh

Abstract: Common experience suggests that agents who know each other well are better able to work together. In this work, we address the problem of calibrating intention and capabilities in human-robot collaboration. In particular, we focus on scenarios where the robot is attempting to assist a human who is unable to directly communicate her intent. Moreover, both agents may have differing capabilities that… ▽ More Common experience suggests that agents who know each other well are better able to work together. In this work, we address the problem of calibrating intention and capabilities in human-robot collaboration. In particular, we focus on scenarios where the robot is attempting to assist a human who is unable to directly communicate her intent. Moreover, both agents may have differing capabilities that are unknown to one another. We adopt a decision-theoretic approach and propose the TICC-POMDP for modeling this setting, with an associated online solver. Experiments show our approach leads to better team performance both in simulation and in a real-world study with human subjects. △ Less

Submitted 3 August, 2020; originally announced August 2020.

Comments: IROS 2020

arXiv:2007.13201 [pdf]

Going Beyond the Debye Length: Overcoming Charge Screening Limitations in Next-Generation Bioelectronic Sensors

Authors: Vladimir Kesler, Boris Murmann, H. Tom Soh

Abstract: Electronic biosensors are a natural fit for field-deployable diagnostic devices, because they can be miniaturized, mass produced, and integrated with circuitry. Unfortunately, progress in the development of such platforms has been hindered by the fact that mobile ions present in biological samples screen charges from the target molecule, greatly reducing sensor sensitivity. Under physiological con… ▽ More Electronic biosensors are a natural fit for field-deployable diagnostic devices, because they can be miniaturized, mass produced, and integrated with circuitry. Unfortunately, progress in the development of such platforms has been hindered by the fact that mobile ions present in biological samples screen charges from the target molecule, greatly reducing sensor sensitivity. Under physiological conditions, the thickness of the resulting electric double layer is less than 1 nm, and it has generally been assumed that electronic detection beyond this distance is virtually impossible. However, a few recently-described sensor design strategies seem to defy this conventional wisdom, exploiting the physics of electrical double layers in ways that traditional models do not capture. In the first strategy, charge screening is decreased by constraining the space in which double layers can form. The second strategy uses external stimuli to prevent double layers from reaching equilibrium, thereby effectively reducing charge screening. The goal of this article is to describe these relatively new concepts, and to offer theoretical insights into mechanisms that may enable electronic biosensing beyond the double-layer. If these concepts can be further developed and translated into practical electronic biosensors, we foresee exciting opportunities for the next generation of diagnostic technologies. △ Less

Submitted 26 July, 2020; originally announced July 2020.

arXiv:2006.16068 [pdf, other]

The Evolutionary Dynamics of Independent Learning Agents in Population Games

Authors: Shuyue Hu, Chin-Wing Leung, Ho-fung Leung, Harold Soh

Abstract: Understanding the evolutionary dynamics of reinforcement learning under multi-agent settings has long remained an open problem. While previous works primarily focus on 2-player games, we consider population games, which model the strategic interactions of a large population comprising small and anonymous agents. This paper presents a formal relation between stochastic processes and the dynamics of… ▽ More Understanding the evolutionary dynamics of reinforcement learning under multi-agent settings has long remained an open problem. While previous works primarily focus on 2-player games, we consider population games, which model the strategic interactions of a large population comprising small and anonymous agents. This paper presents a formal relation between stochastic processes and the dynamics of independent learning agents who reason based on the reward signals. Using a master equation approach, we provide a novel unified framework for characterising population dynamics via a single partial differential equation (Theorem 1). Through a case study involving Cross learning agents, we illustrate that Theorem 1 allows us to identify qualitatively different evolutionary dynamics, to analyse steady states, and to gain insights into the expected behaviour of a population. In addition, we present extensive experimental results validating that Theorem 1 holds for a variety of learning methods and population games. △ Less

Submitted 29 June, 2020; originally announced June 2020.

arXiv:2005.04319 [pdf, other]

ST-MNIST -- The Spiking Tactile MNIST Neuromorphic Dataset

Authors: Hian Hian See, Brian Lim, Si Li, Haicheng Yao, Wen Cheng, Harold Soh, Benjamin C. K. Tee

Abstract: Tactile sensing is an essential modality for smart robots as it enables them to interact flexibly with physical objects in their environment. Recent advancements in electronic skins have led to the development of data-driven machine learning methods that exploit this important sensory modality. However, current datasets used to train such algorithms are limited to standard synchronous tactile sens… ▽ More Tactile sensing is an essential modality for smart robots as it enables them to interact flexibly with physical objects in their environment. Recent advancements in electronic skins have led to the development of data-driven machine learning methods that exploit this important sensory modality. However, current datasets used to train such algorithms are limited to standard synchronous tactile sensors. There is a dearth of neuromorphic event-based tactile datasets, principally due to the scarcity of large-scale event-based tactile sensors. Having such datasets is crucial for the development and evaluation of new algorithms that process spatio-temporal event-based data. For example, evaluating spiking neural networks on conventional frame-based datasets is considered sub-optimal. Here, we debut a novel neuromorphic Spiking Tactile MNIST (ST-MNIST) dataset, which comprises handwritten digits obtained by human participants writing on a neuromorphic tactile sensor array. We also describe an initial effort to evaluate our ST-MNIST dataset using existing artificial and spiking neural network models. The classification accuracies provided herein can serve as performance benchmarks for future work. We anticipate that our ST-MNIST dataset will be of interest and useful to the neuromorphic and robotics research communities. △ Less

Submitted 8 May, 2020; originally announced May 2020.

Comments: Corresponding authors: Benjamin C.K. Tee and Harold Soh For dataset, see http://www.benjamintee.com/stmnist 10 Pages, 4 Figures and 2 Tables

arXiv:2005.03479 [pdf]

Design and Analysis of a Sample-and-Hold CMOS Electrochemical Sensor for Aptamer-based Therapeutic Drug Monitoring

Authors: Jun-Chau Chien, Sam W. Baker, H. Tom Soh, Amin Arbabian

Abstract: In this paper, we present the design and the analysis of an electrochemical circuit for measuring the concentrations of therapeutic drugs using structure-switching aptamers. Aptamers are single-stranded nucleic acids, whose sequence is selected to exhibit high affinity and specificity toward a molecular target, and change its conformation upon binding. This property, when coupled with a redox repo… ▽ More In this paper, we present the design and the analysis of an electrochemical circuit for measuring the concentrations of therapeutic drugs using structure-switching aptamers. Aptamers are single-stranded nucleic acids, whose sequence is selected to exhibit high affinity and specificity toward a molecular target, and change its conformation upon binding. This property, when coupled with a redox reporter and electrochemical detection, enables reagent-free biosensing with a sub-minute temporal resolution for in vivo therapeutic drug monitoring. Specifically, we design a chronoamperometry-based electrochemical circuit that measures the direct changes in the electron transfer (ET) kinetics of a methylene blue reporter conjugated at the distal-end of the aptamer. To overcome the high-frequency noise amplification issue when interfacing with a large-size (> 0.25 mm2) implantable electrode, we present a sample-and-hold (S/H) circuit technique in which the desired electrode potentials are held onto noiseless capacitors during the recording of the redox currents. This allows disconnecting the feedback amplifiers to avoid its noise injection while reducing the total power consumption. A prototype circuit implemented in 65-nm CMOS demonstrates a cell-capacitance-insensitive input-referred noise (IRN) current of 15.2 pArms at a 2.5-kHz filtering bandwidth. Tested in human whole blood samples, changes in the ET kinetics from the redox-labeled aminoglycoside aptamers at different kanamycin concentrations are measured from the recorded current waveforms. By employing principal component analysis (PCA) to compensate for the sampling errors, a detection limit (SNR = 1) of 3.1 uM under 1-sec acquisition is achieved at 0.22-mW power consumption. △ Less

Submitted 7 May, 2020; originally announced May 2020.

Comments: Submitted to IEEE Journal of Solid-State Circuits (JSSC). This is the initial submission version

arXiv:2004.03289 [pdf, other]

KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding

Authors: Jiyeon Ham, Yo Joong Choe, Kyubyong Park, Ilji Choi, Hyungjoon Soh

Abstract: Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed K… ▽ More Natural language inference (NLI) and semantic textual similarity (STS) are key tasks in natural language understanding (NLU). Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language. Motivated by this, we construct and release new datasets for Korean NLI and STS, dubbed KorNLI and KorSTS, respectively. Following previous approaches, we machine-translate existing English training sets and manually translate development and test sets into Korean. To accelerate research on Korean NLU, we also establish baselines on KorNLI and KorSTS. Our datasets are publicly available at https://github.com/kakaobrain/KorNLUDatasets. △ Less

Submitted 5 October, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: Findings of EMNLP 2020. Datasets available at https://github.com/kakaobrain/KorNLUDatasets

arXiv:1910.09546 [pdf]

Overlimiting current in non-uniform arrays of microchannels

Authors: Hyekyung Lee, Shima Alizadeh, Tae Jin Kim, Seung-min Park, Hyongsok Tom Soh, Ali Mani, Sung Jae Kim

Abstract: Overlimiting current (OLC) through electrolytes interfaced with perm-selective membranes has been extensively researched in recent years for understanding the fundamental mechanisms of transport and developing efficient applications from electrochemistry to sample analysis and separation. Predominant mechanisms responsible for OLC include surface conduction, convection by electro-osmotic flow, and… ▽ More Overlimiting current (OLC) through electrolytes interfaced with perm-selective membranes has been extensively researched in recent years for understanding the fundamental mechanisms of transport and developing efficient applications from electrochemistry to sample analysis and separation. Predominant mechanisms responsible for OLC include surface conduction, convection by electro-osmotic flow, and electro-osmotic instability depending on input parameters such as surface charge and geometric constrictions. This work studies how a network of microchannels in a non-uniform array, which mimicks a natural pore configuration, can contribute to OLC. To this end, micro/nanofluidic devices are fabricated with arrays of parallel microchannels with non-uniform size distributions. All cases maintain the same surface and bulk conduction to allow probing the sensitivity only by the non-uniformity of the channels. Both experimental and theoretical current-voltage relations demonstrate that OLCs increase with increasing non-uniformity. Furthermore, the visualization of internal recirculating flows indicates that the non-uniform arrays induce flow loops across the network enhancing advective transport. These evidences confirm a new driving mechanism of OLC, inspired by natural micro/nanoporous materials with random geometric structure. Therefore, this result can advance not only the fundamental understanding of nanoelectrokinetics but also the design rule of engineering applications of electrochemical membrane. △ Less

Submitted 21 October, 2019; originally announced October 2019.

arXiv:1909.07425 [pdf, other]

A Characteristic Function Approach to Deep Implicit Generative Modeling

Authors: Abdul Fatir Ansari, Jonathan Scarlett, Harold Soh

Abstract: Implicit Generative Models (IGMs) such as GANs have emerged as effective data-driven models for generating samples, particularly images. In this paper, we formulate the problem of learning an IGM as minimizing the expected distance between characteristic functions. Specifically, we minimize the distance between characteristic functions of the real and generated data distributions under a suitably-… ▽ More Implicit Generative Models (IGMs) such as GANs have emerged as effective data-driven models for generating samples, particularly images. In this paper, we formulate the problem of learning an IGM as minimizing the expected distance between characteristic functions. Specifically, we minimize the distance between characteristic functions of the real and generated data distributions under a suitably-chosen weighting distribution. This distance metric, which we term as the characteristic function distance (CFD), can be (approximately) computed with linear time-complexity in the number of samples, in contrast with the quadratic-time Maximum Mean Discrepancy (MMD). By replacing the discrepancy measure in the critic of a GAN with the CFD, we obtain a model that is simple to implement and stable to train. The proposed metric enjoys desirable theoretical properties including continuity and differentiability with respect to generator parameters, and continuity in the weak topology. We further propose a variation of the CFD in which the weighting distribution parameters are also optimized during training; this obviates the need for manual tuning, and leads to an improvement in test power relative to CFD. We demonstrate experimentally that our proposed method outperforms WGAN and MMD-GAN variants on a variety of unsupervised image generation benchmarks. △ Less

Submitted 16 June, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

Comments: CVPR 2020 (Oral), Code available at https://github.com/clear-nus/OCFGAN

arXiv:1909.05329 [pdf, other]

Robot Capability and Intention in Trust-based Decisions across Tasks

Authors: Yaqi Xie, Indu P Bodala, Desmond C. Ong, David Hsu, Harold Soh

Abstract: In this paper, we present results from a human-subject study designed to explore two facets of human mental models of robots---inferred capability and intention---and their relationship to overall trust and eventual decisions. In particular, we examine delegation situations characterized by uncertainty, and explore how inferred capability and intention are applied across different tasks. We develo… ▽ More In this paper, we present results from a human-subject study designed to explore two facets of human mental models of robots---inferred capability and intention---and their relationship to overall trust and eventual decisions. In particular, we examine delegation situations characterized by uncertainty, and explore how inferred capability and intention are applied across different tasks. We develop an online survey where human participants decide whether to delegate control to a simulated UAV agent. Our study shows that human estimations of robot capability and intent correlate strongly with overall self-reported trust. However, overall trust is not independently sufficient to determine whether a human will decide to trust (delegate) a given task to a robot. Instead, our study reveals that estimations of robot intention, capability, and overall trust are integrated when deciding to delegate. From a broader perspective, these results suggest that calibrating overall trust alone is insufficient; to make correct decisions, humans need (and use) multi-faceted mental models when collaborating with robots across multiple contexts. △ Less

Submitted 3 September, 2019; originally announced September 2019.

Journal ref: ACM/IEEE Conference on Human Robot Interaction (HRI), 2019

arXiv:1909.01161 [pdf, other]

Embedding Symbolic Knowledge into Deep Networks

Authors: Yaqi Xie, Ziwei Xu, Mohan S. Kankanhalli, Kuldeep S. Meel, Harold Soh

Abstract: In this work, we aim to leverage prior symbolic knowledge to improve the performance of deep models. We propose a graph embedding network that projects propositional formulae (and assignments) onto a manifold via an augmented Graph Convolutional Network (GCN). To generate semantically-faithful embeddings, we develop techniques to recognize node heterogeneity, and semantic regularization that incor… ▽ More In this work, we aim to leverage prior symbolic knowledge to improve the performance of deep models. We propose a graph embedding network that projects propositional formulae (and assignments) onto a manifold via an augmented Graph Convolutional Network (GCN). To generate semantically-faithful embeddings, we develop techniques to recognize node heterogeneity, and semantic regularization that incorporate structural constraints into the embedding. Experiments show that our approach improves the performance of models trained to perform entailment checking and visual relation prediction. Interestingly, we observe a connection between the tractability of the propositional theory representation and the ease of embedding. Future exploration of this connection may elucidate the relationship between knowledge compilation and vector representation learning. △ Less

Submitted 29 October, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

Comments: *Equal contribution; Accepted at conference Neural Information Processing Systems (NeurIPS), 2019

Showing 1–50 of 62 results for author: Soh, H