subscribe to arXiv mailings

Enhancing Federated Domain Adaptation with Multi-Domain Prototype-Based Federated Fine-Tuning

Authors: Jingyuan Zhang, Yiyang Duan, Shuaicheng Niu, Yang Cao, Wei Yang Bryan Lim

Abstract: Federated Domain Adaptation (FDA) is a Federated Learning (FL) scenario where models are trained across multiple clients with unique data domains but a shared category space, without transmitting private data. The primary challenge in FDA is data heterogeneity, which causes significant divergences in gradient updates when using conventional averaging-based aggregation methods, reducing the efficac… ▽ More Federated Domain Adaptation (FDA) is a Federated Learning (FL) scenario where models are trained across multiple clients with unique data domains but a shared category space, without transmitting private data. The primary challenge in FDA is data heterogeneity, which causes significant divergences in gradient updates when using conventional averaging-based aggregation methods, reducing the efficacy of the global model. This further undermines both in-domain and out-of-domain performance (within the same federated system but outside the local client). To address this, we propose a novel framework called \textbf{M}ulti-domain \textbf{P}rototype-based \textbf{F}ederated Fine-\textbf{T}uning (MPFT). MPFT fine-tunes a pre-trained model using multi-domain prototypes, i.e., pretrained representations enriched with domain-specific information from category-specific local data. This enables supervised learning on the server to derive a globally optimized adapter that is subsequently distributed to local clients, without the intrusion of data privacy. Empirical results show that MPFT significantly improves both in-domain and out-of-domain accuracy over conventional methods, enhancing knowledge preservation and adaptation in FDA. Notably, MPFT achieves convergence within a single communication round, greatly reducing computation and communication costs. To ensure privacy, MPFT applies differential privacy to protect the prototypes. Additionally, we develop a prototype-based feature space hijacking attack to evaluate robustness, confirming that raw data samples remain unrecoverable even after extensive training epochs. The complete implementation of MPFL is available at \url{https://anonymous.4open.science/r/DomainFL/}. △ Less

Submitted 10 October, 2024; originally announced October 2024.

arXiv:2409.13315 [pdf, other]

Exploring the Performance-Reproducibility Trade-off in Quality-Diversity

Authors: Manon Flageat, Hannah Janmohamed, Bryan Lim, Antoine Cully

Abstract: Quality-Diversity (QD) algorithms have exhibited promising results across many domains and applications. However, uncertainty in fitness and behaviour estimations of solutions remains a major challenge when QD is used in complex real-world applications. While several approaches have been proposed to improve the performance in uncertain applications, many fail to address a key challenge: determinin… ▽ More Quality-Diversity (QD) algorithms have exhibited promising results across many domains and applications. However, uncertainty in fitness and behaviour estimations of solutions remains a major challenge when QD is used in complex real-world applications. While several approaches have been proposed to improve the performance in uncertain applications, many fail to address a key challenge: determining how to prioritise solutions that perform consistently under uncertainty, in other words, solutions that are reproducible. Most prior methods improve fitness and reproducibility jointly, ignoring the possibility that they could be contradictory objectives. For example, in robotics, solutions may reliably walk at 90% of the maximum velocity in uncertain environments, while solutions that walk faster are also more prone to falling over. As this is a trade-off, neither one of these two solutions is "better" than the other. Thus, algorithms cannot intrinsically select one solution over the other, but can only enforce given preferences over these two contradictory objectives. In this paper, we formalise this problem as the performance-reproducibility trade-off for uncertain QD. We propose four new a-priori QD algorithms that find optimal solutions for given preferences over the trade-offs. We also propose an a-posteriori QD algorithm for when these preferences cannot be defined in advance. Our results show that our approaches successfully find solutions that satisfy given preferences. Importantly, by simply accounting for this trade-off, our approaches perform better than existing uncertain QD methods. This suggests that considering the performance-reproducibility trade-off unlocks important stepping stones that are usually missed when only performance is optimised. △ Less

Submitted 20 September, 2024; originally announced September 2024.

arXiv:2409.10587 [pdf, other]

SoccerNet 2024 Challenges Results

Authors: Anthony Cioppa, Silvio Giancola, Vladimir Somers, Victor Joos, Floriane Magera, Jan Held, Seyed Abolfazl Ghasemzadeh, Xin Zhou, Karolina Seweryn, Mateusz Kowalczyk, Zuzanna Mróz, Szymon Łukasik, Michał Hałoń, Hassan Mkhallati, Adrien Deliège, Carlos Hinojosa, Karen Sanchez, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Adam Gorski , et al. (59 additional authors not shown)

Abstract: The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team. These challenges aim to advance research across multiple themes in football, including broadcast video understanding, field understanding, and player understanding. This year, the challenges encompass four vision-based tasks. (1) Ball Action Spotting, focusing on precisely loca… ▽ More The SoccerNet 2024 challenges represent the fourth annual video understanding challenges organized by the SoccerNet team. These challenges aim to advance research across multiple themes in football, including broadcast video understanding, field understanding, and player understanding. This year, the challenges encompass four vision-based tasks. (1) Ball Action Spotting, focusing on precisely localizing when and which soccer actions related to the ball occur, (2) Dense Video Captioning, focusing on describing the broadcast with natural language and anchored timestamps, (3) Multi-View Foul Recognition, a novel task focusing on analyzing multiple viewpoints of a potential foul incident to classify whether a foul occurred and assess its severity, (4) Game State Reconstruction, another novel task focusing on reconstructing the game state from broadcast videos onto a 2D top-view map of the field. Detailed information about the tasks, challenges, and leaderboards can be found at https://www.soccer-net.org, with baselines and development kits available at https://github.com/SoccerNet. △ Less

Submitted 16 September, 2024; originally announced September 2024.

Comments: 7 pages, 1 figure

arXiv:2409.07160 [pdf]

Distance Measurement for UAVs in Deep Hazardous Tunnels

Authors: Vishal Choudhary, Shashi Kant Gupta, Shaohui Foong, Hock Beng Lim

Abstract: The localization of Unmanned aerial vehicles (UAVs) in deep tunnels is extremely challenging due to their inaccessibility and hazardous environment. Conventional outdoor localization techniques (such as using GPS) and indoor localization techniques (such as those based on WiFi, Infrared (IR), Ultra-Wideband, etc.) do not work in deep tunnels. We are developing a UAV-based system for the inspection… ▽ More The localization of Unmanned aerial vehicles (UAVs) in deep tunnels is extremely challenging due to their inaccessibility and hazardous environment. Conventional outdoor localization techniques (such as using GPS) and indoor localization techniques (such as those based on WiFi, Infrared (IR), Ultra-Wideband, etc.) do not work in deep tunnels. We are developing a UAV-based system for the inspection of defects in the Deep Tunnel Sewerage System (DTSS) in Singapore. To enable the UAV localization in the DTSS, we have developed a distance measurement module based on the optical flow technique. However, the standard optical flow technique does not work well in tunnels with poor lighting and a lack of features. Thus, we have developed an enhanced optical flow algorithm with prediction, to improve the distance measurement for UAVs in deep hazardous tunnels. △ Less

Submitted 11 September, 2024; originally announced September 2024.

arXiv:2408.03560 [pdf, other]

In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models

Authors: Ayrton San Joaquin, Bin Wang, Zhengyuan Liu, Nicholas Asher, Brian Lim, Philippe Muller, Nancy F. Chen

Abstract: Despite advancements, fine-tuning Large Language Models (LLMs) remains costly due to the extensive parameter count and substantial data requirements for model generalization. Accessibility to computing resources remains a barrier for the open-source community. To address this challenge, we propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and eval… ▽ More Despite advancements, fine-tuning Large Language Models (LLMs) remains costly due to the extensive parameter count and substantial data requirements for model generalization. Accessibility to computing resources remains a barrier for the open-source community. To address this challenge, we propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and evaluation samples with a trained model. Notably, we assess the model's internal gradients to estimate this relationship, aiming to rank the contribution of each training point. To enhance efficiency, we propose an optimization to compute influence functions with a reduced number of layers while achieving similar accuracy. By applying our algorithm to instruction fine-tuning data of LLMs, we can achieve similar performance with just 50% of the training data. Meantime, using influence functions to analyze model coverage to certain testing samples could provide a reliable and interpretable signal on the training set's coverage of those test points. △ Less

Submitted 2 October, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

Comments: EMNLP 2024 - Findings

arXiv:2406.00116 [pdf, other]

A Sim2Real Approach for Identifying Task-Relevant Properties in Interpretable Machine Learning

Authors: Eura Nofshin, Esther Brown, Brian Lim, Weiwei Pan, Finale Doshi-Velez

Abstract: Explanations of an AI's function can assist human decision-makers, but the most useful explanation depends on the decision's context, referred to as the downstream task. User studies are necessary to determine the best explanations for each task. Unfortunately, testing every explanation and task combination is impractical, especially considering the many factors influencing human+AI collaboration… ▽ More Explanations of an AI's function can assist human decision-makers, but the most useful explanation depends on the decision's context, referred to as the downstream task. User studies are necessary to determine the best explanations for each task. Unfortunately, testing every explanation and task combination is impractical, especially considering the many factors influencing human+AI collaboration beyond the explanation's content. This work leverages two insights to streamline finding the most effective explanation. First, explanations can be characterized by properties, such as faithfulness or complexity, which indicate if they contain the right information for the task. Second, we introduce XAIsim2real, a pipeline for running synthetic user studies. In our validation study, XAIsim2real accurately predicts user preferences across three tasks, making it a valuable tool for refining explanation choices before full studies. Additionally, it uncovers nuanced relationships, like how cognitive budget limits a user's engagement with complex explanations -- a trend confirmed with real users. △ Less

Submitted 18 September, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.18802 [pdf, other]

Enhancing Security and Privacy in Federated Learning using Update Digests and Voting-Based Defense

Authors: Wenjie Li, Kai Fan, Jingyuan Zhang, Hui Li, Wei Yang Bryan Lim, Qiang Yang

Abstract: Federated Learning (FL) is a promising privacy-preserving machine learning paradigm that allows data owners to collaboratively train models while keeping their data localized. Despite its potential, FL faces challenges related to the trustworthiness of both clients and servers, especially in the presence of curious or malicious adversaries. In this paper, we introduce a novel framework named \unde… ▽ More Federated Learning (FL) is a promising privacy-preserving machine learning paradigm that allows data owners to collaboratively train models while keeping their data localized. Despite its potential, FL faces challenges related to the trustworthiness of both clients and servers, especially in the presence of curious or malicious adversaries. In this paper, we introduce a novel framework named \underline{\textbf{F}}ederated \underline{\textbf{L}}earning with \underline{\textbf{U}}pdate \underline{\textbf{D}}igest (FLUD), which addresses the critical issues of privacy preservation and resistance to Byzantine attacks within distributed learning environments. FLUD utilizes an innovative approach, the $\mathsf{LinfSample}$ method, allowing clients to compute the $l_{\infty}$ norm across sliding windows of updates as an update digest. This digest enables the server to calculate a shared distance matrix, significantly reducing the overhead associated with Secure Multi-Party Computation (SMPC) by three orders of magnitude while effectively distinguishing between benign and malicious updates. Additionally, FLUD integrates a privacy-preserving, voting-based defense mechanism that employs optimized SMPC protocols to minimize communication rounds. Our comprehensive experiments demonstrate FLUD's effectiveness in countering Byzantine adversaries while incurring low communication and runtime overhead. FLUD offers a scalable framework for secure and reliable FL in distributed environments, facilitating its application in scenarios requiring robust data management and security. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 14 pages

arXiv:2404.15794 [pdf, other]

Large Language Models as In-context AI Generators for Quality-Diversity

Authors: Bryan Lim, Manon Flageat, Antoine Cully

Abstract: Quality-Diversity (QD) approaches are a promising direction to develop open-ended processes as they can discover archives of high-quality solutions across diverse niches. While already successful in many applications, QD approaches usually rely on combining only one or two solutions to generate new candidate solutions. As observed in open-ended processes such as technological evolution, wisely com… ▽ More Quality-Diversity (QD) approaches are a promising direction to develop open-ended processes as they can discover archives of high-quality solutions across diverse niches. While already successful in many applications, QD approaches usually rely on combining only one or two solutions to generate new candidate solutions. As observed in open-ended processes such as technological evolution, wisely combining large diversity of these solutions could lead to more innovative solutions and potentially boost the productivity of QD search. In this work, we propose to exploit the pattern-matching capabilities of generative models to enable such efficient solution combinations. We introduce In-context QD, a framework of techniques that aim to elicit the in-context capabilities of pre-trained Large Language Models (LLMs) to generate interesting solutions using few-shot and many-shot prompting with quality-diverse examples from the QD archive as context. Applied to a series of common QD domains, In-context QD displays promising results compared to both QD baselines and similar strategies developed for single-objective optimization. Additionally, this result holds across multiple values of parameter sizes and archive population sizes, as well as across domains with distinct characteristics from BBO functions to policy search. Finally, we perform an extensive ablation that highlights the key prompt design considerations that encourage the generation of promising solutions for QD. △ Less

Submitted 5 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.11922 [pdf, other]

Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration

Authors: Hans Jarett J. Ong, Brian Godwin S. Lim

Abstract: Effective causal discovery is essential for learning the causal graph from observational data. The linear non-Gaussian acyclic model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise in determining the causal graph. Its assumption of unmeasured confounders being absent, however, poses practical limitations. In response, empirical research has shown… ▽ More Effective causal discovery is essential for learning the causal graph from observational data. The linear non-Gaussian acyclic model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise in determining the causal graph. Its assumption of unmeasured confounders being absent, however, poses practical limitations. In response, empirical research has shown that the reformulation of LiNGAM as a shortest path problem (LiNGAM-SPP) addresses this limitation. Within LiNGAM-SPP, mutual information is chosen to serve as the measure of independence. A challenge is introduced - parameter tuning is now needed due to its reliance on kNN mutual information estimators. The paper proposes a threefold enhancement to the LiNGAM-SPP framework. First, the need for parameter tuning is eliminated by using the pairwise likelihood ratio in lieu of kNN-based mutual information. This substitution is validated on a general data generating process and benchmark real-world data sets, outperforming existing methods especially when given a larger set of features. The incorporation of prior knowledge is then enabled by a node-skipping strategy implemented on the graph representation of all causal orderings to eliminate violations based on the provided input of relative orderings. Flexibility relative to existing approaches is achieved. Last among the three enhancements is the utilization of the distribution of paths in the graph representation of all causal orderings. From this, crucial properties of the true causal graph such as the presence of unmeasured confounders and sparsity may be inferred. To some extent, the expected performance of the causal discovery algorithm may be predicted. The refinements above advance the practicality and performance of LiNGAM-SPP, showcasing the potential of graph-search-based methodologies in advancing causal discovery. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.06733 [pdf, other]

Incremental XAI: Memorable Understanding of AI with Incremental Explanations

Authors: Jessica Y. Bo, Pan Hao, Brian Y. Lim

Abstract: Many explainable AI (XAI) techniques strive for interpretability by providing concise salient information, such as sparse linear factors. However, users either only see inaccurate global explanations, or highly-varying local explanations. We propose to provide more detailed explanations by leveraging the human cognitive capacity to accumulate knowledge by incrementally receiving more details. Focu… ▽ More Many explainable AI (XAI) techniques strive for interpretability by providing concise salient information, such as sparse linear factors. However, users either only see inaccurate global explanations, or highly-varying local explanations. We propose to provide more detailed explanations by leveraging the human cognitive capacity to accumulate knowledge by incrementally receiving more details. Focusing on linear factor explanations (factors $\times$ values = outcome), we introduce Incremental XAI to automatically partition explanations for general and atypical instances by providing Base + Incremental factors to help users read and remember more faithful explanations. Memorability is improved by reusing base factors and reducing the number of factors shown in atypical cases. In modeling, formative, and summative user studies, we evaluated the faithfulness, memorability and understandability of Incremental XAI against baseline explanation methods. This work contributes towards more usable explanation that users can better ingrain to facilitate intuitive engagement with AI. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: CHI 2024

arXiv:2403.12529 [pdf, other]

Contextualized Messages Boost Graph Representations

Authors: Brian Godwin Lim, Galvin Brice Lim, Renzo Roel Tan, Kazushi Ikeda

Abstract: Graph neural networks (GNNs) have gained significant attention in recent years for their ability to process data that may be represented as graphs. This has prompted several studies to explore their representational capability based on the graph isomorphism task. These works inherently assume a countable node feature representation, potentially limiting their applicability. Interestingly, only a f… ▽ More Graph neural networks (GNNs) have gained significant attention in recent years for their ability to process data that may be represented as graphs. This has prompted several studies to explore their representational capability based on the graph isomorphism task. These works inherently assume a countable node feature representation, potentially limiting their applicability. Interestingly, only a few study GNNs with uncountable node feature representation. In the paper, a novel perspective on the representational capability of GNNs is investigated across all levels$\unicode{x2014}$node-level, neighborhood-level, and graph-level$\unicode{x2014}$when the space of node feature representation is uncountable. More specifically, the strict injective and metric requirements are softly relaxed by employing a pseudometric distance on the space of input to create a soft-injective function such that distinct inputs may produce similar outputs if and only if the pseudometric deems the inputs to be sufficiently similar on some representation. As a consequence, a simple and computationally efficient soft-isomorphic relational graph convolution network (SIR-GCN) that emphasizes the contextualized transformation of neighborhood feature representations via anisotropic and dynamic message functions is proposed. A mathematical discussion on the relationship between SIR-GCN and widely used GNNs is then laid out to put the contribution into context, establishing SIR-GCN as a generalization of classical GNN methodologies. Experiments on synthetic and benchmark datasets then demonstrate the relative superiority of SIR-GCN, outperforming comparable models in node and graph property prediction tasks. △ Less

Submitted 30 September, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2401.14565 [pdf, other]

TIFu: Tri-directional Implicit Function for High-Fidelity 3D Character Reconstruction

Authors: Byoungsung Lim, Seong-Whan Lee

Abstract: Recent advances in implicit function-based approaches have shown promising results in 3D human reconstruction from a single RGB image. However, these methods are not sufficient to extend to more general cases, often generating dragged or disconnected body parts, particularly for animated characters. We argue that these limitations stem from the use of the existing point-level 3D shape representati… ▽ More Recent advances in implicit function-based approaches have shown promising results in 3D human reconstruction from a single RGB image. However, these methods are not sufficient to extend to more general cases, often generating dragged or disconnected body parts, particularly for animated characters. We argue that these limitations stem from the use of the existing point-level 3D shape representation, which lacks holistic 3D context understanding. Voxel-based reconstruction methods are more suitable for capturing the entire 3D space at once, however, these methods are not practical for high-resolution reconstructions due to their excessive memory usage. To address these challenges, we introduce Tri-directional Implicit Function (TIFu), which is a vector-level representation that increases global 3D consistencies while significantly reducing memory usage compared to voxel representations. We also introduce a new algorithm in 3D reconstruction at an arbitrary resolution by aggregating vectors along three orthogonal axes, resolving inherent problems with regressing fixed dimension of vectors. Our approach achieves state-of-the-art performances in both our self-curated character dataset and the benchmark 3D human dataset. We provide both quantitative and qualitative analyses to support our findings. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2312.07178 [pdf, other]

Beyond Expected Return: Accounting for Policy Reproducibility when Evaluating Reinforcement Learning Algorithms

Authors: Manon Flageat, Bryan Lim, Antoine Cully

Abstract: Many applications in Reinforcement Learning (RL) usually have noise or stochasticity present in the environment. Beyond their impact on learning, these uncertainties lead the exact same policy to perform differently, i.e. yield different return, from one roll-out to another. Common evaluation procedures in RL summarise the consequent return distributions using solely the expected return, which doe… ▽ More Many applications in Reinforcement Learning (RL) usually have noise or stochasticity present in the environment. Beyond their impact on learning, these uncertainties lead the exact same policy to perform differently, i.e. yield different return, from one roll-out to another. Common evaluation procedures in RL summarise the consequent return distributions using solely the expected return, which does not account for the spread of the distribution. Our work defines this spread as the policy reproducibility: the ability of a policy to obtain similar performance when rolled out many times, a crucial property in some real-world applications. We highlight that existing procedures that only use the expected return are limited on two fronts: first an infinite number of return distributions with a wide range of performance-reproducibility trade-offs can have the same expected return, limiting its effectiveness when used for comparing policies; second, the expected return metric does not leave any room for practitioners to choose the best trade-off value for considered applications. In this work, we address these limitations by recommending the use of Lower Confidence Bound, a metric taken from Bayesian optimisation that provides the user with a preference parameter to choose a desired performance-reproducibility trade-off. We also formalise and quantify policy reproducibility, and demonstrate the benefit of our metrics using extensive experiments of popular RL algorithms on common uncertain RL tasks. △ Less

Submitted 22 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2311.01829 [pdf, other]

Mix-ME: Quality-Diversity for Multi-Agent Learning

Authors: Garðar Ingvarsson, Mikayel Samvelyan, Bryan Lim, Manon Flageat, Antoine Cully, Tim Rocktäschel

Abstract: In many real-world systems, such as adaptive robotics, achieving a single, optimised solution may be insufficient. Instead, a diverse set of high-performing solutions is often required to adapt to varying contexts and requirements. This is the realm of Quality-Diversity (QD), which aims to discover a collection of high-performing solutions, each with their own unique characteristics. QD methods ha… ▽ More In many real-world systems, such as adaptive robotics, achieving a single, optimised solution may be insufficient. Instead, a diverse set of high-performing solutions is often required to adapt to varying contexts and requirements. This is the realm of Quality-Diversity (QD), which aims to discover a collection of high-performing solutions, each with their own unique characteristics. QD methods have recently seen success in many domains, including robotics, where they have been used to discover damage-adaptive locomotion controllers. However, most existing work has focused on single-agent settings, despite many tasks of interest being multi-agent. To this end, we introduce Mix-ME, a novel multi-agent variant of the popular MAP-Elites algorithm that forms new solutions using a crossover-like operator by mixing together agents from different teams. We evaluate the proposed methods on a variety of partially observable continuous control tasks. Our evaluation shows that these multi-agent variants obtained by Mix-ME not only compete with single-agent baselines but also often outperform them in multi-agent settings under partial observability. △ Less

Submitted 3 November, 2023; originally announced November 2023.

Comments: 15 pages, 7 figures. Submitted and accepted to the ALOE workshop at NeurIPS 2023

arXiv:2309.06006 [pdf, ps, other]

SoccerNet 2023 Challenges Results

Authors: Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim , et al. (77 additional authors not shown)

Abstract: The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, fo… ▽ More The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2308.03665 [pdf, other]

QDax: A Library for Quality-Diversity and Population-based Algorithms with Hardware Acceleration

Authors: Felix Chalumeau, Bryan Lim, Raphael Boige, Maxime Allard, Luca Grillotti, Manon Flageat, Valentin Macé, Arthur Flajolet, Thomas Pierrot, Antoine Cully

Abstract: QDax is an open-source library with a streamlined and modular API for Quality-Diversity (QD) optimization algorithms in Jax. The library serves as a versatile tool for optimization purposes, ranging from black-box optimization to continuous control. QDax offers implementations of popular QD, Neuroevolution, and Reinforcement Learning (RL) algorithms, supported by various examples. All the implemen… ▽ More QDax is an open-source library with a streamlined and modular API for Quality-Diversity (QD) optimization algorithms in Jax. The library serves as a versatile tool for optimization purposes, ranging from black-box optimization to continuous control. QDax offers implementations of popular QD, Neuroevolution, and Reinforcement Learning (RL) algorithms, supported by various examples. All the implementations can be just-in-time compiled with Jax, facilitating efficient execution across multiple accelerators, including GPUs and TPUs. These implementations effectively demonstrate the framework's flexibility and user-friendliness, easing experimentation for research purposes. Furthermore, the library is thoroughly documented and tested with 95\% coverage. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2304.12080 [pdf, other]

doi 10.1145/3583133.3590625

Quality-Diversity Optimisation on a Physical Robot Through Dynamics-Aware and Reset-Free Learning

Authors: Simón C. Smith, Bryan Lim, Hannah Janmohamed, Antoine Cully

Abstract: Learning algorithms, like Quality-Diversity (QD), can be used to acquire repertoires of diverse robotics skills. This learning is commonly done via computer simulation due to the large number of evaluations required. However, training in a virtual environment generates a gap between simulation and reality. Here, we build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a p… ▽ More Learning algorithms, like Quality-Diversity (QD), can be used to acquire repertoires of diverse robotics skills. This learning is commonly done via computer simulation due to the large number of evaluations required. However, training in a virtual environment generates a gap between simulation and reality. Here, we build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a physical robot. This method uses a dynamics model, learned from interactions between the robot and the environment, to predict the robot's behaviour and improve sample efficiency. A behaviour selection policy filters out uninteresting or unsafe policies predicted by the model. RF-QD also includes a recovery policy that returns the robot to a safe zone when it has walked outside of it, allowing continuous learning. We demonstrate that our method enables a physical quadruped robot to learn a repertoire of behaviours in two hours without human supervision. We successfully test the solution repertoire using a maze navigation task. Finally, we compare our approach to the MAP-Elites algorithm. We show that dynamics awareness and a recovery policy are required for training on a physical robot for optimal archive generation. Video available at https://youtu.be/BgGNvIsRh7Q △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: 5 pages, 2 figures, 1 linked video, to be presented as a poster at the Genetic and Evolutionary Computation Conference Companion (GECCO 2023 Companion), July, 2023, Lisbon, Portugal

arXiv:2304.07948 [pdf, other]

Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A Multi-Agent Reinforcement Learning Approach

Authors: Siyue Zhang, Minrui Xu, Wei Yang Bryan Lim, Dusit Niyato

Abstract: Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption. Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive… ▽ More Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption. Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive and low-carbon energy and address the issue of workload imbalance. To tackle the challenge of multi-objective scheduling, i.e., maximizing GPU utilization while reducing operational costs, we propose an algorithm based on multi-agent reinforcement learning and actor-critic methods to learn the optimal collaborative scheduling strategy through interacting with a cloud system built with real-life workload patterns, energy prices, and carbon intensities. Compared with other algorithms, our proposed method improves the system utility by up to 28.6% attributable to higher GPU utilization, lower energy cost, and less carbon emission. △ Less

Submitted 16 April, 2023; originally announced April 2023.

arXiv:2304.03672 [pdf, other]

doi 10.1145/3583131.3590498

Don't Bet on Luck Alone: Enhancing Behavioral Reproducibility of Quality-Diversity Solutions in Uncertain Domains

Authors: Luca Grillotti, Manon Flageat, Bryan Lim, Antoine Cully

Abstract: Quality-Diversity (QD) algorithms are designed to generate collections of high-performing solutions while maximizing their diversity in a given descriptor space. However, in the presence of unpredictable noise, the fitness and descriptor of the same solution can differ significantly from one evaluation to another, leading to uncertainty in the estimation of such values. Given the elitist nature of… ▽ More Quality-Diversity (QD) algorithms are designed to generate collections of high-performing solutions while maximizing their diversity in a given descriptor space. However, in the presence of unpredictable noise, the fitness and descriptor of the same solution can differ significantly from one evaluation to another, leading to uncertainty in the estimation of such values. Given the elitist nature of QD algorithms, they commonly end up with many degenerate solutions in such noisy settings. In this work, we introduce Archive Reproducibility Improvement Algorithm (ARIA); a plug-and-play approach that improves the reproducibility of the solutions present in an archive. We propose it as a separate optimization module, relying on natural evolution strategies, that can be executed on top of any QD algorithm. Our module mutates solutions to (1) optimize their probability of belonging to their niche, and (2) maximize their fitness. The performance of our method is evaluated on various tasks, including a classical optimization problem and two high-dimensional control tasks in simulated robotic environments. We show that our algorithm enhances the quality and descriptor space coverage of any given archive by at least 50%. △ Less

Submitted 7 April, 2023; originally announced April 2023.

Comments: The two first authors contributed equally to this research

ACM Class: I.2.8

arXiv:2303.09097 [pdf, other]

IRIS: Interpretable Rubric-Informed Segmentation for Action Quality Assessment

Authors: Hitoshi Matsuyama, Nobuo Kawaguchi, Brian Y. Lim

Abstract: AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria -… ▽ More AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria - rubric - on multiple actions in each performance sequence. Therefore, we propose IRIS to perform Interpretable Rubric-Informed Segmentation on action sequences for AQA. We investigated IRIS for scoring videos of figure skating performance. IRIS predicts (1) action segments, (2) technical element score differences of each segment relative to base scores, (3) multiple program component scores, and (4) the summed final score. In a modeling study, we found that IRIS performs better than non-interpretable, state-of-the-art models. In a formative user study, practicing figure skaters agreed with the rubric-informed explanations, found them useful, and trusted AI judgments more. This work highlights the importance of using judgment rubrics to account for AI decisions. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: 28th International Conference on Intelligent User Interfaces (IUI 2023)

arXiv:2303.06164 [pdf, other]

Understanding the Synergies between Quality-Diversity and Deep Reinforcement Learning

Authors: Bryan Lim, Manon Flageat, Antoine Cully

Abstract: The synergies between Quality-Diversity (QD) and Deep Reinforcement Learning (RL) have led to powerful hybrid QD-RL algorithms that have shown tremendous potential, and brings the best of both fields. However, only a single deep RL algorithm (TD3) has been used in prior hybrid methods despite notable progress made by other RL algorithms. Additionally, there are fundamental differences in the optim… ▽ More The synergies between Quality-Diversity (QD) and Deep Reinforcement Learning (RL) have led to powerful hybrid QD-RL algorithms that have shown tremendous potential, and brings the best of both fields. However, only a single deep RL algorithm (TD3) has been used in prior hybrid methods despite notable progress made by other RL algorithms. Additionally, there are fundamental differences in the optimization procedures between QD and RL which would benefit from a more principled approach. We propose Generalized Actor-Critic QD-RL, a unified modular framework for actor-critic deep RL methods in the QD-RL setting. This framework provides a path to study insights from Deep RL in the QD-RL setting, which is an important and efficient way to make progress in QD-RL. We introduce two new algorithms, PGA-ME (SAC) and PGA-ME (DroQ) which apply recent advancements in Deep RL to the QD-RL setting, and solves the humanoid environment which was not possible using existing QD-RL algorithms. However, we also find that not all insights from Deep RL can be effectively translated to QD-RL. Critically, this work also demonstrates that the actor-critic models in QD-RL are generally insufficiently trained and performance gains can be achieved without any additional environment evaluations. △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.06137 [pdf, other]

doi 10.1145/3638529.3654089

Enhancing MAP-Elites with Multiple Parallel Evolution Strategies

Authors: Manon Flageat, Bryan Lim, Antoine Cully

Abstract: With the development of fast and massively parallel evaluations in many domains, Quality-Diversity (QD) algorithms, that already proved promising in a large range of applications, have seen their potential multiplied. However, we have yet to understand how to best use a large number of evaluations as using them for random variations alone is not always effective. High-dimensional search spaces are… ▽ More With the development of fast and massively parallel evaluations in many domains, Quality-Diversity (QD) algorithms, that already proved promising in a large range of applications, have seen their potential multiplied. However, we have yet to understand how to best use a large number of evaluations as using them for random variations alone is not always effective. High-dimensional search spaces are a typical situation where random variations struggle to effectively search. Another situation is uncertain settings where solutions can appear better than they truly are and naively evaluating more solutions might mislead QD algorithms. In this work, we propose MAP-Elites-Multi-ES (MEMES), a novel QD algorithm based on Evolution Strategies (ES) designed to exploit fast parallel evaluations more effectively. MEMES maintains multiple (up to 100) simultaneous ES processes, each with its own independent objective and reset mechanism designed for QD optimisation, all on just a single GPU. We show that MEMES outperforms both gradient-based and mutation-based QD algorithms on black-box optimisation and QD-Reinforcement-Learning tasks, demonstrating its benefit across domains. Additionally, our approach outperforms sampling-based QD methods in uncertain domains when given the same evaluation budget. Overall, MEMES generates reproducible solutions that are high-performing and diverse through large-scale ES optimisation on easily accessible hardware. △ Less

Submitted 12 April, 2024; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2302.09466 [pdf]

doi 10.1145/3544548.3581402

RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards Precise Expressions

Authors: Yunlong Wang, Shuyuan Shen, Brian Y. Lim

Abstract: Generative AI models have shown impressive ability to produce images with text prompts, which could benefit creativity in visual art creation and self-expression. However, it is unclear how precisely the generated images express contexts and emotions from the input texts. We explored the emotional expressiveness of AI-generated images and developed RePrompt, an automatic method to refine text prom… ▽ More Generative AI models have shown impressive ability to produce images with text prompts, which could benefit creativity in visual art creation and self-expression. However, it is unclear how precisely the generated images express contexts and emotions from the input texts. We explored the emotional expressiveness of AI-generated images and developed RePrompt, an automatic method to refine text prompts toward precise expression of the generated images. Inspired by crowdsourced editing strategies, we curated intuitive text features, such as the number and concreteness of nouns, and trained a proxy model to analyze the feature effects on the AI-generated image. With model explanations of the proxy model, we curated a rubric to adjust text prompts to optimize image generation for precise emotion expression. We conducted simulation and user studies, which showed that RePrompt significantly improves the emotional expressiveness of AI-generated images, especially for negative emotions. △ Less

Submitted 19 March, 2023; v1 submitted 18 February, 2023; originally announced February 2023.

Comments: To appear in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23)

arXiv:2302.01241 [pdf, other]

Diagrammatization: Rationalizing with diagrammatic AI explanations for abductive-deductive reasoning on hypotheses

Authors: Brian Y. Lim, Joseph P. Cahaly, Chester Y. F. Sng, Adam Chew

Abstract: Many visualizations have been developed for explainable AI (XAI), but they often require further reasoning by users to interpret. We argue that XAI should support diagrammatic and abductive reasoning for the AI to perform hypothesis generation and evaluation to reduce the interpretability gap. We propose Diagrammatization to i) perform Peircean abductive-deductive reasoning, ii) follow domain conv… ▽ More Many visualizations have been developed for explainable AI (XAI), but they often require further reasoning by users to interpret. We argue that XAI should support diagrammatic and abductive reasoning for the AI to perform hypothesis generation and evaluation to reduce the interpretability gap. We propose Diagrammatization to i) perform Peircean abductive-deductive reasoning, ii) follow domain conventions, and iii) explain with diagrams visually or verbally. We implemented DiagramNet for a clinical application to predict cardiac diagnoses from heart auscultation, and explain with shape-based murmur diagrams. In modeling studies, we found that DiagramNet not only provides faithful murmur shape explanations, but also has better prediction performance than baseline models. We further demonstrate the interpretability and trustworthiness of diagrammatic explanations in a qualitative user study with medical students, showing that clinically-relevant, diagrammatic explanations are preferred over technical saliency map explanations. This work contributes insights into providing domain-conventional abductive explanations for user-centric XAI. △ Less

Submitted 12 July, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2211.15868 [pdf, other]

Kinematic-aware Hierarchical Attention Network for Human Pose Estimation in Videos

Authors: Kyung-Min Jin, Byoung-Sung Lim, Gun-Hee Lee, Tae-Kyung Kang, Seong-Whan Lee

Abstract: Previous video-based human pose estimation methods have shown promising results by leveraging aggregated features of consecutive frames. However, most approaches compromise accuracy to mitigate jitter or do not sufficiently comprehend the temporal aspects of human motion. Furthermore, occlusion increases uncertainty between consecutive frames, which results in unsmooth results. To address these is… ▽ More Previous video-based human pose estimation methods have shown promising results by leveraging aggregated features of consecutive frames. However, most approaches compromise accuracy to mitigate jitter or do not sufficiently comprehend the temporal aspects of human motion. Furthermore, occlusion increases uncertainty between consecutive frames, which results in unsmooth results. To address these issues, we design an architecture that exploits the keypoint kinematic features with the following components. First, we effectively capture the temporal features by leveraging individual keypoint's velocity and acceleration. Second, the proposed hierarchical transformer encoder aggregates spatio-temporal dependencies and refines the 2D or 3D input pose estimated from existing estimators. Finally, we provide an online cross-supervision between the refined input pose generated from the encoder and the final pose from our decoder to enable joint optimization. We demonstrate comprehensive results and validate the effectiveness of our model in various tasks: 2D pose estimation, 3D pose estimation, body mesh recovery, and sparsely annotated multi-human pose estimation. Our code is available at https://github.com/KyungMinJin/HANet. △ Less

Submitted 28 November, 2022; originally announced November 2022.

arXiv:2211.12610 [pdf, other]

Efficient Exploration using Model-Based Quality-Diversity with Gradients

Authors: Bryan Lim, Manon Flageat, Antoine Cully

Abstract: Exploration is a key challenge in Reinforcement Learning, especially in long-horizon, deceptive and sparse-reward environments. For such applications, population-based approaches have proven effective. Methods such as Quality-Diversity deals with this by encouraging novel solutions and producing a diversity of behaviours. However, these methods are driven by either undirected sampling (i.e. mutati… ▽ More Exploration is a key challenge in Reinforcement Learning, especially in long-horizon, deceptive and sparse-reward environments. For such applications, population-based approaches have proven effective. Methods such as Quality-Diversity deals with this by encouraging novel solutions and producing a diversity of behaviours. However, these methods are driven by either undirected sampling (i.e. mutations) or use approximated gradients (i.e. Evolution Strategies) in the parameter space, which makes them highly sample-inefficient. In this paper, we propose a model-based Quality-Diversity approach. It extends existing QD methods to use gradients for efficient exploitation and leverage perturbations in imagination for efficient exploration. Our approach optimizes all members of a population simultaneously to maintain both performance and diversity efficiently by leveraging the effectiveness of QD algorithms as good data generators to train deep models. We demonstrate that it maintains the divergent search capabilities of population-based approaches on tasks with deceptive rewards while significantly improving their sample efficiency and quality of solutions. △ Less

Submitted 22 November, 2022; originally announced November 2022.

arXiv:2211.03057 [pdf, other]

Towards Green Metaverse Networking Technologies, Advancements and Future Directions

Authors: Siyue Zhang, Wei Yang Bryan Lim, Wei Chong Ng, Zehui Xiong, Dusit Niyato, Xuemin Sherman Shen, Chunyan Miao

Abstract: As the Metaverse is iteratively being defined, its potential to unleash the next wave of digital disruption and create real-life value becomes increasingly clear. With distinctive features of immersive experience, simultaneous interactivity, and user agency, the Metaverse has the capability to transform all walks of life. However, the enabling technologies of the Metaverse, i.e., digital twin, art… ▽ More As the Metaverse is iteratively being defined, its potential to unleash the next wave of digital disruption and create real-life value becomes increasingly clear. With distinctive features of immersive experience, simultaneous interactivity, and user agency, the Metaverse has the capability to transform all walks of life. However, the enabling technologies of the Metaverse, i.e., digital twin, artificial intelligence, blockchain, and extended reality, are known to be energy-hungry, therefore raising concerns about the sustainability of its large-scale deployment and development. This article proposes Green Metaverse Networking for the first time to optimize energy efficiencies of all network components for Metaverse sustainable development. We first analyze energy consumption, efficiency, and sustainability of energy-intensive technologies in the Metaverse. Next, focusing on computation and networking, we present major advancements related to energy efficiency and their integration into the Metaverse. A case study of energy conservation by incorporating semantic communication and stochastic resource allocation in the Metaverse is presented. Finally, we outline the critical challenges of Metaverse sustainable development, thereby indicating potential directions of future research towards the green Metaverse. △ Less

Submitted 13 April, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

arXiv:2211.02193 [pdf, other]

Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning

Authors: Manon Flageat, Bryan Lim, Luca Grillotti, Maxime Allard, Simón C. Smith, Antoine Cully

Abstract: We present a Quality-Diversity benchmark suite for Deep Neuroevolution in Reinforcement Learning domains for robot control. The suite includes the definition of tasks, environments, behavioral descriptors, and fitness. We specify different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. The benchmark uses standard Quality-Diversity metrics, in… ▽ More We present a Quality-Diversity benchmark suite for Deep Neuroevolution in Reinforcement Learning domains for robot control. The suite includes the definition of tasks, environments, behavioral descriptors, and fitness. We specify different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. The benchmark uses standard Quality-Diversity metrics, including coverage, QD-score, maximum fitness, and an archive profile metric to quantify the relation between coverage and fitness. We also present how to quantify the robustness of the solutions with respect to environmental stochasticity by introducing corrected versions of the same metrics. We believe that our benchmark is a valuable tool for the community to compare and improve their findings. The source code is available online: https://github.com/adaptive-intelligent-robotics/QDax △ Less

Submitted 3 November, 2022; originally announced November 2022.

Comments: Accepted at GECCO Workshop on Quality Diversity Algorithm Benchmarks

arXiv:2210.09918 [pdf, other]

Online Damage Recovery for Physical Robots with Hierarchical Quality-Diversity

Authors: Maxime Allard, Simón C. Smith, Konstantinos Chatzilygeroudis, Bryan Lim, Antoine Cully

Abstract: In real-world environments, robots need to be resilient to damages and robust to unforeseen scenarios. Quality-Diversity (QD) algorithms have been successfully used to make robots adapt to damages in seconds by leveraging a diverse set of learned skills. A high diversity of skills increases the chances of a robot to succeed at overcoming new situations since there are more potential alternatives t… ▽ More In real-world environments, robots need to be resilient to damages and robust to unforeseen scenarios. Quality-Diversity (QD) algorithms have been successfully used to make robots adapt to damages in seconds by leveraging a diverse set of learned skills. A high diversity of skills increases the chances of a robot to succeed at overcoming new situations since there are more potential alternatives to solve a new task.However, finding and storing a large behavioural diversity of multiple skills often leads to an increase in computational complexity. Furthermore, robot planning in a large skill space is an additional challenge that arises with an increased number of skills. Hierarchical structures can help reducing this search and storage complexity by breaking down skills into primitive skills. In this paper, we introduce the Hierarchical Trial and Error algorithm, which uses a hierarchical behavioural repertoire to learn diverse skills and leverages them to make the robot adapt quickly in the physical world. We show that the hierarchical decomposition of skills enables the robot to learn more complex behaviours while keeping the learning of the repertoire tractable. Experiments with a hexapod robot show that our method solves a maze navigation tasks with 20% less actions in simulation, and 43% less actions in the physical world, for the most challenging scenarios than the best baselines while having 78% less complete failures. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2204.05726

arXiv:2210.04819 [pdf, other]

Efficient Learning of Locomotion Skills through the Discovery of Diverse Environmental Trajectory Generator Priors

Authors: Shikha Surana, Bryan Lim, Antoine Cully

Abstract: Data-driven learning based methods have recently been particularly successful at learning robust locomotion controllers for a variety of unstructured terrains. Prior work has shown that incorporating good locomotion priors in the form of trajectory generators (TGs) is effective at efficiently learning complex locomotion skills. However, defining a good, single TG as tasks/environments become incre… ▽ More Data-driven learning based methods have recently been particularly successful at learning robust locomotion controllers for a variety of unstructured terrains. Prior work has shown that incorporating good locomotion priors in the form of trajectory generators (TGs) is effective at efficiently learning complex locomotion skills. However, defining a good, single TG as tasks/environments become increasingly more complex remains a challenging problem as it requires extensive tuning and risks reducing the effectiveness of the prior. In this paper, we present Evolved Environmental Trajectory Generators (EETG), a method that learns a diverse set of specialised locomotion priors using Quality-Diversity algorithms while maintaining a single policy within the Policies Modulating TG (PMTG) architecture. The results demonstrate that EETG enables a quadruped robot to successfully traverse a wide range of environments, such as slopes, stairs, rough terrain, and balance beams. Our experiments show that learning a diverse set of specialized TG priors is significantly (5 times) more efficient than using a single, fixed prior when dealing with a wide range of environments. △ Less

Submitted 22 June, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

arXiv:2210.03516 [pdf, other]

Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery

Authors: Felix Chalumeau, Raphael Boige, Bryan Lim, Valentin Macé, Maxime Allard, Arthur Flajolet, Antoine Cully, Thomas Pierrot

Abstract: Deep Reinforcement Learning (RL) has emerged as a powerful paradigm for training neural policies to solve complex control tasks. However, these policies tend to be overfit to the exact specifications of the task and environment they were trained on, and thus do not perform well when conditions deviate slightly or when composed hierarchically to solve even more complex tasks. Recent work has shown… ▽ More Deep Reinforcement Learning (RL) has emerged as a powerful paradigm for training neural policies to solve complex control tasks. However, these policies tend to be overfit to the exact specifications of the task and environment they were trained on, and thus do not perform well when conditions deviate slightly or when composed hierarchically to solve even more complex tasks. Recent work has shown that training a mixture of policies, as opposed to a single one, that are driven to explore different regions of the state-action space can address this shortcoming by generating a diverse set of behaviors, referred to as skills, that can be collectively used to great effect in adaptation tasks or for hierarchical planning. This is typically realized by including a diversity term - often derived from information theory - in the objective function optimized by RL. However these approaches often require careful hyperparameter tuning to be effective. In this work, we demonstrate that less widely-used neuroevolution methods, specifically Quality Diversity (QD), are a competitive alternative to information-theory-augmented RL for skill discovery. Through an extensive empirical evaluation comparing eight state-of-the-art algorithms (four flagship algorithms from each line of work) on the basis of (i) metrics directly evaluating the skills' diversity, (ii) the skills' performance on adaptation tasks, and (iii) the skills' performance when used as primitives for hierarchical planning; QD methods are found to provide equal, and sometimes improved, performance whilst being less sensitive to hyperparameters and more scalable. As no single method is found to provide near-optimal performance across all environments, there is a rich scope for further research which we support by proposing future directions and providing optimized open-source implementations. △ Less

Submitted 8 September, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

Comments: Camera ready version for ICLR2023 (spotlight)

arXiv:2209.11957 [pdf, ps, other]

Cooperative Resource Management in Quantum Key Distribution (QKD) Networks for Semantic Communication

Authors: Rakpong Kaewpuang, Minrui Xu, Wei Yang Bryan Lim, Dusit Niyato, Han Yu, Jiawen Kang, Xuemin Sherman Shen

Abstract: Increasing privacy and security concerns in intelligence-native 6G networks require quantum key distribution-secured semantic information communication (QKD-SIC). In QKD-SIC systems, edge devices connected via quantum channels can efficiently encrypt semantic information from the semantic source, and securely transmit the encrypted semantic information to the semantic destination. In this paper, w… ▽ More Increasing privacy and security concerns in intelligence-native 6G networks require quantum key distribution-secured semantic information communication (QKD-SIC). In QKD-SIC systems, edge devices connected via quantum channels can efficiently encrypt semantic information from the semantic source, and securely transmit the encrypted semantic information to the semantic destination. In this paper, we consider an efficient resource (i.e., QKD and KM wavelengths) sharing problem to support QKD-SIC systems under the uncertainty of semantic information generated by edge devices. In such a system, QKD service providers offer QKD services with different subscription options to the edge devices. As such, to reduce the cost for the edge device users, we propose a QKD resource management framework for the edge devices communicating semantic information. The framework is based on a two-stage stochastic optimization model to achieve optimal QKD deployment. Moreover, to reduce the deployment cost of QKD service providers, QKD resources in the proposed framework can be utilized based on efficient QKD-SIC resource management, including semantic information transmission among edge devices, secret-key provisioning, and cooperation formation among QKD service providers. In detail, the formulated two-stage stochastic optimization model can achieve the optimal QKD-SIC resource deployment while meeting the secret-key requirements for semantic information transmission of edge devices. Moreover, to share the cost of the QKD resource pool among cooperative QKD service providers forming a coalition in a fair and interpretable manner, the proposed framework leverages the concept of Shapley value from cooperative game theory as a solution. Experimental results demonstrate that the proposed framework can reduce the deployment cost by about 40% compared with existing non-cooperative baselines. △ Less

Submitted 24 September, 2022; originally announced September 2022.

Comments: 16 pages, 20 figures, journal paper. arXiv admin note: text overlap with arXiv:2208.11270

arXiv:2208.14661 [pdf, other]

Stochastic Resource Allocation for Semantic Communication-aided Virtual Transportation Networks in the Metaverse

Authors: Wei Chong Ng, Hongyang Du, Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, Chunyan Miao

Abstract: The physical-virtual world synchronization to develop the Metaverse will require a massive transmission and exchange of data. In this paper, we introduce semantic communication for the development of virtual transportation networks in the Metaverse. Leveraging the perception capabilities of edge devices, virtual service providers (VSPs) can subscribe to their preferred edge devices to receive the… ▽ More The physical-virtual world synchronization to develop the Metaverse will require a massive transmission and exchange of data. In this paper, we introduce semantic communication for the development of virtual transportation networks in the Metaverse. Leveraging the perception capabilities of edge devices, virtual service providers (VSPs) can subscribe to their preferred edge devices to receive the semantic data of interest. However, the demands of the VSPs are highly dependent on the users that they are serving. To address the resource allocation problem amid stochastic user demand, we propose a stochastic semantic transmission scheme (SSTS) based on two-stage stochastic integer programming. Using real data captured by edge devices we deploy in Singapore, the simulation results show that SSTS can minimize the transmission cost of the VSPs while accounting for the users' demand uncertainties. △ Less

Submitted 31 August, 2022; originally announced August 2022.

Comments: 6 pages, 5 figures and 3 tables

arXiv:2208.05040 [pdf, ps, other]

Economics of Semantic Communication System: An Auction Approach

Authors: Zi Qin Liew, Hongyang Du, Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, Chunyan Miao, Dong In Kim

Abstract: Semantic communication technologies enable wireless edge devices to communicate effectively by transmitting semantic meaning of data. Edge components, such as vehicles in next-generation intelligent transport systems, use well-trained semantic models to encode and decode semantic information extracted from raw and sensor data. However, the limitation in computing resources makes it difficult to su… ▽ More Semantic communication technologies enable wireless edge devices to communicate effectively by transmitting semantic meaning of data. Edge components, such as vehicles in next-generation intelligent transport systems, use well-trained semantic models to encode and decode semantic information extracted from raw and sensor data. However, the limitation in computing resources makes it difficult to support the training process of accurate semantic models on edge devices. As such, edge devices can buy the pretrained semantic models from semantic model providers, which is called "semantic model trading". Upon collecting semantic information with the semantic models, the edge devices can then sell the extracted semantic information, e.g., information about urban road conditions or traffic signs, to the interested buyers for profit, which is called "semantic information trading". To facilitate both types of the trades, effective incentive mechanisms should be designed. Thus, in this paper, we propose a hierarchical trading system to support both semantic model trading and semantic information trading jointly. The proposed incentive mechanism helps to maximize the revenue of semantic model providers in the semantic model trading, and effectively incentivizes model providers to participate in the development of semantic communication systems. For semantic information trading, our designed auction approach can support the trading between multiple semantic information sellers and buyers, while ensuring individual rationality, incentive compatibility, and budget balance, and moreover, allowing them achieve higher utilities than the baseline method. △ Less

Submitted 1 August, 2022; originally announced August 2022.

arXiv:2207.00427 [pdf, other]

Semantic Communications for Future Internet: Fundamentals, Applications, and Challenges

Authors: Wanting Yang, Hongyang Du, Ziqin Liew, Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, Xuefen Chi, Xuemin Sherman Shen, Chunyan Miao

Abstract: With the increasing demand for intelligent services, the sixth-generation (6G) wireless networks will shift from a traditional architecture that focuses solely on high transmission rate to a new architecture that is based on the intelligent connection of everything. Semantic communication (SemCom), a revolutionary architecture that integrates user as well as application requirements and meaning of… ▽ More With the increasing demand for intelligent services, the sixth-generation (6G) wireless networks will shift from a traditional architecture that focuses solely on high transmission rate to a new architecture that is based on the intelligent connection of everything. Semantic communication (SemCom), a revolutionary architecture that integrates user as well as application requirements and meaning of information into the data processing and transmission, is predicted to become a new core paradigm in 6G. While SemCom is expected to progress beyond the classical Shannon paradigm, several obstacles need to be overcome on the way to a SemCom-enabled smart wireless Internet. In this paper, we first highlight the motivations and compelling reasons of SemCom in 6G. Then, we outline the major 6G visions and key enabler techniques which lay the foundation of SemCom. Meanwhile, we highlight some benefits of SemCom-empowered 6G and present a SemCom-native 6G network architecture. Next, we show the evolution of SemCom from its introduction to classical SemCom related theory and modern AI-enabled SemCom. Following that, focusing on modern SemCom, we classify SemCom into three categories, i.e., semantic-oriented communication, goal-oriented communication, and semantic-aware communication, and introduce three types of semantic metrics. We then discuss the applications, the challenges and technologies related to semantics and communication. Finally, we introduce future research opportunities. In a nutshell, this paper investigates the fundamentals of SemCom, its applications in 6G networks, and the existing challenges and open issues for further direction. △ Less

Submitted 13 November, 2022; v1 submitted 10 June, 2022; originally announced July 2022.

Comments: arXiv admin note: text overlap with arXiv:2103.05391 by other authors

arXiv:2205.14931 [pdf, other]

A multimedia recommendation model based on collaborative graph

Authors: Breda Lim, Shubhi Bansal, Ahmed Buru, Kayla Manthey

Abstract: As one of the main solutions to the information overload problem, recommender systems are widely used in daily life. In the recent emerging micro-video recommendation scenario, micro-videos contain rich multimedia information, involving text, image, video and other multimodal data, and these rich multimodal information conceals users' deep interest in the items. Most of the current recommendation… ▽ More As one of the main solutions to the information overload problem, recommender systems are widely used in daily life. In the recent emerging micro-video recommendation scenario, micro-videos contain rich multimedia information, involving text, image, video and other multimodal data, and these rich multimodal information conceals users' deep interest in the items. Most of the current recommendation algorithms based on multimodal data use multimodal information to expand the information on the item side, but ignore the different preferences of users for different modal information, and lack the fine-grained mining of the internal connection of multimodal information. To investigate the problems in the micro-video recommendr system mentioned above, we design a hybrid recommendation model based on multimodal information, introduces multimodal information and user-side auxiliary information in the network structure, fully explores the deep interest of users, measures the importance of each dimension of user and item feature representation in the scoring prediction task, makes the application of graph neural network in the recommendation system is improved by using an attention mechanism to fuse the multi-layer state output information, allowing the shallow structural features provided by the intermediate layer to better participate in the prediction task. The recommendation accuracy is improved compared with the traditional recommendation algorithm on different data sets, and the feasibility and effectiveness of our model is verified. △ Less

Submitted 30 May, 2022; originally announced May 2022.

arXiv:2204.03655 [pdf, other]

doi 10.1145/3512290.3528715

Learning to Walk Autonomously via Reset-Free Quality-Diversity

Authors: Bryan Lim, Alexander Reichenbach, Antoine Cully

Abstract: Quality-Diversity (QD) algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills. However, the generation of behavioural repertoires has mainly been limited to simulation environments instead of real-world learning. This is because existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual… ▽ More Quality-Diversity (QD) algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills. However, the generation of behavioural repertoires has mainly been limited to simulation environments instead of real-world learning. This is because existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions. This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments. We build on Dynamics-Aware Quality-Diversity (DA-QD) and introduce a behaviour selection policy that leverages the diversity of the imagined repertoire and environmental information to intelligently select of behaviours that can act as automatic resets. We demonstrate this through a task of learning to walk within defined training zones with obstacles. Our experiments show that we can learn full repertoires of legged locomotion controllers autonomously without manual resets with high sample efficiency in spite of harsh safety constraints. Finally, using an ablation of different target objectives, we show that it is important for RF-QD to have diverse types solutions available for the behaviour selection policy over solutions optimised with a specific objective. Videos and code available at https://sites.google.com/view/rf-qd. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.08648 [pdf, other]

Artificial Intelligence Enables Real-Time and Intuitive Control of Prostheses via Nerve Interface

Authors: Diu Khue Luu, Anh Tuan Nguyen, Ming Jiang, Markus W. Drealan, Jian Xu, Tong Wu, Wing-kin Tam, Wenfeng Zhao, Brian Z. H. Lim, Cynthia K. Overstreet, Qi Zhao, Jonathan Cheng, Edward W. Keefer, Zhi Yang

Abstract: Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed… ▽ More Objective: The next generation prosthetic hand that moves and feels like a real hand requires a robust neural interconnection between the human minds and machines. Methods: Here we present a neuroprosthetic system to demonstrate that principle by employing an artificial intelligence (AI) agent to translate the amputee's movement intent through a peripheral nerve interface. The AI agent is designed based on the recurrent neural network (RNN) and could simultaneously decode six degree-of-freedom (DOF) from multichannel nerve data in real-time. The decoder's performance is characterized in motor decoding experiments with three human amputees. Results: First, we show the AI agent enables amputees to intuitively control a prosthetic hand with individual finger and wrist movements up to 97-98% accuracy. Second, we demonstrate the AI agent's real-time performance by measuring the reaction time and information throughput in a hand gesture matching task. Third, we investigate the AI agent's long-term uses and show the decoder's robust predictive performance over a 16-month implant duration. Conclusion & significance: Our study demonstrates the potential of AI-enabled nerve technology, underling the next generation of dexterous and intuitive prosthetic hands. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2203.05471 [pdf, other]

A Full Dive into Realizing the Edge-enabled Metaverse: Visions, Enabling Technologies,and Challenges

Authors: Minrui Xu, Wei Chong Ng, Wei Yang Bryan Lim, Jiawen Kang, Zehui Xiong, Dusit Niyato, Qiang Yang, Xuemin Sherman Shen, Chunyan Miao

Abstract: Dubbed "the successor to the mobile Internet", the concept of the Metaverse has grown in popularity. While there exist lite versions of the Metaverse today, they are still far from realizing the full vision of an immersive, embodied, and interoperable Metaverse. Without addressing the issues of implementation from the communication and networking, as well as computation perspectives, the Metaverse… ▽ More Dubbed "the successor to the mobile Internet", the concept of the Metaverse has grown in popularity. While there exist lite versions of the Metaverse today, they are still far from realizing the full vision of an immersive, embodied, and interoperable Metaverse. Without addressing the issues of implementation from the communication and networking, as well as computation perspectives, the Metaverse is difficult to succeed the Internet, especially in terms of its accessibility to billions of users today. In this survey, we focus on the edge-enabled Metaverse to realize its ultimate vision. We first provide readers with a succinct tutorial of the Metaverse, an introduction to the architecture, as well as current developments. To enable ubiquitous, seamless, and embodied access to the Metaverse, we discuss the communication and networking challenges and survey cutting-edge solutions and concepts that leverage next-generation communication systems for users to immerse as and interact with embodied avatars in the Metaverse. Moreover, given the high computation costs required, e.g., to render 3D virtual worlds and run data-hungry artificial intelligence-driven avatars, we discuss the computation challenges and cloud-edge-end computation framework-driven solutions to realize the Metaverse on resource-constrained edge devices. Next, we explore how blockchain technologies can aid in the interoperable development of the Metaverse, not just in terms of empowering the economic circulation of virtual user-generated content but also to manage physical edge resources in a decentralized, transparent, and immutable manner. Finally, we discuss the future research directions towards realizing the true vision of the edge-enabled Metaverse. △ Less

Submitted 20 August, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

arXiv:2202.11697 [pdf, other]

Stochastic Coded Offloading Scheme for Unmanned Aerial Vehicle-Assisted Edge Computing

Authors: Wei Chong Ng, Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, Chunyan Miao, Zhu Han, Dong In Kim

Abstract: Unmanned aerial vehicles (UAVs) have gained wide research interests due to their technological advancement and high mobility. The UAVs are equipped with increasingly advanced capabilities to run computationally intensive applications enabled by machine learning techniques. However, because of both energy and computation constraints, the UAVs face issues hovering in the sky while performing computa… ▽ More Unmanned aerial vehicles (UAVs) have gained wide research interests due to their technological advancement and high mobility. The UAVs are equipped with increasingly advanced capabilities to run computationally intensive applications enabled by machine learning techniques. However, because of both energy and computation constraints, the UAVs face issues hovering in the sky while performing computation due to weather uncertainty. To overcome the computation constraints, the UAVs can partially or fully offload their computation tasks to the edge servers. In ordinary computation offloading operations, the UAVs can retrieve the result from the returned output. Nevertheless, if the UAVs are unable to retrieve the entire result from the edge servers, i.e., straggling edge servers, this operation will fail. In this paper, we propose a coded distributed computing approach for computation offloading to mitigate straggling edge servers. The UAVs can retrieve the returned result when the number of returned copies is greater than or equal to the recovery threshold. There is a shortfall if the returned copies are less than the recovery threshold. To minimize the cost of the network, energy consumption by the UAVs, and prevent over and under subscription of the resources, we devise a two-phase Stochastic Coded Offloading Scheme (SCOS). In the first phase, the appropriate UAVs are allocated to the charging stations amid weather uncertainty. In the second phase, we use the $z$-stage Stochastic Integer Programming (SIP) to optimize the number of computation subtasks offloaded and computed locally, while taking into account the computation shortfall and demand uncertainty. By using a real dataset, the simulation results show that our proposed scheme is fully dynamic, and minimizes the cost of the network and UAV energy consumption amid stochastic uncertainties. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: Accepted by IEEE Internet of Things Journal. 20 pages, 18 figures. arXiv admin note: text overlap with arXiv:2110.14873

arXiv:2202.07349 [pdf, other]

IF-City: Intelligible Fair City Planning to Measure, Explain and Mitigate Inequality

Authors: Yan Lyu, Hangxin Lu, Min Kyung Lee, Gerhard Schmitt, Brian Y. Lim

Abstract: With the increasing pervasiveness of Artificial Intelligence (AI), many visual analytics tools have been proposed to examine fairness, but they mostly focus on data scientist users. Instead, tackling fairness must be inclusive and involve domain experts with specialized tools and workflows. Thus, domain-specific visualizations are needed for algorithmic fairness. Furthermore, while much work on AI… ▽ More With the increasing pervasiveness of Artificial Intelligence (AI), many visual analytics tools have been proposed to examine fairness, but they mostly focus on data scientist users. Instead, tackling fairness must be inclusive and involve domain experts with specialized tools and workflows. Thus, domain-specific visualizations are needed for algorithmic fairness. Furthermore, while much work on AI fairness has focused on predictive decisions, less has been done for fair allocation and planning, which require human expertise and iterative design to integrate myriad constraints. We propose the Intelligible Fair Allocation (IF-Alloc) Framework that leverages explanations of causal attribution (Why), contrastive (Why Not) and counterfactual reasoning (What If, How To) to aid domain experts to assess and alleviate unfairness in allocation problems. We apply the framework to fair urban planning for designing cities that provide equal access to amenities and benefits for diverse resident types. Specifically, we propose an interactive visual tool, Intelligible Fair City Planner (IF-City), to help urban planners to perceive inequality across groups, identify and attribute sources of inequality, and mitigate inequality with automatic allocation simulations and constraint-satisfying recommendations. We demonstrate and evaluate the usage and usefulness of IF-City on a real neighborhood in New York City, US, with practicing urban planners from multiple countries, and discuss generalizing our findings, application, and framework to other use cases and applications of fair allocation. △ Less

Submitted 15 February, 2022; originally announced February 2022.

Comments: 18 pages including references and bios, 11 figures, submitted to IEEE Transactions on Visualization and Computer Graphics

arXiv:2202.06471 [pdf, other]

Semantic Communication Meets Edge Intelligence

Authors: Wanting Yang, Zi Qin Liew, Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, Xuefen Chi, Xianbin Cao, Khaled B. Letaief

Abstract: The development of emerging applications, such as autonomous transportation systems, are expected to result in an explosive growth in mobile data traffic. As the available spectrum resource becomes more and more scarce, there is a growing need for a paradigm shift from Shannon's Classical Information Theory (CIT) to semantic communication (SemCom). Specifically, the former adopts a "transmit-befor… ▽ More The development of emerging applications, such as autonomous transportation systems, are expected to result in an explosive growth in mobile data traffic. As the available spectrum resource becomes more and more scarce, there is a growing need for a paradigm shift from Shannon's Classical Information Theory (CIT) to semantic communication (SemCom). Specifically, the former adopts a "transmit-before-understanding" approach while the latter leverages artificial intelligence (AI) techniques to "understand-before-transmit", thereby alleviating bandwidth pressure by reducing the amount of data to be exchanged without negating the semantic effectiveness of the transmitted symbols. However, the semantic extraction (SE) procedure incurs costly computation and storage overheads. In this article, we introduce an edge-driven training, maintenance, and execution of SE. We further investigate how edge intelligence can be enhanced with SemCom through improving the generalization capabilities of intelligent agents at lower computation overheads and reducing the communication overhead of information exchange. Finally, we present a case study involving semantic-aware resource optimization for the wireless powered Internet of Things (IoT). △ Less

Submitted 13 February, 2022; originally announced February 2022.

arXiv:2202.01258 [pdf, other]

Accelerated Quality-Diversity through Massive Parallelism

Authors: Bryan Lim, Maxime Allard, Luca Grillotti, Antoine Cully

Abstract: Quality-Diversity (QD) optimization algorithms are a well-known approach to generate large collections of diverse and high-quality solutions. However, derived from evolutionary computation, QD algorithms are population-based methods which are known to be data-inefficient and requires large amounts of computational resources. This makes QD algorithms slow when used in applications where solution ev… ▽ More Quality-Diversity (QD) optimization algorithms are a well-known approach to generate large collections of diverse and high-quality solutions. However, derived from evolutionary computation, QD algorithms are population-based methods which are known to be data-inefficient and requires large amounts of computational resources. This makes QD algorithms slow when used in applications where solution evaluations are computationally costly. A common approach to speed up QD algorithms is to evaluate solutions in parallel, for instance by using physical simulators in robotics. Yet, this approach is limited to several dozen of parallel evaluations as most physics simulators can only be parallelized more with a greater number of CPUs. With recent advances in simulators that run on accelerators, thousands of evaluations can now be performed in parallel on single GPU/TPU. In this paper, we present QDax, an accelerated implementation of MAP-Elites which leverages massive parallelism on accelerators to make QD algorithms more accessible. We show that QD algorithms are ideal candidates to take advantage of progress in hardware acceleration. We demonstrate that QD algorithms can scale with massive parallelism to be run at interactive timescales without any significant effect on the performance. Results across standard optimization functions and four neuroevolution benchmark environments shows that experiment runtimes are reduced by two factors of magnitudes, turning days of computation into minutes. More surprising, we observe that reducing the number of generations by two orders of magnitude, and thus having significantly shorter lineage does not impact the performance of QD algorithms. These results show that QD can now benefit from hardware acceleration, which contributed significantly to the bloom of deep learning. △ Less

Submitted 10 October, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

arXiv:2201.12835

Debiased-CAM to mitigate systematic error with faithful visual explanations of machine learning

Authors: Wencan Zhang, Mariella Dimiccoli, Brian Y. Lim

Abstract: Model explanations such as saliency maps can improve user trust in AI by highlighting important features for a prediction. However, these become distorted and misleading when explaining predictions of images that are subject to systematic error (bias). Furthermore, the distortions persist despite model fine-tuning on images biased by different factors (blur, color temperature, day/night). We prese… ▽ More Model explanations such as saliency maps can improve user trust in AI by highlighting important features for a prediction. However, these become distorted and misleading when explaining predictions of images that are subject to systematic error (bias). Furthermore, the distortions persist despite model fine-tuning on images biased by different factors (blur, color temperature, day/night). We present Debiased-CAM to recover explanation faithfulness across various bias types and levels by training a multi-input, multi-task model with auxiliary tasks for explanation and bias level predictions. In simulation studies, the approach not only enhanced prediction accuracy, but also generated highly faithful explanations about these predictions as if the images were unbiased. In user studies, debiased explanations improved user task performance, perceived truthfulness and perceived helpfulness. Debiased training can provide a versatile platform for robust performance and explanation faithfulness for a wide range of applications with data biases. △ Less

Submitted 28 February, 2022; v1 submitted 30 January, 2022; originally announced January 2022.

Comments: This work was intended as a replacement of arXiv:2012.05567 and any subsequent updates will appear there

ACM Class: I.2.0

arXiv:2201.01634 [pdf, other]

Realizing the Metaverse with Edge Intelligence: A Match Made in Heaven

Authors: Wei Yang Bryan Lim, Zehui Xiong, Dusit Niyato, Xianbin Cao, Chunyan Miao, Sumei Sun, Qiang Yang

Abstract: Dubbed "the successor to the mobile Internet", the concept of the Metaverse has recently exploded in popularity. While there exists lite versions of the Metaverse today, we are still far from realizing the vision of a seamless, shardless, and interoperable Metaverse given the stringent sensing, communication, and computation requirements. Moreover, the birth of the Metaverse comes amid growing pri… ▽ More Dubbed "the successor to the mobile Internet", the concept of the Metaverse has recently exploded in popularity. While there exists lite versions of the Metaverse today, we are still far from realizing the vision of a seamless, shardless, and interoperable Metaverse given the stringent sensing, communication, and computation requirements. Moreover, the birth of the Metaverse comes amid growing privacy concerns among users. In this article, we begin by providing a preliminary definition of the Metaverse. We discuss the architecture of the Metaverse and mainly focus on motivating the convergence of edge intelligence and the infrastructure layer of the Metaverse. We present major edge-based technological developments and their integration to support the Metaverse engine. Then, we present our research attempts through a case study of virtual city development in the Metaverse. Finally, we discuss the open research issues. △ Less

Submitted 5 January, 2022; originally announced January 2022.

Comments: 9 pages, 5 figures

arXiv:2112.14005 [pdf, other]

doi 10.1145/3491102.3501826

Towards Relatable Explainable AI with the Perceptual Process

Authors: Wencan Zhang, Brian Y. Lim

Abstract: Machine learning models need to provide contrastive explanations, since people often seek to understand why a puzzling prediction occurred instead of some expected outcome. Current contrastive explanations are rudimentary comparisons between examples or raw features, which remain difficult to interpret, since they lack semantic meaning. We argue that explanations must be more relatable to other co… ▽ More Machine learning models need to provide contrastive explanations, since people often seek to understand why a puzzling prediction occurred instead of some expected outcome. Current contrastive explanations are rudimentary comparisons between examples or raw features, which remain difficult to interpret, since they lack semantic meaning. We argue that explanations must be more relatable to other concepts, hypotheticals, and associations. Inspired by the perceptual process from cognitive psychology, we propose the XAI Perceptual Processing Framework and RexNet model for relatable explainable AI with Contrastive Saliency, Counterfactual Synthetic, and Contrastive Cues explanations. We investigated the application of vocal emotion recognition, and implemented a modular multi-task deep neural network to predict and explain emotions from speech. From think-aloud and controlled studies, we found that counterfactual explanations were useful and further enhanced with semantic cues, but not saliency explanations. This work provides insights into providing and evaluating relatable contrastive explainable AI for perception applications. △ Less

Submitted 28 March, 2022; v1 submitted 28 December, 2021; originally announced December 2021.

Comments: 14 pages, 7 figures, 4 tables, accepted by chi2022

ACM Class: I.2.0

arXiv:2110.14873 [pdf, other]

Optimal Stochastic Coded Computation Offloading in Unmanned Aerial Vehicles Network

Authors: Wei Chong Ng, Wei Yang Bryan Lim, Jer Shyuan Ng, Suttinee Sawadsitang, Zehui Xiong, Dusit Niyato

Abstract: Today, modern unmanned aerial vehicles (UAVs) are equipped with increasingly advanced capabilities that can run applications enabled by machine learning techniques, which require computationally intensive operations such as matrix multiplications. Due to computation constraints, the UAVs can offload their computation tasks to edge servers. To mitigate stragglers, coded distributed computing (CDC)… ▽ More Today, modern unmanned aerial vehicles (UAVs) are equipped with increasingly advanced capabilities that can run applications enabled by machine learning techniques, which require computationally intensive operations such as matrix multiplications. Due to computation constraints, the UAVs can offload their computation tasks to edge servers. To mitigate stragglers, coded distributed computing (CDC) based offloading can be adopted. In this paper, we propose an Optimal Task Allocation Scheme (OTAS) based on Stochastic Integer Programming with the objective to minimize energy consumption during computation offloading. The simulation results show that amid uncertainty of task completion, the energy consumption in the UAV network is minimized. △ Less

Submitted 28 October, 2021; originally announced October 2021.

Comments: To be published in IEEE Global Communications Conference

arXiv:2110.14325 [pdf, other]

Unified Resource Allocation Framework for the Edge Intelligence-Enabled Metaverse

Authors: Wei Chong Ng, Wei Yang Bryan Lim, Jer Shyuan Ng, Zehui Xiong, Dusit Niyato, Chunyan Miao

Abstract: Dubbed as the next-generation Internet, the metaverse is a virtual world that allows users to interact with each other or objects in real-time using their avatars. The metaverse is envisioned to support novel ecosystems of service provision in an immersive environment brought about by an intersection of the virtual and physical worlds. The native AI systems in metaverse will personalized user expe… ▽ More Dubbed as the next-generation Internet, the metaverse is a virtual world that allows users to interact with each other or objects in real-time using their avatars. The metaverse is envisioned to support novel ecosystems of service provision in an immersive environment brought about by an intersection of the virtual and physical worlds. The native AI systems in metaverse will personalized user experience over time and shape the experience in a scalable, seamless, and synchronous way. However, the metaverse is characterized by diverse resource types amid a highly dynamic demand environment. In this paper, we propose the case study of virtual education in the metaverse and address the unified resource allocation problem amid stochastic user demand. We propose a stochastic optimal resource allocation scheme (SORAS) based on stochastic integer programming with the objective of minimizing the cost of the virtual service provider. The simulation results show that SORAS can minimize the cost of the virtual service provider while accounting for the users' demands uncertainty. △ Less

Submitted 27 October, 2021; originally announced October 2021.

Comments: 6 pages, 10 figures

arXiv:2110.01423 [pdf, ps, other]

Economics of Semantic Communication System in Wireless Powered Internet of Things

Authors: Zi Qin Liew, Yanyu Cheng, Wei Yang Bryan Lim, Dusit Niyato, Chunyan Miao, Sumei Sun

Abstract: The semantic communication system enables wireless devices to communicate effectively with the semantic meaning of the data. Wireless powered Internet of Things (IoT) that adopts the semantic communication system relies on harvested energy to transmit semantic information. However, the issue of energy constraint in the semantic communication system is not well studied. In this paper, we propose a… ▽ More The semantic communication system enables wireless devices to communicate effectively with the semantic meaning of the data. Wireless powered Internet of Things (IoT) that adopts the semantic communication system relies on harvested energy to transmit semantic information. However, the issue of energy constraint in the semantic communication system is not well studied. In this paper, we propose a semantic-based energy valuation and take an economic approach to solve the energy allocation problem as an incentive mechanism design. In our model, IoT devices (bidders) place their bids for the energy and power transmitter (auctioneer) decides the winner and payment by using deep learning based optimal auction. Results show that the revenue of wireless power transmitter is maximized while satisfying Individual Rationality (IR) and Incentive Compatibility (IC). △ Less

Submitted 4 October, 2021; originally announced October 2021.

arXiv:2109.10231 [pdf]

SalienTrack: providing salient information for semi-automated self-tracking feedback with model explanations

Authors: Yunlong Wang, Jiaying Liu, Homin Park, Jordan Schultz-McArdle, Stephanie Rosenthal, Judy Kay, Brian Y. Lim

Abstract: Self-tracking can improve people's awareness of their unhealthy behaviors and support reflection to inform behavior change. Increasingly, new technologies make tracking easier, leading to large amounts of tracked data. However, much of that information is not salient for reflection and self-awareness. To tackle this burden for reflection, we created the SalienTrack framework, which aims to 1) iden… ▽ More Self-tracking can improve people's awareness of their unhealthy behaviors and support reflection to inform behavior change. Increasingly, new technologies make tracking easier, leading to large amounts of tracked data. However, much of that information is not salient for reflection and self-awareness. To tackle this burden for reflection, we created the SalienTrack framework, which aims to 1) identify salient tracking events, 2) select the salient details of those events, 3) explain why they are informative, and 4) present the details as manually elicited or automatically shown feedback. We implemented SalienTrack in the context of nutrition tracking. To do this, we first conducted a field study to collect photo-based mobile food tracking over 1-5 weeks. We then report how we used this data to train an explainable-AI model of salience. Finally, we created interfaces to present salient information and conducted a formative user study to gain insights about how SalienTrack could be integrated into an interface for reflection. Our key contributions are the SalienTrack framework, a demonstration of its implementation for semi-automated feedback in an important and challenging self-tracking context and a discussion of the broader uses of the framework. △ Less

Submitted 16 February, 2022; v1 submitted 21 September, 2021; originally announced September 2021.

Showing 1–50 of 101 results for author: Lim, B