-
A Basic Łukasiewicz m-valued conditional logic
Authors:
Shuquan Huo
Abstract:
This paper is devoted to the construction of conditional logic system of Łukasiewicz m-valued propositional logic. We construct conditional logic system ŁCR based on Łukasiewicz m-valued propositional logic. We construct world semantics for the system by generalizing conditional and accessibility relation from classical bivalent to m-valued, and prove its soundness, completeness and finite model p…
▽ More
This paper is devoted to the construction of conditional logic system of Łukasiewicz m-valued propositional logic. We construct conditional logic system ŁCR based on Łukasiewicz m-valued propositional logic. We construct world semantics for the system by generalizing conditional and accessibility relation from classical bivalent to m-valued, and prove its soundness, completeness and finite model property. Conditionals of ŁCR cannot be generalized directly to variable strict conditionals, but they are stricter than classical conditionals.
△ Less
Submitted 27 July, 2024;
originally announced July 2024.
-
A survey on fairness of large language models in e-commerce: progress, application, and challenge
Authors:
Qingyang Ren,
Zilin Jiang,
Jinghan Cao,
Sijia Li,
Chiqu Li,
Yiyang Liu,
Shuning Huo,
Tiange He,
Yuan Chen
Abstract:
This survey explores the fairness of large language models (LLMs) in e-commerce, examining their progress, applications, and the challenges they face. LLMs have become pivotal in the e-commerce domain, offering innovative solutions and enhancing customer experiences. This work presents a comprehensive survey on the applications and challenges of LLMs in e-commerce. The paper begins by introducing…
▽ More
This survey explores the fairness of large language models (LLMs) in e-commerce, examining their progress, applications, and the challenges they face. LLMs have become pivotal in the e-commerce domain, offering innovative solutions and enhancing customer experiences. This work presents a comprehensive survey on the applications and challenges of LLMs in e-commerce. The paper begins by introducing the key principles underlying the use of LLMs in e-commerce, detailing the processes of pretraining, fine-tuning, and prompting that tailor these models to specific needs. It then explores the varied applications of LLMs in e-commerce, including product reviews, where they synthesize and analyze customer feedback; product recommendations, where they leverage consumer data to suggest relevant items; product information translation, enhancing global accessibility; and product question and answer sections, where they automate customer support. The paper critically addresses the fairness challenges in e-commerce, highlighting how biases in training data and algorithms can lead to unfair outcomes, such as reinforcing stereotypes or discriminating against certain groups. These issues not only undermine consumer trust, but also raise ethical and legal concerns. Finally, the work outlines future research directions, emphasizing the need for more equitable and transparent LLMs in e-commerce. It advocates for ongoing efforts to mitigate biases and improve the fairness of these systems, ensuring they serve diverse global markets effectively and ethically. Through this comprehensive analysis, the survey provides a holistic view of the current landscape of LLMs in e-commerce, offering insights into their potential and limitations, and guiding future endeavors in creating fairer and more inclusive e-commerce environments.
△ Less
Submitted 21 June, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
Assessing and Verifying Task Utility in LLM-Powered Applications
Authors:
Negar Arabzadeh,
Siqing Huo,
Nikhil Mehta,
Qinqyun Wu,
Chi Wang,
Ahmed Awadallah,
Charles L. A. Clarke,
Julia Kiseleva
Abstract:
The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks. However, a significant gap remains in assessing to what extent LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the need to verify utility of LLM-powered applicat…
▽ More
The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks. However, a significant gap remains in assessing to what extent LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the need to verify utility of LLM-powered applications, particularly by ensuring alignment between the application's functionality and end-user needs. We introduce AgentEval, a novel framework designed to simplify the utility verification process by automatically proposing a set of criteria tailored to the unique purpose of any given application. This allows for a comprehensive assessment, quantifying the utility of an application against the suggested criteria. We present a comprehensive analysis of the effectiveness and robustness of AgentEval for two open source datasets including Math Problem solving and ALFWorld House-hold related tasks. For reproducibility purposes, we make the data, code and all the logs publicly available at https://bit.ly/3w3yKcS .
△ Less
Submitted 12 May, 2024; v1 submitted 3 May, 2024;
originally announced May 2024.
-
Towards an In-Depth Comprehension of Case Relevance for Better Legal Retrieval
Authors:
Haitao Li,
You Chen,
Zhekai Ge,
Qingyao Ai,
Yiqun Liu,
Quan Zhou,
Shuai Huo
Abstract:
Legal retrieval techniques play an important role in preserving the fairness and equality of the judicial system. As an annually well-known international competition, COLIEE aims to advance the development of state-of-the-art retrieval models for legal texts. This paper elaborates on the methodology employed by the TQM team in COLIEE2024.Specifically, we explored various lexical matching and seman…
▽ More
Legal retrieval techniques play an important role in preserving the fairness and equality of the judicial system. As an annually well-known international competition, COLIEE aims to advance the development of state-of-the-art retrieval models for legal texts. This paper elaborates on the methodology employed by the TQM team in COLIEE2024.Specifically, we explored various lexical matching and semantic retrieval models, with a focus on enhancing the understanding of case relevance. Additionally, we endeavor to integrate various features using the learning-to-rank technique. Furthermore, fine heuristic pre-processing and post-processing methods have been proposed to mitigate irrelevant information. Consequently, our methodology achieved remarkable performance in COLIEE2024, securing first place in Task 1 and third place in Task 3. We anticipate that our proposed approach can contribute valuable insights to the advancement of legal retrieval technology.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Authors:
Yichao Wu,
Yafei Xiang,
Shuning Huo,
Yulu Gong,
Penghao Liang
Abstract:
In addressing the computational and memory demands of fine-tuning Large Language Models(LLMs), we propose LoRA-SP(Streamlined Partial Parameter Adaptation), a novel approach utilizing randomized half-selective parameter freezing within the Low-Rank Adaptation(LoRA)framework. This method efficiently balances pre-trained knowledge retention and adaptability for task-specific optimizations. Through a…
▽ More
In addressing the computational and memory demands of fine-tuning Large Language Models(LLMs), we propose LoRA-SP(Streamlined Partial Parameter Adaptation), a novel approach utilizing randomized half-selective parameter freezing within the Low-Rank Adaptation(LoRA)framework. This method efficiently balances pre-trained knowledge retention and adaptability for task-specific optimizations. Through a randomized mechanism, LoRA-SP determines which parameters to update or freeze, significantly reducing computational and memory requirements without compromising model performance. We evaluated LoRA-SP across several benchmark NLP tasks, demonstrating its ability to achieve competitive performance with substantially lower resource consumption compared to traditional full-parameter fine-tuning and other parameter-efficient techniques. LoRA-SP innovative approach not only facilitates the deployment of advanced NLP models in resource-limited settings but also opens new research avenues into effective and efficient model adaptation strategies.
△ Less
Submitted 28 February, 2024;
originally announced March 2024.
-
Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research
Authors:
Shuning Huo,
Yafei Xiang,
Hanyi Yu,
Mengran Zhu,
Yulu Gong
Abstract:
In recent years, advancements in natural language processing (NLP) have been fueled by deep learning techniques, particularly through the utilization of powerful computing resources like GPUs and TPUs. Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including…
▽ More
In recent years, advancements in natural language processing (NLP) have been fueled by deep learning techniques, particularly through the utilization of powerful computing resources like GPUs and TPUs. Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including semantic understanding, intelligent writing, and reasoning, paving the way for a more generalized form of artificial intelligence. NLP, as a vital application of AI, aims to bridge the gap between humans and computers through natural language interaction. This paper delves into the current landscape and future prospects of large-scale model-based NLP, focusing on the question-answering systems within this domain. Practical cases and developments in artificial intelligence-driven question-answering systems are analyzed to foster further exploration and research in the realm of large-scale NLP.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Machine Learning-Based Vehicle Intention Trajectory Recognition and Prediction for Autonomous Driving
Authors:
Hanyi Yu,
Shuning Huo,
Mengran Zhu,
Yulu Gong,
Yafei Xiang
Abstract:
In recent years, the expansion of internet technology and advancements in automation have brought significant attention to autonomous driving technology. Major automobile manufacturers, including Volvo, Mercedes-Benz, and Tesla, have progressively introduced products ranging from assisted-driving vehicles to semi-autonomous vehicles. However, this period has also witnessed several traffic safety i…
▽ More
In recent years, the expansion of internet technology and advancements in automation have brought significant attention to autonomous driving technology. Major automobile manufacturers, including Volvo, Mercedes-Benz, and Tesla, have progressively introduced products ranging from assisted-driving vehicles to semi-autonomous vehicles. However, this period has also witnessed several traffic safety incidents involving self-driving vehicles. For instance, in March 2016, a Google self-driving car was involved in a minor collision with a bus. At the time of the accident, the autonomous vehicle was attempting to merge into the right lane but failed to dynamically respond to the real-time environmental information during the lane change. It incorrectly assumed that the approaching bus would slow down to avoid it, leading to a low-speed collision with the bus. This incident highlights the current technological shortcomings and safety concerns associated with autonomous lane-changing behavior, despite the rapid advancements in autonomous driving technology. Lane-changing is among the most common and hazardous behaviors in highway driving, significantly impacting traffic safety and flow. Therefore, lane-changing is crucial for traffic safety, and accurately predicting drivers' lane change intentions can markedly enhance driving safety. This paper introduces a deep learning-based prediction method for autonomous driving lane change behavior, aiming to facilitate safe lane changes and thereby improve road safety.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Text Understanding and Generation Using Transformer Models for Intelligent E-commerce Recommendations
Authors:
Yafei Xiang,
Hanyi Yu,
Yulu Gong,
Shuning Huo,
Mengran Zhu
Abstract:
With the rapid development of artificial intelligence technology, Transformer structural pre-training model has become an important tool for large language model (LLM) tasks. In the field of e-commerce, these models are especially widely used, from text understanding to generating recommendation systems, which provide powerful technical support for improving user experience and optimizing service…
▽ More
With the rapid development of artificial intelligence technology, Transformer structural pre-training model has become an important tool for large language model (LLM) tasks. In the field of e-commerce, these models are especially widely used, from text understanding to generating recommendation systems, which provide powerful technical support for improving user experience and optimizing service processes. This paper reviews the core application scenarios of Transformer pre-training model in e-commerce text understanding and recommendation generation, including but not limited to automatic generation of product descriptions, sentiment analysis of user comments, construction of personalized recommendation system and automated processing of customer service conversations. Through a detailed analysis of the model's working principle, implementation process, and application effects in specific cases, this paper emphasizes the unique advantages of pre-trained models in understanding complex user intentions and improving the quality of recommendations. In addition, the challenges and improvement directions for the future are also discussed, such as how to further improve the generalization ability of the model, the ability to handle large-scale data sets, and technical strategies to protect user privacy. Ultimately, the paper points out that the application of Transformer structural pre-training models in e-commerce has not only driven technological innovation, but also brought substantial benefits to merchants and consumers, and looking forward, these models will continue to play a key role in e-commerce and beyond.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Utilizing GANs for Fraud Detection: Model Training with Synthetic Transaction Data
Authors:
Mengran Zhu,
Yulu Gong,
Yafei Xiang,
Hanyi Yu,
Shuning Huo
Abstract:
Anomaly detection is a critical challenge across various research domains, aiming to identify instances that deviate from normal data distributions. This paper explores the application of Generative Adversarial Networks (GANs) in fraud detection, comparing their advantages with traditional methods. GANs, a type of Artificial Neural Network (ANN), have shown promise in modeling complex data distrib…
▽ More
Anomaly detection is a critical challenge across various research domains, aiming to identify instances that deviate from normal data distributions. This paper explores the application of Generative Adversarial Networks (GANs) in fraud detection, comparing their advantages with traditional methods. GANs, a type of Artificial Neural Network (ANN), have shown promise in modeling complex data distributions, making them effective tools for anomaly detection. The paper systematically describes the principles of GANs and their derivative models, emphasizing their application in fraud detection across different datasets. And by building a collection of adversarial verification graphs, we will effectively prevent fraud caused by bots or automated systems and ensure that the users in the transaction are real. The objective of the experiment is to design and implement a fake face verification code and fraud detection system based on Generative Adversarial network (GANs) algorithm to enhance the security of the transaction process.The study demonstrates the potential of GANs in enhancing transaction security through deep learning techniques.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Utilizing Deep Learning for Enhancing Network Resilience in Finance
Authors:
Yulu Gong,
Mengran Zhu,
Shuning Huo,
Yafei Xiang,
Hanyi Yu
Abstract:
In the age of the Internet, people's lives are increasingly dependent on today's network technology. Maintaining network integrity and protecting the legitimate interests of users is at the heart of network construction. Threat detection is an important part of a complete and effective defense system. How to effectively detect unknown threats is one of the concerns of network protection. Currently…
▽ More
In the age of the Internet, people's lives are increasingly dependent on today's network technology. Maintaining network integrity and protecting the legitimate interests of users is at the heart of network construction. Threat detection is an important part of a complete and effective defense system. How to effectively detect unknown threats is one of the concerns of network protection. Currently, network threat detection is usually based on rules and traditional machine learning methods, which create artificial rules or extract common spatiotemporal features, which cannot be applied to large-scale data applications, and the emergence of unknown risks causes the detection accuracy of the original model to decline. With this in mind, this paper uses deep learning for advanced threat detection to improve protective measures in the financial industry. Many network researchers have shifted their focus to exception-based intrusion detection techniques. The detection technology mainly uses statistical machine learning methods - collecting normal program and network behavior data, extracting multidimensional features, and training decision machine learning models on this basis (commonly used include naive Bayes, decision trees, support vector machines, random forests, etc.).
△ Less
Submitted 18 February, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Explicit-Implicit Subgoal Planning for Long-Horizon Tasks with Sparse Reward
Authors:
Fangyuan Wang,
Anqing Duan,
Peng Zhou,
Shengzeng Huo,
Guodong Guo,
Chenguang Yang,
David Navarro-Alarcon
Abstract:
The challenges inherent in long-horizon tasks in robotics persist due to the typical inefficient exploration and sparse rewards in traditional reinforcement learning approaches. To address these challenges, we have developed a novel algorithm, termed Explicit-Implicit Subgoal Planning (EISP), designed to tackle long-horizon tasks through a divide-and-conquer approach. We utilize two primary criter…
▽ More
The challenges inherent in long-horizon tasks in robotics persist due to the typical inefficient exploration and sparse rewards in traditional reinforcement learning approaches. To address these challenges, we have developed a novel algorithm, termed Explicit-Implicit Subgoal Planning (EISP), designed to tackle long-horizon tasks through a divide-and-conquer approach. We utilize two primary criteria, feasibility and optimality, to ensure the quality of the generated subgoals. EISP consists of three components: a hybrid subgoal generator, a hindsight sampler, and a value selector. The hybrid subgoal generator uses an explicit model to infer subgoals and an implicit model to predict the final goal, inspired by way of human thinking that infers subgoals by using the current state and final goal as well as reason about the final goal conditioned on the current state and given subgoals. Additionally, the hindsight sampler selects valid subgoals from an offline dataset to enhance the feasibility of the generated subgoals. While the value selector utilizes the value function in reinforcement learning to filter the optimal subgoals from subgoal candidates. To validate our method, we conduct four long-horizon tasks in both simulation and the real world. The obtained quantitative and qualitative data indicate that our approach achieves promising performance compared to other baseline methods. These experimental results can be seen on the website \url{https://sites.google.com/view/vaesi}.
△ Less
Submitted 15 June, 2024; v1 submitted 24 December, 2023;
originally announced December 2023.
-
Retrieving Supporting Evidence for Generative Question Answering
Authors:
Siqing Huo,
Negar Arabzadeh,
Charles L. A. Clarke
Abstract:
Current large language models (LLMs) can exhibit near-human levels of performance on many natural language-based tasks, including open-domain question answering. Unfortunately, at this time, they also convincingly hallucinate incorrect answers, so that responses to questions must be verified against external sources before they can be accepted at face value. In this paper, we report two simple exp…
▽ More
Current large language models (LLMs) can exhibit near-human levels of performance on many natural language-based tasks, including open-domain question answering. Unfortunately, at this time, they also convincingly hallucinate incorrect answers, so that responses to questions must be verified against external sources before they can be accepted at face value. In this paper, we report two simple experiments to automatically validate generated answers against a corpus. We base our experiments on questions and passages from the MS MARCO (V1) test collection, and a retrieval pipeline consisting of sparse retrieval, dense retrieval and neural rerankers. In the first experiment, we validate the generated answer in its entirety. After presenting a question to an LLM and receiving a generated answer, we query the corpus with the combination of the question + generated answer. We then present the LLM with the combination of the question + generated answer + retrieved answer, prompting it to indicate if the generated answer can be supported by the retrieved answer. In the second experiment, we consider the generated answer at a more granular level, prompting the LLM to extract a list of factual statements from the answer and verifying each statement separately. We query the corpus with each factual statement and then present the LLM with the statement and the corresponding retrieved evidence. The LLM is prompted to indicate if the statement can be supported and make necessary edits using the retrieved material. With an accuracy of over 80%, we find that an LLM is capable of verifying its generated answer when a corpus of supporting material is provided. However, manual assessment of a random sample of questions reveals that incorrect generated answers are missed by this verification process. While this verification process can reduce hallucinations, it can not entirely eliminate them.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Denjoy Domains and BMOA
Authors:
Shengjin Huo,
Michel Zinsmeister
Abstract:
A Denjoy domain is a plane domain whose complement is a closed subset $E$ of the extended real line $\bar{R}$ containing $\infty$ : such a domain is called Carleson-homogeneous if there exists $C>0$ such that for all $z\in E$ and $r>0$, one has $\vert E\cap [z-r,z+r]\vert\geq Cr$, where $\vert\cdot\vert$ is the Lebesgue measure on the line. We prove that if $U=\bar{ \mathbb C}\backslash K$ is a Ca…
▽ More
A Denjoy domain is a plane domain whose complement is a closed subset $E$ of the extended real line $\bar{R}$ containing $\infty$ : such a domain is called Carleson-homogeneous if there exists $C>0$ such that for all $z\in E$ and $r>0$, one has $\vert E\cap [z-r,z+r]\vert\geq Cr$, where $\vert\cdot\vert$ is the Lebesgue measure on the line. We prove that if $U=\bar{ \mathbb C}\backslash K$ is a Carleson-homogeneous Denjoy domain then, if $f$ stands for one of its universal coverings, $\log {f'}\in BMOA.$ In order to prove this result, we develop ideas from
[On Carleson measures induced by Beltrami coefficients being compatible with Fuchsian groups, Ann. Fenn. Math. 46(2021),67-77] leading to a general theorem about planar domains giving sufficient conditions ensuring that $\log {f'}\in BMOA$ for any universal covering $f.$
△ Less
Submitted 27 July, 2023;
originally announced July 2023.
-
PSO-Based Optimal Coverage Path Planning for Surface Defect Inspection of 3C Components with a Robotic Line Scanner
Authors:
Hongpeng Chen,
Shengzeng Huo,
Muhammad Muddassir,
Hoi-Yin Lee,
Anqing Duan,
Pai Zheng,
David Navarro-Alarcon
Abstract:
The automatic inspection of surface defects is an important task for quality control in the computers, communications, and consumer electronics (3C) industry. Conventional devices for defect inspection (viz. line-scan sensors) have a limited field of view, thus, a robot-aided defect inspection system needs to scan the object from multiple viewpoints. Optimally selecting the robot's viewpoints and…
▽ More
The automatic inspection of surface defects is an important task for quality control in the computers, communications, and consumer electronics (3C) industry. Conventional devices for defect inspection (viz. line-scan sensors) have a limited field of view, thus, a robot-aided defect inspection system needs to scan the object from multiple viewpoints. Optimally selecting the robot's viewpoints and planning a path is regarded as coverage path planning (CPP), a problem that enables inspecting the object's complete surface while reducing the scanning time and avoiding misdetection of defects. However, the development of CPP strategies for robotic line scanners has not been sufficiently studied by researchers. To fill this gap in the literature, in this paper, we present a new approach for robotic line scanners to detect surface defects of 3C free-form objects automatically. Our proposed solution consists of generating a local path by a new hybrid region segmentation method and an adaptive planning algorithm to ensure the coverage of the complete object surface. An optimization method for the global path sequence is developed to maximize the scanning efficiency. To verify our proposed methodology, we conduct detailed simulation-based and experimental studies on various free-form workpieces, and compare its performance with a state-of-the-art solution. The reported results demonstrate the feasibility and effectiveness of our approach.
△ Less
Submitted 28 July, 2024; v1 submitted 10 July, 2023;
originally announced July 2023.
-
Video object detection for privacy-preserving patient monitoring in intensive care
Authors:
Raphael Emberger,
Jens Michael Boss,
Daniel Baumann,
Marko Seric,
Shufan Huo,
Lukas Tuggener,
Emanuela Keller,
Thilo Stadelmann
Abstract:
Patient monitoring in intensive care units, although assisted by biosensors, needs continuous supervision of staff. To reduce the burden on staff members, IT infrastructures are built to record monitoring data and develop clinical decision support systems. These systems, however, are vulnerable to artifacts (e.g. muscle movement due to ongoing treatment), which are often indistinguishable from rea…
▽ More
Patient monitoring in intensive care units, although assisted by biosensors, needs continuous supervision of staff. To reduce the burden on staff members, IT infrastructures are built to record monitoring data and develop clinical decision support systems. These systems, however, are vulnerable to artifacts (e.g. muscle movement due to ongoing treatment), which are often indistinguishable from real and potentially dangerous signals. Video recordings could facilitate the reliable classification of biosignals using object detection (OD) methods to find sources of unwanted artifacts. Due to privacy restrictions, only blurred videos can be stored, which severely impairs the possibility to detect clinically relevant events such as interventions or changes in patient status with standard OD methods. Hence, new kinds of approaches are necessary that exploit every kind of available information due to the reduced information content of blurred footage and that are at the same time easily implementable within the IT infrastructure of a normal hospital. In this paper, we propose a new method for exploiting information in the temporal succession of video frames. To be efficiently implementable using off-the-shelf object detectors that comply with given hardware constraints, we repurpose the image color channels to account for temporal consistency, leading to an improved detection rate of the object classes. Our method outperforms a standard YOLOv5 baseline model by +1.7% mAP@.5 while also training over ten times faster on our proprietary dataset. We conclude that this approach has shown effectiveness in the preliminary experiments and holds potential for more general video OD in the future.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Retrieving Supporting Evidence for LLMs Generated Answers
Authors:
Siqing Huo,
Negar Arabzadeh,
Charles L. A. Clarke
Abstract:
Current large language models (LLMs) can exhibit near-human levels of performance on many natural language tasks, including open-domain question answering. Unfortunately, they also convincingly hallucinate incorrect answers, so that responses to questions must be verified against external sources before they can be accepted at face value. In this paper, we report a simple experiment to automatical…
▽ More
Current large language models (LLMs) can exhibit near-human levels of performance on many natural language tasks, including open-domain question answering. Unfortunately, they also convincingly hallucinate incorrect answers, so that responses to questions must be verified against external sources before they can be accepted at face value. In this paper, we report a simple experiment to automatically verify generated answers against a corpus. After presenting a question to an LLM and receiving a generated answer, we query the corpus with the combination of the question + generated answer. We then present the LLM with the combination of the question + generated answer + retrieved answer, prompting it to indicate if the generated answer can be supported by the retrieved answer. We base our experiment on questions and passages from the MS MARCO (V1) test collection, exploring three retrieval approaches ranging from standard BM25 to a full question answering stack, including a reader based on the LLM. For a large fraction of questions, we find that an LLM is capable of verifying its generated answer if appropriate supporting material is provided. However, with an accuracy of 70-80%, this approach cannot be fully relied upon to detect hallucinations.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Efficient Robot Skill Learning with Imitation from a Single Video for Contact-Rich Fabric Manipulation
Authors:
Shengzeng Huo,
Anqing Duan,
Lijun Han,
Luyin Hu,
Hesheng Wang,
David Navarro-Alarcon
Abstract:
Classical policy search algorithms for robotics typically require performing extensive explorations, which are time-consuming and expensive to implement with real physical platforms. To facilitate the efficient learning of robot manipulation skills, in this work, we propose a new approach comprised of three modules: (1) learning of general prior knowledge with random explorations in simulation, in…
▽ More
Classical policy search algorithms for robotics typically require performing extensive explorations, which are time-consuming and expensive to implement with real physical platforms. To facilitate the efficient learning of robot manipulation skills, in this work, we propose a new approach comprised of three modules: (1) learning of general prior knowledge with random explorations in simulation, including state representations, dynamic models, and the constrained action space of the task; (2) extraction of a state alignment-based reward function from a single demonstration video; (3) real-time optimization of the imitation policy under systematic safety constraints with sampling-based model predictive control. This solution results in an efficient one-shot imitation-from-video strategy that simplifies the learning and execution of robot skills in real applications. Specifically, we learn priors in a scene of a task family and then deploy the policy in a novel scene immediately following a single demonstration, preventing time-consuming and risky explorations in the environment. As we do not make a strong assumption of dynamic consistency between the scenes, learning priors can be conducted in simulation to avoid collecting data in real-world circumstances. We evaluate the effectiveness of our approach in the context of contact-rich fabric manipulation, which is a common scenario in industrial and domestic tasks. Detailed numerical simulations and real-world hardware experiments reveal that our method can achieve rapid skill acquisition for challenging manipulation tasks.
△ Less
Submitted 23 April, 2023;
originally announced April 2023.
-
A Dual-Arm Collaborative Framework for Dexterous Manipulation in Unstructured Environments with Contrastive Planning
Authors:
Shengzeng Huo,
Fangyuan Wang,
Luyin Hu,
Peng Zhou,
Jihong Zhu,
Hesheng Wang,
David Navarro-Alarcon
Abstract:
Most object manipulation strategies for robots are based on the assumption that the object is rigid (i.e., with fixed geometry) and the goal's details have been fully specified (e.g., the exact target pose). However, there are many tasks that involve spatial relations in human environments where these conditions may be hard to satisfy, e.g., bending and placing a cable inside an unknown container.…
▽ More
Most object manipulation strategies for robots are based on the assumption that the object is rigid (i.e., with fixed geometry) and the goal's details have been fully specified (e.g., the exact target pose). However, there are many tasks that involve spatial relations in human environments where these conditions may be hard to satisfy, e.g., bending and placing a cable inside an unknown container. To develop advanced robotic manipulation capabilities in unstructured environments that avoid these assumptions, we propose a novel long-horizon framework that exploits contrastive planning in finding promising collaborative actions. Using simulation data collected by random actions, we learn an embedding model in a contrastive manner that encodes the spatio-temporal information from successful experiences, which facilitates the subgoal planning through clustering in the latent space. Based on the keypoint correspondence-based action parameterization, we design a leader-follower control scheme for the collaboration between dual arms. All models of our policy are automatically trained in simulation and can be directly transferred to real-world environments. To validate the proposed framework, we conduct a detailed experimental study on a complex scenario subject to environmental and reachability constraints in both simulation and real environments.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Towards Hybrid-Optimization Video Coding
Authors:
Shuai Huo,
Dong Liu,
Li Li,
Siwei Ma,
Feng Wu,
Wen Gao
Abstract:
Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimiza…
▽ More
Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimization solutions. Block-based hybrid coding represents the discrete optimization solution because those irrelevant coding modes are discrete in mathematics. It searches for the best one among multiple starting points (i.e. modes). However, the search is not efficient enough. On the other hand, end-to-end learned coding represents the continuous optimization solution because the gradient descent is based on a continuous function. It optimizes a group of model parameters efficiently by the numerical algorithm. However, limited by only one starting point, it is easy to fall into the local optimum. To better solve the optimization problem, we propose to regard video coding as a hybrid of the discrete and continuous optimization problem, and use both search and numerical algorithm to solve it. Our idea is to provide multiple discrete starting points in the global space and optimize the local optimum around each point by numerical algorithm efficiently. Finally, we search for the global optimum among those local optimums. Guided by the hybrid optimization idea, we design a hybrid optimization video coding framework, which is built on continuous deep networks entirely and also contains some discrete modes. We conduct a comprehensive set of experiments. Compared to the continuous optimization framework, our method outperforms pure learned video coding methods. Meanwhile, compared to the discrete optimization framework, our method achieves comparable performance to HEVC reference software HM16.10 in PSNR.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Natural Language Sentence Generation from API Specifications
Authors:
Siyu Huo,
Kushal Mukherjee,
Jayachandu Bandlamudi,
Vatche Isahagian,
Vinod Muthusamy,
Yara Rizk
Abstract:
APIs are everywhere; they provide access to automation solutions that could help businesses automate some of their tasks. Unfortunately, they may not be accessible to the business users who need them but are not equipped with the necessary technical skills to leverage them. Wrapping these APIs with chatbot capabilities is one solution to make these automation solutions interactive. In this work, w…
▽ More
APIs are everywhere; they provide access to automation solutions that could help businesses automate some of their tasks. Unfortunately, they may not be accessible to the business users who need them but are not equipped with the necessary technical skills to leverage them. Wrapping these APIs with chatbot capabilities is one solution to make these automation solutions interactive. In this work, we propose a system to generate sentences to train intent recognition models, a crucial component within chatbots to understand natural language utterances from users. Evaluation of our approach based on deep learning models showed promising and inspiring results, and the human-in-the-loop interaction will provide further improvement on the system.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Strongly quasisymmetirc homeomorphisms being compatible with Fuchsian groups
Authors:
Shengjin Huo,
Mengzhen Zhao
Abstract:
In this paper we first introduced a domain called generalized Dirichlet fundamental domain $\mathcal{F}^{*}$ for a Fuchsian group $G$ whose generators contain parabolic elements. This allows us to show that a quasisymmetric homeomorphism $h$ being compatible with a convergence Fuchsian group $G$ of first kind is a strongly quasisymmetric homeomorphism if and only if it has a quasiconformal extensi…
▽ More
In this paper we first introduced a domain called generalized Dirichlet fundamental domain $\mathcal{F}^{*}$ for a Fuchsian group $G$ whose generators contain parabolic elements. This allows us to show that a quasisymmetric homeomorphism $h$ being compatible with a convergence Fuchsian group $G$ of first kind is a strongly quasisymmetric homeomorphism if and only if it has a quasiconformal extension $f$ to the upper half plane $\mathbb{H}$ onto itself such that the induced measure $λ_μ=|μ|^{2}/Im(z)dxdy$ by the Beltrami coefficient $μ$ of $f$ is a Carleson measure on the generalized Dirichlet fundamental domain $\mathcal{F}^{*}.$
We also show that the above property also holds for Carleson-Denjoy domains.
△ Less
Submitted 8 June, 2022;
originally announced June 2022.
-
An adaptive mixture-population Monte Carlo method for likelihood-free inference
Authors:
Zhijian He,
Shifeng Huo,
Tianhui Yang
Abstract:
This paper focuses on variational inference with intractable likelihood functions that can be unbiasedly estimated. A flexible variational approximation based on Gaussian mixtures is developed, by adopting the mixture population Monte Carlo (MPMC) algorithm in \cite{cappe2008adaptive}. MPMC updates iteratively the parameters of mixture distributions with importance sampling computations, instead o…
▽ More
This paper focuses on variational inference with intractable likelihood functions that can be unbiasedly estimated. A flexible variational approximation based on Gaussian mixtures is developed, by adopting the mixture population Monte Carlo (MPMC) algorithm in \cite{cappe2008adaptive}. MPMC updates iteratively the parameters of mixture distributions with importance sampling computations, instead of the complicated gradient estimation of the optimization objective in usual variational Bayes. Noticing that MPMC uses a fixed number of mixture components, which is difficult to predict for real applications, we further propose an automatic component--updating procedure to derive an appropriate number of components. The derived adaptive MPMC algorithm is capable of finding good approximations of the multi-modal posterior distributions even with a standard Gaussian as the initial distribution, as demonstrated in our numerical experiments.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Full-attention based Neural Architecture Search using Context Auto-regression
Authors:
Yuan Zhou,
Haiyang Wang,
Shuwei Huo,
Boyu Wang
Abstract:
Self-attention architectures have emerged as a recent advancement for improving the performance of vision tasks. Manual determination of the architecture for self-attention networks relies on the experience of experts and cannot automatically adapt to various scenarios. Meanwhile, neural architecture search (NAS) has significantly advanced the automatic design of neural architectures. Thus, it is…
▽ More
Self-attention architectures have emerged as a recent advancement for improving the performance of vision tasks. Manual determination of the architecture for self-attention networks relies on the experience of experts and cannot automatically adapt to various scenarios. Meanwhile, neural architecture search (NAS) has significantly advanced the automatic design of neural architectures. Thus, it is appropriate to consider using NAS methods to discover a better self-attention architecture automatically. However, it is challenging to directly use existing NAS methods to search attention networks because of the uniform cell-based search space and the lack of long-term content dependencies. To address this issue, we propose a full-attention based NAS method. More specifically, a stage-wise search space is constructed that allows various attention operations to be adopted for different layers of a network. To extract global features, a self-supervised search algorithm is proposed that uses context auto-regression to discover the full-attention architecture. To verify the efficacy of the proposed methods, we conducted extensive experiments on various learning tasks, including image classification, fine-grained image recognition, and zero-shot image retrieval. The empirical results show strong evidence that our method is capable of discovering high-performance, full-attention architectures while guaranteeing the required search efficiency.
△ Less
Submitted 13 November, 2021;
originally announced November 2021.
-
Action Planning for Packing Long Linear Elastic Objects into Compact Boxes with Bimanual Robotic Manipulation
Authors:
Wanyu Ma,
Bin Zhang,
Lijun Han,
Shengzeng Huo,
Hesheng Wang,
David Navarro-Alarcon
Abstract:
In this paper, we propose a new action planning approach to automatically pack long linear elastic objects into common-size boxes with a bimanual robotic system. For that, we developed a hybrid geometric model to handle large-scale occlusions combining an online vision-based method and an offline reference template. Then, a reference point generator is introduced to automatically plan the referenc…
▽ More
In this paper, we propose a new action planning approach to automatically pack long linear elastic objects into common-size boxes with a bimanual robotic system. For that, we developed a hybrid geometric model to handle large-scale occlusions combining an online vision-based method and an offline reference template. Then, a reference point generator is introduced to automatically plan the reference poses for the predesigned action primitives. Finally, an action planner integrates these components enabling the execution of high-level behaviors and the accomplishment of packing manipulation tasks. To validate the proposed approach, we conducted a detailed experimental study with multiple types and lengths of objects and packing boxes.
△ Less
Submitted 19 July, 2022; v1 submitted 22 October, 2021;
originally announced October 2021.
-
Keypoint-Based Bimanual Shaping of Deformable Linear Objects under Environmental Constraints using Hierarchical Action Planning
Authors:
Shengzeng Huo,
Anqing Duan,
Chengxi Li,
Peng Zhou,
Wanyu Ma,
David Navarro-Alarcon
Abstract:
This paper addresses the problem of contact-based manipulation of deformable linear objects (DLOs) towards desired shapes with a dual-arm robotic system. To alleviate the burden of high-dimensional continuous state-action spaces, we model the DLO as a kinematic multibody system via our proposed keypoint detection network. This new perception network is trained on a synthetic labeled image dataset…
▽ More
This paper addresses the problem of contact-based manipulation of deformable linear objects (DLOs) towards desired shapes with a dual-arm robotic system. To alleviate the burden of high-dimensional continuous state-action spaces, we model the DLO as a kinematic multibody system via our proposed keypoint detection network. This new perception network is trained on a synthetic labeled image dataset and transferred to real manipulation scenarios without conducting any manual annotations. Our goal-conditioned policy can efficiently learn to rearrange the configuration of the DLO based on the detected keypoints. The proposed hierarchical action framework tackles the manipulation problem in a coarse-to-fine manner (with high-level task planning and low-level motion control) by leveraging on two action primitives. The identification of deformation properties is avoided since the algorithm replans its motion after each bimanual execution. The conducted experimental results reveal that our method achieves high performance in state representation of the DLO, and is robust to uncertain environmental constraints.
△ Less
Submitted 17 October, 2021;
originally announced October 2021.
-
Learning Cloth Folding Tasks with Refined Flow Based Spatio-Temporal Graphs
Authors:
Peng Zhou,
Omar Zahra,
Anqing Duan,
Shengzeng Huo,
Zeyu Wu,
David Navarro-Alarcon
Abstract:
Cloth folding is a widespread domestic task that is seemingly performed by humans but which is highly challenging for autonomous robots to execute due to the highly deformable nature of textiles; It is hard to engineer and learn manipulation pipelines to efficiently execute it. In this paper, we propose a new solution for robotic cloth folding (using a standard folding board) via learning from dem…
▽ More
Cloth folding is a widespread domestic task that is seemingly performed by humans but which is highly challenging for autonomous robots to execute due to the highly deformable nature of textiles; It is hard to engineer and learn manipulation pipelines to efficiently execute it. In this paper, we propose a new solution for robotic cloth folding (using a standard folding board) via learning from demonstrations. Our demonstration video encoding is based on a high-level abstraction, namely, a refined optical flow-based spatiotemporal graph, as opposed to a low-level encoding such as image pixels. By constructing a new spatiotemporal graph with an advanced visual corresponding descriptor, the policy learning can focus on key points and relations with a 3D spatial configuration, which allows to quickly generalize across different environments. To further boost the policy searching, we combine optical flow and static motion saliency maps to discriminate the dominant motions for better handling the system dynamics in real-time, which aligns with the attentional motion mechanism that dominates the human imitation process. To validate the proposed approach, we analyze the manual folding procedure and developed a custom-made end-effector to efficiently interact with the folding board. Multiple experiments on a real robotic platform were conducted to validate the effectiveness and robustness of the proposed method.
△ Less
Submitted 16 October, 2021;
originally announced October 2021.
-
Thermally-driven formation of Ge quantum dots on self-catalysed thin GaAs nanowires
Authors:
Yunyan Zhang,
H. Aruni Fonseka,
Hui Yang,
Xuezhe Yu,
Pamela Jurczak,
Suguo Huo,
Ana M. Sanchez,
Huiyun Liu
Abstract:
Embedding quantum dots (QDs) on nanowire (NW) sidewalls allows the integration of multi-layers of QDs into the active region of radial p-i-n junctions to greatly enhance light emission/absorption. However, the surface curvature makes the growth much more challenging compared with growths on thin-films, particularly on NWs with small diameters (Ø <100 nm). Moreover, the {110} sidewall facets of sel…
▽ More
Embedding quantum dots (QDs) on nanowire (NW) sidewalls allows the integration of multi-layers of QDs into the active region of radial p-i-n junctions to greatly enhance light emission/absorption. However, the surface curvature makes the growth much more challenging compared with growths on thin-films, particularly on NWs with small diameters (Ø <100 nm). Moreover, the {110} sidewall facets of self-catalyzed NWs favor two-dimensional growth (2D), with the realization of three-dimensional (3D) Stranski-Krastanow growth becoming extremely challenging. Here, we demonstrate thermally-driven formation of Ge dots on the {110} sidewalls facets of thin self-catalyzed NWs without using any surfactant or surface treatment. The 2D-3D transition of the pseudomorphic Ge layer grown on GaAs NWs is driven by energy minimization under high-temperature annealing. This method opens a new avenue to integrate QDs on NWs without any restriction on NW diameter or elastic strain, which can allow the formation of QDs in a wider range of materials systems where the growth of islands by traditional mechanisms is not possible, with benefits for novel NWQD-based optoelectronic devices.
△ Less
Submitted 31 March, 2021;
originally announced March 2021.
-
On the dimension distortions of quasi-symmetric homeomorphisms
Authors:
Shengjin Huo
Abstract:
In this paper, we first generalize a result of Bishop and Steger [Representation theoretic rigidity in PSL(2, R). Acta Math., 170, (1993), 121-149] by proving that for a Fuchsian group $G$ of divergence type and non-lattice, if $h$ is a quasi-symmetric homeomorphism of the real axis $\mathbb{R}$ corresponding to a quasi-conformal compact deformation of $G$. Then for any $E\subset \mathbb{R}$, we h…
▽ More
In this paper, we first generalize a result of Bishop and Steger [Representation theoretic rigidity in PSL(2, R). Acta Math., 170, (1993), 121-149] by proving that for a Fuchsian group $G$ of divergence type and non-lattice, if $h$ is a quasi-symmetric homeomorphism of the real axis $\mathbb{R}$ corresponding to a quasi-conformal compact deformation of $G$. Then for any $E\subset \mathbb{R}$, we have max(dim$E$, dim$h(\mathbb{R}\setminus E))=1$. Furthermore, we showed that Bishop and steger's result does not hold for the covering groups of all '$d$-dimensional jungle gym' (d is any positive integer) which generalizes Gönye's results [ Differentiability of quasi-conformal maps on the jungle gym. Trans. Amer. Math. Soc. Vol 359 (2007), 9-32] where the author discussed the case of '$1$-dimensional jungle gym'.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
LaSeSOM: A Latent and Semantic Representation Framework for Soft Object Manipulation
Authors:
Peng Zhou,
Jihong Zhu,
Shengzeng Huo,
David Navarro-Alarcon
Abstract:
Soft object manipulation has recently gained popularity within the robotics community due to its potential applications in many economically important areas. Although great progress has been recently achieved in these types of tasks, most state-of-the-art methods are case-specific; They can only be used to perform a single deformation task (e.g. bending), as their shape representation algorithms t…
▽ More
Soft object manipulation has recently gained popularity within the robotics community due to its potential applications in many economically important areas. Although great progress has been recently achieved in these types of tasks, most state-of-the-art methods are case-specific; They can only be used to perform a single deformation task (e.g. bending), as their shape representation algorithms typically rely on "hard-coded" features. In this paper, we present LaSeSOM, a new feedback latent representation framework for semantic soft object manipulation. Our new method introduces internal latent representation layers between low-level geometric feature extraction and high-level semantic shape analysis; This allows the identification of each compressed semantic function and the formation of a valid shape classifier from different feature extraction levels. The proposed latent framework makes soft object representation more generic (independent from the object's geometry and its mechanical properties) and scalable (it can work with 1D/2D/3D tasks). Its high-level semantic layer enables to perform (quasi) shape planning tasks with soft objects, a valuable and underexplored capability in many soft manipulation tasks. To validate this new methodology, we report a detailed experimental study with robotic manipulators.
△ Less
Submitted 18 October, 2021; v1 submitted 9 December, 2020;
originally announced December 2020.
-
On Carleson Measures of Beltrami Coefficients Being Compatible with Infinitely Generated Fuchsian Groups Related to Denjoy Domian
Authors:
Shengjin Huo
Abstract:
Let $Ω$ be a Carleson-Denjoy domain and $G$ be its covering group. Let $μ$ be a Beltrami coefficient on the unit disk which is compatible with the group $G$. In this paper we show that if $\frac{|μ|^{2}}{1-|z|^{2}}dxdy$ satisfies the Carleson condition on the infinite boundary of the Dirichlet fundamental domain of $G$, then $\frac{|μ|^{2}}{1-|z|^{2}}dxdy$ is a Carleson measure on the unit disk. W…
▽ More
Let $Ω$ be a Carleson-Denjoy domain and $G$ be its covering group. Let $μ$ be a Beltrami coefficient on the unit disk which is compatible with the group $G$. In this paper we show that if $\frac{|μ|^{2}}{1-|z|^{2}}dxdy$ satisfies the Carleson condition on the infinite boundary of the Dirichlet fundamental domain of $G$, then $\frac{|μ|^{2}}{1-|z|^{2}}dxdy$ is a Carleson measure on the unit disk. We also show that the above property does not hold for Denjoy domain.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
A Robotic Line Scan System with Adaptive ROI for Inspection of Defects over Convex Free-form Specular Surfaces
Authors:
Shengzeng Huo,
David Navarro-Alarcon,
David Chik
Abstract:
In this paper, we present a new robotic system to perform defect inspection tasks over free-form specular surfaces. The autonomous procedure is achieved by a six-DOF manipulator, equipped with a line scan camera and a high-intensity lighting system. Our method first uses the object's CAD mesh model to implement a K-means unsupervised learning algorithm that segments the object's surface into areas…
▽ More
In this paper, we present a new robotic system to perform defect inspection tasks over free-form specular surfaces. The autonomous procedure is achieved by a six-DOF manipulator, equipped with a line scan camera and a high-intensity lighting system. Our method first uses the object's CAD mesh model to implement a K-means unsupervised learning algorithm that segments the object's surface into areas with similar curvature. Then, the scanning path is computed by using an adaptive algorithm that adjusts the camera's ROI to observe regions with irregular shapes properly. A novel iterative closest point-based projection registration method that robustly localizes the object in the robot's coordinate frame system is proposed to deal with the blind spot problem of specular objects captured by depth sensors. Finally, an image processing pipeline automatically detects surface defects in the captured high-resolution images. A detailed experimental study with a vision-guided robotic scanning system is reported to validate the proposed methodology.
△ Less
Submitted 25 August, 2020;
originally announced August 2020.
-
Generating Adjacency Matrix for Video Relocalization
Authors:
Yuan Zhou,
Mingfei Wang,
Ruolin Wang,
Shuwei Huo
Abstract:
In this paper, we continue our work on video relocalization task. Based on using graph convolution to extract intra-video and inter-video frame features, we improve the method by using similarity-metric based graph convolution, whose weighted adjacency matrix is achieved by calculating similarity metric between features of any two different time steps in the graph. Experiments on ActivityNet v1.2…
▽ More
In this paper, we continue our work on video relocalization task. Based on using graph convolution to extract intra-video and inter-video frame features, we improve the method by using similarity-metric based graph convolution, whose weighted adjacency matrix is achieved by calculating similarity metric between features of any two different time steps in the graph. Experiments on ActivityNet v1.2 and Thumos14 dataset show the effectiveness of this improvement, and it outperforms the state-of-the-art methods.
△ Less
Submitted 26 January, 2022; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Graph Neural Network for Video Relocalization
Authors:
Yuan Zhou,
Mingfei Wang,
Ruolin Wang,
Shuwei Huo
Abstract:
In this paper, we focus on video relocalization task, which uses a query video clip as input to retrieve a semantic relative video clip in another untrimmed long video. we find that in video relocalization datasets, there exists a phenomenon showing that there does not exist consistent relationship between feature similarity by frame and feature similarity by video, which affects the feature fusio…
▽ More
In this paper, we focus on video relocalization task, which uses a query video clip as input to retrieve a semantic relative video clip in another untrimmed long video. we find that in video relocalization datasets, there exists a phenomenon showing that there does not exist consistent relationship between feature similarity by frame and feature similarity by video, which affects the feature fusion among frames. However, existing video relocalization methods do not fully consider it. Taking this phenomenon into account, in this article, we treat video features as a graph by concatenating the query video feature and proposal video feature along time dimension, where each timestep is treated as a node, each row of the feature matrix is treated as feature of each node. Then, with the power of graph neural networks, we propose a Multi-Graph Feature Fusion Module to fuse the relation feature of this graph. After evaluating our method on ActivityNet v1.2 dataset and Thumos14 dataset, we find that our proposed method outperforms the state of art methods.
△ Less
Submitted 26 January, 2022; v1 submitted 20 July, 2020;
originally announced July 2020.
-
Experimental realization of three-dimensional elastic phononic topological insulator
Authors:
Shao-yong Huo,
Jiu-jiu Chen Hong-bo Huang,
Yong-jian Wei,
Zhu-hua Tan Lu-yang Feng,
Xiao-ping Xie
Abstract:
Three-dimensional (3D) elastic phononic topological insulator, featuring two-dimensional (2D) surface states, which support the high-efficient and robust elastic wave propagation without backscattering in all spatial dimensions, remains a challenge due to the nature of multiple polarized elastic modes and their complex hybridization in 3D. Here, a 3D elastic phononic topological insulator is desig…
▽ More
Three-dimensional (3D) elastic phononic topological insulator, featuring two-dimensional (2D) surface states, which support the high-efficient and robust elastic wave propagation without backscattering in all spatial dimensions, remains a challenge due to the nature of multiple polarized elastic modes and their complex hybridization in 3D. Here, a 3D elastic phononic topological insulator is designed and observed experimentally by emulating the quantum valley Hall effects. The spatial inversion of adjacent atoms gives rise to a valley topological phase to an insulating regime with complete 3D topological phononic bandgap. The 2D surface states protected by valley topology are unveiled numerically, which are confirmed experimentally to have a great robustness against the straight channel and sharp bends. Further engineering the elastic valley layer with appropriate interlayer coupling, we also demonstrate that layer pseudospin can be created in 3D elastic system which leads to 2D topological layer-dependent surface states and layer-selective transport. Our work will be a key step for the manipulation of elastic wave in 2D topological plane and the applications of 3D elastic topological-insulator-based devices with layer-selective functionality.
△ Less
Submitted 12 April, 2020;
originally announced April 2020.
-
Soft-Root-Sign Activation Function
Authors:
Yuan Zhou,
Dandan Li,
Shuwei Huo,
Sun-Yuan Kung
Abstract:
The choice of activation function in deep networks has a significant effect on the training dynamics and task performance. At present, the most effective and widely-used activation function is ReLU. However, because of the non-zero mean, negative missing and unbounded output, ReLU is at a potential disadvantage during optimization. To this end, we introduce a novel activation function to manage to…
▽ More
The choice of activation function in deep networks has a significant effect on the training dynamics and task performance. At present, the most effective and widely-used activation function is ReLU. However, because of the non-zero mean, negative missing and unbounded output, ReLU is at a potential disadvantage during optimization. To this end, we introduce a novel activation function to manage to overcome the above three challenges. The proposed nonlinearity, namely "Soft-Root-Sign" (SRS), is smooth, non-monotonic, and bounded. Notably, the bounded property of SRS distinguishes itself from most state-of-the-art activation functions. In contrast to ReLU, SRS can adaptively adjust the output by a pair of independent trainable parameters to capture negative information and provide zero-mean property, which leading not only to better generalization performance, but also to faster learning speed. It also avoids and rectifies the output distribution to be scattered in the non-negative real number space, making it more compatible with batch normalization (BN) and less sensitive to initialization. In experiments, we evaluated SRS on deep networks applied to a variety of tasks, including image classification, machine translation and generative modelling. Our SRS matches or exceeds models with ReLU and other state-of-the-art nonlinearities, showing that the proposed activation function is generalized and can achieve high performance across tasks. Ablation study further verified the compatibility with BN and self-adaptability for different initialization.
△ Less
Submitted 1 March, 2020;
originally announced March 2020.
-
Efficient Global String Kernel with Random Features: Beyond Counting Substructures
Authors:
Lingfei Wu,
Ian En-Hsu Yen,
Siyu Huo,
Liang Zhao,
Kun Xu,
Liang Ma,
Shouling Ji,
Charu Aggarwal
Abstract:
Analysis of large-scale sequential data has been one of the most crucial tasks in areas such as bioinformatics, text, and audio mining. Existing string kernels, however, either (i) rely on local features of short substructures in the string, which hardly capture long discriminative patterns, (ii) sum over too many substructures, such as all possible subsequences, which leads to diagonal dominance…
▽ More
Analysis of large-scale sequential data has been one of the most crucial tasks in areas such as bioinformatics, text, and audio mining. Existing string kernels, however, either (i) rely on local features of short substructures in the string, which hardly capture long discriminative patterns, (ii) sum over too many substructures, such as all possible subsequences, which leads to diagonal dominance of the kernel matrix, or (iii) rely on non-positive-definite similarity measures derived from the edit distance. Furthermore, while there have been works addressing the computational challenge with respect to the length of string, most of them still experience quadratic complexity in terms of the number of training samples when used in a kernel-based classifier. In this paper, we present a new class of global string kernels that aims to (i) discover global properties hidden in the strings through global alignments, (ii) maintain positive-definiteness of the kernel, without introducing a diagonal dominant kernel matrix, and (iii) have a training cost linear with respect to not only the length of the string but also the number of training string samples. To this end, the proposed kernels are explicitly defined through a series of different random feature maps, each corresponding to a distribution of random strings. We show that kernels defined this way are always positive-definite, and exhibit computational benefits as they always produce \emph{Random String Embeddings (RSE)} that can be directly used in any linear classification models. Our extensive experiments on nine benchmark datasets corroborate that RSE achieves better or comparable accuracy in comparison to state-of-the-art baselines, especially with the strings of longer lengths. In addition, we empirically show that RSE scales linearly with the increase of the number and the length of string.
△ Less
Submitted 25 November, 2019;
originally announced November 2019.
-
Comb Convolution for Efficient Convolutional Architecture
Authors:
Dandan Li,
Yuan Zhou,
Shuwei Huo,
Sun-Yuan Kung
Abstract:
Convolutional neural networks (CNNs) are inherently suffering from massively redundant computation (FLOPs) due to the dense connection pattern between feature maps and convolution kernels. Recent research has investigated the sparse relationship between channels, however, they ignored the spatial relationship within a channel. In this paper, we present a novel convolutional operator, namely comb c…
▽ More
Convolutional neural networks (CNNs) are inherently suffering from massively redundant computation (FLOPs) due to the dense connection pattern between feature maps and convolution kernels. Recent research has investigated the sparse relationship between channels, however, they ignored the spatial relationship within a channel. In this paper, we present a novel convolutional operator, namely comb convolution, to exploit the intra-channel sparse relationship among neurons. The proposed convolutional operator eliminates nearly 50% of connections by inserting uniform mappings into standard convolutions and removing about half of spatial connections in convolutional layer. Notably, our work is orthogonal and complementary to existing methods that reduce channel-wise redundancy. Thus, it has great potential to further increase efficiency through integrating the comb convolution to existing architectures. Experimental results demonstrate that by simply replacing standard convolutions with comb convolutions on state-of-the-art CNN architectures (e.g., VGGNets, Xception and SE-Net), we can achieve 50% FLOPs reduction while still maintaining the accuracy.
△ Less
Submitted 1 November, 2019;
originally announced November 2019.
-
P2L: Predicting Transfer Learning for Images and Semantic Relations
Authors:
Bishwaranjan Bhattacharjee,
John R. Kender,
Matthew Hill,
Parijat Dube,
Siyu Huo,
Michael R. Glass,
Brian Belgodere,
Sharath Pankanti,
Noel Codella,
Patrick Watson
Abstract:
Transfer learning enhances learning across tasks, by leveraging previously learned representations -- if they are properly chosen. We describe an efficient method to accurately estimate the appropriateness of a previously trained model for use in a new learning task. We use this measure, which we call "Predict To Learn" ("P2L"), in the two very different domains of images and semantic relations, w…
▽ More
Transfer learning enhances learning across tasks, by leveraging previously learned representations -- if they are properly chosen. We describe an efficient method to accurately estimate the appropriateness of a previously trained model for use in a new learning task. We use this measure, which we call "Predict To Learn" ("P2L"), in the two very different domains of images and semantic relations, where it predicts, from a set of "source" models, the one model most likely to produce effective transfer for training a given "target" model. We validate our approach thoroughly, by assembling a collection of candidate source models, then fine-tuning each candidate to perform each of a collection of target tasks, and finally measuring how well transfer has been enhanced. Across 95 tasks within multiple domains (images classification and semantic relations), the P2L approach was able to select the best transfer learning model on average, while the heuristic of choosing model trained with the largest data set selected the best model in only 55 cases. These results suggest that P2L captures important information in common between source and target tasks, and that this shared informational structure contributes to successful transfer learning more than simple data size.
△ Less
Submitted 15 October, 2020; v1 submitted 20 August, 2019;
originally announced August 2019.
-
Edge states and corner modes in second-order topological phononic crystal plates
Authors:
Shao-yong Huo,
Hong-bo Huang,
Lu-yang Feng,
Jiu-jiu Chen
Abstract:
We realize an elastic second-order topological insulator hosting both one-dimensional gapped edge states and zero-dimensional in-gap corner modes in the double-sided pillared phononic crystal plates with square lattice. Changing the width of two neighbor pillars breaks the inversion symmetry and induces the band inversion to emulate the quantum spin Hall effect where the gapless edge states are ob…
▽ More
We realize an elastic second-order topological insulator hosting both one-dimensional gapped edge states and zero-dimensional in-gap corner modes in the double-sided pillared phononic crystal plates with square lattice. Changing the width of two neighbor pillars breaks the inversion symmetry and induces the band inversion to emulate the quantum spin Hall effect where the gapless edge states are obtained. Further breaking the space-symmetry at interface, the gapless edge states are gapped and inducing the edge topological transitions and then giving rise to the zero-dimensional in-gap corner modes. Our work offers a novel way for elastic wave trapping and robustly guiding.
△ Less
Submitted 21 May, 2019;
originally announced May 2019.
-
A large data theory for nonlinear wave on the Schwarzschild background
Authors:
Saisai Huo,
Jinhua Wang
Abstract:
We study both of the scattering and Cauchy problems for the semilinear wave equation with null quadratic form on the Schwarzschild background. Prescribing the scattering data that are given by the short pulse data on the future null infinity and are trivial on the future event horizon, we construct a class of globally smooth solutions backwards up to any finite time and show that the wave travels…
▽ More
We study both of the scattering and Cauchy problems for the semilinear wave equation with null quadratic form on the Schwarzschild background. Prescribing the scattering data that are given by the short pulse data on the future null infinity and are trivial on the future event horizon, we construct a class of globally smooth solutions backwards up to any finite time and show that the wave travels in such a way that almost all of the (large) energy is focusing in an outgoing null strip, while little radiates out of this strip. In reverse, considering a class of Cauchy data with large energy norms, there exists a unique and global solution in the future development. And most of the wave packet is confined in an incoming null strip and reflected to the future event horizon, whereas little is transmitted to the future null infinity.
△ Less
Submitted 19 September, 2019; v1 submitted 20 May, 2019;
originally announced May 2019.
-
IPC: A Benchmark Data Set for Learning with Graph-Structured Data
Authors:
Patrick Ferber,
Tengfei Ma,
Siyu Huo,
Jie Chen,
Michael Katz
Abstract:
Benchmark data sets are an indispensable ingredient of the evaluation of graph-based machine learning methods. We release a new data set, compiled from International Planning Competitions (IPC), for benchmarking graph classification, regression, and related tasks. Apart from the graph construction (based on AI planning problems) that is interesting in its own right, the data set possesses distinct…
▽ More
Benchmark data sets are an indispensable ingredient of the evaluation of graph-based machine learning methods. We release a new data set, compiled from International Planning Competitions (IPC), for benchmarking graph classification, regression, and related tasks. Apart from the graph construction (based on AI planning problems) that is interesting in its own right, the data set possesses distinctly different characteristics from popularly used benchmarks. The data set, named IPC, consists of two self-contained versions, grounded and lifted, both including graphs of large and skewedly distributed sizes, posing substantial challenges for the computation of graph models such as graph kernels and graph neural networks. The graphs in this data set are directed and the lifted version is acyclic, offering the opportunity of benchmarking specialized models for directed (acyclic) structures. Moreover, the graph generator and the labeling are computer programmed; thus, the data set may be extended easily if a larger scale is desired. The data set is accessible from \url{https://github.com/IBM/IPC-graph-data}.
△ Less
Submitted 15 May, 2019;
originally announced May 2019.
-
Online Planner Selection with Graph Neural Networks and Adaptive Scheduling
Authors:
Tengfei Ma,
Patrick Ferber,
Siyu Huo,
Jie Chen,
Michael Katz
Abstract:
Automated planning is one of the foundational areas of AI. Since no single planner can work well for all tasks and domains, portfolio-based techniques have become increasingly popular in recent years. In particular, deep learning emerges as a promising methodology for online planner selection. Owing to the recent development of structural graph representations of planning tasks, we propose a graph…
▽ More
Automated planning is one of the foundational areas of AI. Since no single planner can work well for all tasks and domains, portfolio-based techniques have become increasingly popular in recent years. In particular, deep learning emerges as a promising methodology for online planner selection. Owing to the recent development of structural graph representations of planning tasks, we propose a graph neural network (GNN) approach to selecting candidate planners. GNNs are advantageous over a straightforward alternative, the convolutional neural networks, in that they are invariant to node permutations and that they incorporate node labels for better inference.
Additionally, for cost-optimal planning, we propose a two-stage adaptive scheduling method to further improve the likelihood that a given task is solved in time. The scheduler may switch at halftime to a different planner, conditioned on the observed performance of the first one. Experimental results validate the effectiveness of the proposed method against strong baselines, both deep learning and non-deep learning based.
The code is available at \url{https://github.com/matenure/GNN_planner}.
△ Less
Submitted 20 November, 2019; v1 submitted 31 October, 2018;
originally announced November 2018.
-
Self-ordering induces multiple topological transitions for elastic waves in phononic crystals
Authors:
Jiujiu Chen,
Hongbo Huang,
Shaoyong Huo,
Zhuhua Tan,
Xiaoping Xie,
Jianchun Cheng
Abstract:
Topological defects with symmetry-breaking phase transitions have captured much attention. Vortex generated by topological defects exhibits exotic properties and its flow direction can be switched by altering the spin configurations. Contrary to electromagnetic and acoustic domains, the topological transport of elastic waves in periodic structures with topological defects is not well explored due…
▽ More
Topological defects with symmetry-breaking phase transitions have captured much attention. Vortex generated by topological defects exhibits exotic properties and its flow direction can be switched by altering the spin configurations. Contrary to electromagnetic and acoustic domains, the topological transport of elastic waves in periodic structures with topological defects is not well explored due to the mode conversion between the longitudinal and transverse modes. Here, we propose an elastic topological insulator with spontaneously broken symmetry based on the topological theory of defects and homotopy theory. Multiple topological transitions for elastic waves are achieved by topologically modifying the ellipse orientation in a triangular lattice of elliptical cylinders. The solid system, independent of the number of molecules in order parameter space, breaks through the limit of the point-group symmetry to emulate elastic pseudospin-orbit coupling. The transport robustness of the edge states is experimentally demonstrated. Our approach provides new possibilities for controlling and transporting elastic waves.
△ Less
Submitted 28 April, 2018;
originally announced May 2018.
-
Landau level-superfluid modified factor and effective X/$γ$-ray coefficient of a magnetar
Authors:
Z. F. Gao,
Q. H. Peng,
N. Wang,
C. K. Chou,
W. S. Huo
Abstract:
As soon as the energy of electrons near the Fermi surface are higher than $Q$, the threshold energy of inverse $β-$ decay, the electron capture process will dominate. The resulting high-energy neutrons will destroy anisotropic ${}^3P_2$ neutron superfluid Cooper pairs. By colliding with the neutrons produced in the process $n+ (n\uparrow n\downarrow)\longrightarrow n+ n+ n$, the kinetic energy of…
▽ More
As soon as the energy of electrons near the Fermi surface are higher than $Q$, the threshold energy of inverse $β-$ decay, the electron capture process will dominate. The resulting high-energy neutrons will destroy anisotropic ${}^3P_2$ neutron superfluid Cooper pairs. By colliding with the neutrons produced in the process $n+ (n\uparrow n\downarrow)\longrightarrow n+ n+ n$, the kinetic energy of the outgoing neutrons will be transformed into thermal energy. The transformed thermal energy would transported from the star interior to the star surface by conduction, then would be transformed into radiation energy as soft X-rays and gamma-rays. After a highly efficient modulation within the pulsar magnetosphere, the surface thermal emission (mainly soft X/$γ$-ray emission) has been shaped into a spectrum with the observed characteristics of magnetars. By introducing two important parameters: Landau level-superfluid modified factor and effective X/$γ$-ray coefficient, we numerically simulate the process of magnetar cooling and magnetic field decay, and then compute magnetars' soft X/$γ$-ray luminosities $L_{X}$. Further, we obtain aschematic diagrams of $L_{X}$ as a function of magnetic field strength $B$. The observations are compared with the calculations.
△ Less
Submitted 10 December, 2013;
originally announced December 2013.
-
Enhanced Tumor Accumulation of Sub-2 nm Gold Nanoclusters for Cancer Radiation Therapy
Authors:
Xiao-Dong Zhang,
Jie Chen,
Zhentao Luo,
Di Wu,
Xiu Shen,
Sha-Sha Song,
Yuan-Ming Sun,
Pei-Xun Liu,
Jing Zhao,
Shuaidong Huo,
Saijun Fan,
Feiyue Fan,
Xing-Jie Liang,
Jianping Xie
Abstract:
A new type of metabolizable and efficient radiosensitizer for cancer radiotherapy is presented in this study by combining ultrasmall Au nanoclusters (NCs, <2 nm) with biocompatible coating ligands (glutathione, GSH). The new nano-construct (GSH-coated Au25 NCs) inherits attractive features of both the Au core (strong radiosensitizing effect) and GSH shell (good biocompatibility). It can preferenti…
▽ More
A new type of metabolizable and efficient radiosensitizer for cancer radiotherapy is presented in this study by combining ultrasmall Au nanoclusters (NCs, <2 nm) with biocompatible coating ligands (glutathione, GSH). The new nano-construct (GSH-coated Au25 NCs) inherits attractive features of both the Au core (strong radiosensitizing effect) and GSH shell (good biocompatibility). It can preferentially accumulate in tumor via the improved EPR effect, which leads to strong enhancement for cancer radiotherapy. After the treatment, the small-sized GSH-Au25 NCs can be efficiently cleared by the kidney, minimizing any potential side effects due to the accumulation of Au25 NCs in the body.
△ Less
Submitted 30 August, 2013;
originally announced August 2013.
-
Azimuthal correlation between the $(\vec{p}_l,\vec{p}_{X_b})$ and $(\vec{p}_l,\vec{P}_t)$ planes in the semileptonic rest frame decay of a polarized top quark: An $O(α_s)$ effect
Authors:
S. Groote,
W. S. Huo,
A. Kadeer,
J. G. Korner
Abstract:
The azimuthal correlation between the planes formed by the vectors
$(\vec{p}_\ell,\vec{p}_{X_b})$ and $(\vec{p}_\ell,\vec{P}_t)$ in the semileptonic rest frame decay of a polarized top quark $t(\uparrow) \to X_b + l^+ + ν_\ell$ belongs to a class of polarization observables involving the top quark which vanish at the Born term level in the standard model. We determine the next--to--leading ord…
▽ More
The azimuthal correlation between the planes formed by the vectors
$(\vec{p}_\ell,\vec{p}_{X_b})$ and $(\vec{p}_\ell,\vec{P}_t)$ in the semileptonic rest frame decay of a polarized top quark $t(\uparrow) \to X_b + l^+ + ν_\ell$ belongs to a class of polarization observables involving the top quark which vanish at the Born term level in the standard model. We determine the next--to--leading order QCD corrections to the afore-mentioned azimuthal correlation and compare the result to the corresponding contribution of a non--standard--model right--chiral quark current.
△ Less
Submitted 25 February, 2008; v1 submitted 2 February, 2006;
originally announced February 2006.
-
Temporal Dark Solitons in Nonuniform Bose-Einstein Condensates
Authors:
Tao Hong,
Yu Zhu Wang,
Yun Sheng Huo
Abstract:
We discuss temporal dark solitons in confined nonuniform Bose condensates. As a kind of localized high excitation, these solitons can be viewed as macroscopic quasiparticles, having relative motion to the background condensate. We get an analytic expression for one dark soliton under a slowly varying approximation and discuss its special propagation properties in a nonuniform condensate, then we…
▽ More
We discuss temporal dark solitons in confined nonuniform Bose condensates. As a kind of localized high excitation, these solitons can be viewed as macroscopic quasiparticles, having relative motion to the background condensate. We get an analytic expression for one dark soliton under a slowly varying approximation and discuss its special propagation properties in a nonuniform condensate, then we numerically prove that this approximation is reasonable and this kind of soliton exhibits their propagation properties in the nonuniform condensate. Finally, we simulate the generation of dark-soliton-like pulses in the condensate, and indicate that the excitation experiment, done by Ketterle and co-workers [Phys. Rev. Lett. 79, 553 (1997)], can also be interpreted in terms of temporal dark soliton.
△ Less
Submitted 2 November, 2002;
originally announced November 2002.