-
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Authors:
Howard Yen,
Tianyu Gao,
Minmin Hou,
Ke Ding,
Daniel Fleischer,
Peter Izsak,
Moshe Wasserblat,
Danqi Chen
Abstract:
There have been many benchmarks for evaluating long-context language models (LCLMs), but developers often rely on synthetic tasks like needle-in-a-haystack (NIAH) or arbitrary subsets of tasks. It remains unclear whether they translate to the diverse downstream applications of LCLMs, and the inconsistency further complicates model comparison. We investigate the underlying reasons behind current pr…
▽ More
There have been many benchmarks for evaluating long-context language models (LCLMs), but developers often rely on synthetic tasks like needle-in-a-haystack (NIAH) or arbitrary subsets of tasks. It remains unclear whether they translate to the diverse downstream applications of LCLMs, and the inconsistency further complicates model comparison. We investigate the underlying reasons behind current practices and find that existing benchmarks often provide noisy signals due to low coverage of applications, insufficient lengths, unreliable metrics, and incompatibility with base models. In this work, we present HELMET (How to Evaluate Long-context Models Effectively and Thoroughly), a comprehensive benchmark encompassing seven diverse, application-centric categories. We also address many issues in previous benchmarks by adding controllable lengths up to 128k tokens, model-based evaluation for reliable metrics, and few-shot prompting for robustly evaluating base models. Consequently, we demonstrate that HELMET offers more reliable and consistent rankings of frontier LCLMs. Through a comprehensive study of 51 LCLMs, we find that (1) synthetic tasks like NIAH are not good predictors of downstream performance; (2) the diverse categories in HELMET exhibit distinct trends and low correlation with each other; and (3) while most LCLMs achieve perfect NIAH scores, open-source models significantly lag behind closed ones when the task requires full-context reasoning or following complex instructions -- the gap widens with increased lengths. Finally, we recommend using our RAG tasks for fast model development, as they are easy to run and more predictive of other downstream performance; ultimately, we advocate for a holistic evaluation across diverse tasks.
△ Less
Submitted 10 October, 2024; v1 submitted 3 October, 2024;
originally announced October 2024.
-
Asymptotic stability of the composite wave of rarefaction wave and contact wave to nonlinear viscoelasticity model with non-convex flux
Authors:
Zhenhua Guo,
Meichen Hou,
Guiqin Qiu,
Lingda Xu
Abstract:
In this paper, we consider the wave propagations of viscoelastic materials, which has been derived by Taiping-Liu to approximate the viscoelastic dynamic system with fading memory (see [T.P.Liu(1988)\cite{LiuTP}]) by the Chapman-Enskog expansion. By constructing a set of linear diffusion waves coupled with the high-order diffusion waves to achieve cancellations to approximate the viscous contact w…
▽ More
In this paper, we consider the wave propagations of viscoelastic materials, which has been derived by Taiping-Liu to approximate the viscoelastic dynamic system with fading memory (see [T.P.Liu(1988)\cite{LiuTP}]) by the Chapman-Enskog expansion. By constructing a set of linear diffusion waves coupled with the high-order diffusion waves to achieve cancellations to approximate the viscous contact wave well and explicit expressions, the nonlinear stability of the composite wave is obtained by a continuum argument.
It emphasis that, the stress function in our paper is a general non-convex function, which leads to several essential differences from strictly hyperbolic systems such as the Euler system. Our method is completely new and can be applied to more general systems and a new weighted Poincaré type of inequality is established, which is more challenging compared to the convex case and this inequality plays an important role in studying systems with non-convex flux.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation
Authors:
Wentao Xiao,
Yueyang Zhan,
Rui Xi,
Mengshu Hou,
Jianming Liao
Abstract:
The approximate nearest neighbor search (ANNS) is a fundamental and essential component in data mining and information retrieval, with graph-based methodologies demonstrating superior performance compared to alternative approaches. Extensive research efforts have been dedicated to improving search efficiency by developing various graph-based indices, such as HNSW (Hierarchical Navigable Small Worl…
▽ More
The approximate nearest neighbor search (ANNS) is a fundamental and essential component in data mining and information retrieval, with graph-based methodologies demonstrating superior performance compared to alternative approaches. Extensive research efforts have been dedicated to improving search efficiency by developing various graph-based indices, such as HNSW (Hierarchical Navigable Small World). However, the performance of HNSW and most graph-based indices become unacceptable when faced with a large number of real-time deletions, insertions, and updates. Furthermore, during update operations, HNSW can result in some data points becoming unreachable, a situation we refer to as the `unreachable points phenomenon'. This phenomenon could significantly affect the search accuracy of the graph in certain situations.
To address these issues, we present efficient measures to overcome the shortcomings of HNSW, specifically addressing poor performance over long periods of delete and update operations and resolving the issues caused by the unreachable points phenomenon. Our proposed MN-RU algorithm effectively improves update efficiency and suppresses the growth rate of unreachable points, ensuring better overall performance and maintaining the integrity of the graph. Our results demonstrate that our methods outperform existing approaches. Furthermore, since our methods are based on HNSW, they can be easily integrated with existing indices widely used in the industrial field, making them practical for future real-world applications. Code is available at \url{https://github.com/xwt1/MN-RU.git}
△ Less
Submitted 15 July, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
Boundedness of weak solutions to degenerate Kolmogorov equations of hypoelliptic type in bounded domains
Authors:
Mingyi Hou
Abstract:
We establish the boundedness of weak subsolutions for a class of degenerate Kolmogorov equations of hypoelliptic type, compatible with a homogeneous Lie group structure, within bounded product domains using the De Giorgi iteration. We employ the renormalization formula to handle boundary values and provide energy estimates. An $L^1$-$L^p$ type embedding estimate derived from the fundamental soluti…
▽ More
We establish the boundedness of weak subsolutions for a class of degenerate Kolmogorov equations of hypoelliptic type, compatible with a homogeneous Lie group structure, within bounded product domains using the De Giorgi iteration. We employ the renormalization formula to handle boundary values and provide energy estimates. An $L^1$-$L^p$ type embedding estimate derived from the fundamental solution is utilized to incorporate lower-order divergence terms. This work naturally extends the boundedness theory for uniformly parabolic equations, with matching exponents for the coefficients.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Applications of Deep Learning parameterization of Ocean Momentum Forcing
Authors:
Guosong Wang,
Min Hou,
Xinrong Wu,
Xidong Wang,
Zhigang Gao,
Hongli Fu,
Bo Dan,
Chunjian Sun,
Xiaoshuang Zhang
Abstract:
Mesoscale eddies are of utmost importance in understanding ocean dynamics and the transport of heat, salt, and nutrients. Accurate representation of these eddies in ocean models is essential for improving model predictions. However, accurately representing these mesoscale features in numerical models is challenging due to their relatively small size. In this study, we propose a convolutional neura…
▽ More
Mesoscale eddies are of utmost importance in understanding ocean dynamics and the transport of heat, salt, and nutrients. Accurate representation of these eddies in ocean models is essential for improving model predictions. However, accurately representing these mesoscale features in numerical models is challenging due to their relatively small size. In this study, we propose a convolutional neural network (CNN) that combines data-driven techniques with physical principles to develop a robust and interpretable parameterization scheme for mesoscale eddies in ocean modeling. We first analyze a high-resolution reanalysis dataset to extract subgrid eddy momentum and use machine learning algorithms to identify patterns and correlations. To ensure physical consistency, we have introduced conservation of momentum constraints in our CNN parameterization scheme through soft and hard constraints. The interpretability analysis illustrate that the pre-trained CNN parameterization shows promising results in accurately solving the resolved mean velocity at the local scale and effectively capturing the representation of unresolved subgrid turbulence processes at the global scale. Furthermore, to validate the CNN parameterization scheme offline, we conduct simulations using the MITgcm ocean model. A series of experiments is conducted to compare the performance of the model with the CNN parameterization scheme and high-resolution simulations. The offline validation using MITgcm simulations demonstrates the effectiveness of the CNN parameterization scheme in improving the representation of mesoscale eddies in the ocean model. Incorporating the CNN parameterization scheme leads to better agreement with high-resolution simulations and a more accurate representation of the kinetic energy spectra.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
"Give Me an Example Like This": Episodic Active Reinforcement Learning from Demonstrations
Authors:
Muhan Hou,
Koen Hindriks,
A. E. Eiben,
Kim Baraka
Abstract:
Reinforcement Learning (RL) has achieved great success in sequential decision-making problems, but often at the cost of a large number of agent-environment interactions. To improve sample efficiency, methods like Reinforcement Learning from Expert Demonstrations (RLED) introduce external expert demonstrations to facilitate agent exploration during the learning process. In practice, these demonstra…
▽ More
Reinforcement Learning (RL) has achieved great success in sequential decision-making problems, but often at the cost of a large number of agent-environment interactions. To improve sample efficiency, methods like Reinforcement Learning from Expert Demonstrations (RLED) introduce external expert demonstrations to facilitate agent exploration during the learning process. In practice, these demonstrations, which are often collected from human users, are costly and hence often constrained to a limited amount. How to select the best set of human demonstrations that is most beneficial for learning therefore becomes a major concern. This paper presents EARLY (Episodic Active Learning from demonstration querY), an algorithm that enables a learning agent to generate optimized queries of expert demonstrations in a trajectory-based feature space. Based on a trajectory-level estimate of uncertainty in the agent's current policy, EARLY determines the optimized timing and content for feature-based queries. By querying episodic demonstrations as opposed to isolated state-action pairs, EARLY improves the human teaching experience and achieves better learning performance. We validate the effectiveness of our method in three simulated navigation tasks of increasing difficulty. The results show that our method is able to achieve expert-level performance for all three tasks with convergence over 30\% faster than other baseline methods when demonstrations are generated by simulated oracle policies. The results of a follow-up pilot user study (N=18) further validate that our method can still maintain a significantly better convergence in the case of human expert demonstrators while achieving a better user experience in perceived task load and consuming significantly less human time.
△ Less
Submitted 2 October, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering
Authors:
Hongyu Yang,
Liyang He,
Min Hou,
Shuanghong Shen,
Rui Li,
Jiahui Hou,
Jianhui Ma,
Junda Zhao
Abstract:
Code Community Question Answering (CCQA) seeks to tackle programming-related issues, thereby boosting productivity in both software engineering and academic research. Recent advancements in Reinforcement Learning from Human Feedback (RLHF) have transformed the fine-tuning process of Large Language Models (LLMs) to produce responses that closely mimic human behavior. Leveraging LLMs with RLHF for p…
▽ More
Code Community Question Answering (CCQA) seeks to tackle programming-related issues, thereby boosting productivity in both software engineering and academic research. Recent advancements in Reinforcement Learning from Human Feedback (RLHF) have transformed the fine-tuning process of Large Language Models (LLMs) to produce responses that closely mimic human behavior. Leveraging LLMs with RLHF for practical CCQA applications has thus emerged as a promising area of study. Unlike standard code question-answering tasks, CCQA involves multiple possible answers, with varying user preferences for each response. Additionally, code communities often show a preference for new APIs. These challenges prevent LLMs from generating responses that cater to the diverse preferences of users in CCQA tasks. To address these issues, we propose a novel framework called Aligning LLMs through Multi-perspective User Preference Ranking-based Feedback for Programming Question Answering (ALMupQA) to create user-focused responses. Our approach starts with Multi-perspective Preference Ranking Alignment (MPRA), which synthesizes varied user preferences based on the characteristics of answers from code communities. We then introduce a Retrieval-augmented In-context Learning (RIL) module to mitigate the problem of outdated answers by retrieving responses to similar questions from a question bank. Due to the limited availability of high-quality, multi-answer CCQA datasets, we also developed a dataset named StaCCQA from real code communities. Extensive experiments demonstrated the effectiveness of the ALMupQA framework in terms of accuracy and user preference. Compared to the base model, ALMupQA showed nearly an 11% improvement in BLEU, with increases of 20% and 17.5% in BERTScore and CodeBERTScore, respectively.
△ Less
Submitted 27 May, 2024;
originally announced June 2024.
-
Multimodality Invariant Learning for Multimedia-Based New Item Recommendation
Authors:
Haoyue Bai,
Le Wu,
Min Hou,
Miaomiao Cai,
Zhuangzhuang He,
Yuyang Zhou,
Richang Hong,
Meng Wang
Abstract:
Multimedia-based recommendation provides personalized item suggestions by learning the content preferences of users. With the proliferation of digital devices and APPs, a huge number of new items are created rapidly over time. How to quickly provide recommendations for new items at the inference time is challenging. What's worse, real-world items exhibit varying degrees of modality missing(e.g., m…
▽ More
Multimedia-based recommendation provides personalized item suggestions by learning the content preferences of users. With the proliferation of digital devices and APPs, a huge number of new items are created rapidly over time. How to quickly provide recommendations for new items at the inference time is challenging. What's worse, real-world items exhibit varying degrees of modality missing(e.g., many short videos are uploaded without text descriptions). Though many efforts have been devoted to multimedia-based recommendations, they either could not deal with new multimedia items or assumed the modality completeness in the modeling process.
In this paper, we highlight the necessity of tackling the modality missing issue for new item recommendation. We argue that users' inherent content preference is stable and better kept invariant to arbitrary modality missing environments. Therefore, we approach this problem from a novel perspective of invariant learning. However, how to construct environments from finite user behavior training data to generalize any modality missing is challenging. To tackle this issue, we propose a novel Multimodality Invariant Learning reCommendation(a.k.a. MILK) framework. Specifically, MILK first designs a cross-modality alignment module to keep semantic consistency from pretrained multimedia item features. After that, MILK designs multi-modal heterogeneous environments with cyclic mixup to augment training data, in order to mimic any modality missing for invariant user preference learning. Extensive experiments on three real datasets verify the superiority of our proposed framework. The code is available at https://github.com/HaoyueBai98/MILK.
△ Less
Submitted 28 April, 2024;
originally announced May 2024.
-
Weak and Perron's Solutions to Linear Kinetic Fokker-Planck Equations of Divergence Form in Bounded Domains
Authors:
Benny Avelin,
Mingyi Hou
Abstract:
In this paper, we investigate weak solutions and Perron-Wiener-Brelot solutions to the linear kinetic Fokker-Planck equation in bounded domains. We establish the existence of weak solutions by applying the Lions-Lax-Milgram theorem and the vanishing viscosity method in product domains. Additionally, we demonstrate the regularity of weak solutions and establish a strong maximum principle. Furthermo…
▽ More
In this paper, we investigate weak solutions and Perron-Wiener-Brelot solutions to the linear kinetic Fokker-Planck equation in bounded domains. We establish the existence of weak solutions by applying the Lions-Lax-Milgram theorem and the vanishing viscosity method in product domains. Additionally, we demonstrate the regularity of weak solutions and establish a strong maximum principle. Furthermore, we construct a Perron solution and provide examples of barriers in arbitrary bounded domains. Our findings are based on recent advancements in the theory of kinetic Fokker-Planck equations with rough coefficients, particularly focusing on the characterization of a weaker notion of trace and the convolution-translation.
△ Less
Submitted 20 June, 2024; v1 submitted 7 May, 2024;
originally announced May 2024.
-
Long-time behaviour of supercritical finite circular mechanism branching processes
Authors:
Junping Li,
Mixuan Hou
Abstract:
This paper concentrates on the limit behavior of discrete-time branching process with circular mechanism. Three types of limit behaviour of discrete-time branching process with circular mechanism are given explicitly under various moment conditions on branching rates. It is shown that the rate of the first one is geometric, while the other two are supergeometric.
This paper concentrates on the limit behavior of discrete-time branching process with circular mechanism. Three types of limit behaviour of discrete-time branching process with circular mechanism are given explicitly under various moment conditions on branching rates. It is shown that the rate of the first one is geometric, while the other two are supergeometric.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Uncovering an Excess of X-ray Point Sources in the Halos of Virgo Late-type Galaxies
Authors:
Zhensong Hu,
Meicun Hou,
Zhiyuan Li
Abstract:
We present a systematic search for extraplanar X-ray point sources around 19 late-type, highly inclined disk galaxies residing in the Virgo cluster, based on archival Chandra observations reaching a source detection sensitivity of $L\rm(0.5- 8~keV)\sim10^{38}\rm~erg~s^{-1}$. Based on the cumulative source surface density distribution as a function of projected vertical distance from the disk mid-p…
▽ More
We present a systematic search for extraplanar X-ray point sources around 19 late-type, highly inclined disk galaxies residing in the Virgo cluster, based on archival Chandra observations reaching a source detection sensitivity of $L\rm(0.5- 8~keV)\sim10^{38}\rm~erg~s^{-1}$. Based on the cumulative source surface density distribution as a function of projected vertical distance from the disk mid-plane, we identify a statistically significant ($\sim3.3σ$) excess of $\sim20$ X-ray sources within a projected vertical off-disk distance of $0.92'-2.5'$ ($\sim4.4-12\ \mathrm{kpc}$), the presence of which cannot be explained by the bulk stellar content of the individual galaxies, nor by the cosmic X-ray background. On the other hand, there is no significant evidence for an excess of extraplanar X-ray sources in a comparison sample of field late-type edge-on galaxies, for which Chandra observations reaching a similar source detection sensitivity are available. We discuss possible origins for the observed excess, which include low-mass X-ray binaries (LMXBs) associated with globular clusters, supernova-kicked LMXBs, high-mass X-ray binaries born in recent star formation induced by ram pressure stripping of the disk gas, as well as a class of intra-cluster X-ray sources previously identified around early-type member galaxies of Virgo. We find that none of these X-ray populations can naturally dominate the observed extraplanar excess, although supernova-kicked LMXBs and the effect of ram pressure are most likely relevant
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
OceanPlan: Hierarchical Planning and Replanning for Natural Language AUV Piloting in Large-scale Unexplored Ocean Environments
Authors:
Ruochu Yang,
Fumin Zhang,
Mengxue Hou
Abstract:
We develop a hierarchical LLM-task-motion planning and replanning framework to efficiently ground an abstracted human command into tangible Autonomous Underwater Vehicle (AUV) control through enhanced representations of the world. We also incorporate a holistic replanner to provide real-world feedback with all planners for robust AUV operation. While there has been extensive research in bridging t…
▽ More
We develop a hierarchical LLM-task-motion planning and replanning framework to efficiently ground an abstracted human command into tangible Autonomous Underwater Vehicle (AUV) control through enhanced representations of the world. We also incorporate a holistic replanner to provide real-world feedback with all planners for robust AUV operation. While there has been extensive research in bridging the gap between LLMs and robotic missions, they are unable to guarantee success of AUV applications in the vast and unknown ocean environment. To tackle specific challenges in marine robotics, we design a hierarchical planner to compose executable motion plans, which achieves planning efficiency and solution quality by decomposing long-horizon missions into sub-tasks. At the same time, real-time data stream is obtained by a replanner to address environmental uncertainties during plan execution. Experiments validate that our proposed framework delivers successful AUV performance of long-duration missions through natural language piloting.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
An X-ray Census of Active Galactic Nuclei in the Virgo and Fornax Clusters of Galaxies with SRG/eROSITA
Authors:
Meicun Hou,
Zhensong Hu,
Zhiyuan Li
Abstract:
We present a uniform and sensitive X-ray census of active galactic nuclei (AGNs) in the two nearest galaxy clusters, Virgo and Fornax, utilizing the newly released X-ray source catalogs from the first all-sky scan of SRG/eROSITA. A total of 50 and 10 X-ray sources are found positionally coincident with the nuclei of member galaxies in Virgo and Fornax, respectively, down to a 0.2-2.3 keV luminosit…
▽ More
We present a uniform and sensitive X-ray census of active galactic nuclei (AGNs) in the two nearest galaxy clusters, Virgo and Fornax, utilizing the newly released X-ray source catalogs from the first all-sky scan of SRG/eROSITA. A total of 50 and 10 X-ray sources are found positionally coincident with the nuclei of member galaxies in Virgo and Fornax, respectively, down to a 0.2-2.3 keV luminosity of $\sim10^{39}\rm~erg~s^{-1}$ and reaching out to a projected distance well beyond the virial radius of both clusters. The majority of the nuclear X-ray sources are newly identified. There is weak evidence that the nuclear X-ray sources are preferentially found in late-type hosts. Several hosts are dwarf galaxies with a stellar mass below $\sim10^{9}\rm~M_\odot$. We find that contamination by non-nuclear X-ray emission can be neglected in most cases, indicating the dominance of a genuine AGN. In the meantime, no nuclear X-ray source exhibits a luminosity higher than a few times $10^{41}\rm~erg~s^{-1}$. The X-ray AGN occupation rate is only $\sim$ 3% in both clusters, apparently much lower than that in field galaxies inferred from previous X-ray studies. Both aspects suggest that the cluster environment effectively suppresses AGN activity. The findings of this census have important implications on the interplay between galaxies and their central massive black holes in cluster environments.
△ Less
Submitted 11 April, 2024; v1 submitted 11 February, 2024;
originally announced February 2024.
-
Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach
Authors:
Xin Chen,
Mingliang Hou,
Tao Tang,
Achhardeep Kaur,
Feng Xia
Abstract:
With the arrival of the big data era, mobility profiling has become a viable method of utilizing enormous amounts of mobility data to create an intelligent transportation system. Mobility profiling can extract potential patterns in urban traffic from mobility data and is critical for a variety of traffic-related applications. However, due to the high level of complexity and the huge amount of data…
▽ More
With the arrival of the big data era, mobility profiling has become a viable method of utilizing enormous amounts of mobility data to create an intelligent transportation system. Mobility profiling can extract potential patterns in urban traffic from mobility data and is critical for a variety of traffic-related applications. However, due to the high level of complexity and the huge amount of data, mobility profiling faces huge challenges. Digital Twin (DT) technology paves the way for cost-effective and performance-optimised management by digitally creating a virtual representation of the network to simulate its behaviour. In order to capture the complex spatio-temporal features in traffic scenario, we construct alignment diagrams to assist in completing the spatio-temporal correlation representation and design dilated alignment convolution network (DACN) to learn the fine-grained correlations, i.e., spatio-temporal interactions. We propose a digital twin mobility profiling (DTMP) framework to learn node profiles on a mobility network DT model. Extensive experiments have been conducted upon three real-world datasets. Experimental results demonstrate the effectiveness of DTMP.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Learn Once Plan Arbitrarily (LOPA): Attention-Enhanced Deep Reinforcement Learning Method for Global Path Planning
Authors:
Guoming Huang,
Mingxin Hou,
Xiaofang Yuan,
Shuqiao Huang,
Yaonan Wang
Abstract:
Deep reinforcement learning (DRL) methods have recently shown promise in path planning tasks. However, when dealing with global planning tasks, these methods face serious challenges such as poor convergence and generalization. To this end, we propose an attention-enhanced DRL method called LOPA (Learn Once Plan Arbitrarily) in this paper. Firstly, we analyze the reasons of these problems from the…
▽ More
Deep reinforcement learning (DRL) methods have recently shown promise in path planning tasks. However, when dealing with global planning tasks, these methods face serious challenges such as poor convergence and generalization. To this end, we propose an attention-enhanced DRL method called LOPA (Learn Once Plan Arbitrarily) in this paper. Firstly, we analyze the reasons of these problems from the perspective of DRL's observation, revealing that the traditional design causes DRL to be interfered by irrelevant map information. Secondly, we develop the LOPA which utilizes a novel attention-enhanced mechanism to attain an improved attention capability towards the key information of the observation. Such a mechanism is realized by two steps: (1) an attention model is built to transform the DRL's observation into two dynamic views: local and global, significantly guiding the LOPA to focus on the key information on the given maps; (2) a dual-channel network is constructed to process these two views and integrate them to attain an improved reasoning capability. The LOPA is validated via multi-objective global path planning experiments. The result suggests the LOPA has improved convergence and generalization performance as well as great path planning efficiency.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
A unified uncertainty-aware exploration: Combining epistemic and aleatory uncertainty
Authors:
Parvin Malekzadeh,
Ming Hou,
Konstantinos N. Plataniotis
Abstract:
Exploration is a significant challenge in practical reinforcement learning (RL), and uncertainty-aware exploration that incorporates the quantification of epistemic and aleatory uncertainty has been recognized as an effective exploration strategy. However, capturing the combined effect of aleatory and epistemic uncertainty for decision-making is difficult. Existing works estimate aleatory and epis…
▽ More
Exploration is a significant challenge in practical reinforcement learning (RL), and uncertainty-aware exploration that incorporates the quantification of epistemic and aleatory uncertainty has been recognized as an effective exploration strategy. However, capturing the combined effect of aleatory and epistemic uncertainty for decision-making is difficult. Existing works estimate aleatory and epistemic uncertainty separately and consider the composite uncertainty as an additive combination of the two. Nevertheless, the additive formulation leads to excessive risk-taking behavior, causing instability. In this paper, we propose an algorithm that clarifies the theoretical connection between aleatory and epistemic uncertainty, unifies aleatory and epistemic uncertainty estimation, and quantifies the combined effect of both uncertainties for a risk-sensitive exploration. Our method builds on a novel extension of distributional RL that estimates a parameterized return distribution whose parameters are random variables encoding epistemic uncertainty. Experimental results on tasks with exploration and risk challenges show that our method outperforms alternative approaches.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
X-Ray Constraints on the Hot Gaseous Corona of Edge-on Late-type Galaxies in Virgo
Authors:
Meicun Hou,
Lin He,
Zhensong Hu,
Zhiyuan Li,
Christine Jones,
William Forman,
Yuanyuan Su,
Jing Wang,
Luis C. Ho
Abstract:
We present a systematic study of the putative hot gas corona around late-type galaxies (LTGs) residing in the Virgo cluster, based on archival Chandra observations. Our sample consists of 21 nearly edge-on galaxies representing a star formation rate (SFR) range of ($0.2-3\rm~M_\odot~yr^{-1}$) a stellar mass ($M_*$) range of $(0.2-10) \times 10^{10}\rm~M_{\odot}$, the majority of which have not bee…
▽ More
We present a systematic study of the putative hot gas corona around late-type galaxies (LTGs) residing in the Virgo cluster, based on archival Chandra observations. Our sample consists of 21 nearly edge-on galaxies representing a star formation rate (SFR) range of ($0.2-3\rm~M_\odot~yr^{-1}$) a stellar mass ($M_*$) range of $(0.2-10) \times 10^{10}\rm~M_{\odot}$, the majority of which have not been explored with high-sensitivity X-ray observations so far. Significant extraplanar diffuse X-ray (0.5-2 keV) emission is detected in only three LTGs, which are also the three galaxies with the highest SFR. A stacking analysis is performed for the remaining galaxies without individual detection, dividing the whole sample into two subsets based on SFR, stellar mass, or specific SFR. Only the high-SFR bin yields a significant detection, which has a value of $L\rm_X \sim3\times10^{38}\rm~erg~s^{-1}$ per galaxy. The stacked extraplanar X-ray signals of the Virgo LTGs are consistent with the empirical $L\rm_X - SFR$ and $L\rm_X - M_*$ relations found among highly inclined disk galaxies in the field, but appear to be systematically lower than that of a comparison sample of simulated cluster star-formation galaxies identified from the Illustris-TNG100 simulation. The apparent paucity of hot gas coronae in the sampled Virgo LTGs might be understood as the net outcome of the long-lasting effect of ram pressure stripping exerted by the hot intra-cluster medium and in-disk star-forming activity acting on shorter timescales. A better understanding of the roles of environmental effects in regulating the hot gas content of cluster galaxies invites sensitive X-ray observations for a large galaxy sample.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
A manometric feature descriptor with linear-SVM to distinguish esophageal contraction vigor
Authors:
Jialin Liu,
Lu Yan,
Xiaowei Liu,
Yuzhuo Dai,
Fanggen Lu,
Yuanting Ma,
Muzhou Hou,
Zheng Wang
Abstract:
n clinical, if a patient presents with nonmechanical obstructive dysphagia, esophageal chest pain, and gastro esophageal reflux symptoms, the physician will usually assess the esophageal dynamic function. High-resolution manometry (HRM) is a clinically commonly used technique for detection of esophageal dynamic function comprehensively and objectively. However, after the results of HRM are obtaine…
▽ More
n clinical, if a patient presents with nonmechanical obstructive dysphagia, esophageal chest pain, and gastro esophageal reflux symptoms, the physician will usually assess the esophageal dynamic function. High-resolution manometry (HRM) is a clinically commonly used technique for detection of esophageal dynamic function comprehensively and objectively. However, after the results of HRM are obtained, doctors still need to evaluate by a variety of parameters. This work is burdensome, and the process is complex. We conducted image processing of HRM to predict the esophageal contraction vigor for assisting the evaluation of esophageal dynamic function. Firstly, we used Feature-Extraction and Histogram of Gradients (FE-HOG) to analyses feature of proposal of swallow (PoS) to further extract higher-order features. Then we determine the classification of esophageal contraction vigor normal, weak and failed by using linear-SVM according to these features. Our data set includes 3000 training sets, 500 validation sets and 411 test sets. After verification our accuracy reaches 86.83%, which is higher than other common machine learning methods.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
DINO-Mix: Enhancing Visual Place Recognition with Foundational Vision Model and Feature Mixing
Authors:
Gaoshuang Huang,
Yang Zhou,
Xiaofei Hu,
Chenglong Zhang,
Luying Zhao,
Wenjian Gan,
Mingbo Hou
Abstract:
Utilizing visual place recognition (VPR) technology to ascertain the geographical location of publicly available images is a pressing issue for real-world VPR applications. Although most current VPR methods achieve favorable results under ideal conditions, their performance in complex environments, characterized by lighting variations, seasonal changes, and occlusions caused by moving objects, is…
▽ More
Utilizing visual place recognition (VPR) technology to ascertain the geographical location of publicly available images is a pressing issue for real-world VPR applications. Although most current VPR methods achieve favorable results under ideal conditions, their performance in complex environments, characterized by lighting variations, seasonal changes, and occlusions caused by moving objects, is generally unsatisfactory. In this study, we utilize the DINOv2 model as the backbone network for trimming and fine-tuning to extract robust image features. We propose a novel VPR architecture called DINO-Mix, which combines a foundational vision model with feature aggregation. This architecture relies on the powerful image feature extraction capabilities of foundational vision models. We employ an MLP-Mixer-based mix module to aggregate image features, resulting in globally robust and generalizable descriptors that enable high-precision VPR. We experimentally demonstrate that the proposed DINO-Mix architecture significantly outperforms current state-of-the-art (SOTA) methods. In test sets having lighting variations, seasonal changes, and occlusions (Tokyo24/7, Nordland, SF-XL-Testv1), our proposed DINO-Mix architecture achieved Top-1 accuracy rates of 91.75%, 80.18%, and 82%, respectively. Compared with SOTA methods, our architecture exhibited an average accuracy improvement of 5.14%.
△ Less
Submitted 5 December, 2023; v1 submitted 31 October, 2023;
originally announced November 2023.
-
Students' Perspective on AI Code Completion: Benefits and Challenges
Authors:
Wannita Takerngsaksiri,
Cleshan Warusavitarne,
Christian Yaacoub,
Matthew Hee Keng Hou,
Chakkrit Tantithamthavorn
Abstract:
AI Code Completion (e.g., GitHub's Copilot) has revolutionized how computer science students interact with programming languages. However, AI code completion has been studied from the developers' perspectives, not the students' perspectives who represent the future generation of our digital world. In this paper, we investigated the benefits, challenges, and expectations of AI code completion from…
▽ More
AI Code Completion (e.g., GitHub's Copilot) has revolutionized how computer science students interact with programming languages. However, AI code completion has been studied from the developers' perspectives, not the students' perspectives who represent the future generation of our digital world. In this paper, we investigated the benefits, challenges, and expectations of AI code completion from students' perspectives. To facilitate the study, we first developed an open-source Visual Studio Code Extension tool AutoAurora, powered by a state-of-the-art large language model StarCoder, as an AI code completion research instrument. Next, we conduct an interview study with ten student participants and apply grounded theory to help analyze insightful findings regarding the benefits, challenges, and expectations of students on AI code completion. Our findings show that AI code completion enhanced students' productivity and efficiency by providing correct syntax suggestions, offering alternative solutions, and functioning as a coding tutor. However, the over-reliance on AI code completion may lead to a surface-level understanding of programming concepts, diminishing problem-solving skills and restricting creativity. In the future, AI code completion should be explainable and provide best coding practices to enhance the education process.
△ Less
Submitted 31 May, 2024; v1 submitted 31 October, 2023;
originally announced November 2023.
-
Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning
Authors:
Parvin Malekzadeh,
Ming Hou,
Konstantinos N. Plataniotis
Abstract:
Sample efficiency is central to developing practical reinforcement learning (RL) for complex and large-scale decision-making problems. The ability to transfer and generalize knowledge gained from previous experiences to downstream tasks can significantly improve sample efficiency. Recent research indicates that successor feature (SF) RL algorithms enable knowledge generalization between tasks with…
▽ More
Sample efficiency is central to developing practical reinforcement learning (RL) for complex and large-scale decision-making problems. The ability to transfer and generalize knowledge gained from previous experiences to downstream tasks can significantly improve sample efficiency. Recent research indicates that successor feature (SF) RL algorithms enable knowledge generalization between tasks with different rewards but identical transition dynamics. It has recently been hypothesized that combining model-based (MB) methods with SF algorithms can alleviate the limitation of fixed transition dynamics. Furthermore, uncertainty-aware exploration is widely recognized as another appealing approach for improving sample efficiency. Putting together two ideas of hybrid model-based successor feature (MB-SF) and uncertainty leads to an approach to the problem of sample efficient uncertainty-aware knowledge transfer across tasks with different transition dynamics or/and reward functions. In this paper, the uncertainty of the value of each action is approximated by a Kalman filter (KF)-based multiple-model adaptive estimation. This KF-based framework treats the parameters of a model as random variables. To the best of our knowledge, this is the first attempt at formulating a hybrid MB-SF algorithm capable of generalizing knowledge across large or continuous state space tasks with various transition dynamics while requiring less computation at decision time than MB methods. The number of samples required to learn the tasks was compared to recent SF and MB baselines. The results show that our algorithm generalizes its knowledge across different transition dynamics, learns downstream tasks with significantly fewer samples than starting from scratch, and outperforms existing approaches.
△ Less
Submitted 22 July, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking
Authors:
Yang Yu,
Qi Liu,
Kai Zhang,
Yuren Zhang,
Chao Song,
Min Hou,
Yuqing Yuan,
Zhihao Ye,
Zaixi Zhang,
Sanshi Lei Yu
Abstract:
User modeling, which aims to capture users' characteristics or interests, heavily relies on task-specific labeled data and suffers from the data sparsity issue. Several recent studies tackled this problem by pre-training the user model on massive user behavior sequences with a contrastive learning task. Generally, these methods assume different views of the same behavior sequence constructed via d…
▽ More
User modeling, which aims to capture users' characteristics or interests, heavily relies on task-specific labeled data and suffers from the data sparsity issue. Several recent studies tackled this problem by pre-training the user model on massive user behavior sequences with a contrastive learning task. Generally, these methods assume different views of the same behavior sequence constructed via data augmentation are semantically consistent, i.e., reflecting similar characteristics or interests of the user, and thus maximizing their agreement in the feature space. However, due to the diverse interests and heavy noise in user behaviors, existing augmentation methods tend to lose certain characteristics of the user or introduce noisy behaviors. Thus, forcing the user model to directly maximize the similarity between the augmented views may result in a negative transfer. To this end, we propose to replace the contrastive learning task with a new pretext task: Augmentation-Adaptive SelfSupervised Ranking (AdaptSSR), which alleviates the requirement of semantic consistency between the augmented views while pre-training a discriminative user model. Specifically, we adopt a multiple pairwise ranking loss which trains the user model to capture the similarity orders between the implicitly augmented view, the explicitly augmented view, and views from other users. We further employ an in-batch hard negative sampling strategy to facilitate model training. Moreover, considering the distinct impacts of data augmentation on different behavior sequences, we design an augmentation-adaptive fusion mechanism to automatically adjust the similarity order constraint applied to each sample based on the estimated similarity between the augmented views. Extensive experiments on both public and industrial datasets with six downstream tasks verify the effectiveness of AdaptSSR.
△ Less
Submitted 24 October, 2023; v1 submitted 14 October, 2023;
originally announced October 2023.
-
OceanChat: Piloting Autonomous Underwater Vehicles in Natural Language
Authors:
Ruochu Yang,
Mengxue Hou,
Junkai Wang,
Fumin Zhang
Abstract:
In the trending research of fusing Large Language Models (LLMs) and robotics, we aim to pave the way for innovative development of AI systems that can enable Autonomous Underwater Vehicles (AUVs) to seamlessly interact with humans in an intuitive manner. We propose OceanChat, a system that leverages a closed-loop LLM-guided task and motion planning framework to tackle AUV missions in the wild. LLM…
▽ More
In the trending research of fusing Large Language Models (LLMs) and robotics, we aim to pave the way for innovative development of AI systems that can enable Autonomous Underwater Vehicles (AUVs) to seamlessly interact with humans in an intuitive manner. We propose OceanChat, a system that leverages a closed-loop LLM-guided task and motion planning framework to tackle AUV missions in the wild. LLMs translate an abstract human command into a high-level goal, while a task planner further grounds the goal into a task sequence with logical constraints. To assist the AUV with understanding the task sequence, we utilize a motion planner to incorporate real-time Lagrangian data streams received by the AUV, thus mapping the task sequence into an executable motion plan. Considering the highly dynamic and partially known nature of the underwater environment, an event-triggered replanning scheme is developed to enhance the system's robustness towards uncertainty. We also build a simulation platform HoloEco that generates photo-realistic simulation for a wide range of AUV applications. Experimental evaluation verifies that the proposed system can achieve improved performance in terms of both success rate and computation time. Project website: \url{https://sites.google.com/view/oceanchat}
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Decoding Layer Saliency in Language Transformers
Authors:
Elizabeth M. Hou,
Gregory Castanon
Abstract:
In this paper, we introduce a strategy for identifying textual saliency in large-scale language models applied to classification tasks. In visual networks where saliency is more well-studied, saliency is naturally localized through the convolutional layers of the network; however, the same is not true in modern transformer-stack networks used to process natural language. We adapt gradient-based sa…
▽ More
In this paper, we introduce a strategy for identifying textual saliency in large-scale language models applied to classification tasks. In visual networks where saliency is more well-studied, saliency is naturally localized through the convolutional layers of the network; however, the same is not true in modern transformer-stack networks used to process natural language. We adapt gradient-based saliency methods for these networks, propose a method for evaluating the degree of semantic coherence of each layer, and demonstrate consistent improvement over numerous other methods for textual saliency on multiple benchmark classification datasets. Our approach requires no additional training or access to labelled data, and is comparatively very computationally efficient.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Vanishing viscosity limit to the planar rarefaction wave with vacuum for 3-D full compressible Navier-Stokes equations with temperature-dependent transport coefficients
Authors:
Meichen Hou,
Lingjun Liu,
Shu Wang,
Lingda Xu
Abstract:
In this paper, we construct a family of global-in-time solutions of the 3-D full compressible Navier-Stokes (N-S) equations with temperature-dependent transport coefficients (including viscosity and heat-conductivity), and show that at arbitrary times {and arbitrary strength} this family of solutions converges to planar rarefaction waves connected to the vacuum as the viscosity vanishes in the sen…
▽ More
In this paper, we construct a family of global-in-time solutions of the 3-D full compressible Navier-Stokes (N-S) equations with temperature-dependent transport coefficients (including viscosity and heat-conductivity), and show that at arbitrary times {and arbitrary strength} this family of solutions converges to planar rarefaction waves connected to the vacuum as the viscosity vanishes in the sense of $L^\infty(\R^3)$. We consider the Cauchy problem in $\R^3$ with perturbations of the infinite global norm, particularly, periodic perturbations. To deal with the infinite oscillation, we construct a suitable ansatz carrying this periodic oscillation such that the difference between the solution and the ansatz belongs to some Sobolev space and thus the energy method is feasible. The novelty of this paper is that the viscosity and heat-conductivity are temperature-dependent and degeneracies caused by vacuum. Thus the a priori assumptions and two Gagliardo-Nirenberg type inequalities are essentially used. Next, more careful energy estimates are carried out in this paper, by studying the zero and non-zero modes of the solutions, we obtain not only the convergence rate concerning the viscosity and heat conductivity coefficients but also the exponential time decay rate for the non-zero mode.
△ Less
Submitted 23 February, 2024; v1 submitted 6 August, 2023;
originally announced August 2023.
-
High-dimensional Optimal Density Control with Wasserstein Metric Matching
Authors:
Shaojun Ma,
Mengxue Hou,
Xiaojing Ye,
Haomin Zhou
Abstract:
We present a novel computational framework for density control in high-dimensional state spaces. The considered dynamical system consists of a large number of indistinguishable agents whose behaviors can be collectively modeled as a time-evolving probability distribution. The goal is to steer the agents from an initial distribution to reach (or approximate) a given target distribution within a fix…
▽ More
We present a novel computational framework for density control in high-dimensional state spaces. The considered dynamical system consists of a large number of indistinguishable agents whose behaviors can be collectively modeled as a time-evolving probability distribution. The goal is to steer the agents from an initial distribution to reach (or approximate) a given target distribution within a fixed time horizon at minimum cost. To tackle this problem, we propose to model the drift as a nonlinear reduced-order model, such as a deep network, and enforce the matching to the target distribution at terminal time either strictly or approximately using the Wasserstein metric. The resulting saddle-point problem can be solved by an effective numerical algorithm that leverages the excellent representation power of deep networks and fast automatic differentiation for this challenging high-dimensional control problem. A variety of numerical experiments were conducted to demonstrate the performance of our method.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Landslide Surface Displacement Prediction Based on VSXC-LSTM Algorithm
Authors:
Menglin Kong,
Ruichen Li,
Fan Liu,
Xingquan Li,
Juan Cheng,
Muzhou Hou,
Cong Cao
Abstract:
Landslide is a natural disaster that can easily threaten local ecology, people's lives and property. In this paper, we conduct modelling research on real unidirectional surface displacement data of recent landslides in the research area and propose a time series prediction framework named VMD-SegSigmoid-XGBoost-ClusterLSTM (VSXC-LSTM) based on variational mode decomposition, which can predict the…
▽ More
Landslide is a natural disaster that can easily threaten local ecology, people's lives and property. In this paper, we conduct modelling research on real unidirectional surface displacement data of recent landslides in the research area and propose a time series prediction framework named VMD-SegSigmoid-XGBoost-ClusterLSTM (VSXC-LSTM) based on variational mode decomposition, which can predict the landslide surface displacement more accurately. The model performs well on the test set. Except for the random item subsequence that is hard to fit, the root mean square error (RMSE) and the mean absolute percentage error (MAPE) of the trend item subsequence and the periodic item subsequence are both less than 0.1, and the RMSE is as low as 0.006 for the periodic item prediction module based on XGBoost\footnote{Accepted in ICANN2023}.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
DEPHN: Different Expression Parallel Heterogeneous Network using virtual gradient optimization for Multi-task Learning
Authors:
Menglin Kong,
Ri Su,
Shaojie Zhao,
Muzhou Hou
Abstract:
Recommendation system algorithm based on multi-task learning (MTL) is the major method for Internet operators to understand users and predict their behaviors in the multi-behavior scenario of platform. Task correlation is an important consideration of MTL goals, traditional models use shared-bottom models and gating experts to realize shared representation learning and information differentiation.…
▽ More
Recommendation system algorithm based on multi-task learning (MTL) is the major method for Internet operators to understand users and predict their behaviors in the multi-behavior scenario of platform. Task correlation is an important consideration of MTL goals, traditional models use shared-bottom models and gating experts to realize shared representation learning and information differentiation. However, The relationship between real-world tasks is often more complex than existing methods do not handle properly sharing information. In this paper, we propose an Different Expression Parallel Heterogeneous Network (DEPHN) to model multiple tasks simultaneously. DEPHN constructs the experts at the bottom of the model by using different feature interaction methods to improve the generalization ability of the shared information flow. In view of the model's differentiating ability for different task information flows, DEPHN uses feature explicit mapping and virtual gradient coefficient for expert gating during the training process, and adaptively adjusts the learning intensity of the gated unit by considering the difference of gating values and task correlation. Extensive experiments on artificial and real-world datasets demonstrate that our proposed method can capture task correlation in complex situations and achieve better performance than baseline models\footnote{Accepted in IJCNN2023}.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks
Authors:
Menglin Kong,
Shaojie Zhao,
Juan Cheng,
Xingquan Li,
Ri Su,
Muzhou Hou,
Cong Cao
Abstract:
There are two fundamental problems in applying deep learning/machine learning methods to disease classification tasks, one is the insufficient number and poor quality of training samples; another one is how to effectively fuse multiple source features and thus train robust classification models. To address these problems, inspired by the process of human learning knowledge, we propose the Feature-…
▽ More
There are two fundamental problems in applying deep learning/machine learning methods to disease classification tasks, one is the insufficient number and poor quality of training samples; another one is how to effectively fuse multiple source features and thus train robust classification models. To address these problems, inspired by the process of human learning knowledge, we propose the Feature-aware Fusion Correlation Neural Network (FaFCNN), which introduces a feature-aware interaction module and a feature alignment module based on domain adversarial learning. This is a general framework for disease classification, and FaFCNN improves the way existing methods obtain sample correlation features. The experimental results show that training using augmented features obtained by pre-training gradient boosting decision tree yields more performance gains than random-forest based methods. On the low-quality dataset with a large amount of missing data in our setup, FaFCNN obtains a consistently optimal performance compared to competitive baselines. In addition, extensive experiments demonstrate the robustness of the proposed method and the effectiveness of each component of the model\footnote{Accepted in IEEE SMC2023}.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Coupled Attention Networks for Multivariate Time Series Anomaly Detection
Authors:
Feng Xia,
Xin Chen,
Shuo Yu,
Mingliang Hou,
Mujie Liu,
Linlin You
Abstract:
Multivariate time series anomaly detection (MTAD) plays a vital role in a wide variety of real-world application domains. Over the past few years, MTAD has attracted rapidly increasing attention from both academia and industry. Many deep learning and graph learning models have been developed for effective anomaly detection in multivariate time series data, which enable advanced applications such a…
▽ More
Multivariate time series anomaly detection (MTAD) plays a vital role in a wide variety of real-world application domains. Over the past few years, MTAD has attracted rapidly increasing attention from both academia and industry. Many deep learning and graph learning models have been developed for effective anomaly detection in multivariate time series data, which enable advanced applications such as smart surveillance and risk management with unprecedented capabilities. Nevertheless, MTAD is facing critical challenges deriving from the dependencies among sensors and variables, which often change over time. To address this issue, we propose a coupled attention-based neural network framework (CAN) for anomaly detection in multivariate time series data featuring dynamic variable relationships. We combine adaptive graph learning methods with graph attention to generate a global-local graph that can represent both global correlations and dynamic local correlations among sensors. To capture inter-sensor relationships and temporal dependencies, a convolutional neural network based on the global-local graph is integrated with a temporal self-attention module to construct a coupled attention module. In addition, we develop a multilevel encoder-decoder architecture that accommodates reconstruction and prediction tasks to better characterize multivariate time series data. Extensive experiments on real-world datasets have been conducted to evaluate the performance of the proposed CAN approach, and the results show that CAN significantly outperforms state-of-the-art baselines.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
A Galerkin type method for kinetic Fokker Planck equations based on Hermite expansions
Authors:
Benny Avelin,
Mingyi Hou,
Kaj Nyström
Abstract:
In this paper, we develop a Galerkin-type approximation, with quantitative error estimates, for weak solutions to the Cauchy problem for kinetic Fokker-Planck equations in the domain $(0, T) \times D \times \mathbb{R}^d$, where $D$ is either $\mathbb{T}^d$ or $\mathbb{R}^d$. Our approach is based on a Hermite expansion in the velocity variable only, with a hyperbolic system that appears as the tru…
▽ More
In this paper, we develop a Galerkin-type approximation, with quantitative error estimates, for weak solutions to the Cauchy problem for kinetic Fokker-Planck equations in the domain $(0, T) \times D \times \mathbb{R}^d$, where $D$ is either $\mathbb{T}^d$ or $\mathbb{R}^d$. Our approach is based on a Hermite expansion in the velocity variable only, with a hyperbolic system that appears as the truncation of the Brinkman hierarchy, as well as ideas from $\href{arXiv:1902.04037v2}{AAMN21}$ and additional energy-type estimates that we have developed. We also establish the regularity of the solution based on the regularity of the initial data and the source term.
△ Less
Submitted 28 September, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
DADIN: Domain Adversarial Deep Interest Network for Cross Domain Recommender Systems
Authors:
Menglin Kong,
Muzhou Hou,
Shaojie Zhao,
Feng Liu,
Ri Su,
Yinghao Chen
Abstract:
Click-Through Rate (CTR) prediction is one of the main tasks of the recommendation system, which is conducted by a user for different items to give the recommendation results. Cross-domain CTR prediction models have been proposed to overcome problems of data sparsity, long tail distribution of user-item interactions, and cold start of items or users. In order to make knowledge transfer from source…
▽ More
Click-Through Rate (CTR) prediction is one of the main tasks of the recommendation system, which is conducted by a user for different items to give the recommendation results. Cross-domain CTR prediction models have been proposed to overcome problems of data sparsity, long tail distribution of user-item interactions, and cold start of items or users. In order to make knowledge transfer from source domain to target domain more smoothly, an innovative deep learning cross-domain CTR prediction model, Domain Adversarial Deep Interest Network (DADIN) is proposed to convert the cross-domain recommendation task into a domain adaptation problem. The joint distribution alignment of two domains is innovatively realized by introducing domain agnostic layers and specially designed loss, and optimized together with CTR prediction loss in a way of adversarial training. It is found that the Area Under Curve (AUC) of DADIN is 0.08% higher than the most competitive baseline on Huawei dataset and is 0.71% higher than its competitors on Amazon dataset, achieving the state-of-the-art results on the basis of the evaluation of this model performance on two real datasets. The ablation study shows that by introducing adversarial method, this model has respectively led to the AUC improvements of 2.34% on Huawei dataset and 16.67% on Amazon dataset.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Real-time Autonomous Glider Navigation Software
Authors:
Ruochu Yang,
Mengxue Hou,
Chad Lembke,
Catherine Edwards,
Fumin Zhang
Abstract:
Underwater gliders are widely utilized for ocean sampling, surveillance, and other various oceanic applications. In the context of complex ocean environments, gliders may yield poor navigation performance due to strong ocean currents, thus requiring substantial human effort during the manual piloting process. To enhance navigation accuracy, we developed a real-time autonomous glider navigation sof…
▽ More
Underwater gliders are widely utilized for ocean sampling, surveillance, and other various oceanic applications. In the context of complex ocean environments, gliders may yield poor navigation performance due to strong ocean currents, thus requiring substantial human effort during the manual piloting process. To enhance navigation accuracy, we developed a real-time autonomous glider navigation software, named GENIoS Python, which generates waypoints based on flow predictions to assist human piloting. The software is designed to closely check glider status, provide customizable experiment settings, utilize lightweight computing resources, offer stably communicate with dockservers, robustly run for extended operation time, and quantitatively compare flow estimates, which add to its value as an autonomous tool for underwater glider navigation.
△ Less
Submitted 20 December, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
A Diverse Population of z ~ 2 ULIRGs Revealed by JWST Imaging
Authors:
J. -S. Huang,
Zi-Jian Li,
Cheng Cheng,
Meicun Hou,
Haojing Yan,
S. P. Willner,
Y. -S. Dai,
X. Z. Zheng,
J. Pan,
D. Rigopoulou,
T. Wang,
Zhiyuan Li,
Piaoran Liang,
A. Esamdin,
G. G. Fazio
Abstract:
Four ultra-luminous infrared galaxies (ULIRGs) observed with JWST/NIRcam in the Cosmos Evolution Early Release Science program offer an unbiased preview of the $z\approx2$ ULIRG population. The objects were originally selected at 24 $μ$m and have strong polycyclic aromatic hydrocarbon emission features observed with Spitzer/IRS. The four objects have similar stellar masses of ${\sim}10^{11}$ M…
▽ More
Four ultra-luminous infrared galaxies (ULIRGs) observed with JWST/NIRcam in the Cosmos Evolution Early Release Science program offer an unbiased preview of the $z\approx2$ ULIRG population. The objects were originally selected at 24 $μ$m and have strong polycyclic aromatic hydrocarbon emission features observed with Spitzer/IRS. The four objects have similar stellar masses of ${\sim}10^{11}$ M$_\odot$ but otherwise are quite diverse. One is an isolated disk galaxy, but it has an active nucleus as shown by X-ray observations and by a bright point-source nucleus. Two others are merging pairs with mass ratios of 6-7:1. One has active nuclei in both components, while the other has only one active nucleus: the one in the less-massive neighbor, not the ULIRG. The fourth object is clumpy and irregular and is probably a merger, but there is no sign of an active nucleus. The intrinsic spectral energy distributions for the four AGNs in these systems are typical of type-2 QSOs. This study is consistent with the idea that even if internal processes can produce large luminosities at $z\sim2$, galaxy merging may still be necessary for the most luminous objects. The diversity of these four initial examples suggests that large samples will be needed to understand the $z\approx2$ ULIRG population.
△ Less
Submitted 6 April, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
A Chandra X-ray Survey of Optically Selected Close Galaxy Pairs: Unexpectedly Low Occupation of Active Galactic Nuclei
Authors:
Lin He,
Meicun Hou,
Zhiyuan Li,
Shuai Feng,
Xin Liu
Abstract:
High-resolution X-ray observations offer a unique tool for probing the still elusive connection between galaxy mergers and active galactic nuclei (AGNs). We present an analysis of nuclear X-ray emission in an optically selected sample of 92 close galaxy pairs (with projected separations $\lesssim 20$ kpc and line-of-sight velocity offsets $<$ 500 km s$^{-1}$) at low redshift ($\bar{z} \sim 0.07$),…
▽ More
High-resolution X-ray observations offer a unique tool for probing the still elusive connection between galaxy mergers and active galactic nuclei (AGNs). We present an analysis of nuclear X-ray emission in an optically selected sample of 92 close galaxy pairs (with projected separations $\lesssim 20$ kpc and line-of-sight velocity offsets $<$ 500 km s$^{-1}$) at low redshift ($\bar{z} \sim 0.07$), based on archival Chandra observations. The parent sample of galaxy pairs is constructed without imposing an optical classification of nuclear activity, thus is largely free of selection effect for or against the presence of an AGN. Nor is this sample biased for or against gas-rich mergers. An X-ray source is detected in 70 of the 184 nuclei, giving a detection rate of $38\%^{+5\%}_{-5\%}$, down to a 0.5-8 keV limiting luminosity of $\lesssim 10^{40}\rm~erg~s^{-1}$. The detected and undetected nuclei show no systematic difference in their host galaxy properties such as galaxy morphology, stellar mass and stellar velocity dispersion. When potential contamination from star formation is avoided (i.e., $L_{\rm 2-10~keV} > 10^{41}\rm~erg~s^{-1}$), the detection rate becomes $18\%^{+3\%}_{-3\%}$ (32/184), which shows no excess compared to the X-ray detection rate of a comparison sample of optically classified single AGNs. The fraction of pairs containing dual AGN is only $2\%^{+2\%}_{-2\%}$. Moreover, most nuclei at the smallest projected separations probed by our sample (a few kpc) have an unexpectedly low apparent X-ray luminosity and Eddington ratio, which cannot be solely explained by circumnuclear obscuration. These findings suggest that close galaxy interaction is not a sufficient condition for triggering a high level of AGN activity.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Nonlinear stability of the composite wave of planar rarefaction waves and planar contact waves for viscous conservation laws with non-convex flux under multi-dimensional periodic perturbations
Authors:
Meichen Hou,
Lingda Xu
Abstract:
In this paper, we study the nonlinear stability of the composite wave consisting of planar rarefaction and planar contact waves for viscous conservation laws with degenerate flux under multi-dimensional periodic perturbations. To the level of our knowledge, it is the first stability result of the composite wave for conservation laws in several dimensions. Moreover, the perturbations studied in the…
▽ More
In this paper, we study the nonlinear stability of the composite wave consisting of planar rarefaction and planar contact waves for viscous conservation laws with degenerate flux under multi-dimensional periodic perturbations. To the level of our knowledge, it is the first stability result of the composite wave for conservation laws in several dimensions. Moreover, the perturbations studied in the present paper are periodic, which keep constantly oscillating at infinity. Suitable ansatz is constructed to overcome the difficulty caused by this kind of perturbation and delicate estimates are done on zero and non-zero modes of perturbations. We obtain satisfactory decay rates for zero modes and exponential decay rates for non-zero modes.
△ Less
Submitted 13 February, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Anomaly Detection of Underwater Gliders Verified by Deployment Data
Authors:
Ruochu Yang,
Mengxue Hou,
Chad Lembke,
Catherine Edwards,
Fumin Zhang
Abstract:
This paper utilizes an anomaly detection algorithm to check if underwater gliders are operating normally in the unknown ocean environment. Glider pilots can be warned of the detected glider anomaly in real time, thus taking over the glider appropriately and avoiding further damage to the glider. The adopted algorithm is validated by two valuable sets of data in real glider deployments, the Univers…
▽ More
This paper utilizes an anomaly detection algorithm to check if underwater gliders are operating normally in the unknown ocean environment. Glider pilots can be warned of the detected glider anomaly in real time, thus taking over the glider appropriately and avoiding further damage to the glider. The adopted algorithm is validated by two valuable sets of data in real glider deployments, the University of South Florida (USF) glider Stella and the Skidaway Institute of Oceanography (SkIO) glider Angus.
△ Less
Submitted 27 December, 2022; v1 submitted 25 December, 2022;
originally announced December 2022.
-
NOEMA Detection of Circumnuclear Molecular Gas in X-ray Weak Dual Active Galactic Nuclei: No Evidence for Heavy Obscuration
Authors:
Meicun Hou,
Zhiyuan Li,
Xin Liu,
Zongnan Li,
Ruancun Li,
Ran Wang,
Jing Wang,
Luis C. Ho
Abstract:
Dual active galactic nuclei (AGN), which are the manifestation of two actively accreting supermassive black holes (SMBHs) hosted by a pair of merging galaxies, are a unique laboratory for studying the physics of SMBH feeding and feedback during an indispensable stage of galaxy evolution. In this work, we present NOEMA CO(2-1) observations of seven kpc-scale dual-AGN candidates drawn from a recent…
▽ More
Dual active galactic nuclei (AGN), which are the manifestation of two actively accreting supermassive black holes (SMBHs) hosted by a pair of merging galaxies, are a unique laboratory for studying the physics of SMBH feeding and feedback during an indispensable stage of galaxy evolution. In this work, we present NOEMA CO(2-1) observations of seven kpc-scale dual-AGN candidates drawn from a recent Chandra survey of low-redshift, optically classified AGN pairs. These systems are selected because they show unexpectedly low 2-10 keV X-ray luminosities for their small physical separations signifying an intermediate-to-late stage of merger. Circumnuclear molecular gas traced by the CO(2-1) emission is significantly detected in 6 of the 7 pairs and 10 of the 14 nuclei, with an estimated mass ranging between $(0.2 - 21) \times10^9\rm~M_{\odot}$. The primary nuclei, i.e., the ones with the higher stellar velocity dispersion, tend to have a higher molecular gas mass than the secondary. Most CO-detected nuclei show a compact morphology, with a velocity field consistent with a kpc-scale rotating structure. The inferred hydrogen column densities range between $5\times10^{21} - 2\times10^{23}\rm~cm^{-2}$, but mostly at a few times $10^{22}\rm~cm^{-2}$, in broad agreement with those derived from X-ray spectral analysis. Together with the relatively weak mid-infrared emission, the moderate column density argues against the prevalence of heavily obscured, intrinsically luminous AGNs in these seven systems, but favors a feedback scenario in which AGN activity triggered by a recent pericentric passage of the galaxy pair can expel circumnuclear gas and suppress further SMBH accretion.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
A strong He II $λ$1640 emitter with extremely blue UV spectral slope at $z=8.16$: presence of Pop III stars?
Authors:
Xin Wang,
Cheng Cheng,
Junqiang Ge,
Xiao-Lei Meng,
Emanuele Daddi,
Haojing Yan,
Zhiyuan Ji,
Yifei Jin,
Tucker Jones,
Matthew A. Malkan,
Pablo Arrabal Haro,
Gabriel Brammer,
Masamune Oguri,
Meicun Hou,
Shiwu Zhang
Abstract:
Cosmic hydrogen reionization and cosmic production of first metals are major phase transitions of the universe occurring during the first billion years after the Big Bang, however these are still underexplored observationally. Using the JWST NIRSpec prism spectroscopy, we report the discovery of a sub-$L_\ast$ galaxy at $z_{\rm spec}=8.1623\pm0.0007$, dubbed RXJ2129-z8HeII, via the detection of a…
▽ More
Cosmic hydrogen reionization and cosmic production of first metals are major phase transitions of the universe occurring during the first billion years after the Big Bang, however these are still underexplored observationally. Using the JWST NIRSpec prism spectroscopy, we report the discovery of a sub-$L_\ast$ galaxy at $z_{\rm spec}=8.1623\pm0.0007$, dubbed RXJ2129-z8HeII, via the detection of a series of strong rest-frame UV/optical nebular emission lines and the clear Lyman break. RXJ2129-z8HeII shows a pronounced UV continuum with an extremely steep (i.e. blue) spectral slope of $β=-2.53_{-0.07}^{+0.06}$, the steepest amongst all spectroscopically confirmed galaxies at $z_{\rm spec}\gtrsim7$, in support of its very hard ionizing spectrum that could lead to a significant leakage of its ionizing flux. Therefore, RXJ2129-z8HeII is representative of the key galaxy population driving the cosmic reionization. More importantly, we detect a strong He II $λ$1640 emission line in its spectrum, one of the highest redshifts at which such a line is robustly detected. Its high rest-frame equivalent width (${\rm EW}=21\pm4$ Angstrom) and extreme flux ratios with respect to UV metal and Balmer lines raise the possibility that part of RXJ2129-z8HeII's stellar populations could be Pop III-like. Through careful photoionization modeling, we show that the physically calibrated phenomenological models of the ionizing spectra of Pop III stars with strong mass loss can successfully reproduce the emission line flux ratios observed in RXJ2129-z8HeII. Assuming the Eddington limit, the total mass of the Pop III stars within this system is estimated to be $7.8\pm1.4\times10^5 M_\odot$. To date, this galaxy presents the most compelling case in the early universe where trace Pop III stars might coexist with metal-enriched populations.
△ Less
Submitted 9 May, 2024; v1 submitted 8 December, 2022;
originally announced December 2022.
-
ViT-CAT: Parallel Vision Transformers with Cross Attention Fusion for Popularity Prediction in MEC Networks
Authors:
Zohreh HajiAkhondi-Meybodi,
Arash Mohammadi,
Ming Hou,
Jamshid Abouei,
Konstantinos N. Plataniotis
Abstract:
Mobile Edge Caching (MEC) is a revolutionary technology for the Sixth Generation (6G) of wireless networks with the promise to significantly reduce users' latency via offering storage capacities at the edge of the network. The efficiency of the MEC network, however, critically depends on its ability to dynamically predict/update the storage of caching nodes with the top-K popular contents. Convent…
▽ More
Mobile Edge Caching (MEC) is a revolutionary technology for the Sixth Generation (6G) of wireless networks with the promise to significantly reduce users' latency via offering storage capacities at the edge of the network. The efficiency of the MEC network, however, critically depends on its ability to dynamically predict/update the storage of caching nodes with the top-K popular contents. Conventional statistical caching schemes are not robust to the time-variant nature of the underlying pattern of content requests, resulting in a surge of interest in using Deep Neural Networks (DNNs) for time-series popularity prediction in MEC networks. However, existing DNN models within the context of MEC fail to simultaneously capture both temporal correlations of historical request patterns and the dependencies between multiple contents. This necessitates an urgent quest to develop and design a new and innovative popularity prediction architecture to tackle this critical challenge. The paper addresses this gap by proposing a novel hybrid caching framework based on the attention mechanism. Referred to as the parallel Vision Transformers with Cross Attention (ViT-CAT) Fusion, the proposed architecture consists of two parallel ViT networks, one for collecting temporal correlation, and the other for capturing dependencies between different contents. Followed by a Cross Attention (CA) module as the Fusion Center (FC), the proposed ViT-CAT is capable of learning the mutual information between temporal and spatial correlations, as well, resulting in improving the classification accuracy, and decreasing the model's complexity about 8 times. Based on the simulation results, the proposed ViT-CAT architecture outperforms its counterparts across the classification accuracy, complexity, and cache-hit ratio.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Multi-Content Time-Series Popularity Prediction with Multiple-Model Transformers in MEC Networks
Authors:
Zohreh HajiAkhondi-Meybodi,
Arash Mohammadi,
Ming Hou,
Elahe Rahimian,
Shahin Heidarian,
Jamshid Abouei,
Konstantinos N. Plataniotis
Abstract:
Coded/uncoded content placement in Mobile Edge Caching (MEC) has evolved as an efficient solution to meet the significant growth of global mobile data traffic by boosting the content diversity in the storage of caching nodes. To meet the dynamic nature of the historical request pattern of multimedia contents, the main focus of recent researches has been shifted to develop data-driven and real-time…
▽ More
Coded/uncoded content placement in Mobile Edge Caching (MEC) has evolved as an efficient solution to meet the significant growth of global mobile data traffic by boosting the content diversity in the storage of caching nodes. To meet the dynamic nature of the historical request pattern of multimedia contents, the main focus of recent researches has been shifted to develop data-driven and real-time caching schemes. In this regard and with the assumption that users' preferences remain unchanged over a short horizon, the Top-K popular contents are identified as the output of the learning model. Most existing datadriven popularity prediction models, however, are not suitable for the coded/uncoded content placement frameworks. On the one hand, in coded/uncoded content placement, in addition to classifying contents into two groups, i.e., popular and nonpopular, the probability of content request is required to identify which content should be stored partially/completely, where this information is not provided by existing data-driven popularity prediction models. On the other hand, the assumption that users' preferences remain unchanged over a short horizon only works for content with a smooth request pattern. To tackle these challenges, we develop a Multiple-model (hybrid) Transformer-based Edge Caching (MTEC) framework with higher generalization ability, suitable for various types of content with different time-varying behavior, that can be adapted with coded/uncoded content placement frameworks. Simulation results corroborate the effectiveness of the proposed MTEC caching framework in comparison to its counterparts in terms of the cache-hit ratio, classification accuracy, and the transferred byte volume.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Chance-Constrained AC Optimal Power Flow for Unbalanced Distribution Grids
Authors:
Kshitij Girigoudar,
Ashley M. Hou,
Line A. Roald
Abstract:
The growing penetration of distributed energy resources (DERs) is leading to continually changing operating conditions, which need to be managed efficiently by distribution grid operators. The intermittent nature of DERs such as solar photovoltaic (PV) systems as well as load forecasting errors not only increase uncertainty in the grid, but also pose significant power quality challenges such as vo…
▽ More
The growing penetration of distributed energy resources (DERs) is leading to continually changing operating conditions, which need to be managed efficiently by distribution grid operators. The intermittent nature of DERs such as solar photovoltaic (PV) systems as well as load forecasting errors not only increase uncertainty in the grid, but also pose significant power quality challenges such as voltage unbalance and voltage magnitude violations. This paper leverages a chance-constrained optimization approach to reduce the impact of uncertainty on distribution grid operation. We first present the chance-constrained optimal power flow (CC-OPF) problem for distribution grids and discuss a reformulation based on constraint tightening that does not require any approximations or relaxations of the three-phase AC power flow equations. We then propose two iterative solution algorithms capable of efficiently solving the reformulation. In the case studies, the performance of both algorithms is analyzed by running simulations on the IEEE 13-bus test feeder using real PV and load measurement data. The simulation results indicate that both methods are able to enforce the chance constraints in in- and out-of-sample evaluations.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Very Large Array Multi-band Radio Imaging of the Triple AGN Candidate SDSS J0849+1114
Authors:
Sijia Peng,
Zhiyuan Li,
Xin Liu,
Kristina Nyland,
Joan M. Wrobel,
Meicun Hou
Abstract:
Kpc-scale triple active galactic nuclei (AGNs), potential precursors of gravitationally-bound triple massive black holes (MBHs), are rarely seen objects and believed to play an important role in the evolution of MBHs and their host galaxies. In this work we present a multi-band (3.0, 6.0 10.0, and 15.0 GHz), high-resolution radio imaging of the triple AGN candidate, SDSS J0849+1114, using the Very…
▽ More
Kpc-scale triple active galactic nuclei (AGNs), potential precursors of gravitationally-bound triple massive black holes (MBHs), are rarely seen objects and believed to play an important role in the evolution of MBHs and their host galaxies. In this work we present a multi-band (3.0, 6.0 10.0, and 15.0 GHz), high-resolution radio imaging of the triple AGN candidate, SDSS J0849+1114, using the Very Large Array. Two of the three nuclei (A and C) are detected at 3.0, 6.0, and 15 GHz for the first time, both exhibiting a steep spectrum over 3--15 GHz (with a spectral index $-0.90 \pm 0.05$ and $-1.03 \pm 0.04$) consistent with a synchrotron origin. Nucleus A, the strongest nucleus among the three, shows a double-sided jet, with the jet orientation changing by $\sim20^{\circ}$ between its inner 1" and the outer 5.5" (8.1 kpc) components, which may be explained as the MBH's angular momentum having been altered by merger-enhanced accretion. Nucleus C also shows a two-sided jet, with the western jet inflating into a radio lobe with an extent of 1.5" (2.2 kpc). The internal energy of the radio lobe is estimated to be $\rm 5.0 \times 10^{55}$ erg, for an equipartition magnetic field strength of $\rm \sim 160\ μG$. No significant radio emission is detected at all four frequencies for nucleus B, yielding an upper limit of 15, 15, 15, and 18 $\rm μJy\ beam^{-1}$ at 3.0, 6.0, 10.0, and 15.0 GHz, based on which we constrain the star formation rate in nucleus B to be $\lesssim 0.4~\rm M_{\odot}~yr^{-1}$.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
PHN: Parallel heterogeneous network with soft gating for CTR prediction
Authors:
Ri Su,
Alphonse Houssou Hounye,
Cong Cao,
Muzhou Hou
Abstract:
The Click-though Rate (CTR) prediction task is a basic task in recommendation system. Most of the previous researches of CTR models built based on Wide \& deep structure and gradually evolved into parallel structures with different modules. However, the simple accumulation of parallel structures can lead to higher structural complexity and longer training time. Based on the Sigmoid activation func…
▽ More
The Click-though Rate (CTR) prediction task is a basic task in recommendation system. Most of the previous researches of CTR models built based on Wide \& deep structure and gradually evolved into parallel structures with different modules. However, the simple accumulation of parallel structures can lead to higher structural complexity and longer training time. Based on the Sigmoid activation function of output layer, the linear addition activation value of parallel structures in the training process is easy to make the samples fall into the weak gradient interval, resulting in the phenomenon of weak gradient, and reducing the effectiveness of training. To this end, this paper proposes a Parallel Heterogeneous Network (PHN) model, which constructs a network with parallel structure through three different interaction analysis methods, and uses Soft Selection Gating (SSG) to feature heterogeneous data with different structure. Finally, residual link with trainable parameters are used in the network to mitigate the influence of weak gradient phenomenon. Furthermore, we demonstrate the effectiveness of PHN in a large number of comparative experiments, and visualize the performance of the model in training process and structure.
△ Less
Submitted 18 June, 2022;
originally announced June 2022.
-
A Chandra Survey of Milky Way Globular Clusters. III. Searching for X-ray Signature of Intermediate-mass Black Holes
Authors:
Zhao Su,
Zhiyuan Li,
Meicun Hou,
Mengfei Zhang,
Zhongqun Cheng
Abstract:
Globular clusters (GCs) are thought to harbor the long-sought population of intermediate-mass black holes (IMBHs). We present a systematic search for a putative IMBH in 81 Milky Way GCs, based on archival Chandra X-ray observations. We find in only six GCs a significant X-ray source positionally coincident with the cluster center, which have 0.5-8 keV luminosities between…
▽ More
Globular clusters (GCs) are thought to harbor the long-sought population of intermediate-mass black holes (IMBHs). We present a systematic search for a putative IMBH in 81 Milky Way GCs, based on archival Chandra X-ray observations. We find in only six GCs a significant X-ray source positionally coincident with the cluster center, which have 0.5-8 keV luminosities between $\sim1\times 10^{30}~{\rm erg~s^{-1}}$ to $\sim 4\times10^{33}~{\rm erg~s^{-1}}$. However, the spectral and temporal properties of these six sources can also be explained in terms of binary stars. The remaining 75 GCs do not have a detectable central source, most with $3σ$ upper limits ranging between $10^{29-32}~{\rm erg~s^{-1}}$ over 0.5-8 keV, which are significantly lower than predicted for canonical Bondi accretion. To help understand the feeble X-ray signature, we perform hydrodynamic simulations of stellar wind accretion onto a $1000~{\rm M_\odot}$ IMBH from the most-bound orbiting star, for stellar wind properties consistent with either a main-sequence (MS) star or an asymptotic giant branch (AGB) star. We find that the synthetic X-ray luminosity for the MS case ($\sim 10^{19}\rm~erg~s^{-1}$) is far below the current X-ray limits. The predicted X-ray luminosity for the AGB case ($\sim 10^{34}\rm~erg~s^{-1}$), on the other hand, is compatible with the detected central X-ray sources, in particular the ones in Terzan 5 and NGC 6652. However, the probability of having an AGB star as the most-bound star around the putative IMBH is very low. Our study strongly suggests that it is very challenging to detect the accretion-induced X-ray emission from IMBHs, even if they were prevalent in present-day GCs.
△ Less
Submitted 19 August, 2022; v1 submitted 1 June, 2022;
originally announced June 2022.
-
JUNO: Jump-Start Reinforcement Learning-based Node Selection for UWB Indoor Localization
Authors:
Zohreh Hajiakhondi-Meybodi,
Ming Hou,
Arash Mohammadi
Abstract:
Ultra-Wideband (UWB) is one of the key technologies empowering the Internet of Thing (IoT) concept to perform reliable, energy-efficient, and highly accurate monitoring, screening, and localization in indoor environments. Performance of UWB-based localization systems, however, can significantly degrade because of Non Line of Sight (NLoS) connections between a mobile user and UWB beacons. To mitiga…
▽ More
Ultra-Wideband (UWB) is one of the key technologies empowering the Internet of Thing (IoT) concept to perform reliable, energy-efficient, and highly accurate monitoring, screening, and localization in indoor environments. Performance of UWB-based localization systems, however, can significantly degrade because of Non Line of Sight (NLoS) connections between a mobile user and UWB beacons. To mitigate the destructive effects of NLoS connections, we target development of a Reinforcement Learning (RL) anchor selection framework that can efficiently cope with the dynamic nature of indoor environments. Existing RL models in this context, however, lack the ability to generalize well to be used in a new setting. Moreover, it takes a long time for the conventional RL models to reach the optimal policy. To tackle these challenges, we propose the Jump-start RL-based Uwb NOde selection (JUNO) framework, which performs real-time location predictions without relying on complex NLoS identification/mitigation methods. The effectiveness of the proposed JUNO framework is evaluated in term of the location error, where the mobile user moves randomly through an ultra-dense indoor environment with a high chance of establishing NLoS connections. Simulation results corroborate the effectiveness of the proposed framework in comparison to its state-of-the-art counterparts.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
AKF-SR: Adaptive Kalman Filtering-based Successor Representation
Authors:
Parvin Malekzadeh,
Mohammad Salimibeni,
Ming Hou,
Arash Mohammadi,
Konstantinos N. Plataniotis
Abstract:
Recent studies in neuroscience suggest that Successor Representation (SR)-based models provide adaptation to changes in the goal locations or reward function faster than model-free algorithms, together with lower computational cost compared to that of model-based algorithms. However, it is not known how such representation might help animals to manage uncertainty in their decision-making. Existing…
▽ More
Recent studies in neuroscience suggest that Successor Representation (SR)-based models provide adaptation to changes in the goal locations or reward function faster than model-free algorithms, together with lower computational cost compared to that of model-based algorithms. However, it is not known how such representation might help animals to manage uncertainty in their decision-making. Existing methods for SR learning do not capture uncertainty about the estimated SR. In order to address this issue, the paper presents a Kalman filter-based SR framework, referred to as Adaptive Kalman Filtering-based Successor Representation (AKF-SR). First, Kalman temporal difference approach, which is a combination of the Kalman filter and the temporal difference method, is used within the AKF-SR framework to cast the SR learning procedure into a filtering problem to benefit from the uncertainty estimation of the SR, and also decreases in memory requirement and sensitivity to model's parameters in comparison to deep neural network-based algorithms. An adaptive Kalman filtering approach is then applied within the proposed AKF-SR framework in order to tune the measurement noise covariance and measurement mapping function of Kalman filter as the most important parameters affecting the filter's performance. Moreover, an active learning method that exploits the estimated uncertainty of the SR to form the behaviour policy leading to more visits to less certain values is proposed to improve the overall performance of an agent in terms of received rewards while interacting with its environment.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
Low-dose CT reconstruction by self-supervised learning in the projection domain
Authors:
Long Zhou,
Xiaozhuang Wang,
Min Hou,
Ping Li,
Chunlong Fu,
Yanjun Ren,
Tingting Shao,
Xi Hu,
Jihong Sun,
Hongwei Ye
Abstract:
In the intention of minimizing excessive X-ray radiation administration to patients, low-dose computed tomography (LDCT) has become a distinct trend in radiology. However, while lowering the radiation dose reduces the risk to the patient, it also increases noise and artifacts, compromising image quality and clinical diagnosis. In most supervised learning methods, paired CT images are required, but…
▽ More
In the intention of minimizing excessive X-ray radiation administration to patients, low-dose computed tomography (LDCT) has become a distinct trend in radiology. However, while lowering the radiation dose reduces the risk to the patient, it also increases noise and artifacts, compromising image quality and clinical diagnosis. In most supervised learning methods, paired CT images are required, but such images are unlikely to be available in the clinic. We present a self-supervised learning model (Noise2Projection) that fully exploits the raw projection images to reduce noise and improve the quality of reconstructed LDCT images. Unlike existing self-supervised algorithms, the proposed method only requires noisy CT projection images and reduces noise by exploiting the correlation between nearby projection images. We trained and tested the model using clinical data and the quantitative and qualitative results suggest that our model can effectively reduce LDCT image noise while also drastically removing artifacts in LDCT images.
△ Less
Submitted 13 March, 2022;
originally announced March 2022.
-
Exploring Human Mobility for Multi-Pattern Passenger Prediction: A Graph Learning Framework
Authors:
Xiangjie Kong,
Kailai Wang,
Mingliang Hou,
Feng Xia,
Gour Karmakar,
Jianxin Li
Abstract:
Traffic flow prediction is an integral part of an intelligent transportation system and thus fundamental for various traffic-related applications. Buses are an indispensable way of moving for urban residents with fixed routes and schedules, which leads to latent travel regularity. However, human mobility patterns, specifically the complex relationships between bus passengers, are deeply hidden in…
▽ More
Traffic flow prediction is an integral part of an intelligent transportation system and thus fundamental for various traffic-related applications. Buses are an indispensable way of moving for urban residents with fixed routes and schedules, which leads to latent travel regularity. However, human mobility patterns, specifically the complex relationships between bus passengers, are deeply hidden in this fixed mobility mode. Although many models exist to predict traffic flow, human mobility patterns have not been well explored in this regard. To reduce this research gap and learn human mobility knowledge from this fixed travel behaviors, we propose a multi-pattern passenger flow prediction framework, MPGCN, based on Graph Convolutional Network (GCN). Firstly, we construct a novel sharing-stop network to model relationships between passengers based on bus record data. Then, we employ GCN to extract features from the graph by learning useful topology information and introduce a deep clustering method to recognize mobility patterns hidden in bus passengers. Furthermore, to fully utilize Spatio-temporal information, we propose GCN2Flow to predict passenger flow based on various mobility patterns. To the best of our knowledge, this paper is the first work to adopt a multipattern approach to predict the bus passenger flow from graph learning. We design a case study for optimizing routes. Extensive experiments upon a real-world bus dataset demonstrate that MPGCN has potential efficacy in passenger flow prediction and route optimization.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
The Rational Selection of Goal Operations and the Integration ofSearch Strategies with Goal-Driven Autonomy
Authors:
Sravya Kondrakunta,
Venkatsampath Raja Gogineni,
Michael T. Cox,
Demetris Coleman,
Xiaobao Tan,
Tony Lin,
Mengxue Hou,
Fumin Zhang,
Frank McQuarrie,
Catherine R. Edwards
Abstract:
Intelligent physical systems as embodied cognitive systems must perform high-level reasoning while concurrently managing an underlying control architecture. The link between cognition and control must manage the problem of converting continuous values from the real world to symbolic representations (and back). To generate effective behaviors, reasoning must include a capacity to replan, acquire an…
▽ More
Intelligent physical systems as embodied cognitive systems must perform high-level reasoning while concurrently managing an underlying control architecture. The link between cognition and control must manage the problem of converting continuous values from the real world to symbolic representations (and back). To generate effective behaviors, reasoning must include a capacity to replan, acquire and update new information, detect and respond to anomalies, and perform various operations on system goals. But, these processes are not independent and need further exploration. This paper examines an agent's choices when multiple goal operations co-occur and interact, and it establishes a method of choosing between them. We demonstrate the benefits and discuss the trade offs involved with this and show positive results in a dynamic marine search task.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.