Skip to main content

Showing 1–50 of 346 results for author: Xiang, Y

  1. arXiv:2410.13099  [pdf

    eess.IV cs.CV

    Adversarial Neural Networks in Medical Imaging Advancements and Challenges in Semantic Segmentation

    Authors: Houze Liu, Bo Zhang, Yanlin Xiang, Yuxiang Hu, Aoran Shen, Yang Lin

    Abstract: Recent advancements in artificial intelligence (AI) have precipitated a paradigm shift in medical imaging, particularly revolutionizing the domain of brain imaging. This paper systematically investigates the integration of deep learning -- a principal branch of AI -- into the semantic segmentation of brain images. Semantic segmentation serves as an indispensable technique for the delineation of di… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  2. arXiv:2410.12543  [pdf, other

    cs.CL cs.AI

    LLM-based Translation Inference with Iterative Bilingual Understanding

    Authors: Andong Chen, Kehai Chen, Yang Xiang, Xuefeng Bai, Muyun Yang, Tiejun Zhao, Min zhang

    Abstract: The remarkable understanding and generation capabilities of large language models (LLMs) have greatly improved translation performance. However, incorrect understanding of the sentence to be translated can degrade translation quality. To address this issue, we proposed a novel Iterative Bilingual Understanding Translation (IBUT) method based on the cross-lingual capabilities of LLMs and the dual c… ▽ More

    Submitted 16 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

    Comments: Work in progress

  3. arXiv:2410.10737  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Online Statistical Inference for Time-varying Sample-averaged Q-learning

    Authors: Saunak Kumar Panda, Ruiqi Liu, Yisha Xiang

    Abstract: Reinforcement learning (RL) has emerged as a key approach for training agents in complex and uncertain environments. Incorporating statistical inference in RL algorithms is essential for understanding and managing uncertainty in model performance. This paper introduces a time-varying batch-averaged Q-learning algorithm, termed sampleaveraged Q-learning, which improves upon traditional single-sampl… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  4. arXiv:2410.09072  [pdf, other

    cs.RO

    iTeach: Interactive Teaching for Robot Perception using Mixed Reality

    Authors: Jishnu Jaykumar P, Cole Salvato, Vinaya Bomnale, Jikai Wang, Yu Xiang

    Abstract: We introduce iTeach, a Mixed Reality (MR) framework to improve robot perception through real-time interactive teaching. By allowing human instructors to dynamically label robot RGB data, iTeach improves both the accuracy and adaptability of robot perception to new scenarios. The framework supports on-the-fly data collection and labeling, enhancing model performance, and generalization. Applied to… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  5. arXiv:2410.06308  [pdf, other

    math.NA cs.LG

    Quantifying Training Difficulty and Accelerating Convergence in Neural Network-Based PDE Solvers

    Authors: Chuqi Chen, Qixuan Zhou, Yahong Yang, Yang Xiang, Tao Luo

    Abstract: Neural network-based methods have emerged as powerful tools for solving partial differential equations (PDEs) in scientific and engineering applications, particularly when handling complex domains or incorporating empirical data. These methods leverage neural networks as basis functions to approximate PDE solutions. However, training such networks can be challenging, often resulting in limited acc… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  6. arXiv:2410.04790  [pdf, other

    cs.CL

    GARLIC: LLM-Guided Dynamic Progress Control with Hierarchical Weighted Graph for Long Document QA

    Authors: Xinyu Wang, Yanzheng Xiang, Lin Gui, Yulan He

    Abstract: In the past, Retrieval-Augmented Generation (RAG) methods split text into chunks to enable language models to handle long documents. Recent tree-based RAG methods are able to retrieve detailed information while preserving global context. However, with the advent of more powerful LLMs, such as Llama 3.1, which offer better comprehension and support for longer inputs, we found that even recent tree-… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  7. arXiv:2409.19563  [pdf, other

    cs.CV cs.AI

    CLIP-based Camera-Agnostic Feature Learning for Intra-camera Person Re-Identification

    Authors: Xuan Tan, Xun Gong, Yang Xiang

    Abstract: Contrastive Language-Image Pre-Training (CLIP) model excels in traditional person re-identification (ReID) tasks due to its inherent advantage in generating textual descriptions for pedestrian images. However, applying CLIP directly to intra-camera supervised person re-identification (ICS ReID) presents challenges. ICS ReID requires independent identity labeling within each camera, without associa… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

    Comments: Submitted to IEEE TCSVT

  8. arXiv:2409.19510  [pdf, other

    cs.CL

    CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought

    Authors: Yexing Du, Ziyang Ma, Yifan Yang, Keqi Deng, Xie Chen, Bo Yang, Yang Xiang, Ming Liu, Bing Qin

    Abstract: Speech Language Models (SLMs) have demonstrated impressive performance on speech translation tasks. However, existing research primarily focuses on direct instruction fine-tuning and often overlooks the inherent reasoning capabilities of SLMs. In this paper, we introduce a three-stage training framework designed to activate the chain-of-thought (CoT) capabilities of SLMs. We propose CoT-ST, a spee… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

  9. arXiv:2409.15493  [pdf, other

    cs.RO cs.CV

    Autonomous Exploration and Semantic Updating of Large-Scale Indoor Environments with Mobile Robots

    Authors: Sai Haneesh Allu, Itay Kadosh, Tyler Summers, Yu Xiang

    Abstract: We introduce a new robotic system that enables a mobile robot to autonomously explore an unknown environment, build a semantic map of the environment, and subsequently update the semantic map to reflect environment changes, such as location changes of objects. Our system leverages a LiDAR scanner for 2D occupancy grid mapping and an RGB-D camera for object perception. We introduce a semantic map r… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

    Comments: 7 pages, 7 figures. Project page is available at https://irvlutd.github.io/SemanticMapping/

  10. arXiv:2409.14760  [pdf

    cs.LG

    Isometric Immersion Learning with Riemannian Geometry

    Authors: Zihao Chen, Wenyong Wang, Yu Xiang

    Abstract: Manifold learning has been proven to be an effective method for capturing the implicitly intrinsic structure of non-Euclidean data, in which one of the primary challenges is how to maintain the distortion-free (isometry) of the data representations. Actually, there is still no manifold learning method that provides a theoretical guarantee of isometry. Inspired by Nash's isometric theorem, we intro… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  11. arXiv:2409.14519  [pdf, other

    cs.RO cs.CV cs.LG

    RobotFingerPrint: Unified Gripper Coordinate Space for Multi-Gripper Grasp Synthesis

    Authors: Ninad Khargonkar, Luis Felipe Casas, Balakrishnan Prabhakaran, Yu Xiang

    Abstract: We introduce a novel representation named as the unified gripper coordinate space for grasp synthesis of multiple grippers. The space is a 2D surface of a sphere in 3D using longitude and latitude as its coordinates, and it is shared for all robotic grippers. We propose a new algorithm to map the palm surface of a gripper into the unified gripper coordinate space, and design a conditional variatio… ▽ More

    Submitted 22 September, 2024; originally announced September 2024.

    Comments: 7 pages, 8 figures, 2 tables. Project page available at https://irvlutd.github.io/RobotFingerPrint

  12. arXiv:2409.13440  [pdf, other

    eess.SP cs.AI cs.CR cs.LG

    Differentially Private Multimodal Laplacian Dropout (DP-MLD) for EEG Representative Learning

    Authors: Xiaowen Fu, Bingxin Wang, Xinzhou Guo, Guoqing Liu, Yang Xiang

    Abstract: Recently, multimodal electroencephalogram (EEG) learning has shown great promise in disease detection. At the same time, ensuring privacy in clinical studies has become increasingly crucial due to legal and ethical concerns. One widely adopted scheme for privacy protection is differential privacy (DP) because of its clear interpretation and ease of implementation. Although numerous methods have be… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  13. arXiv:2409.13083  [pdf, other

    cs.CR cs.AI cs.DC

    FedAT: Federated Adversarial Training for Distributed Insider Threat Detection

    Authors: R G Gayathri, Atul Sajjanhar, Md Palash Uddin, Yong Xiang

    Abstract: Insider threats usually occur from within the workplace, where the attacker is an entity closely associated with the organization. The sequence of actions the entities take on the resources to which they have access rights allows us to identify the insiders. Insider Threat Detection (ITD) using Machine Learning (ML)-based approaches gained attention in the last few years. However, most techniques… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: 10 pages, 7 figures

  14. arXiv:2409.11652  [pdf, other

    cs.CV cs.CR

    Relax DARTS: Relaxing the Constraints of Differentiable Architecture Search for Eye Movement Recognition

    Authors: Hongyu Zhu, Xin Jin, Hongchao Liao, Yan Xiang, Mounim A. El-Yacoubi, Huafeng Qin

    Abstract: Eye movement biometrics is a secure and innovative identification method. Deep learning methods have shown good performance, but their network architecture relies on manual design and combined priori knowledge. To address these issues, we introduce automated network search (NAS) algorithms to the field of eye movement recognition and present Relax DARTS, which is an improvement of the Differentiab… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: Accepted By CCBR 2024

  15. arXiv:2408.13985  [pdf, other

    cs.CL

    TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models

    Authors: Zelin Li, Kehai Chen, Lemao Liu, Xuefeng Bai, Mingming Yang, Yang Xiang, Min Zhang

    Abstract: With the great advancements in large language models (LLMs), adversarial attacks against LLMs have recently attracted increasing attention. We found that pre-existing adversarial attack methodologies exhibit limited transferability and are notably inefficient, particularly when applied to LLMs. In this paper, we analyze the core mechanisms of previous predominant adversarial attack methods, reveal… ▽ More

    Submitted 8 September, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

    Comments: 14 pages, 6 figures

  16. arXiv:2408.09945  [pdf, other

    cs.CL cs.AI

    Benchmarking LLMs for Translating Classical Chinese Poetry:Evaluating Adequacy, Fluency, and Elegance

    Authors: Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang

    Abstract: Large language models (LLMs) have shown remarkable performance in translation tasks. However, the increasing demand for high-quality translations that are not only adequate but also fluent and elegant. To evaluate the extent to which current LLMs can meet these demands, we introduce a suitable benchmark (PoetMT) for translating classical Chinese poetry into English. This task requires not only ade… ▽ More

    Submitted 16 October, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

    Comments: Work in progress

  17. arXiv:2408.09896  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    Instruction-Based Molecular Graph Generation with Unified Text-Graph Diffusion Model

    Authors: Yuran Xiang, Haiteng Zhao, Chang Ma, Zhi-Hong Deng

    Abstract: Recent advancements in computational chemistry have increasingly focused on synthesizing molecules based on textual instructions. Integrating graph generation with these instructions is complex, leading most current methods to use molecular sequences with pre-trained large language models. In response to this challenge, we propose a novel framework, named… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  18. arXiv:2408.07613  [pdf, other

    cs.CV

    Rethinking the Key Factors for the Generalization of Remote Sensing Stereo Matching Networks

    Authors: Liting Jiang, Feng Wang, Wenyi Zhang, Peifeng Li, Hongjian You, Yuming Xiang

    Abstract: Stereo matching, a critical step of 3D reconstruction, has fully shifted towards deep learning due to its strong feature representation of remote sensing images. However, ground truth for stereo matching task relies on expensive airborne LiDAR data, thus making it difficult to obtain enough samples for supervised learning. To improve the generalization ability of stereo matching networks on cross-… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: submitted to IEEE jstars

  19. arXiv:2408.07419  [pdf, other

    cs.CV

    Unsupervised Stereo Matching Network For VHR Remote Sensing Images Based On Error Prediction

    Authors: Liting Jiang, Yuming Xiang, Feng Wang, Hongjian You

    Abstract: Stereo matching in remote sensing has recently garnered increased attention, primarily focusing on supervised learning. However, datasets with ground truth generated by expensive airbone Lidar exhibit limited quantity and diversity, constraining the effectiveness of supervised networks. In contrast, unsupervised learning methods can leverage the increasing availability of very-high-resolution (VHR… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted to International Geoscience and Remote Sensing Symposium (IGARSS), 2024

  20. arXiv:2407.19216  [pdf, other

    cs.CR cs.AI cs.SE

    EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

    Authors: Shigang Liu, Di Cao, Junae Kim, Tamas Abraham, Paul Montague, Seyit Camtepe, Jun Zhang, Yang Xiang

    Abstract: Recently, deep learning has demonstrated promising results in enhancing the accuracy of vulnerability detection and identifying vulnerabilities in software. However, these techniques are still vulnerable to attacks. Adversarial examples can exploit vulnerabilities within deep neural networks, posing a significant threat to system security. This study showcases the susceptibility of deep learning m… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

  21. arXiv:2407.13911  [pdf, other

    cs.CV cs.LG

    Continual Distillation Learning

    Authors: Qifan Zhang, Yunhui Guo, Yu Xiang

    Abstract: We study the problem of Continual Distillation Learning (CDL) that considers Knowledge Distillation (KD) in the Continual Learning (CL) setup. A teacher model and a student model need to learn a sequence of tasks, and the knowledge of the teacher model will be distilled to the student to improve the student model. We introduce a novel method named CDL-Prompt that utilizes prompt-based continual le… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  22. arXiv:2407.11529  [pdf, other

    eess.IV cs.AI cs.CV

    Cross-Phase Mutual Learning Framework for Pulmonary Embolism Identification on Non-Contrast CT Scans

    Authors: Bizhe Bai, Yan-Jie Zhou, Yujian Hu, Tony C. W. Mok, Yilang Xiang, Le Lu, Hongkun Zhang, Minfeng Xu

    Abstract: Pulmonary embolism (PE) is a life-threatening condition where rapid and accurate diagnosis is imperative yet difficult due to predominantly atypical symptomatology. Computed tomography pulmonary angiography (CTPA) is acknowledged as the gold standard imaging tool in clinics, yet it can be contraindicated for emergency department (ED) patients and represents an onerous procedure, thus necessitating… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Early accept by MICCAI 2024

  23. arXiv:2407.07289  [pdf, other

    cs.CV

    Deformable Feature Alignment and Refinement for Moving Infrared Dim-small Target Detection

    Authors: Dengyan Luo, Yanping Xiang, Hu Wang, Luping Ji, Shuai Li, Mao Ye

    Abstract: The detection of moving infrared dim-small targets has been a challenging and prevalent research topic. The current state-of-the-art methods are mainly based on ConvLSTM to aggregate information from adjacent frames to facilitate the detection of the current frame. However, these methods implicitly utilize motion information only in the training stage and fail to explicitly explore motion compensa… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  24. arXiv:2407.03945  [pdf, other

    math.NA cs.LG

    A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

    Authors: Tianyu Jin, Georg Maierhofer, Katharina Schratz, Yang Xiang

    Abstract: The use of implicit time-stepping schemes for the numerical approximation of solutions to stiff nonlinear time-evolution equations brings well-known advantages including, typically, better stability behaviour and corresponding support of larger time steps, and better structure preservation properties. However, this comes at the price of having to solve a nonlinear equation at every time step of th… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  25. arXiv:2407.02280  [pdf, other

    cs.CV cs.AI

    FedIA: Federated Medical Image Segmentation with Heterogeneous Annotation Completeness

    Authors: Yangyang Xiang, Nannan Wu, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Federated learning has emerged as a compelling paradigm for medical image segmentation, particularly in light of increasing privacy concerns. However, most of the existing research relies on relatively stringent assumptions regarding the uniformity and completeness of annotations across clients. Contrary to this, this paper highlights a prevalent challenge in medical practice: incomplete annotatio… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI 2024

  26. arXiv:2406.17969  [pdf, other

    cs.CL cs.AI

    Encourage or Inhibit Monosemanticity? Revisit Monosemanticity from a Feature Decorrelation Perspective

    Authors: Hanqi Yan, Yanzheng Xiang, Guangyi Chen, Yifei Wang, Lin Gui, Yulan He

    Abstract: To better interpret the intrinsic mechanism of large language models (LLMs), recent studies focus on monosemanticity on its basic units. A monosemantic neuron is dedicated to a single and specific concept, which forms a one-to-one correlation between neurons and concepts. Despite extensive research in monosemanticity probing, it remains unclear whether monosemanticity is beneficial or harmful to m… ▽ More

    Submitted 15 October, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: EMNLP24, Main, Long

  27. arXiv:2406.15222  [pdf

    eess.IV cs.AI cs.CV

    Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

    Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, Jingyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

    Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More

    Submitted 16 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  28. arXiv:2406.07232  [pdf, other

    cs.CL cs.AI

    DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms

    Authors: Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang

    Abstract: Recently, large language models (LLMs) enhanced by self-reflection have achieved promising performance on machine translation. The key idea is guiding LLMs to generate translation with human-like feedback. However, existing self-reflection methods lack effective feedback information, limiting the translation performance. To address this, we introduce a DUAL-REFLECT framework, leveraging the dual l… ▽ More

    Submitted 21 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main conference

  29. arXiv:2406.07036  [pdf, other

    cs.CL cs.AI

    Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model

    Authors: Hongbin Zhang, Kehai Chen, Xuefeng Bai, Yang Xiang, Min Zhang

    Abstract: Large language models (LLMs) have showcased impressive multilingual machine translation ability. However, unlike encoder-decoder style models, decoder-only LLMs lack an explicit alignment between source and target contexts. Analyzing contribution scores during generation processes revealed that LLMs can be biased towards previously generated tokens over corresponding source tokens, leading to unfa… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL2024 Findings

  30. arXiv:2406.06843  [pdf, other

    cs.CV

    HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

    Authors: Jikai Wang, Qifan Zhang, Yu-Wei Chao, Bowen Wen, Xiaohu Guo, Yu Xiang

    Abstract: We introduce a data capture system and a new dataset named HO-Cap that can be used to study 3D reconstruction and pose tracking of hands and objects in videos. The capture system uses multiple RGB-D cameras and a HoloLens headset for data collection, avoiding the use of expensive 3D scanners or mocap systems. We propose a semi-automatic method to obtain annotations of shape and pose of hands and o… ▽ More

    Submitted 16 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  31. arXiv:2406.03880  [pdf, other

    cs.LG cs.AI

    Memorization in deep learning: A survey

    Authors: Jiaheng Wei, Yanjun Zhang, Leo Yu Zhang, Ming Ding, Chao Chen, Kok-Leong Ong, Jun Zhang, Yang Xiang

    Abstract: Deep Learning (DL) powered by Deep Neural Networks (DNNs) has revolutionized various domains, yet understanding the intricacies of DNN decision-making and learning processes remains a significant challenge. Recent investigations have uncovered an interesting memorization phenomenon in which DNNs tend to memorize specific details from examples rather than learning general patterns, affecting model… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  32. arXiv:2406.02630  [pdf, other

    cs.CR cs.AI

    AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

    Authors: Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, Yang Xiang

    Abstract: An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. AI agents, capable of perceiving user inputs, reasoning and planning tasks, and executing actions, have seen remarkable advancements in algorithm development and task performance. However, the security challenges they pose remain under-expl… ▽ More

    Submitted 5 September, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Submitted to ACM Computing Survey

  33. arXiv:2405.17859  [pdf, other

    cs.CV cs.RO

    Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation

    Authors: Yangxiao Lu, Jishnu Jaykumar P, Yunhui Guo, Nicholas Ruozzi, Yu Xiang

    Abstract: Novel Instance Detection and Segmentation (NIDS) aims at detecting and segmenting novel object instances given a few examples of each instance. We propose a unified framework (NIDS-Net) comprising object proposal generation, embedding creation for both instance templates and proposal regions, and embedding matching for instance label assignment. Leveraging recent advancements in large vision metho… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 22 pages, 9 figures, Code is available at: https://github.com/YoungSean/NIDS-Net

  34. arXiv:2405.16594  [pdf, ps, other

    stat.ML cs.LG

    Training-Conditional Coverage Bounds under Covariate Shift

    Authors: Mehrdad Pournaderi, Yu Xiang

    Abstract: Training-conditional coverage guarantees in conformal prediction concern the concentration of the error distribution, conditional on the training data, below some nominal level. The conformal prediction methodology has recently been generalized to the covariate shift setting, namely, the covariate distribution changes between the training and test data. In this paper, we study the training-conditi… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2404.13731

  35. arXiv:2405.15258  [pdf, other

    cs.CR

    Leakage-Resilient and Carbon-Neutral Aggregation Featuring the Federated AI-enabled Critical Infrastructure

    Authors: Zehang Deng, Ruoxi Sun, Minhui Xue, Sheng Wen, Seyit Camtepe, Surya Nepal, Yang Xiang

    Abstract: AI-enabled critical infrastructures (ACIs) integrate artificial intelligence (AI) technologies into various essential systems and services that are vital to the functioning of society, offering significant implications for efficiency, security and resilience. While adopting decentralized AI approaches (such as federated learning technology) in ACIs is plausible, private and sensitive data are stil… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  36. arXiv:2405.14099  [pdf, other

    cs.LG math.NA

    Automatic Differentiation is Essential in Training Neural Networks for Solving Differential Equations

    Authors: Chuqi Chen, Yahong Yang, Yang Xiang, Wenrui Hao

    Abstract: Neural network-based approaches have recently shown significant promise in solving partial differential equations (PDEs) in science and engineering, especially in scenarios featuring complex domains or incorporation of empirical data. One advantage of the neural network methods for PDEs lies in its automatic differentiation (AD), which necessitates only the sample points themselves, unlike traditi… ▽ More

    Submitted 2 September, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  37. arXiv:2405.12114  [pdf, other

    cs.CV math.NA

    A New Cross-Space Total Variation Regularization Model for Color Image Restoration with Quaternion Blur Operator

    Authors: Zhigang Jia, Yuelian Xiang, Meixiang Zhao, Tingting Wu, Michael K. Ng

    Abstract: The cross-channel deblurring problem in color image processing is difficult to solve due to the complex coupling and structural blurring of color pixels. Until now, there are few efficient algorithms that can reduce color infection in deblurring process. To solve this challenging problem, we present a novel cross-space total variation (CSTV) regularization model for color image deblurring by intro… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 15pages,10figures

  38. arXiv:2405.10616  [pdf, other

    cs.CL cs.LG

    Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization

    Authors: Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang

    Abstract: In recent years, large language models (LLMs) have driven advances in natural language processing. Still, their growing scale has increased the computational burden, necessitating a balance between efficiency and performance. Low-rank compression, a promising technique, reduces non-essential parameters by decomposing weight matrices into products of two low-rank matrices. Yet, its application in L… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted by 2024 ACL findings

  39. arXiv:2405.09298  [pdf

    eess.IV cs.CV

    Deep Blur Multi-Model (DeepBlurMM) -- a strategy to mitigate the impact of image blur on deep learning model performance in histopathology image analysis

    Authors: Yujie Xiang, Bojing Liu, Mattias Rantalainen

    Abstract: AI-based analysis of histopathology whole slide images (WSIs) is central in computational pathology. However, image quality, including unsharp areas of WSIs, impacts model performance. We investigate the impact of blur and propose a multi-model approach to mitigate negative impact of unsharp image areas. In this study, we use a simulation approach, evaluating model performance under varying levels… ▽ More

    Submitted 23 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    ACM Class: I.4; J.3

  40. arXiv:2405.06902   

    cs.LG stat.ML

    Causal Inference from Slowly Varying Nonstationary Processes

    Authors: Kang Du, Yu Xiang

    Abstract: Causal inference from observational data following the restricted structural causal models (SCM) framework hinges largely on the asymmetry between cause and effect from the data generating mechanisms, such as non-Gaussianity or non-linearity. This methodology can be adapted to stationary time series, yet inferring causal relationships from nonstationary time series remains a challenging task. In t… ▽ More

    Submitted 29 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

    Comments: This work was intended as a replacement of arXiv:2012.13025 and any subsequent updates will appear there

  41. arXiv:2405.05498  [pdf, other

    cs.SD eess.AS

    The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: Jingguang Tian, Shuaishuai Ye, Shunfei Chen, Yang Xiang, Zhaohui Yin, Xinhui Hu, Xinkang Xu

    Abstract: This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58\% compared to the official baseline on t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  42. arXiv:2405.04858  [pdf, other

    cs.CV

    Pedestrian Attribute Recognition as Label-balanced Multi-label Learning

    Authors: Yibo Zhou, Hai-Miao Hu, Yirong Xiang, Xiaokang Zhang, Haotian Wu

    Abstract: Rooting in the scarcity of most attributes, realistic pedestrian attribute datasets exhibit unduly skewed data distribution, from which two types of model failures are delivered: (1) label imbalance: model predictions lean greatly towards the side of majority labels; (2) semantics imbalance: model is easily overfitted on the under-represented attributes due to their insufficient semantic diversity… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted as ICML2024 main conference paper

  43. arXiv:2405.00273  [pdf, other

    cs.CL cs.HC

    Social Life Simulation for Non-Cognitive Skills Learning

    Authors: Zihan Yan, Yaohong Xiang, Yun Huang

    Abstract: Non-cognitive skills are crucial for personal and social life well-being, and such skill development can be supported by narrative-based (e.g., storytelling) technologies. While generative AI enables interactive and role-playing storytelling, little is known about how users engage with and perceive the use of AI in social life simulation for non-cognitive skills learning. Additionally, the benefit… ▽ More

    Submitted 19 July, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

  44. arXiv:2405.00026  [pdf

    cs.CE cs.AI

    Enhancing Credit Card Fraud Detection A Neural Network and SMOTE Integrated Approach

    Authors: Mengran Zhu, Ye Zhang, Yulu Gong, Changxin Xu, Yafei Xiang

    Abstract: Credit card fraud detection is a critical challenge in the financial sector, demanding sophisticated approaches to accurately identify fraudulent transactions. This research proposes an innovative methodology combining Neural Networks (NN) and Synthet ic Minority Over-sampling Technique (SMOTE) to enhance the detection performance. The study addresses the inherent imbalance in credit card transact… ▽ More

    Submitted 26 February, 2024; originally announced May 2024.

  45. PromptCL: Improving Event Representation via Prompt Template and Contrastive Learning

    Authors: Yubo Feng, Lishuang Li, Yi Xiang, Xueyang Qin

    Abstract: The representation of events in text plays a significant role in various NLP tasks. Recent research demonstrates that contrastive learning has the ability to improve event comprehension capabilities of Pre-trained Language Models (PLMs) and enhance the performance of event representation learning. However, the efficacy of event representation learning based on contrastive learning and PLMs is limi… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: NLPCC 2023 Best Student Paper

    Journal ref: Natural Language Processing and Chinese Computing (NLPCC 2023)

  46. arXiv:2404.15245  [pdf, other

    stat.ME cs.LG

    Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

    Authors: Austin Goddard, Kang Du, Yu Xiang

    Abstract: Making predictions in an unseen environment given data from multiple training environments is a challenging task. We approach this problem from an invariance perspective, focusing on binary classification to shed light on general nonlinear data generation mechanisms. We identify a unique form of invariance that exists solely in a binary setting that allows us to train models invariant over environ… ▽ More

    Submitted 3 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted to the 2024 International Symposium on Information Theory (ISIT)

  47. arXiv:2404.13731  [pdf, ps, other

    stat.ML cs.LG

    Training-Conditional Coverage Bounds for Uniformly Stable Learning Algorithms

    Authors: Mehrdad Pournaderi, Yu Xiang

    Abstract: The training-conditional coverage performance of the conformal prediction is known to be empirically sound. Recently, there have been efforts to support this observation with theoretical guarantees. The training-conditional coverage bounds for jackknife+ and full-conformal prediction regions have been established via the notion of $(m,n)$-stability by Liang and Barber~[2023]. Although this notion… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted to the ISIT 2024 workshop on Information-Theoretic Methods for Trustworthy Machine Learning (IT-TML)

  48. arXiv:2404.12715  [pdf, other

    cs.CL

    Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

    Authors: Yichong Huang, Xiaocheng Feng, Baohang Li, Yang Xiang, Hui Wang, Bing Qin, Ting Liu

    Abstract: Large language models (LLMs) exhibit complementary strengths in various tasks, motivating the research of LLM ensembling. However, existing work focuses on training an extra reward model or fusion model to select or combine all candidate answers, posing a great challenge to the generalization on unseen data distributions. Besides, prior methods use textual responses as communication media, ignorin… ▽ More

    Submitted 30 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: 16 pages, 9 figures, 9 tables

  49. arXiv:2404.11667  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification

    Authors: Shivvrat Arya, Yu Xiang, Vibhav Gogate

    Abstract: We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Will appear in AISTATS 2024. arXiv admin note: substantial text overlap with arXiv:2302.00633

  50. arXiv:2404.08690  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Towards Building a Robust Toxicity Predictor

    Authors: Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Liutong Zhou, Yanjun Qi

    Abstract: Recent NLP literature pays little attention to the robustness of toxicity language predictors, while these systems are most likely to be used in adversarial contexts. This paper presents a novel adversarial attack, \texttt{ToxicTrap}, introducing small word-level perturbations to fool SOTA text classifiers to predict toxic text samples as benign. ToxicTrap exploits greedy based search strategies t… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: ACL 2023 /