Skip to main content

Showing 1–50 of 670 results for author: Liu, G

  1. arXiv:2410.14055  [pdf, other

    stat.ML cs.LG

    Feedback Schr{ö}dinger Bridge Matching

    Authors: Panagiotis Theodoropoulos, Nikolaos Komianos, Vincent Pacelli, Guan-Horng Liu, Evangelos A. Theodorou

    Abstract: Recent advancements in diffusion bridges for distribution transport problems have heavily relied on matching frameworks, yet existing methods often face a trade-off between scalability and access to optimal pairings during training. Fully unsupervised methods make minimal assumptions but incur high computational costs, limiting their practicality. On the other hand, imposing full supervision of th… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  2. arXiv:2410.13907  [pdf, other

    cs.CR cs.AI cs.CL

    NSmark: Null Space Based Black-box Watermarking Defense Framework for Pre-trained Language Models

    Authors: Haodong Zhao, Jinming Hu, Peixuan Li, Fangqi Li, Jinrui Sha, Peixuan Chen, Zhuosheng Zhang, Gongshen Liu

    Abstract: Pre-trained language models (PLMs) have emerged as critical intellectual property (IP) assets that necessitate protection. Although various watermarking strategies have been proposed, they remain vulnerable to Linear Functionality Equivalence Attacks (LFEA), which can invalidate most existing white-box watermarks without prior knowledge of the watermarking scheme or training data. This paper furth… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  3. arXiv:2410.11795  [pdf, other

    cs.CV

    Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

    Authors: Zhiyuan Ma, Yuzhu Zhang, Guoli Jia, Liangliang Zhao, Yichao Ma, Mingjie Ma, Gaofeng Liu, Kaiyan Zhang, Jianjun Li, Bowen Zhou

    Abstract: As one of the most popular and sought-after generative models in the recent years, diffusion models have sparked the interests of many researchers and steadily shown excellent advantage in various generative tasks such as image synthesis, video generation, molecule design, 3D scene rendering and multimodal generation, relying on their dense theoretical principles and reliable application practices… ▽ More

    Submitted 16 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    ACM Class: I.4.9

  4. arXiv:2410.10370  [pdf, other

    cs.AI

    Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps

    Authors: Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Gang Liu, Xuguang Lan, Hui Wang

    Abstract: Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking. Similar to reasoning tasks like solving math problems, humor generation requires continuous reflection and revision to foster creative thinking, rather than relying on a sudden flash of inspiration like… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  5. arXiv:2410.09767  [pdf, other

    cs.HC cs.AI

    LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition

    Authors: Huan Liu, Shusen Yang, Yuzhe Zhang, Mengze Wang, Fanyu Gong, Chengxi Xie, Guanjian Liu, Dalin Zhang

    Abstract: EEG-based emotion recognition (EER) is garnering increasing attention due to its potential in understanding and analyzing human emotions. Recently, significant advancements have been achieved using various deep learning-based techniques to address the EER problem. However, the absence of a convincing benchmark and open-source codebase complicates fair comparisons between different models and poses… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

  6. arXiv:2410.09760  [pdf, other

    cs.LG

    Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation

    Authors: Guozhi Liu, Weiwei Lin, Tiansheng Huang, Ruichao Mo, Qi Mu, Li Shen

    Abstract: Harmful fine-tuning attack poses a serious threat to the online fine-tuning service. Vaccine, a recent alignment-stage defense, applies uniform perturbation to all layers of embedding to make the model robust to the simulated embedding drift. However, applying layer-wise uniform perturbation may lead to excess perturbations for some particular safety-irrelevant layers, resulting in defense perform… ▽ More

    Submitted 17 October, 2024; v1 submitted 13 October, 2024; originally announced October 2024.

  7. arXiv:2410.09302  [pdf, other

    cs.LG cs.AI cs.CL

    Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

    Authors: Guanlin Liu, Kaixuan Ji, Renjie Zheng, Zheng Wu, Chen Dun, Quanquan Gu, Lin Yan

    Abstract: Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs) with human preferences and improving their ability to perform complex tasks. However, current approaches either require significant computational resources due to the use of multiple models and extensive online sampling for training (e.g., PPO) or are framed as bandit problems (e.g., DPO, DRO), which often st… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  8. arXiv:2410.08553  [pdf

    cs.CR cs.AI cs.CL

    Balancing Innovation and Privacy: Data Security Strategies in Natural Language Processing Applications

    Authors: Shaobo Liu, Guiran Liu, Binrong Zhu, Yuanshuai Luo, Linxiao Wu, Rui Wang

    Abstract: This research addresses privacy protection in Natural Language Processing (NLP) by introducing a novel algorithm based on differential privacy, aimed at safeguarding user data in common applications such as chatbots, sentiment analysis, and machine translation. With the widespread application of NLP technology, the security and privacy protection of user data have become important issues that need… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  9. arXiv:2410.07525  [pdf, other

    cs.LG cs.AI

    Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare

    Authors: Nan Fang, Guiliang Liu, Wei Gong

    Abstract: Reinforcement Learning (RL) applied in healthcare can lead to unsafe medical decisions and treatment, such as excessive dosages or abrupt changes, often due to agents overlooking common-sense constraints. Consequently, Constrained Reinforcement Learning (CRL) is a natural choice for safe decisions. However, specifying the exact cost function is inherently difficult in healthcare. Recent Inverse Co… ▽ More

    Submitted 14 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

  10. arXiv:2410.06541  [pdf, other

    cs.CL cs.AI

    Chip-Tuning: Classify Before Language Models Say

    Authors: Fangwei Zhu, Dian Li, Jiajun Huang, Gang Liu, Hui Wang, Zhifang Sui

    Abstract: The rapid development in the performance of large language models (LLMs) is accompanied by the escalation of model size, leading to the increasing cost of model training and inference. Previous research has discovered that certain layers in LLMs exhibit redundancy, and removing these layers brings only marginal loss in model performance. In this paper, we adopt the probing technique to explain the… ▽ More

    Submitted 11 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

  11. arXiv:2410.04534  [pdf, other

    cs.SD cs.CV cs.GR cs.LG cs.MM eess.AS

    UniMuMo: Unified Text, Music and Motion Generation

    Authors: Han Yang, Kun Su, Yutong Zhang, Jiaben Chen, Kaizhi Qian, Gaowen Liu, Chuang Gan

    Abstract: We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To address the lack of time-synchronized data, we align unpaired music and motion data based on rhythmic patterns to leverage existing large-scale music-only and motion-only datasets. By converting music, motion, and text int… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

  12. arXiv:2410.04335  [pdf, other

    cs.CL

    ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model

    Authors: Shuhao Gu, Mengdi Zhao, Bowen Zhang, Liangdong Wang, Jijie Li, Guang Liu

    Abstract: Tokenizer is an essential component for large language models (LLMs), and a tokenizer with a high compression rate can improve the model's representation and processing efficiency. However, the tokenizer cannot ensure high compression rate in all scenarios, and an increase in the average input and output lengths will increases the training and inference costs of the model. Therefore, it is crucial… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

  13. arXiv:2410.04223  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

    Authors: Gang Liu, Michael Sun, Wojciech Matusik, Meng Jiang, Jie Chen

    Abstract: While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inver… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: 27 pages, 11 figures, 4 tables

  14. arXiv:2410.02082  [pdf, other

    cs.LG q-bio.QM

    FARM: Functional Group-Aware Representations for Small Molecules

    Authors: Thao Nguyen, Kuan-Hao Huang, Ge Liu, Martin D. Burke, Ying Diao, Heng Ji

    Abstract: We introduce Functional Group-Aware Representations for Small Molecules (FARM), a novel foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs. The key innovation of FARM lies in its functional group-aware tokenization, which directly incorporates functional group information into the representations. This strategic reduction in tokenization granularity… ▽ More

    Submitted 6 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

    Comments: Preprint

  15. arXiv:2410.00313  [pdf, ps, other

    cs.IT eess.SP

    Pre-Chirp-Domain Index Modulation for Full-Diversity Affine Frequency Division Multiplexing towards 6G

    Authors: Guangyao Liu, Tianqi Mao, Zhenyu Xiao, Ruiqi Liu, Miaowen Wen

    Abstract: Affine frequency division multiplexing (AFDM), tailored as a superior multicarrier technique utilizing chirp signals for high-mobility communications, is envisioned as a promising candidate for the sixth-generation (6G) wireless network. AFDM is based on the discrete affine Fourier transform (DAFT) with two adjustable parameters of the chirp signals, termed as the pre-chirp and post-chirp paramete… ▽ More

    Submitted 17 October, 2024; v1 submitted 30 September, 2024; originally announced October 2024.

  16. arXiv:2409.19209  [pdf

    cs.LG cond-mat.mtrl-sci physics.data-an

    Boosting SISSO Performance on Small Sample Datasets by Using Random Forests Prescreening for Complex Feature Selection

    Authors: Xiaolin Jiang, Guanqi Liu, Jiaying Xie, Zhenpeng Hu

    Abstract: In materials science, data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates. Symbolic regression is a key to extracting material descriptors from large datasets, in particular the Sure Independence Screening and Sparsifying Operator (SISSO) method. While SISSO needs to store the entire expression space to impose heavy memory demands, it… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

  17. arXiv:2409.18869  [pdf, other

    cs.CV

    Emu3: Next-Token Prediction is All You Need

    Authors: Xinlong Wang, Xiaosong Zhang, Zhengxiong Luo, Quan Sun, Yufeng Cui, Jinsheng Wang, Fan Zhang, Yueze Wang, Zhen Li, Qiying Yu, Yingli Zhao, Yulong Ao, Xuebin Min, Tao Li, Boya Wu, Bo Zhao, Bowen Zhang, Liangdong Wang, Guang Liu, Zheqi He, Xi Yang, Jingjing Liu, Yonghua Lin, Tiejun Huang, Zhongyuan Wang

    Abstract: While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e.g., Stable Diffusion) and compositional approaches (e.g., CLIP combined with LLMs). In this paper, we introduce Emu3, a new suite of state-of-the-art multimodal models trained solely with next-token predi… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Comments: Project Page: https://emu.baai.ac.cn

  18. arXiv:2409.18395  [pdf, other

    cs.CR cs.AI

    Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning

    Authors: Arshiya Khan, Guannan Liu, Xing Gao

    Abstract: Large Language Models (LLMs) have shown significant challenges in detecting and repairing vulnerable code, particularly when dealing with vulnerabilities involving multiple aspects, such as variables, code flows, and code structures. In this study, we utilize GitHub Copilot as the LLM and focus on buffer overflow vulnerabilities. Our experiments reveal a notable gap in Copilot's abilities when dea… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

  19. arXiv:2409.17622  [pdf, other

    cs.LG cs.AI

    Neural P$^3$M: A Long-Range Interaction Modeling Enhancer for Geometric GNNs

    Authors: Yusong Wang, Chaoran Cheng, Shaoning Li, Yuxuan Ren, Bin Shao, Ge Liu, Pheng-Ann Heng, Nanning Zheng

    Abstract: Geometric graph neural networks (GNNs) have emerged as powerful tools for modeling molecular geometry. However, they encounter limitations in effectively capturing long-range interactions in large molecular systems. To address this challenge, we introduce Neural P$^3$M, a versatile enhancer of geometric GNNs to expand the scope of their capabilities by incorporating mesh points alongside atoms and… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

    Comments: Published as a conference paper at NeurIPS 2024

  20. arXiv:2409.17517  [pdf, other

    cs.LG cs.AI

    Dataset Distillation-based Hybrid Federated Learning on Non-IID Data

    Authors: Xiufang Shi, Wei Zhang, Mincheng Wu, Guangyi Liu, Zhenyu Wen, Shibo He, Tejal Shah, Rajiv Ranjan

    Abstract: In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (Non-IID) data. This study focuses on the issue of label distribution skew. To address it, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distil… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

  21. arXiv:2409.15963  [pdf, other

    cs.LG cs.AI

    Provably Efficient Exploration in Inverse Constrained Reinforcement Learning

    Authors: Bo Yue, Jian Li, Guiliang Liu

    Abstract: To obtain the optimal constraints in complex environments, Inverse Constrained Reinforcement Learning (ICRL) seeks to recover these constraints from expert demonstrations in a data-driven manner. Existing ICRL algorithms collect training samples from an interactive environment. However, the efficacy and efficiency of these sampling strategies remain unknown. To bridge this gap, we introduce a stra… ▽ More

    Submitted 30 September, 2024; v1 submitted 24 September, 2024; originally announced September 2024.

  22. arXiv:2409.14000  [pdf

    cs.CL cs.AI

    Graph Neural Network Framework for Sentiment Analysis Using Syntactic Feature

    Authors: Linxiao Wu, Yuanshuai Luo, Binrong Zhu, Guiran Liu, Rui Wang, Qian Yu

    Abstract: Amidst the swift evolution of social media platforms and e-commerce ecosystems, the domain of opinion mining has surged as a pivotal area of exploration within natural language processing. A specialized segment within this field focuses on extracting nuanced evaluations tied to particular elements within textual contexts. This research advances a composite framework that amalgamates the positional… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  23. arXiv:2409.13989  [pdf, other

    cs.CL cs.AI cs.LG physics.chem-ph q-bio.BM

    ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models

    Authors: Yuqing Huang, Rongyang Zhang, Xuesong He, Xuyang Zhi, Hao Wang, Xin Li, Feiyang Xu, Deguang Liu, Huadong Liang, Yi Li, Jian Cui, Zimu Liu, Shijin Wang, Guoping Hu, Guiquan Liu, Qi Liu, Defu Lian, Enhong Chen

    Abstract: There is a growing interest in the role that LLMs play in chemistry which lead to an increased focus on the development of LLMs benchmarks tailored to chemical domains to assess the performance of LLMs across a spectrum of chemical tasks varying in type and complexity. However, existing benchmarks in this domain fail to adequately meet the specific requirements of chemical research professionals.… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  24. arXiv:2409.13440  [pdf, other

    eess.SP cs.AI cs.CR cs.LG

    Differentially Private Multimodal Laplacian Dropout (DP-MLD) for EEG Representative Learning

    Authors: Xiaowen Fu, Bingxin Wang, Xinzhou Guo, Guoqing Liu, Yang Xiang

    Abstract: Recently, multimodal electroencephalogram (EEG) learning has shown great promise in disease detection. At the same time, ensuring privacy in clinical studies has become increasingly crucial due to legal and ethical concerns. One widely adopted scheme for privacy protection is differential privacy (DP) because of its clear interpretation and ease of implementation. Although numerous methods have be… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  25. arXiv:2409.07569  [pdf, other

    cs.LG cs.AI

    A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges

    Authors: Guiliang Liu, Sheng Xu, Shicheng Liu, Ashish Gaurav, Sriram Ganapathi Subramanian, Pascal Poupart

    Abstract: Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints followed by expert agents from their demonstration data. As an emerging research topic, ICRL has received considerable attention in recent years. This article presents a categorical survey of the latest advances in ICRL. It serves as a comprehensive reference for machine learning researchers and pra… ▽ More

    Submitted 21 September, 2024; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: 29 pages

  26. arXiv:2409.07416  [pdf, other

    cs.IR cs.AI cs.LG

    Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation

    Authors: Luo Ji, Gao Liu, Mingyang Yin, Hongxia Yang, Jingren Zhou

    Abstract: Modern listwise recommendation systems need to consider both long-term user perceptions and short-term interest shifts. Reinforcement learning can be applied on recommendation to study such a problem but is also subject to large search space, sparse user feedback and long interactive latency. Motivated by recent progress in hierarchical reinforcement learning, we propose a novel framework called m… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: 18 pages, 4 figures

  27. arXiv:2409.04867  [pdf, other

    cs.CV

    Fine-Grained Representation Learning via Multi-Level Contrastive Learning without Class Priors

    Authors: Houwang Jiang, Zhuxian Liu, Guodong Liu, Xiaolong Liu, Shihua Zhan

    Abstract: Recent advances in unsupervised representation learning often rely on knowing the number of classes to improve feature extraction and clustering. However, this assumption raises an important question: is the number of classes always necessary, and do class labels fully capture the fine-grained features within the data? In this paper, we propose Contrastive Disentangling (CD), a framework designed… ▽ More

    Submitted 23 September, 2024; v1 submitted 7 September, 2024; originally announced September 2024.

  28. arXiv:2409.04593  [pdf, other

    cs.CL

    Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

    Authors: Guanyu Lin, Tao Feng, Pengrui Han, Ge Liu, Jiaxuan You

    Abstract: As scientific research proliferates, researchers face the daunting task of navigating and reading vast amounts of literature. Existing solutions, such as document QA, fail to provide personalized and up-to-date information efficiently. We present Paper Copilot, a self-evolving, efficient LLM system designed to assist researchers, based on thought-retrieval, user profile and high performance optimi… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  29. arXiv:2409.03487  [pdf, other

    cs.CV

    ScreenMark: Watermarking Arbitrary Visual Content on Screen

    Authors: Xiujian Liang, Gaozhi Liu, Yichao Si, Xiaoxiao Hu, Zhenxing Qian, Xinpeng Zhang

    Abstract: Digital watermarking has demonstrated its effectiveness in protecting multimedia content. However, existing watermarking are predominantly tailored for specific media types, rendering them less effective for the protection of content displayed on computer screens, which is often multimodal and dynamic. Visual Screen Content (VSC), is particularly susceptible to theft and leakage via screenshots, a… ▽ More

    Submitted 12 September, 2024; v1 submitted 5 September, 2024; originally announced September 2024.

  30. arXiv:2409.00947  [pdf, other

    cs.CV cs.AI

    XNet v2: Fewer Limitations, Better Results and Greater Universality

    Authors: Yanfeng Zhou, Lingrui Li, Zichen Wang, Guole Liu, Ziwen Liu, Ge Yang

    Abstract: XNet introduces a wavelet-based X-shaped unified architecture for fully- and semi-supervised biomedical segmentation. So far, however, XNet still faces the limitations, including performance degradation when images lack high-frequency (HF) information, underutilization of raw images and insufficient fusion. To address these issues, we propose XNet v2, a low- and high-frequency complementary model.… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  31. arXiv:2409.00620  [pdf, other

    cs.CV cs.AI

    Enhancing Vectorized Map Perception with Historical Rasterized Maps

    Authors: Xiaoyu Zhang, Guangwei Liu, Zihao Liu, Ningyi Xu, Yunhui Liu, Ji Zhao

    Abstract: In autonomous driving, there is growing interest in end-to-end online vectorized map perception in bird's-eye-view (BEV) space, with an expectation that it could replace traditional high-cost offline high-definition (HD) maps. However, the accuracy and robustness of these methods can be easily compromised in challenging conditions, such as occlusion or adverse weather, when relying only on onboard… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: Accepted by ECCV 2024

  32. arXiv:2408.15998  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

    Authors: Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu

    Abstract: The ability to accurately interpret complex visual information is a crucial topic of multimodal large language models (MLLMs). Recent work indicates that enhanced visual perception significantly reduces hallucinations and improves performance on resolution-sensitive tasks, such as optical character recognition and document analysis. A number of recent MLLMs achieve this goal using a mixture of vis… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: Github: https://github.com/NVlabs/Eagle, HuggingFace: https://huggingface.co/NVEagle

  33. arXiv:2408.15696  [pdf, other

    cs.CY

    Comparing diversity, negativity, and stereotypes in Chinese-language AI technologies: a case study on Baidu, Ernie and Qwen

    Authors: Geng Liu, Carlo Alberto Bono, Francesco Pierri

    Abstract: Large Language Models (LLMs) and search engines have the potential to perpetuate biases and stereotypes by amplifying existing prejudices in their training data and algorithmic processes, thereby influencing public perception and decision-making. While most work has focused on Western-centric AI technologies, we study Chinese-based tools by investigating social biases embedded in the major Chinese… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  34. arXiv:2408.15538  [pdf, other

    cs.AI cs.MA

    TrafficGamer: Reliable and Flexible Traffic Simulation for Safety-Critical Scenarios with Game-Theoretic Oracles

    Authors: Guanren Qiao, Guorui Quan, Jiawei Yu, Shujun Jia, Guiliang Liu

    Abstract: While modern Autonomous Vehicle (AV) systems can develop reliable driving policies under regular traffic conditions, they frequently struggle with safety-critical traffic scenarios. This difficulty primarily arises from the rarity of such scenarios in driving datasets and the complexities associated with predictive modeling among multiple vehicles. To support the testing and refinement of AV polic… ▽ More

    Submitted 21 October, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

  35. arXiv:2408.14023  [pdf, other

    cs.CV cs.AI

    Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos

    Authors: Jiajun Fei, Dian Li, Zhidong Deng, Zekun Wang, Gang Liu, Hui Wang

    Abstract: Multi-modal large language models (MLLMs) have demonstrated considerable potential across various downstream tasks that require cross-domain knowledge. MLLMs capable of processing videos, known as Video-MLLMs, have attracted broad interest in video-language understanding. However, videos, especially long videos, contain more visual tokens than images, making them difficult for LLMs to process. Exi… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: 10 pages, 5 figures

  36. arXiv:2408.13727  [pdf, other

    cs.SE cs.AI

    LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

    Authors: Aoxiao Zhong, Dengyao Mo, Guiyang Liu, Jinbu Liu, Qingda Lu, Qi Zhou, Jiesheng Wu, Quanzheng Li, Qingsong Wen

    Abstract: Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose sig… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM KDD 2024

  37. arXiv:2408.12161  [pdf, other

    cs.CV

    Rebalancing Multi-Label Class-Incremental Learning

    Authors: Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Junzhou Xie, Yixi Shen, Fuyuan Hu, Guangcan Liu

    Abstract: Multi-label class-incremental learning (MLCIL) is essential for real-world multi-label applications, allowing models to learn new labels while retaining previously learned knowledge continuously. However, recent MLCIL approaches can only achieve suboptimal performance due to the oversight of the positive-negative imbalance problem, which manifests at both the label and loss levels because of the t… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  38. arXiv:2408.12152  [pdf, other

    cs.IR

    Behavior Pattern Mining-based Multi-Behavior Recommendation

    Authors: Haojie Li, Zhiyong Cheng, Xu Yu, Jinhuan Liu, Guanfeng Liu, Junwei Du

    Abstract: Multi-behavior recommendation systems enhance effectiveness by leveraging auxiliary behaviors (such as page views and favorites) to address the limitations of traditional models that depend solely on sparse target behaviors like purchases. Existing approaches to multi-behavior recommendations typically follow one of two strategies: some derive initial node representations from individual behavior… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  39. arXiv:2408.12124  [pdf, other

    cs.LG cs.HC eess.SP

    Recording Brain Activity While Listening to Music Using Wearable EEG Devices Combined with Bidirectional Long Short-Term Memory Networks

    Authors: Jingyi Wang, Zhiqun Wang, Guiran Liu

    Abstract: Electroencephalography (EEG) signals are crucial for investigating brain function and cognitive processes. This study aims to address the challenges of efficiently recording and analyzing high-dimensional EEG signals while listening to music to recognize emotional states. We propose a method combining Bidirectional Long Short-Term Memory (Bi-LSTM) networks with attention mechanisms for EEG signal… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: 15 pages

  40. arXiv:2408.11302  [pdf, other

    cs.LG cs.CY

    Modeling Reference-dependent Choices with Graph Neural Networks

    Authors: Liang Zhang, Guannan Liu, Junjie Wu, Yong Tan

    Abstract: While the classic Prospect Theory has highlighted the reference-dependent and comparative nature of consumers' product evaluation processes, few models have successfully integrated this theoretical hypothesis into data-driven preference quantification, particularly in the realm of recommender systems development. To bridge this gap, we propose a new research problem of modeling reference-dependent… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  41. arXiv:2408.09878  [pdf, other

    cs.CR

    Transferring Backdoors between Large Language Models by Knowledge Distillation

    Authors: Pengzhou Cheng, Zongru Wu, Tianjie Ju, Wei Du, Zhuosheng Zhang Gongshen Liu

    Abstract: Backdoor Attacks have been a serious vulnerability against Large Language Models (LLMs). However, previous methods only reveal such risk in specific models, or present tasks transferability after attacking the pre-trained phase. So, how risky is the model transferability of a backdoor attack? In this paper, we focus on whether existing mini-LLMs may be unconsciously instructed in backdoor knowledg… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 13 pages, 16 figures, 5 tables

  42. arXiv:2408.08930  [pdf, other

    cs.CR cs.AI cs.CL

    DePrompt: Desensitization and Evaluation of Personal Identifiable Information in Large Language Model Prompts

    Authors: Xiongtao Sun, Gan Liu, Zhipeng He, Hui Li, Xiaoguang Li

    Abstract: Prompt serves as a crucial link in interacting with large language models (LLMs), widely impacting the accuracy and interpretability of model outputs. However, acquiring accurate and high-quality responses necessitates precise prompts, which inevitably pose significant risks of personal identifiable information (PII) leakage. Therefore, this paper proposes DePrompt, a desensitization protection an… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  43. arXiv:2408.08661  [pdf, other

    cs.CL cs.CR cs.LG

    MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector

    Authors: Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang

    Abstract: The increasing parameters and expansive dataset of large language models (LLMs) highlight the urgent demand for a technical solution to audit the underlying privacy risks and copyright issues associated with LLMs. Existing studies have partially addressed this need through an exploration of the pre-training data detection problem, which is an instance of a membership inference attack (MIA). This p… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: code and dataset: https://github.com/wjfu99/MIA-Tuner

  44. arXiv:2408.07410  [pdf, other

    cs.CL

    Aquila2 Technical Report

    Authors: Bo-Wen Zhang, Liangdong Wang, Jijie Li, Shuhao Gu, Xinya Wu, Zhengduo Zhang, Boyan Gao, Yulong Ao, Guang Liu

    Abstract: This paper introduces the Aquila2 series, which comprises a wide range of bilingual models with parameter sizes of 7, 34, and 70 billion. These models are trained based on an innovative framework named HeuriMentor (HM), which offers real-time insights into model convergence and enhances the training process and data management. The HM System, comprising the Adaptive Training Engine (ATE), Training… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  45. InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning

    Authors: Bo-Wen Zhang, Yan Yan, Lin Li, Guang Liu

    Abstract: Recent advancements in Chain-of-Thoughts (CoT) and Program-of-Thoughts (PoT) methods have greatly enhanced language models' mathematical reasoning capabilities, facilitating their integration into instruction tuning datasets with LLMs. However, existing methods for large-scale dataset creation require substantial seed data and high computational costs for data synthesis, posing significant challen… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: Accepted by CIKM 2024

    ACM Class: I.2.7

  46. arXiv:2408.06567  [pdf, other

    cs.CL cs.AI

    AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies

    Authors: Bo-Wen Zhang, Liangdong Wang, Ye Yuan, Jijie Li, Shuhao Gu, Mengdi Zhao, Xinya Wu, Guang Liu, Chengwei Wu, Hanyu Zhao, Li Du, Yiming Ju, Quanyue Ma, Yulong Ao, Yingli Zhao, Songhe Zhu, Zhou Cao, Dong Liang, Yonghua Lin, Ming Zhang, Shunfei Wang, Yanxin Zhou, Min Ye, Xuekai Chen, Xinyang Yu , et al. (2 additional authors not shown)

    Abstract: In recent years, with the rapid application of large language models across various fields, the scale of these models has gradually increased, and the resources required for their pre-training have grown exponentially. Training an LLM from scratch will cost a lot of computation resources while scaling up from a smaller model is a more efficient approach and has thus attracted significant attention… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  47. arXiv:2408.06360  [pdf, other

    cs.IR cs.CV

    Modality-Balanced Learning for Multimedia Recommendation

    Authors: Jinghao Zhang, Guofan Liu, Qiang Liu, Shu Wu, Liang Wang

    Abstract: Many recommender models have been proposed to investigate how to incorporate multimodal content information into traditional collaborative filtering framework effectively. The use of multimodal information is expected to provide more comprehensive information and lead to superior performance. However, the integration of multiple modalities often encounters the modal imbalance problem: since the in… ▽ More

    Submitted 26 July, 2024; originally announced August 2024.

    Comments: ACM Multimedia 2024 (Oral)

  48. arXiv:2408.05804  [pdf, other

    cs.LG cs.AI

    A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals

    Authors: Grace Liu, Michael Tang, Benjamin Eysenbach

    Abstract: In this paper, we present empirical evidence of skills and directed exploration emerging from a simple RL algorithm long before any successful trials are observed. For example, in a manipulation task, the agent is given a single observation of the goal state and learns skills, first for moving its end-effector, then for pushing the block, and finally for picking up and placing the block. These ski… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: Code and videos: https://graliuce.github.io/sgcrl/

  49. arXiv:2408.01435  [pdf, other

    cs.CV cs.RO

    A New Clustering-based View Planning Method for Building Inspection with Drone

    Authors: Yongshuai Zheng, Guoliang Liu, Yan Ding, Guohui Tian

    Abstract: With the rapid development of drone technology, the application of drones equipped with visual sensors for building inspection and surveillance has attracted much attention. View planning aims to find a set of near-optimal viewpoints for vision-related tasks to achieve the vision coverage goal. This paper proposes a new clustering-based two-step computational method using spectral clustering, loca… ▽ More

    Submitted 19 July, 2024; originally announced August 2024.

  50. arXiv:2407.17333  [pdf, other

    cs.LG

    Global Confidence Degree Based Graph Neural Network for Financial Fraud Detection

    Authors: Jiaxun Liu, Yue Tian, Guanjun Liu

    Abstract: Graph Neural Networks (GNNs) are widely used in financial fraud detection due to their excellent ability on handling graph-structured financial data and modeling multilayer connections by aggregating information of neighbors. However, these GNN-based methods focus on extracting neighbor-level information but neglect a global perspective. This paper presents the concept and calculation formula of G… ▽ More

    Submitted 18 August, 2024; v1 submitted 24 July, 2024; originally announced July 2024.