Skip to main content

Showing 1–50 of 344 results for author: Wei, W

  1. arXiv:2410.14200  [pdf, other

    eess.IV cs.CL cs.CV

    E3D-GPT: Enhanced 3D Visual Foundation for Medical Vision-Language Model

    Authors: Haoran Lai, Zihang Jiang, Qingsong Yao, Rongsheng Wang, Zhiyang He, Xiaodong Tao, Wei Wei, Weifu Lv, S. Kevin Zhou

    Abstract: The development of 3D medical vision-language models holds significant potential for disease diagnosis and patient treatment. However, compared to 2D medical images, 3D medical images, such as CT scans, face challenges related to limited training data and high dimension, which severely restrict the progress of 3D medical vision-language models. To address these issues, we collect a large amount of… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  2. arXiv:2410.14052  [pdf, other

    cs.CL cs.AI cs.LG

    From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs

    Authors: Alireza Rezazadeh, Zichao Li, Wei Wei, Yujia Bao

    Abstract: Recent advancements in large language models have significantly improved their context windows, yet challenges in effective long-term memory management remain. We introduce MemTree, an algorithm that leverages a dynamic, tree-structured memory representation to optimize the organization, retrieval, and integration of information, akin to human cognitive schemas. MemTree organizes memory hierarchic… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  3. arXiv:2410.13122  [pdf, other

    cs.CV cs.LG

    Boosting Imperceptibility of Stable Diffusion-based Adversarial Examples Generation with Momentum

    Authors: Nashrah Haque, Xiang Li, Zhehui Chen, Yanzhao Wu, Lei Yu, Arun Iyengar, Wenqi Wei

    Abstract: We propose a novel framework, Stable Diffusion-based Momentum Integrated Adversarial Examples (SD-MIAE), for generating adversarial examples that can effectively mislead neural network classifiers while maintaining visual imperceptibility and preserving the semantic similarity to the original class label. Our method leverages the text-to-image generation capabilities of the Stable Diffusion model… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 10 pages, 12 figures. To be published in IEEE TPS 2024 Proceedings. Code available on GitHub: https://github.com/nashrahhaque/SD-MIAE

  4. arXiv:2410.11143  [pdf, ps, other

    cs.CL cs.AI cs.LG

    LLM Unlearning via Loss Adjustment with Only Forget Data

    Authors: Yaxuan Wang, Jiaheng Wei, Chris Yuhao Liu, Jinlong Pang, Quan Liu, Ankit Parag Shah, Yujia Bao, Yang Liu, Wei Wei

    Abstract: Unlearning in Large Language Models (LLMs) is essential for ensuring ethical and responsible AI use, especially in addressing privacy leak, bias, safety, and evolving regulations. Existing approaches to LLM unlearning often rely on retain data or a reference LLM, yet they struggle to adequately balance unlearning performance with overall model utility. This challenge arises because leveraging expl… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: Paper under review

  5. arXiv:2410.10877  [pdf, other

    cs.CL cs.AI

    Improving Data Efficiency via Curating LLM-Driven Rating Systems

    Authors: Jinlong Pang, Jiaheng Wei, Ankit Parag Shah, Zhaowei Zhu, Yaxuan Wang, Chen Qian, Yang Liu, Yujia Bao, Wei Wei

    Abstract: Instruction tuning is critical for adapting large language models (LLMs) to downstream tasks, and recent studies have demonstrated that small amounts of human-curated data can outperform larger datasets, challenging traditional data scaling laws. While LLM-based data quality rating systems offer a cost-effective alternative to human annotation, they often suffer from inaccuracies and biases, even… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  6. arXiv:2410.07194  [pdf

    cs.CV cs.AI

    Technical Report: Competition Solution For Modelscope-Sora

    Authors: Shengfu Chen, Hailong Liu, Wenzhao Wei

    Abstract: This report presents the approach adopted in the Modelscope-Sora challenge, which focuses on fine-tuning data for video generation models. The challenge evaluates participants' ability to analyze, clean, and generate high-quality datasets for video-based text-to-video tasks under specific computational constraints. The provided methodology involves data processing techniques such as video descript… ▽ More

    Submitted 23 September, 2024; originally announced October 2024.

  7. arXiv:2410.04936  [pdf, other

    cs.AI

    Training Interactive Agent in Large FPS Game Map with Rule-enhanced Reinforcement Learning

    Authors: Chen Zhang, Huan Hu, Yuan Zhou, Qiyang Cao, Ruochen Liu, Wenya Wei, Elvis S. Liu

    Abstract: In the realm of competitive gaming, 3D first-person shooter (FPS) games have gained immense popularity, prompting the development of game AI systems to enhance gameplay. However, deploying game AI in practical scenarios still poses challenges, particularly in large-scale and complex FPS games. In this paper, we focus on the practical deployment of game AI in the online multiplayer competitive 3D F… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  8. arXiv:2410.04324  [pdf, other

    cs.SD cs.AI eess.AS

    SONAR: A Synthetic AI-Audio Detection Framework and Benchmark

    Authors: Xiang Li, Pin-Yu Chen, Wenqi Wei

    Abstract: Recent advances in Text-to-Speech (TTS) and Voice-Conversion (VC) using generative Artificial Intelligence (AI) technology have made it possible to generate high-quality and realistic human-like audio. This introduces significant challenges to distinguishing AI-synthesized speech from the authentic human voice and could raise potential issues of misuse for malicious purposes such as impersonation… ▽ More

    Submitted 10 October, 2024; v1 submitted 5 October, 2024; originally announced October 2024.

  9. arXiv:2409.15821  [pdf, other

    cs.RO

    Intention-based and Risk-Aware Trajectory Prediction for Autonomous Driving in Complex Traffic Scenarios

    Authors: Wen Wei, Jiankun Wang

    Abstract: Accurately predicting the trajectory of surrounding vehicles is a critical challenge for autonomous vehicles. In complex traffic scenarios, there are two significant issues with the current autonomous driving system: the cognitive uncertainty of prediction and the lack of risk awareness, which limit the further development of autonomous driving. To address this challenge, we introduce a novel traj… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  10. Data Poisoning and Leakage Analysis in Federated Learning

    Authors: Wenqi Wei, Tiansheng Huang, Zachary Yahn, Anoop Singhal, Margaret Loper, Ling Liu

    Abstract: Data poisoning and leakage risks impede the massive deployment of federated learning in the real world. This chapter reveals the truths and pitfalls of understanding two dominating threats: {\em training data privacy intrusion} and {\em training data poisoning}. We first investigate training data privacy threat and present our observations on when and how training data may be leaked during the cou… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: Chapter of Handbook of Trustworthy Federated Learning

  11. arXiv:2409.12760  [pdf, other

    cs.CV

    COCO-Occ: A Benchmark for Occluded Panoptic Segmentation and Image Understanding

    Authors: Wenbo Wei, Jun Wang, Abhir Bhalerao

    Abstract: To help address the occlusion problem in panoptic segmentation and image understanding, this paper proposes a new large-scale dataset, COCO-Occ, which is derived from the COCO dataset by manually labelling the COCO images into three perceived occlusion levels. Using COCO-Occ, we systematically assess and quantify the impact of occlusion on panoptic segmentation on samples having different levels o… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

  12. arXiv:2409.10071  [pdf, other

    cs.CV cs.RO

    Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation

    Authors: Meng Chen, Jiawei Tu, Chao Qi, Yonghao Dang, Feng Zhou, Wei Wei, Jianqin Yin

    Abstract: The deployment of embodied navigation agents in safety-critical environments raises concerns about their vulnerability to adversarial attacks on deep neural networks. However, current attack methods often lack practicality due to challenges in transitioning from the digital to the physical world, while existing physical attacks for object detection fail to achieve both multi-view effectiveness and… ▽ More

    Submitted 19 September, 2024; v1 submitted 16 September, 2024; originally announced September 2024.

    Comments: 8 pages, 6 figures, submitted to the 2025 IEEE International Conference on Robotics & Automation (ICRA)

  13. arXiv:2409.09731  [pdf, other

    eess.IV cs.CV

    Learning Two-factor Representation for Magnetic Resonance Image Super-resolution

    Authors: Weifeng Wei, Heng Chen, Pengxiang Su

    Abstract: Magnetic Resonance Imaging (MRI) requires a trade-off between resolution, signal-to-noise ratio, and scan time, making high-resolution (HR) acquisition challenging. Therefore, super-resolution for MR image is a feasible solution. However, most existing methods face challenges in accurately learning a continuous volumetric representation from low-resolution image or require HR image for supervision… ▽ More

    Submitted 15 September, 2024; originally announced September 2024.

  14. arXiv:2409.09326  [pdf, other

    cs.CV

    LawDNet: Enhanced Audio-Driven Lip Synthesis via Local Affine Warping Deformation

    Authors: Deng Junli, Luo Yihao, Yang Xueting, Li Siyou, Wang Wei, Guo Jinyang, Shi Ping

    Abstract: In the domain of photorealistic avatar generation, the fidelity of audio-driven lip motion synthesis is essential for realistic virtual interactions. Existing methods face two key challenges: a lack of vivacity due to limited diversity in generated lip poses and noticeable anamorphose motions caused by poor temporal coherence. To address these issues, we propose LawDNet, a novel deep-learning arch… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

  15. arXiv:2409.06748  [pdf, other

    cs.LG cs.AI

    EasyST: A Simple Framework for Spatio-Temporal Prediction

    Authors: Jiabin Tang, Wei Wei, Lianghao Xia, Chao Huang

    Abstract: Spatio-temporal prediction is a crucial research area in data-driven urban computing, with implications for transportation, public safety, and environmental monitoring. However, scalability and generalization challenges remain significant obstacles. Advanced models often rely on Graph Neural Networks to encode spatial and temporal correlations, but struggle with the increased complexity of large-s… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: Accepted by CIKM'2024, full paper

  16. arXiv:2408.17150  [pdf, other

    cs.CV cs.AI

    Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning

    Authors: Xiaoye Qu, Jiashuo Sun, Wei Wei, Yu Cheng

    Abstract: Recently, Large Vision-Language Models (LVLMs) have demonstrated impressive capabilities in multi-modal context comprehension. However, they still suffer from hallucination problems referring to generating inconsistent outputs with the image content. To mitigate hallucinations, previous studies mainly focus on retraining LVLMs with custom datasets. Although effective, they inherently come with add… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 13 pages, 7 tables, 7 figures

  17. arXiv:2408.02456  [pdf, other

    cs.LG cs.AI

    Enhancing Heterogeneous Knowledge Graph Completion with a Novel GAT-based Approach

    Authors: Wanxu Wei, Yitong Song, Bin Yao

    Abstract: Knowledge graphs (KGs) play a vital role in enhancing search results and recommendation systems. With the rapid increase in the size of the KGs, they are becoming inaccuracy and incomplete. This problem can be solved by the knowledge graph completion methods, of which graph attention network (GAT)-based methods stand out since their superior performance. However, existing GAT-based knowledge graph… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Journal ref: ACM Transactions on Knowledge Discovery from Data, Volume 18, Issue 4, 2024

  18. arXiv:2408.00555  [pdf, ps, other

    cs.CV cs.AI cs.CL

    Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation

    Authors: Xiaoye Qu, Qiyuan Chen, Wei Wei, Jishuo Sun, Jianfeng Dong

    Abstract: Despite the remarkable ability of large vision-language models (LVLMs) in image comprehension, these models frequently generate plausible yet factually incorrect responses, a phenomenon known as hallucination.Recently, in large language models (LLMs), augmenting LLMs by retrieving information from external knowledge resources has been proven as a promising solution to mitigate hallucinations.Howev… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  19. arXiv:2408.00550  [pdf, other

    cs.CV cs.AI cs.CL

    Mitigating Multilingual Hallucination in Large Vision-Language Models

    Authors: Xiaoye Qu, Mingyang Song, Wei Wei, Jianfeng Dong, Yu Cheng

    Abstract: While Large Vision-Language Models (LVLMs) have exhibited remarkable capabilities across a wide range of tasks, they suffer from hallucination problems, where models generate plausible yet incorrect answers given the input image-query pair. This hallucination phenomenon is even more severe when querying the image in non-English languages, while existing methods for mitigating hallucinations in LVL… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  20. arXiv:2408.00118  [pdf, other

    cs.CL cs.AI

    Gemma 2: Improving Open Language Models at a Practical Size

    Authors: Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, Anton Tsitsulin, Nino Vieillard, Piotr Stanczyk, Sertan Girgin, Nikola Momchev, Matt Hoffman , et al. (173 additional authors not shown)

    Abstract: In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We al… ▽ More

    Submitted 2 October, 2024; v1 submitted 31 July, 2024; originally announced August 2024.

  21. arXiv:2407.18921  [pdf, other

    cs.NI cs.AI cs.LG

    Mobile Edge Intelligence for Large Language Models: A Contemporary Survey

    Authors: Guanqiao Qu, Qiyuan Chen, Wei Wei, Zheng Lin, Xianhao Chen, Kaibin Huang

    Abstract: On-device large language models (LLMs), referring to running LLMs on edge devices, have raised considerable interest owing to their superior privacy, reduced latency, and bandwidth saving. Nonetheless, the capabilities of on-device LLMs are intrinsically constrained by the limited capacity of edge devices compared to the much more powerful cloud centers. To bridge the gap between cloud-based and o… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 37 pages, 13 figures

  22. arXiv:2407.12829  [pdf, other

    cs.AR cs.ET

    PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout

    Authors: Zhiyu Chen, Ziyuan Wen, Weier Wan, Akhil Reddy Pakala, Yiwei Zou, Wei-Chen Wei, Zengyi Li, Yubei Chen, Kaiyuan Yang

    Abstract: Analog compute-in-memory (CIM) in static random-access memory (SRAM) is promising for accelerating deep learning inference by circumventing the memory wall and exploiting ultra-efficient analog low-precision arithmetic. Latest analog CIM designs attempt bit-parallel schemes for multi-bit analog Matrix-Vector Multiplication (MVM), aiming at higher energy efficiency, throughput, and training simplic… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: This manuscript has been accepted to IEEE Journal of Solid-State Circuits (JSSC)

  23. arXiv:2407.01178  [pdf, other

    cs.CL cs.AI cs.LG

    $\text{Memory}^3$: Language Modeling with Explicit Memory

    Authors: Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang, Weinan E

    Abstract: The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowled… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    MSC Class: 68T50 ACM Class: I.2.7

  24. arXiv:2407.01067  [pdf, other

    cs.AI cs.CL cs.CV cs.HC cs.LG

    Human-like object concept representations emerge naturally in multimodal large language models

    Authors: Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, Jinpeng Li, Shuang Qiu, Le Chang, Huiguang He

    Abstract: The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vas… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  25. arXiv:2406.15480  [pdf, other

    cs.CL cs.AI cs.LG

    On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

    Authors: Chenghao Fan, Zhenyi Lu, Wei Wei, Jie Tian, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging. Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates. \thm{Can we fine-tune a series of task-specific small models and transfer their… ▽ More

    Submitted 14 October, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: Accepted by NeurIPS 2024

  26. arXiv:2406.15479  [pdf, other

    cs.CL cs.AI cs.LG

    Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

    Authors: Zhenyi Lu, Chenghao Fan, Wei Wei, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these i… ▽ More

    Submitted 14 October, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 poster

  27. arXiv:2406.13672  [pdf, other

    cs.CV

    Q-SNNs: Quantized Spiking Neural Networks

    Authors: Wenjie Wei, Yu Liang, Ammar Belatreche, Yichen Xiao, Honglin Cao, Zhenbang Ren, Guoqing Wang, Malu Zhang, Yang Yang

    Abstract: Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to represent information and process them in an asynchronous event-driven manner, offering an energy-efficient paradigm for the next generation of machine intelligence. However, the current focus within the SNN community prioritizes accuracy optimization through the development of large-scale models, limiting their viability in r… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures

  28. arXiv:2406.13179  [pdf, other

    cs.SD cs.AI cs.NE eess.AS

    Global-Local Convolution with Spiking Neural Networks for Energy-efficient Keyword Spotting

    Authors: Shuai Wang, Dehao Zhang, Kexin Shi, Yuchen Wang, Wenjie Wei, Jibin Wu, Malu Zhang

    Abstract: Thanks to Deep Neural Networks (DNNs), the accuracy of Keyword Spotting (KWS) has made substantial progress. However, as KWS systems are usually implemented on edge devices, energy efficiency becomes a critical requirement besides performance. Here, we take advantage of spiking neural networks' energy efficiency and propose an end-to-end lightweight KWS model. The model consists of two innovative… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  29. arXiv:2406.12189  [pdf, other

    cs.DC

    Energy-aware Incremental OTA Update for Flash-based Batteryless IoT Devices

    Authors: Wei Wei, Jishnu Banerjee, Sahidul Islam, Chen Pan, Mimi Xie

    Abstract: Over-the-air (OTA) firmware updates are essential for updating and maintaining IoT devices, especially those batteryless devices reliant on energy harvesting power sources. Flash memory, favored for its low cost and high density, is extensively used for data storage in many IoT devices. However, due to its high energy demands for update operations, there is often insufficient energy for code updat… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 6 pages

  30. arXiv:2406.11781  [pdf, other

    cs.IR

    DiffMM: Multi-Modal Diffusion Model for Recommendation

    Authors: Yangqin Jiang, Lianghao Xia, Wei Wei, Da Luo, Kangyi Lin, Chao Huang

    Abstract: The rise of online multi-modal sharing platforms like TikTok and YouTube has enabled personalized recommender systems to incorporate multiple modalities (such as visual, textual, and acoustic) into user representations. However, addressing the challenge of data sparsity in these systems remains a key issue. To address this limitation, recent research has introduced self-supervised learning techniq… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  31. arXiv:2406.09860  [pdf, other

    cs.LG cs.AI cs.CV

    Dataset Condensation with Latent Quantile Matching

    Authors: Wei Wei, Tom De Schepper, Kevin Mets

    Abstract: Dataset condensation (DC) methods aim to learn a smaller synthesized dataset with informative data records to accelerate the training of machine learning models. Current distribution matching (DM) based DC methods learn a synthesized dataset by matching the mean of the latent embeddings between the synthetic and the real dataset. However two distributions with the same mean can still be vastly dif… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPR Workshop 2024: 1st Workshop on Dataset Distillation for Computer Vision

  32. arXiv:2406.09126  [pdf, other

    cs.CV

    Auto-Vocabulary Segmentation for LiDAR Points

    Authors: Weijie Wei, Osman Ülger, Fatemeh Karimi Nejadasl, Theo Gevers, Martin R. Oswald

    Abstract: Existing perception methods for autonomous driving fall short of recognizing unknown entities not covered in the training data. Open-vocabulary methods offer promising capabilities in detecting any object but are limited by user-specified queries representing target classes. We propose AutoVoc3D, a framework for automatic object class recognition and open-ended segmentation. Evaluation on nuScenes… ▽ More

    Submitted 25 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPR 2024 OpenSun3D Workshop

  33. arXiv:2406.07001  [pdf, other

    cs.CL cs.AI

    Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models

    Authors: Zhenyi Lu, Jie Tian, Wei Wei, Xiaoye Qu, Yu Cheng, Wenfeng xie, Dangyang Chen

    Abstract: Text classification is a crucial task encountered frequently in practical scenarios, yet it is still under-explored in the era of large language models (LLMs). This study shows that LLMs are vulnerable to changes in the number and arrangement of options in text classification. Our extensive empirical analyses reveal that the key bottleneck arises from ambiguous decision boundaries and inherent bia… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL2024 findings

  34. arXiv:2406.06559  [pdf, other

    cs.CL cs.AI cs.LG

    Harnessing Business and Media Insights with Large Language Models

    Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

    Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  35. arXiv:2406.03805  [pdf, other

    cs.CR

    AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens

    Authors: Lin Lu, Hai Yan, Zenghui Yuan, Jiawen Shi, Wenqi Wei, Pin-Yu Chen, Pan Zhou

    Abstract: Jailbreak attacks in large language models (LLMs) entail inducing the models to generate content that breaches ethical and legal norm through the use of malicious prompts, posing a substantial threat to LLM security. Current strategies for jailbreak attack and defense often focus on optimizing locally within specific algorithmic frameworks, resulting in ineffective optimization and limited scalabi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 32 pages, 2 figures

  36. arXiv:2406.02002  [pdf, other

    cs.CL cs.AI

    Position Debiasing Fine-Tuning for Causal Perception in Long-Term Dialogue

    Authors: Shixuan Fan, Wei Wei, Wendi Li, Xian-Ling Mao, Wenfeng Xie, Dangyang Chen

    Abstract: The core of the dialogue system is to generate relevant, informative, and human-like responses based on extensive dialogue history. Recently, dialogue generation domain has seen mainstream adoption of large language models (LLMs), due to its powerful capability in generating utterances. However, there is a natural deficiency for such models, that is, inherent position bias, which may lead them to… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to IJCAI 2024

  37. arXiv:2406.01988  [pdf, other

    cs.CL cs.AI

    Personalized Topic Selection Model for Topic-Grounded Dialogue

    Authors: Shixuan Fan, Wei Wei, Xiaofei Wen, Xianling Mao, Jixiong Chen, Dangyang Chen

    Abstract: Recently, the topic-grounded dialogue (TGD) system has become increasingly popular as its powerful capability to actively guide users to accomplish specific tasks through topic-guided conversations. Most existing works utilize side information (\eg topics or personas) in isolation to enhance the topic selection ability. However, due to disregarding the noise within these auxiliary information sour… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  38. arXiv:2406.01425  [pdf, other

    cs.CV

    Sensitivity-Informed Augmentation for Robust Segmentation

    Authors: Laura Zheng, Wenjie Wei, Tony Wu, Jacob Clements, Shreelekha Revankar, Andre Harrison, Yu Shen, Ming C. Lin

    Abstract: Segmentation is an integral module in many visual computing applications such as virtual try-on, medical imaging, autonomous driving, and agricultural automation. These applications often involve either widespread consumer use or highly variable environments, both of which can degrade the quality of visual sensor data, whether from a common mobile phone or an expensive satellite imaging camera. In… ▽ More

    Submitted 16 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 10 pages

  39. arXiv:2406.01213  [pdf, other

    cs.CL cs.AI

    Improving Pseudo Labels with Global-Local Denoising Framework for Cross-lingual Named Entity Recognition

    Authors: Zhuojun Ding, Wei Wei, Xiaoye Qu, Dangyang Chen

    Abstract: Cross-lingual named entity recognition (NER) aims to train an NER model for the target language leveraging only labeled source language data and unlabeled target language data. Prior approaches either perform label projection on translated source language data or employ a source model to assign pseudo labels for target language data and train a target model on these pseudo-labeled data to generali… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024

  40. arXiv:2406.01027  [pdf, other

    cs.DB cs.LG

    PRICE: A Pretrained Model for Cross-Database Cardinality Estimation

    Authors: Tianjing Zeng, Junwei Lan, Jiahong Ma, Wenqing Wei, Rong Zhu, Pengfei Li, Bolin Ding, Defu Lian, Zhewei Wei, Jingren Zhou

    Abstract: Cardinality estimation (CardEst) is essential for optimizing query execution plans. Recent ML-based CardEst methods achieve high accuracy but face deployment challenges due to high preparation costs and lack of transferability across databases. In this paper, we propose PRICE, a PRetrained multI-table CardEst model, which addresses these limitations. PRICE takes low-level but transferable features… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  41. Knowledge Enhanced Multi-intent Transformer Network for Recommendation

    Authors: Ding Zou, Wei Wei, Feida Zhu, Chuanyu Xu, Tao Zhang, Chengfu Huo

    Abstract: Incorporating Knowledge Graphs into Recommendation has attracted growing attention in industry, due to the great potential of KG in providing abundant supplementary information and interpretability for the underlying models. However, simply integrating KG into recommendation usually brings in negative feedback in industry, due to the ignorance of the following two factors: i) users' multiple inten… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accept By The Web Conf 2024 (WWW 2024) Industry Track. arXiv admin note: text overlap with arXiv:2204.08807

  42. arXiv:2405.18757  [pdf, other

    cs.RO

    Multi-objective Cross-task Learning via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation

    Authors: Jiawei Fu, Yonghao Long, Kai Chen, Wang Wei, Qi Dou

    Abstract: Surgical robot task automation has been a promising research topic for improving surgical efficiency and quality. Learning-based methods have been recognized as an interesting paradigm and been increasingly investigated. However, existing approaches encounter difficulties in long-horizon goal-conditioned tasks due to the intricate compositional structure, which requires decision-making for a seque… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  43. arXiv:2405.16707  [pdf, other

    cs.CR

    Visualizing the Shadows: Unveiling Data Poisoning Behaviors in Federated Learning

    Authors: Xueqing Zhang, Junkai Zhang, Ka-Ho Chow, Juntao Chen, Ying Mao, Mohamed Rahouti, Xiang Li, Yuchen Liu, Wenqi Wei

    Abstract: This demo paper examines the susceptibility of Federated Learning (FL) systems to targeted data poisoning attacks, presenting a novel system for visualizing and mitigating such threats. We simulate targeted data poisoning attacks via label flipping and analyze the impact on model performance, employing a five-component system that includes Simulation and Data Generation, Data Collection and Upload… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  44. arXiv:2405.14170  [pdf, other

    cs.AI cs.CL

    Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning

    Authors: Jiapu Wang, Kai Sun, Linhao Luo, Wei Wei, Yongli Hu, Alan Wee-Chung Liew, Shirui Pan, Baocai Yin

    Abstract: Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively le… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  45. arXiv:2405.11333  [pdf, other

    cs.LG cs.AI

    GinAR: An End-To-End Multivariate Time Series Forecasting Model Suitable for Variable Missing

    Authors: Chengqing Yu, Fei Wang, Zezhi Shao, Tangwen Qian, Zhao Zhang, Wei Wei, Yongjun Xu

    Abstract: Multivariate time series forecasting (MTSF) is crucial for decision-making to precisely forecast the future values/trends, based on the complex relationships identified from historical observations of multiple sequences. Recently, Spatial-Temporal Graph Neural Networks (STGNNs) have gradually become the theme of MTSF model as their powerful capability in mining spatial-temporal dependencies, but a… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024 (Research track)

  46. arXiv:2405.08638  [pdf, other

    cs.LG

    vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement

    Authors: Yiwen Zhu, Jinyi Liu, Wenya Wei, Qianyi Fu, Yujing Hu, Zhou Fang, Bo An, Jianye Hao, Tangjie Lv, Changjie Fan

    Abstract: Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor in the policy improvement p… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024, with appendix

  47. arXiv:2405.04514  [pdf, other

    quant-ph cs.DC

    Scalable Circuit Cutting and Scheduling in a Resource-constrained and Distributed Quantum System

    Authors: Shuwen Kan, Zefan Du, Miguel Palma, Samuel A Stein, Chenxu Liu, Wenqi Wei, Juntao Chen, Ang Li, Ying Mao

    Abstract: Despite quantum computing's rapid development, current systems remain limited in practical applications due to their limited qubit count and quality. Various technologies, such as superconducting, trapped ions, and neutral atom quantum computing technologies are progressing towards a fault tolerant era, however they all face a diverse set of challenges in scalability and control. Recent efforts ha… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  48. arXiv:2404.17876  [pdf, other

    cs.CV

    DF-SLAM: Dictionary Factors Representation for High-Fidelity Neural Implicit Dense Visual SLAM System

    Authors: Weifeng Wei, Jie Wang, Shuqi Deng, Jie Liu

    Abstract: We introduce a high-fidelity neural implicit dense visual Simultaneous Localization and Mapping (SLAM) system, termed DF-SLAM. In our work, we employ dictionary factors for scene representation, encoding the geometry and appearance information of the scene as a combination of basis and coefficient factors. Compared to neural implicit dense visual SLAM methods that directly encode scene information… ▽ More

    Submitted 25 June, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

  49. arXiv:2404.17136  [pdf, other

    cs.DB cs.AI cs.CL

    Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study

    Authors: Yang Wu, Yao Wan, Hongyu Zhang, Yulei Sui, Wucai Wei, Wei Zhao, Guandong Xu, Hai Jin

    Abstract: The Natural Language to Visualization (NL2Vis) task aims to transform natural-language descriptions into visual representations for a grounded table, enabling users to gain insights from vast amounts of data. Recently, many deep learning-based approaches have been developed for NL2Vis. Despite the considerable efforts made by these approaches, challenges persist in visualizing data sourced from un… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  50. arXiv:2404.03354  [pdf, other

    cs.IR cs.AI

    A Comprehensive Survey on Self-Supervised Learning for Recommendation

    Authors: Xubin Ren, Wei Wei, Lianghao Xia, Chao Huang

    Abstract: Recommender systems play a crucial role in tackling the challenge of information overload by delivering personalized recommendations based on individual user preferences. Deep learning techniques, such as RNNs, GNNs, and Transformer architectures, have significantly propelled the advancement of recommender systems by enhancing their comprehension of user behaviors and preferences. However, supervi… ▽ More

    Submitted 7 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.