Skip to main content

Showing 1–50 of 97 results for author: Hua, W

  1. arXiv:2410.11843  [pdf, other

    cs.HC cs.AI cs.DB cs.LG

    From Commands to Prompts: LLM-based Semantic File System for AIOS

    Authors: Zeru Shi, Kai Mei, Mingyu Jin, Yongye Su, Chaoji Zuo, Wenyue Hua, Wujiang Xu, Yujie Ren, Zirui Liu, Mengnan Du, Dong Deng, Yongfeng Zhang

    Abstract: Large language models (LLMs) have demonstrated significant potential in the development of intelligent applications and systems such as LLM-based agents and agent operating systems (AIOS). However, when these applications and systems interact with the underlying file system, the file system still remains the traditional paradigm: reliant on manual navigation through precise commands. This paradigm… ▽ More

    Submitted 23 September, 2024; originally announced October 2024.

  2. arXiv:2410.04153  [pdf, other

    cs.AI

    Neuro-Symbolic Entity Alignment via Variational Inference

    Authors: Shengyuan Chen, Qinggang Zhang, Junnan Dong, Wen Hua, Jiannong Cao, Xiao Huang

    Abstract: Entity alignment (EA) aims to merge two knowledge graphs (KGs) by identifying equivalent entity pairs. Existing methods can be categorized into symbolic and neural models. Symbolic models, while precise, struggle with substructure heterogeneity and sparsity, whereas neural models, although effective, generally lack interpretability and cannot handle uncertainty. We propose NeuSymEA, a probabilisti… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

  3. arXiv:2410.00079  [pdf, other

    cs.MA cs.AI cs.CL cs.HC cs.LG

    Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface

    Authors: Wenyue Hua, Mengting Wan, Shashank Vadrevu, Ryan Nadel, Yongfeng Zhang, Chi Wang

    Abstract: Agents, as user-centric tools, are increasingly deployed for human task delegation, assisting with a broad spectrum of requests by generating thoughts, engaging with user proxies, and producing action plans. However, agents based on large language models (LLMs) often face substantial planning latency due to two primary factors: the efficiency limitations of the underlying LLMs due to their large s… ▽ More

    Submitted 30 September, 2024; originally announced October 2024.

    Comments: 27 pages, 22 figures

  4. arXiv:2409.18924  [pdf

    cs.CL cs.AI

    AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow

    Authors: Huizi Yu, Jiayan Zhou, Lingyao Li, Shan Chen, Jack Gallifant, Anye Shi, Xiang Li, Wenyue Hua, Mingyu Jin, Guang Chen, Yang Zhou, Zhao Li, Trisha Gupte, Ming-Li Chen, Zahra Azizi, Yongfeng Zhang, Themistocles L. Assimes, Xin Ma, Danielle S. Bitterman, Lin Lu, Lizhou Fan

    Abstract: Simulated patient systems play a crucial role in modern medical education and research, providing safe, integrative learning environments and enabling clinical decision-making simulations. Large Language Models (LLM) could advance simulated patient systems by replicating medical conditions and patient-doctor interactions with high fidelity and low cost. However, ensuring the effectiveness and trus… ▽ More

    Submitted 1 October, 2024; v1 submitted 27 September, 2024; originally announced September 2024.

    Comments: 42 pages, 6 figures, 7 tables

  5. arXiv:2409.06123  [pdf, other

    cs.LG

    Contrastive Federated Learning with Tabular Data Silos

    Authors: Achmad Ginanjar, Xue Li, Wen Hua

    Abstract: Learning from data silos is a difficult task for organizations that need to obtain knowledge of objects that appeared in multiple independent data silos. Objects in multi-organizations, such as government agents, are referred by different identifiers, such as driver license, passport number, and tax file number. The data distributions in data silos are mostly non-IID (Independently and Identically… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: 18 Pages. Was submitted on Artificial Intelligence Journal, Jan 29, 2024, ARTINT-D-24-00098

    MSC Class: 68A00 ACM Class: I.1.1

  6. arXiv:2407.18957  [pdf, other

    q-fin.TR cs.AI cs.MA

    When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments

    Authors: Chong Zhang, Xinyi Liu, Zhongmou Zhang, Mingyu Jin, Lingyao Li, Zhenting Wang, Wenyue Hua, Dong Shu, Suiyuan Zhu, Xiaobo Jin, Sujian Li, Mengnan Du, Yongfeng Zhang

    Abstract: Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large langu… ▽ More

    Submitted 20 September, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 33 pages, 10 figures

  7. arXiv:2407.12821  [pdf, other

    cs.CL cs.AI cs.LG

    AutoFlow: Automated Workflow Generation for Large Language Model Agents

    Authors: Zelong Li, Shuyuan Xu, Kai Mei, Wenyue Hua, Balaji Rama, Om Raheja, Hao Wang, He Zhu, Yongfeng Zhang

    Abstract: Recent advancements in Large Language Models (LLMs) have shown significant progress in understanding complex natural language. One important application of LLM is LLM-based AI Agent, which leverages the ability of LLM as well as external tools for complex-task solving. To make sure LLM Agents follow an effective and reliable procedure to solve the given task, manually designed workflows are usuall… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Open source code available at https://github.com/agiresearch/AutoFlow

  8. arXiv:2407.11282  [pdf, other

    cs.CL

    Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models

    Authors: Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang

    Abstract: Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial. One commonly used method to assess the reliability of LLMs' responses is uncertainty estimation, which gauges the likelihood of their answers being correct. While many studies focus on improving the accuracy of uncertainty estimations for LLMs, our research investigates… ▽ More

    Submitted 19 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  9. arXiv:2407.01016  [pdf, other

    cs.CV

    SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection

    Authors: Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-supervised object detection (SSOD), leveraging unlabeled data to boost object detectors, has become a hot topic recently. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects common in aerial images unexplored. At the same time, the annotation cost of multi-oriented objects is significantly higher than that of their horizontal counterparts. Ther… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  10. arXiv:2406.14711  [pdf, other

    cs.CL cs.AI cs.MA

    MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate

    Authors: Alfonso Amayuelas, Xianjun Yang, Antonis Antoniades, Wenyue Hua, Liangming Pan, William Wang

    Abstract: Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents, enabling interactions among multiple models to execute complex tasks. Such collaborations offer several advantages, including the use of sp… ▽ More

    Submitted 26 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  11. arXiv:2406.04428  [pdf, other

    cs.CL cs.AI

    MoralBench: Moral Evaluation of LLMs

    Authors: Jianchao Ji, Yutong Chen, Mingyu Jin, Wujiang Xu, Wenyue Hua, Yongfeng Zhang

    Abstract: In the rapidly evolving field of artificial intelligence, large language models (LLMs) have emerged as powerful tools for a myriad of applications, from natural language processing to decision-making support systems. However, as these models become increasingly integrated into societal frameworks, the imperative to ensure they operate within ethical and moral boundaries has never been more critica… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  12. arXiv:2406.02787  [pdf, other

    cs.CL cs.AI cs.LG

    Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities

    Authors: Wenyue Hua, Kaijie Zhu, Lingyao Li, Lizhou Fan, Shuhang Lin, Mingyu Jin, Haochen Xue, Zelong Li, JinDong Wang, Yongfeng Zhang

    Abstract: This study intends to systematically disentangle pure logic reasoning and text understanding by investigating the contrast across abstract and contextualized logical problems from a comprehensive set of domains. We explore whether LLMs demonstrate genuine reasoning capabilities across various domains when the underlying logical structure remains constant. We focus on two main questions (1) Can abs… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 22 pages, 9 figures

  13. arXiv:2405.16806  [pdf, other

    cs.CL cs.AI

    Entity Alignment with Noisy Annotations from Large Language Models

    Authors: Shengyuan Chen, Qinggang Zhang, Junnan Dong, Wen Hua, Qing Li, Xiao Huang

    Abstract: Entity alignment (EA) aims to merge two knowledge graphs (KGs) by identifying equivalent entity pairs. While existing methods heavily rely on human-generated labels, it is prohibitively expensive to incorporate cross-domain experts for annotation in real-world scenarios. The advent of Large Language Models (LLMs) presents new avenues for automating EA with annotations, inspired by their comprehens… ▽ More

    Submitted 28 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  14. arXiv:2405.03066  [pdf

    cs.ET

    A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs)

    Authors: Lingyao Li, Jiayan Zhou, Zhenxiang Gao, Wenyue Hua, Lizhou Fan, Huizi Yu, Loni Hagen, Yongfeng Zhang, Themistocles L. Assimes, Libby Hemphill, Siyuan Ma

    Abstract: Electronic Health Records (EHRs) play an important role in the healthcare system. However, their complexity and vast volume pose significant challenges to data interpretation and analysis. Recent advancements in Artificial Intelligence (AI), particularly the development of Large Language Models (LLMs), open up new opportunities for researchers in this domain. Although prior studies have demonstrat… ▽ More

    Submitted 22 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  15. arXiv:2404.15532  [pdf, other

    cs.HC cs.AI cs.CL cs.CV cs.MA

    BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis

    Authors: Shuhang Lin, Wenyue Hua, Lingyao Li, Che-Jui Chang, Lizhou Fan, Jianchao Ji, Hang Hua, Mingyu Jin, Jiebo Luo, Yongfeng Zhang

    Abstract: This paper presents BattleAgent, an emulation system that combines the Large Vision-Language Model and Multi-agent System. This novel system aims to simulate complex dynamic interactions among multiple agents, as well as between agents and their environments, over a period of time. It emulates both the decision-making processes of leaders and the viewpoints of ordinary participants, such as soldie… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 26 pages, 14 figures The data and code for this project are accessible at https://github.com/agiresearch/battleagent

  16. arXiv:2404.07066  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

    Authors: Mingyu Jin, Qinkai Yu, Jingyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, Yongfeng Zhang

    Abstract: Large language models (LLMs) have shown remarkable performances across a wide range of tasks. However, the mechanisms by which these models encode tasks of varying complexities remain poorly understood. In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of ``Concept Depth'' to suggest that more complex concepts are… ▽ More

    Submitted 16 September, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 16 pages

  17. arXiv:2404.01064  [pdf, other

    cs.CV

    Roadside Monocular 3D Detection via 2D Detection Prompting

    Authors: Yechi Ma, Shuoquan Wei, Churun Zhang, Wei Hua, Yanan Li, Shu Kong

    Abstract: The problem of roadside monocular 3D detection requires detecting objects of interested classes in a 2D RGB frame and predicting their 3D information such as locations in bird's-eye-view (BEV). It has broad applications in traffic control, vehicle-vehicle communication, and vehicle-infrastructure cooperative perception. To approach this problem, we present a novel and simple method by prompting th… ▽ More

    Submitted 4 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  18. arXiv:2403.19021  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    IDGenRec: LLM-RecSys Alignment with Textual ID Learning

    Authors: Juntao Tan, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Zelong Li, Yongfeng Zhang

    Abstract: Generative recommendation based on Large Language Models (LLMs) have transformed the traditional ranking-based recommendation style into a text-to-text generation paradigm. However, in contrast to standard NLP tasks that inherently operate on human vocabulary, current research in generative recommendations struggles to effectively encode recommendation items within the text-to-text framework using… ▽ More

    Submitted 17 May, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted in SIGIR 2024

  19. arXiv:2403.16303  [pdf

    cs.DL cs.AI cs.CL cs.SI

    Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis

    Authors: Huizi Yu, Lizhou Fan, Lingyao Li, Jiayan Zhou, Zihui Ma, Lu Xian, Wenyue Hua, Sijia He, Mingyu Jin, Yongfeng Zhang, Ashvin Gandhi, Xin Ma

    Abstract: Large Language Models (LLMs) have rapidly become important tools in Biomedical and Health Informatics (BHI), enabling new ways to analyze data, treat patients, and conduct research. This study aims to provide a comprehensive overview of LLM applications in BHI, highlighting their transformative potential and addressing the associated ethical and practical challenges. We reviewed 1,698 research art… ▽ More

    Submitted 27 July, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: 62 pages, 9 figures, 5 tables

  20. arXiv:2403.09439  [pdf, other

    cs.CV cs.AI

    3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

    Authors: Frank Zhang, Yibo Zhang, Quan Zheng, Rui Ma, Wei Hua, Hujun Bao, Weiwei Xu, Changqing Zou

    Abstract: Text-driven 3D scene generation techniques have made rapid progress in recent years. Their success is mainly attributed to using existing generative models to iteratively perform image warping and inpainting to generate 3D scenes. However, these methods heavily rely on the outputs of existing models, leading to error accumulation in geometry and appearance that prevent the models from being used i… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures

  21. arXiv:2403.01777  [pdf, other

    cs.CL cs.CV

    NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models

    Authors: Lizhou Fan, Wenyue Hua, Xiang Li, Kaijie Zhu, Mingyu Jin, Lingyao Li, Haoyang Ling, Jinkui Chi, Jindong Wang, Xin Ma, Yongfeng Zhang

    Abstract: Understanding the reasoning capabilities of Multimodal Large Language Models (MLLMs) is an important area of research. In this study, we introduce a dynamic benchmark, NPHardEval4V, aimed at addressing the existing gaps in evaluating the pure reasoning abilities of MLLMs. Our benchmark aims to provide a venue to disentangle the effect of various factors such as image recognition and instruction fo… ▽ More

    Submitted 5 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 16 pages, 10 figures, 2 tables

  22. arXiv:2402.13184  [pdf, other

    cs.CL

    What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents

    Authors: Mingyu Jin, Beichen Wang, Zhaoqian Xue, Suiyuan Zhu, Wenyue Hua, Hua Tang, Kai Mei, Mengnan Du, Yongfeng Zhang

    Abstract: In this study, we introduce "CosmoAgent," an innovative artificial intelligence framework utilizing Large Language Models (LLMs) to simulate complex interactions between human and extraterrestrial civilizations, with a special emphasis on Stephen Hawking's cautionary advice about not sending radio signals haphazardly into the universe. The goal is to assess the feasibility of peaceful coexistence… ▽ More

    Submitted 5 August, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  23. arXiv:2402.05868  [pdf, other

    cs.CL cs.AI cs.CR cs.IR cs.LG

    EmojiCrypt: Prompt Encryption for Secure Communication with Large Language Models

    Authors: Guo Lin, Wenyue Hua, Yongfeng Zhang

    Abstract: Cloud-based large language models (LLMs) such as ChatGPT have increasingly become integral to daily operations, serving as vital tools across various applications. While these models offer substantial benefits in terms of accessibility and functionality, they also introduce significant privacy concerns: the transmission and storage of user data in cloud infrastructures pose substantial risks of da… ▽ More

    Submitted 12 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 12 pages, 4 figures, 2 tables, comments and suggestions are welcome

  24. arXiv:2402.01586  [pdf, other

    cs.CL cs.AI cs.LG cs.MA

    TrustAgent: Towards Safe and Trustworthy LLM-based Agents

    Authors: Wenyue Hua, Xianjun Yang, Mingyu Jin, Zelong Li, Wei Cheng, Ruixiang Tang, Yongfeng Zhang

    Abstract: The rise of LLM-based agents shows great potential to revolutionize task planning, capturing significant attention. Given that these agents will be integrated into high-stake domains, ensuring their reliability and safety is crucial. This paper presents an Agent-Constitution-based agent framework, TrustAgent, with a particular focus on improving the LLM-based agent safety. The proposed framework e… ▽ More

    Submitted 3 October, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: In EMNLP 2024

  25. arXiv:2402.00798  [pdf, other

    cs.LG cs.AI cs.CL cs.FL

    Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents

    Authors: Zelong Li, Wenyue Hua, Hao Wang, He Zhu, Yongfeng Zhang

    Abstract: Recent advancements on Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks. However, since LLM's content generation process is hardly controllable, current LLM-based agents frequently generate invalid or non-executable plans, which jeopardizes the performance of the generated plans and corrupts users' trust in LLM-based agents… ▽ More

    Submitted 12 August, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  26. arXiv:2402.00746  [pdf, other

    cs.CL

    Health-LLM: Personalized Retrieval-Augmented Disease Prediction System

    Authors: Mingyu Jin, Qinkai Yu, Dong Shu, Chong Zhang, Lizhou Fan, Wenyue Hua, Suiyuan Zhu, Yanda Meng, Zhenting Wang, Mengnan Du, Yongfeng Zhang

    Abstract: Recent advancements in artificial intelligence (AI), especially large language models (LLMs), have significantly advanced healthcare applications and demonstrated potentials in intelligent medical treatment. However, there are conspicuous challenges such as vast data volumes and inconsistent symptom characterization standards, preventing full integration of healthcare AI systems with individual pa… ▽ More

    Submitted 30 September, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  27. arXiv:2402.00284  [pdf, other

    cs.IR cs.AI cs.LG

    PAP-REC: Personalized Automatic Prompt for Recommendation Language Model

    Authors: Zelong Li, Jianchao Ji, Yingqiang Ge, Wenyue Hua, Yongfeng Zhang

    Abstract: Recently emerged prompt-based Recommendation Language Models (RLM) can solve multiple recommendation tasks uniformly. The RLMs make full use of the inherited knowledge learned from the abundant pre-training data to solve the downstream recommendation tasks by prompts, without introducing additional parameters or network training. However, handcrafted prompts require significant expertise and human… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  28. arXiv:2401.17585  [pdf, other

    cs.CL cs.AI cs.LG stat.ME

    Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks

    Authors: Wenyue Hua, Jiang Guo, Mingwen Dong, Henghui Zhu, Patrick Ng, Zhiguo Wang

    Abstract: Current approaches of knowledge editing struggle to effectively propagate updates to interconnected facts. In this work, we delve into the barriers that hinder the appropriate propagation of updated knowledge within these models for accurate reasoning. To support our analysis, we introduce a novel reasoning-based benchmark -- ReCoE (Reasoning-based Counterfactual Editing dataset) -- which covers s… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 22 pages, 14 figures, 5 tables

  29. arXiv:2401.04925  [pdf, other

    cs.CL cs.AI

    The Impact of Reasoning Step Length on Large Language Models

    Authors: Mingyu Jin, Qinkai Yu, Dong Shu, Haiyan Zhao, Wenyue Hua, Yanda Meng, Yongfeng Zhang, Mengnan Du

    Abstract: Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations. Specifically, we design experiments that expand and compress the ra… ▽ More

    Submitted 22 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Findings of ACL 2024

  30. arXiv:2401.04361  [pdf, other

    cs.CL cs.AI

    Improving the Robustness of Knowledge-Grounded Dialogue via Contrastive Learning

    Authors: Jiaan Wang, Jianfeng Qu, Kexin Wang, Zhixu Li, Wen Hua, Ximing Li, An Liu

    Abstract: Knowledge-grounded dialogue (KGD) learns to generate an informative response based on a given dialogue context and external knowledge (\emph{e.g.}, knowledge graphs; KGs). Recently, the emergence of large language models (LLMs) and pre-training techniques has brought great success to knowledge-grounded dialogue. However, when building KGD systems in real applications, there are various real-world… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  31. arXiv:2312.14890  [pdf, other

    cs.AI cs.CC cs.CL cs.LG

    NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes

    Authors: Lizhou Fan, Wenyue Hua, Lingyao Li, Haoyang Ling, Yongfeng Zhang

    Abstract: Complex reasoning ability is one of the most important features of current LLMs, which has also been leveraged to play an integral role in complex decision-making tasks. Therefore, the investigation into the reasoning capabilities of Large Language Models (LLMs) is critical: numerous benchmarks have been established to assess the reasoning abilities of LLMs. However, current benchmarks are inadequ… ▽ More

    Submitted 12 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 23 pages, 7 figures, 2 tables

  32. arXiv:2312.10986  [pdf, other

    cs.CV cs.RO

    Long-Tailed 3D Detection via Multi-Modal Fusion

    Authors: Yechi Ma, Neehar Peri, Shuoquan Wei, Achal Dave, Wei Hua, Yanan Li, Deva Ramanan, Shu Kong

    Abstract: Contemporary autonomous vehicle (AV) benchmarks have advanced techniques for training 3D detectors, particularly on large-scale multi-modal (LiDAR + RGB) data. Surprisingly, although semantic class labels naturally follow a long-tailed distribution, existing benchmarks only focus on a few common classes (e.g., pedestrian and car) and neglect many rare but crucial classes (e.g., emergency vehicle a… ▽ More

    Submitted 23 September, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: The first two authors contributed equally. Project page: https://mayechi.github.io/lt3d-lf-io/

  33. arXiv:2312.03815  [pdf, other

    cs.OS cs.AI cs.CL cs.LG

    LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem

    Authors: Yingqiang Ge, Yujie Ren, Wenyue Hua, Shuyuan Xu, Juntao Tan, Yongfeng Zhang

    Abstract: This paper envisions a revolutionary AIOS-Agent ecosystem, where Large Language Model (LLM) serves as the (Artificial) Intelligent Operating System (IOS, or AIOS)--an operating system "with soul". Upon this foundation, a diverse range of LLM-based AI Agent Applications (Agents, or AAPs) are developed, enriching the AIOS-Agent ecosystem and signaling a paradigm shift from the traditional OS-APP eco… ▽ More

    Submitted 9 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: 35 pages, 4 figures

  34. arXiv:2312.03022  [pdf, other

    cs.AI cs.CL cs.LG

    Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction

    Authors: Hongbin Ye, Honghao Gui, Aijia Zhang, Tong Liu, Wei Hua, Weiqiang Jia

    Abstract: Knowledge graph construction (KGC) is a multifaceted undertaking involving the extraction of entities, relations, and events. Traditionally, large language models (LLMs) have been viewed as solitary task-solving agents in this complex landscape. However, this paper challenges this paradigm by introducing a novel framework, CooperKGC. Departing from the conventional approach, CooperKGC establishes… ▽ More

    Submitted 29 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: work in progress; 12 pages

  35. arXiv:2311.17227  [pdf, other

    cs.AI cs.CL cs.CY

    War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars

    Authors: Wenyue Hua, Lizhou Fan, Lingyao Li, Kai Mei, Jianchao Ji, Yingqiang Ge, Libby Hemphill, Yongfeng Zhang

    Abstract: Can we avoid wars at the crossroads of history? This question has been pursued by individuals, scholars, policymakers, and organizations throughout human history. In this research, we attempt to answer the question based on the recent advances of Artificial Intelligence (AI) and Large Language Models (LLMs). We propose \textbf{WarAgent}, an LLM-powered multi-agent AI system, to simulate the partic… ▽ More

    Submitted 30 January, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 47 pages, 9 figures, 5 tables

  36. arXiv:2311.11825  [pdf, other

    cs.CV cs.GR

    Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning

    Authors: Zixuan Xie, Rengan Xie, Rong Li, Kai Huang, Pengju Qiao, Jingsen Zhu, Xu Yin, Qi Ye, Wei Hua, Yuchi Huo, Hujun Bao

    Abstract: In this work, we use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs). Without the requirement of complex equipment, our method only takes simple RGB images captured by a drone as inputs to enable physically based and photorealistic novel-view rendering, relighting, and editing. However, a real-world facade usually h… ▽ More

    Submitted 8 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

  37. arXiv:2309.13235  [pdf, other

    cs.CV

    M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

    Authors: Qibo Qiu, Honghui Yang, Wenxiao Wang, Shun Zhang, Haiming Gao, Haochao Ying, Wei Hua, Xiaofei He

    Abstract: Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds. Existing methods reconstruct either the original points or related features as the objective of pre-training. However, considering the diversity of downstream tasks, it is necessary for the model to have both low- and high-level representation modeling capabilities to capture geometric details and… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  38. arXiv:2309.06794  [pdf, other

    cs.CL cs.AI cs.LG

    Cognitive Mirage: A Review of Hallucinations in Large Language Models

    Authors: Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia

    Abstract: As large language models continue to develop in the field of AI, text generation systems are susceptible to a worrisome phenomenon known as hallucination. In this study, we summarize recent compelling insights into hallucinations in LLMs. We present a novel taxonomy of hallucinations from various text generation tasks, thus provide theoretical insights, detection methods and improvement approaches… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: work in progress; 21 pages

  39. arXiv:2307.00457  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    GenRec: Large Language Model for Generative Recommendation

    Authors: Jianchao Ji, Zelong Li, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Juntao Tan, Yongfeng Zhang

    Abstract: In recent years, large language models (LLM) have emerged as powerful tools for diverse natural language processing tasks. However, their potential for recommender systems under the generative recommendation paradigm remains relatively unexplored. This paper presents an innovative approach to recommendation systems using large language models (LLMs) based on text data. In this paper, we present a… ▽ More

    Submitted 4 July, 2023; v1 submitted 1 July, 2023; originally announced July 2023.

  40. arXiv:2306.12317  [pdf, other

    cs.CL

    Iterated Piecewise Affine (IPA) Approximation for Language Modeling

    Authors: Davood Shamsi, Wen-yu Hua, Brian Williams

    Abstract: In this work, we demonstrate the application of a first-order Taylor expansion to approximate a generic function $F: R^{n \times m} \to R^{n \times m}$ and utilize it in language modeling. To enhance the basic Taylor expansion, we introduce iteration and piecewise modeling, leading us to name the algorithm the Iterative Piecewise Affine (IPA) approximation. The final algorithm exhibits interesting… ▽ More

    Submitted 1 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  41. arXiv:2306.11134  [pdf, other

    cs.IR

    OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems

    Authors: Shuyuan Xu, Wenyue Hua, Yongfeng Zhang

    Abstract: In recent years, the integration of Large Language Models (LLMs) into recommender systems has garnered interest among both practitioners and researchers. Despite this interest, the field is still emerging, and the lack of open-source R&D platforms may impede the exploration of LLM-based recommendations. This paper introduces OpenP5, an open-source platform designed as a resource to facilitate the… ▽ More

    Submitted 10 April, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: In SIGIR 2024 Resource & Reproducibility Track

  42. arXiv:2306.03287  [pdf, other

    cs.CV

    ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

    Authors: Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun , et al. (2 additional authors not shown)

    Abstract: Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extracti… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: ICDAR 2023 Competition on SVRD report (To be appear in ICDAR 2023)

  43. arXiv:2306.03235  [pdf, other

    cs.LG cs.CR

    Information Flow Control in Machine Learning through Modular Model Architecture

    Authors: Trishita Tiwari, Suchin Gururangan, Chuan Guo, Weizhe Hua, Sanjay Kariyappa, Udit Gupta, Wenjie Xiong, Kiwan Maeng, Hsien-Hsin S. Lee, G. Edward Suh

    Abstract: In today's machine learning (ML) models, any part of the training data can affect the model output. This lack of control for information flow from training data to model output is a major obstacle in training models on sensitive data when access control only allows individual users to access a subset of data. To enable secure machine learning for access-controlled data, we propose the notion of in… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Usenix Security 2024 Camera Ready

  44. arXiv:2305.12090  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    UP5: Unbiased Foundation Model for Fairness-aware Recommendation

    Authors: Wenyue Hua, Yingqiang Ge, Shuyuan Xu, Jianchao Ji, Yongfeng Zhang

    Abstract: Recent advances in Foundation Models such as Large Language Models (LLMs) have propelled them to the forefront of Recommender Systems (RS). Despite their utility, there is a growing concern that LLMs might inadvertently perpetuate societal stereotypes, resulting in unfair recommendations. Since fairness is critical for RS as many users take it for decision-making and demand fulfillment, this paper… ▽ More

    Submitted 29 May, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: In EACL 2024

  45. A Hybrid 3D Eddy Detection Technique Based on Sea Surface Height and Velocity Field

    Authors: Weiping Hua, Karen Bemis, Dujuan Kang, Sedat Ozer, Deborah Silver

    Abstract: Eddy detection is a critical task for ocean scientists to understand and analyze ocean circulation. In this paper, we introduce a hybrid eddy detection approach that combines sea surface height (SSH) and velocity fields with geometric criteria defining eddy behavior. Our approach searches for SSH minima and maxima, which oceanographers expect to find at the center of eddies. Geometric criteria are… ▽ More

    Submitted 31 October, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: 8 pages, 14 figures. Accepted by EnvirVis 2023. Project Link: https://github.com/VizlabRutgers/Hybrid-Eddy-detection

  46. arXiv:2305.07498  [pdf, other

    cs.CV

    Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

    Authors: Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai

    Abstract: Visual information extraction (VIE), which aims to simultaneously perform OCR and information extraction in a unified framework, has drawn increasing attention due to its essential role in various applications like understanding receipts, goods, and traffic signs. However, as existing benchmark datasets for VIE mainly consist of document images without the adequate diversity of layout structures,… ▽ More

    Submitted 14 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 15 pages, 6 figures, ICDAR2023

  47. arXiv:2305.06569  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    How to Index Item IDs for Recommendation Foundation Models

    Authors: Wenyue Hua, Shuyuan Xu, Yingqiang Ge, Yongfeng Zhang

    Abstract: Recommendation foundation model utilizes large language models (LLM) for recommendation by converting recommendation tasks into natural language tasks. It enables generative recommendation which directly generates the item(s) to recommend rather than calculating a ranking score for each and every candidate item as in traditional recommendation models, simplifying the recommendation pipeline from m… ▽ More

    Submitted 25 September, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted as a full paper by ACM SIGIR-AP 2023

  48. arXiv:2305.06404  [pdf, other

    cs.CL cs.AI

    LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

    Authors: Wen-Yu Hua, Brian Williams, Davood Shamsi

    Abstract: Text embeddings are useful features for several NLP applications, such as sentence similarity, text clustering, and semantic search. In this paper, we present a Low-rank Adaptation with a Contrastive objective on top of 8-bit Siamese-BLOOM, a multilingual large language model optimized to produce semantically meaningful word embeddings. The innovation is threefold. First, we cast BLOOM weights to… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  49. arXiv:2304.04515  [pdf, other

    cs.CV

    SOOD: Towards Semi-Supervised Oriented Object Detection

    Authors: Wei Hua, Dingkang Liang, Jingyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects that are common in aerial images unexplored. This paper proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD, built upon the m… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023. Code will be available at https://github.com/HamPerdredes/SOOD

  50. arXiv:2304.04370  [pdf, other

    cs.AI cs.CL cs.LG

    OpenAGI: When LLM Meets Domain Experts

    Authors: Yingqiang Ge, Wenyue Hua, Kai Mei, Jianchao Ji, Juntao Tan, Shuyuan Xu, Zelong Li, Yongfeng Zhang

    Abstract: Human Intelligence (HI) excels at combining basic skills to solve complex tasks. This capability is vital for Artificial Intelligence (AI) and should be embedded in comprehensive AI Agents, enabling them to harness expert models for complex task-solving towards Artificial General Intelligence (AGI). Large Language Models (LLMs) show promising learning and reasoning abilities, and can effectively u… ▽ More

    Submitted 3 November, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

    Comments: In NeurIPS 2023