Skip to main content

Showing 1–50 of 237 results for author: Jin, D

  1. arXiv:2410.15553  [pdf, other

    cs.CL

    Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following

    Authors: Yun He, Di Jin, Chaoqi Wang, Chloe Bi, Karishma Mandyam, Hejia Zhang, Chen Zhu, Ning Li, Tengyu Xu, Hongjiang Lv, Shruti Bhosale, Chenguang Zhu, Karthik Abinav Sankararaman, Eryk Helenowski, Melanie Kambadur, Aditya Tayade, Hao Ma, Han Fang, Sinong Wang

    Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities in various tasks, including instruction following, which is crucial for aligning model outputs with user expectations. However, evaluating LLMs' ability to follow instructions remains challenging due to the complexity and subjectivity of human language. Current benchmarks primarily focus on single-turn, monolingual instructions… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

  2. arXiv:2410.10131  [pdf, other

    cs.SE

    A First Look at Package-to-Group Mechanism: An Empirical Study of the Linux Distributions

    Authors: Dongming Jin, Nianyu Li, Kai Yang, Minghui Zhou, Zhi Jin

    Abstract: Reusing third-party software packages is a common practice in software development. As the scale and complexity of open-source software (OSS) projects continue to grow (e.g., Linux distributions), the number of reused third-party packages has significantly increased. Therefore, maintaining effective package management is critical for developing and evolving OSS projects. To achieve this, a package… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: 11page, 11 figures

  3. arXiv:2410.04689  [pdf, other

    cs.CV

    Low-Rank Continual Pyramid Vision Transformer: Incrementally Segment Whole-Body Organs in CT with Light-Weighted Adaptation

    Authors: Vince Zhu, Zhanghexuan Ji, Dazhou Guo, Puyang Wang, Yingda Xia, Le Lu, Xianghua Ye, Wei Zhu, Dakai Jin

    Abstract: Deep segmentation networks achieve high performance when trained on specific datasets. However, in clinical practice, it is often desirable that pretrained segmentation models can be dynamically extended to enable segmenting new organs without access to previous training datasets or without training from scratch. This would ensure a much more efficient model development and deployment paradigm acc… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

    Comments: Accepted by Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024

  4. arXiv:2410.03870  [pdf, other

    cs.CL

    From Pixels to Personas: Investigating and Modeling Self-Anthropomorphism in Human-Robot Dialogues

    Authors: Yu Li, Devamanyu Hazarika, Di Jin, Julia Hirschberg, Yang Liu

    Abstract: Self-anthropomorphism in robots manifests itself through their display of human-like characteristics in dialogue, such as expressing preferences and emotions. Our study systematically analyzes self-anthropomorphic expression within various dialogue datasets, outlining the contrasts between self-anthropomorphic and non-self-anthropomorphic responses in dialogue systems. We show significant differen… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

    Comments: Findings of EMNLP 2024, 19 pages

  5. arXiv:2410.03137  [pdf, other

    cs.CL

    SAG: Style-Aligned Article Generation via Model Collaboration

    Authors: Chenning Xu, Fangxun Shu, Dian Jin, Jinghao Wei, Hao Jiang

    Abstract: Large language models (LLMs) have increased the demand for personalized and stylish content generation. However, closed-source models like GPT-4 present limitations in optimization opportunities, while the substantial training costs and inflexibility of open-source alternatives, such as Qwen-72B, pose considerable challenges. Conversely, small language models (SLMs) struggle with understanding com… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

  6. arXiv:2409.20370  [pdf, other

    cs.LG cs.AI cs.CL

    The Perfect Blend: Redefining RLHF with Mixture of Judges

    Authors: Tengyu Xu, Eryk Helenowski, Karthik Abinav Sankararaman, Di Jin, Kaiyan Peng, Eric Han, Shaoliang Nie, Chen Zhu, Hejia Zhang, Wenxuan Zhou, Zhouhao Zeng, Yun He, Karishma Mandyam, Arya Talabzadeh, Madian Khabsa, Gabriel Cohen, Yuandong Tian, Hao Ma, Sinong Wang, Han Fang

    Abstract: Reinforcement learning from human feedback (RLHF) has become the leading approach for fine-tuning large language models (LLM). However, RLHF has limitations in multi-task learning (MTL) due to challenges of reward hacking and extreme multi-objective optimization (i.e., trade-off of multiple and/or sometimes conflicting objectives). Applying RLHF for MTL currently requires careful tuning of the wei… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

    Comments: submitted to conference

  7. arXiv:2409.14745  [pdf, other

    cs.CC cs.IT

    Some Thoughts on Symbolic Transfer Entropy

    Authors: Dian Jin

    Abstract: Transfer entropy is used to establish a measure of causal relationships between two variables. Symbolic transfer entropy, as an estimation method for transfer entropy, is widely applied due to its robustness against non-stationarity. This paper investigates the embedding dimension parameter in symbolic transfer entropy and proposes optimization methods for high complexity in extreme cases with com… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  8. arXiv:2409.04298  [pdf, other

    cs.CV

    FS-MedSAM2: Exploring the Potential of SAM2 for Few-Shot Medical Image Segmentation without Fine-tuning

    Authors: Yunhao Bai, Qinji Yu, Boxiang Yun, Dakai Jin, Yingda Xia, Yan Wang

    Abstract: The Segment Anything Model 2 (SAM2) has recently demonstrated exceptional performance in zero-shot prompt segmentation for natural images and videos. However, it faces significant challenges when applied to medical images. Since its release, many attempts have been made to adapt SAM2's segmentation capabilities to the medical imaging domain. These efforts typically involve using a substantial amou… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

    Comments: 13 pages, 4 figures

  9. arXiv:2409.01588  [pdf, other

    cs.LG cs.AI cs.CY

    Large-scale Urban Facility Location Selection with Knowledge-informed Reinforcement Learning

    Authors: Hongyuan Su, Yu Zheng, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: The facility location problem (FLP) is a classical combinatorial optimization challenge aimed at strategically laying out facilities to maximize their accessibility. In this paper, we propose a reinforcement learning method tailored to solve large-scale urban FLP, capable of producing near-optimal solutions at superfast inference speed. We distill the essential swap operation from local search, an… ▽ More

    Submitted 6 September, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

    Comments: Sigspatial2024

    MSC Class: 68T20

  10. arXiv:2408.07486  [pdf, other

    cs.CV

    OMR: Occlusion-Aware Memory-Based Refinement for Video Lane Detection

    Authors: Dongkwon Jin, Chang-Su Kim

    Abstract: A novel algorithm for video lane detection is proposed in this paper. First, we extract a feature map for a current frame and detect a latent mask for obstacles occluding lanes. Then, we enhance the feature map by developing an occlusion-aware memory-based refinement (OMR) module. It takes the obstacle mask and feature map from the current frame, previous output, and memory information as input, a… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted to ECCV 2024

  11. arXiv:2408.03633  [pdf, other

    cs.CL

    CARE: A Clue-guided Assistant for CSRs to Read User Manuals

    Authors: Weihong Du, Jia Liu, Zujie Wen, Dingnan Jin, Hongru Liang, Wenqiang Lei

    Abstract: It is time-saving to build a reading assistant for customer service representations (CSRs) when reading user manuals, especially information-rich ones. Current solutions don't fit the online custom service scenarios well due to the lack of attention to user questions and possible responses. Hence, we propose to develop a time-saving and careful reading assistant for CSRs, named CARE. It can help t… ▽ More

    Submitted 26 August, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: Accepted to The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

  12. arXiv:2408.02450  [pdf, other

    cs.SE

    An Evaluation of Requirements Modeling for Cyber-Physical Systems via LLMs

    Authors: Dongming Jin, Shengxin Zhao, Zhi Jin, Xiaohong Chen, Chunhui Wang, Zheng Fang, Hongbin Xiao

    Abstract: Cyber-physical systems (CPSs) integrate cyber and physical components and enable them to interact with each other to meet user needs. The needs for CPSs span rich application domains such as healthcare and medicine, smart home, smart building, etc. This indicates that CPSs are all about solving real-world problems. With the increasing abundance of sensing devices and effectors, the problems wanted… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 12 pages, 8 figures

  13. arXiv:2407.16729  [pdf, other

    cs.LG cs.AI

    PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation Learning

    Authors: Huandong Wang, Changzheng Gao, Yuchen Wu, Depeng Jin, Lina Yao, Yong Li

    Abstract: Generating human mobility trajectories is of great importance to solve the lack of large-scale trajectory data in numerous applications, which is caused by privacy concerns. However, existing mobility trajectory generation methods still require real-world human trajectories centrally collected as the training data, where there exists an inescapable risk of privacy leakage. To overcome this limitat… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  14. arXiv:2407.11626  [pdf

    cs.LG cs.NE

    Dynamic Dimension Wrapping (DDW) Algorithm: A Novel Approach for Efficient Cross-Dimensional Search in Dynamic Multidimensional Spaces

    Authors: Dongnan Jin, Yali Liu, Qiuzhi Song, Xunju Ma, Yue Liu, Dehao Wu

    Abstract: In the real world, as the complexity of optimization problems continues to increase, there is an urgent need to research more efficient optimization methods. Current optimization algorithms excel in solving problems with a fixed number of dimensions. However, their efficiency in searching dynamic multi-dimensional spaces is unsatisfactory. In response to the challenge of cross-dimensional search i… ▽ More

    Submitted 18 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

  15. arXiv:2406.19746  [pdf, other

    cs.HC

    Voluminous Fur Stroking Experience through Interactive Visuo-Haptic Model in Virtual Reality

    Authors: Juro Hosoi, Du Jin, Yuki Ban, Shin'ichi Warisawa

    Abstract: The tactile sensation of stroking soft fur, known for its comfort and emotional benefits, has numerous applications in virtual reality, animal-assisted therapy, and household products. Previous studies have primarily utilized actual fur to present a voluminous fur experience that poses challenges concerning versatility and flexibility. In this study, we develop a system that integrates a head-moun… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  16. arXiv:2406.10661  [pdf, other

    cs.AI cs.LG

    A GPU-accelerated Large-scale Simulator for Transportation System Optimization Benchmarking

    Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Depeng Jin, Yong Li

    Abstract: With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision and optimization methods. Learning-based optimization methods require extensive interactions with highly realistic microscopic traffic simulators. However, existing microscopic traffic simulators a… ▽ More

    Submitted 2 October, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: Submitted to ICLR2025

  17. arXiv:2405.12520  [pdf, other

    cs.DC

    MOSS: A Large-scale Open Microscopic Traffic Simulation System

    Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Can Rong, Depeng Jin, Wei Wu, Yong Li

    Abstract: In the research of Intelligent Transportation Systems (ITS), traffic simulation is a key procedure for the evaluation of new methods and optimization of strategies. However, existing traffic simulation systems face two challenges. First, how to balance simulation scale with realism is a dilemma. Second, it is hard to simulate realistic results, which requires realistic travel demand data and simul… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE ITSC 2024

  18. arXiv:2405.12063  [pdf, other

    cs.CL

    CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models

    Authors: Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua

    Abstract: Large language models (LLMs) are increasingly used to meet user information needs, but their effectiveness in dealing with user queries that contain various types of ambiguity remains unknown, ultimately risking user trust and satisfaction. To this end, we introduce CLAMBER, a benchmark for evaluating LLMs using a well-organized taxonomy. Building upon the taxonomy, we construct ~12K high-quality… ▽ More

    Submitted 1 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL 2024. Camera Ready. Our dataset is available at https://github.com/zt991211/CLAMBER

  19. arXiv:2405.12059  [pdf, other

    cs.CL

    STYLE: Improving Domain Transferability of Asking Clarification Questions in Large Language Model Powered Conversational Agents

    Authors: Yue Chen, Chen Huang, Yang Deng, Wenqiang Lei, Dingnan Jin, Jia Liu, Tat-Seng Chua

    Abstract: Equipping a conversational search engine with strategies regarding when to ask clarification questions is becoming increasingly important across various domains. Attributing to the context understanding capability of LLMs and their access to domain-specific sources of knowledge, LLM-based clarification strategies feature rapid transfer to various domains in a post-hoc manner. However, they still s… ▽ More

    Submitted 1 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted to Findings of ACL 2024. Camera Ready

  20. arXiv:2405.09138  [pdf, other

    cs.CV

    OpenGait: A Comprehensive Benchmark Study for Gait Recognition towards Better Practicality

    Authors: Chao Fan, Saihui Hou, Junhao Liang, Chuanfu Shen, Jingzhe Ma, Dongyang Jin, Yongzhen Huang, Shiqi Yu

    Abstract: Gait recognition, a rapidly advancing vision technology for person identification from a distance, has made significant strides in indoor settings. However, evidence suggests that existing methods often yield unsatisfactory results when applied to newly released real-world gait datasets. Furthermore, conclusions drawn from indoor gait datasets may not easily generalize to outdoor ones. Therefore,… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  21. arXiv:2405.08991   

    cs.CV cs.RO

    Theoretical Analysis for Expectation-Maximization-Based Multi-Model 3D Registration

    Authors: David Jin, Harry Zhang, Kai Chang

    Abstract: We perform detailed theoretical analysis of an expectation-maximization-based algorithm recently proposed in for solving a variation of the 3D registration problem, named multi-model 3D registration. Despite having shown superior empirical results, did not theoretically justify the conditions under which the EM approach converges to the ground truth. In this project, we aim to close this gap by es… ▽ More

    Submitted 24 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: Course project based on a previous submission. Very similar to the submitted conference version. see here: arXiv:2402.10865

  22. arXiv:2405.03256  [pdf, other

    cs.SE

    MARE: Multi-Agents Collaboration Framework for Requirements Engineering

    Authors: Dongming Jin, Zhi Jin, Xiaohong Chen, Chunhui Wang

    Abstract: Requirements Engineering (RE) is a critical phase in the software development process that generates requirements specifications from stakeholders' needs. Recently, deep learning techniques have been successful in several RE tasks. However, obtaining high-quality requirements specifications requires collaboration across multiple tasks and roles. In this paper, we propose an innovative framework ca… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  23. arXiv:2404.18399  [pdf, other

    cs.CV

    Semantic Line Combination Detector

    Authors: Jinwon Ko, Dongkwon Jin, Chang-Su Kim

    Abstract: A novel algorithm, called semantic line combination detector (SLCD), to find an optimal combination of semantic lines is proposed in this paper. It processes all lines in each line combination at once to assess the overall harmony of the lines. First, we generate various line combinations from reliable lines. Second, we estimate the score of each line combination and determine the best one. Experi… ▽ More

    Submitted 1 May, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 accepted

  24. arXiv:2404.03819  [pdf, other

    cs.CV

    Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer

    Authors: Qinji Yu, Yirui Wang, Ke Yan, Haoshen Li, Dazhou Guo, Li Zhang, Le Lu, Na Shen, Qifeng Wang, Xiaowei Ding, Xianghua Ye, Dakai Jin

    Abstract: Lymph node (LN) assessment is a critical, indispensable yet very challenging task in the routine clinical workflow of radiology and oncology. Accurate LN analysis is essential for cancer diagnosis, staging, and treatment planning. Finding scatteredly distributed, low-contrast clinically relevant LNs in 3D CT is difficult even for experienced physicians under high inter-observer variations. Previou… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Technical report

  25. arXiv:2404.01284  [pdf, other

    cs.CV

    Large Motion Model for Unified Multi-Modal Motion Generation

    Authors: Mingyuan Zhang, Daisheng Jin, Chenyang Gu, Fangzhou Hong, Zhongang Cai, Jingfang Huang, Chongzhi Zhang, Xinying Guo, Lei Yang, Ying He, Ziwei Liu

    Abstract: Human motion generation, a cornerstone technique in animation and video production, has widespread applications in various tasks like text-to-motion and music-to-dance. Previous works focus on developing specialist models tailored for each task without scalability. In this work, we present Large Motion Model (LMM), a motion-centric, multi-modal framework that unifies mainstream motion generation t… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Homepage: https://mingyuan-zhang.github.io/projects/LMM.html

  26. arXiv:2403.15063  [pdf, other

    cs.CV

    Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans

    Authors: Heng Guo, Jianfeng Zhang, Jiaxing Huang, Tony C. W. Mok, Dazhou Guo, Ke Yan, Le Lu, Dakai Jin, Minfeng Xu

    Abstract: Segment anything model (SAM) demonstrates strong generalization ability on natural image segmentation. However, its direct adaption in medical image segmentation tasks shows significant performance drops with inferior accuracy and unstable results. It may also requires an excessive number of prompt points to obtain a reasonable accuracy. For segmenting 3D radiological CT or MRI scans, a 2D SAM mod… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  27. arXiv:2403.14274  [pdf, other

    cs.SE cs.AI

    Multi-role Consensus through LLMs Discussions for Vulnerability Detection

    Authors: Zhenyu Mao, Jialong Li, Dongming Jin, Munan Li, Kenji Tei

    Abstract: Recent advancements in large language models (LLMs) have highlighted the potential for vulnerability detection, a crucial component of software quality assurance. Despite this progress, most studies have been limited to the perspective of a single role, usually testers, lacking diverse viewpoints from different roles in a typical software development life-cycle, including both developers and teste… ▽ More

    Submitted 18 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  28. arXiv:2403.11202  [pdf, other

    cs.AR cs.AI cs.PL

    Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework

    Authors: Kaiyan Chang, Kun Wang, Nan Yang, Ying Wang, Dantong Jin, Wenlong Zhu, Zhirong Chen, Cangyuan Li, Hao Yan, Yunhao Zhou, Zhuoliang Zhao, Yuan Cheng, Yudong Pan, Yiqi Liu, Mengdi Wang, Shengwen Liang, Yinhe Han, Huawei Li, Xiaowei Li

    Abstract: Recent advances in large language models have demonstrated their potential for automated generation of hardware description language (HDL) code from high-level prompts. Researchers have utilized fine-tuning to enhance the ability of these large language models (LLMs) in the field of Chip Design. However, the lack of Verilog data hinders further improvement in the quality of Verilog generation by L… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: DAC 2024

  29. Rumor Mitigation in Social Media Platforms with Deep Reinforcement Learning

    Authors: Hongyuan Su, Yu Zheng, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: Social media platforms have become one of the main channels where people disseminate and acquire information, of which the reliability is severely threatened by rumors widespread in the network. Existing approaches such as suspending users or broadcasting real information to combat rumors are either with high cost or disturbing users. In this paper, we introduce a novel rumor mitigation paradigm,… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: WWW24 short

    MSC Class: 68T09

  30. MetroGNN: Metro Network Expansion with Reinforcement Learning

    Authors: Hongyuan Su, Yu Zheng, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: Selecting urban regions for metro network expansion to meet maximal transportation demands is crucial for urban development, while computationally challenging to solve. The expansion process relies not only on complicated features like urban demographics and origin-destination (OD) flow but is also constrained by the existing metro network and urban geography. In this paper, we introduce a reinfor… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: WWW24 short

    MSC Class: 68T09

  31. arXiv:2403.04712  [pdf, other

    cs.RO eess.SY

    GMKF: Generalized Moment Kalman Filter for Polynomial Systems with Arbitrary Noise

    Authors: Sangli Teng, Harry Zhang, David Jin, Ashkan Jasour, Maani Ghaffari, Luca Carlone

    Abstract: This paper develops a new filtering approach for state estimation in polynomial systems corrupted by arbitrary noise, which commonly arise in robotics. We first consider a batch setup where we perform state estimation using all data collected from the initial to the current time. We formulate the batch state estimation problem as a Polynomial Optimization Problem (POP) and relax the assumption of… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  32. arXiv:2403.02814  [pdf, other

    cs.LG cs.AI

    InjectTST: A Transformer Method of Injecting Global Information into Independent Channels for Long Time Series Forecasting

    Authors: Ce Chi, Xing Wang, Kexin Yang, Zhiyan Song, Di Jin, Lin Zhu, Chao Deng, Junlan Feng

    Abstract: Transformer has become one of the most popular architectures for multivariate time series (MTS) forecasting. Recent Transformer-based MTS models generally prefer channel-independent structures with the observation that channel independence can alleviate noise and distribution drift issues, leading to more robustness. Nevertheless, it is essential to note that channel dependency remains an inherent… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  33. arXiv:2402.18933  [pdf, other

    cs.CV

    Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration

    Authors: Tony C. W. Mok, Zi Li, Yunhao Bai, Jianpeng Zhang, Wei Liu, Yan-Jie Zhou, Ke Yan, Dakai Jin, Yu Shi, Xiaoli Yin, Le Lu, Ling Zhang

    Abstract: Establishing dense anatomical correspondence across distinct imaging modalities is a foundational yet challenging procedure for numerous medical image analysis studies and image-guided radiotherapy. Existing multi-modality image registration algorithms rely on statistical-based similarity measures or local structural image representations. However, the former is sensitive to locally varying noise,… ▽ More

    Submitted 31 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR2024

  34. arXiv:2402.17161  [pdf, other

    cs.AI cs.MA

    Large Language Model for Participatory Urban Planning

    Authors: Zhilun Zhou, Yuming Lin, Depeng Jin, Yong Li

    Abstract: Participatory urban planning is the mainstream of modern urban planning that involves the active engagement of residents. However, the traditional participatory paradigm requires experienced planning experts and is often time-consuming and costly. Fortunately, the emerging Large Language Models (LLMs) have shown considerable ability to simulate human-like agents, which can be used to emulate the p… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.01698

  35. arXiv:2402.11922  [pdf, other

    cs.LG

    Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

    Authors: Yuan Yuan, Chenyang Shao, Jingtao Ding, Depeng Jin, Yong Li

    Abstract: Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning des… ▽ More

    Submitted 25 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  36. UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction

    Authors: Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li

    Abstract: Urban spatio-temporal prediction is crucial for informed decision-making, such as traffic management, resource optimization, and emergence response. Despite remarkable breakthroughs in pretrained natural language models that enable one model to handle diverse tasks, a universal solution for spatio-temporal prediction remains challenging Existing prediction approaches are typically tailored for spe… ▽ More

    Submitted 30 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 2024 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2024

  37. arXiv:2402.11690  [pdf, other

    cs.CL cs.CV

    Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

    Authors: Zhiyang Xu, Chao Feng, Rulin Shao, Trevor Ashby, Ying Shen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang

    Abstract: Despite vision-language models' (VLMs) remarkable capabilities as versatile visual assistants, two substantial challenges persist within the existing VLM frameworks: (1) lacking task diversity in pretraining and visual instruction tuning, and (2) annotation error and bias in GPT-4 synthesized instruction tuning data. Both challenges lead to issues such as poor generalizability, hallucination, and… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 8 Pages, visual instruction tuning

  38. arXiv:2402.10865  [pdf, other

    cs.RO cs.CV

    Multi-Model 3D Registration: Finding Multiple Moving Objects in Cluttered Point Clouds

    Authors: David Jin, Sushrut Karmalkar, Harry Zhang, Luca Carlone

    Abstract: We investigate a variation of the 3D registration problem, named multi-model 3D registration. In the multi-model registration problem, we are given two point clouds picturing a set of objects at different poses (and possibly including points belonging to the background) and we want to simultaneously reconstruct how all objects moved between the two point clouds. This setup generalizes standard 3D… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 8 pages, Accepted by ICRA 2024

  39. arXiv:2402.06646  [pdf

    physics.ao-ph cs.LG physics.geo-ph

    Diffusion Model-based Probabilistic Downscaling for 180-year East Asian Climate Reconstruction

    Authors: Fenghua Ling, Zeyu Lu, Jing-Jia Luo, Lei Bai, Swadhin K. Behera, Dachao Jin, Baoxiang Pan, Huidong Jiang, Toshio Yamagata

    Abstract: As our planet is entering into the "global boiling" era, understanding regional climate change becomes imperative. Effective downscaling methods that provide localized insights are crucial for this target. Traditional approaches, including computationally-demanding regional dynamical models or statistical downscaling frameworks, are often susceptible to the influence of downscaling uncertainty. He… ▽ More

    Submitted 5 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  40. arXiv:2401.06176  [pdf, other

    cs.LG cs.AI

    GOODAT: Towards Test-time Graph Out-of-Distribution Detection

    Authors: Luzhi Wang, Dongxiao He, He Zhang, Yixin Liu, Wenjie Wang, Shirui Pan, Di Jin, Tat-Seng Chua

    Abstract: Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains. While GNNs excel in scenarios where the testing data shares the distribution of their training counterparts (in distribution, ID), they often exhibit incorrect predictions when confronted with samples from an unfamiliar distribution (out-of-distribution, OOD). To identify and reject OOD sa… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures

  41. arXiv:2312.06085  [pdf, other

    cs.CV

    Robust Geometry and Reflectance Disentanglement for 3D Face Reconstruction from Sparse-view Images

    Authors: Daisheng Jin, Jiangbei Hu, Baixin Xu, Yuxin Dai, Chen Qian, Ying He

    Abstract: This paper presents a novel two-stage approach for reconstructing human faces from sparse-view images, a task made challenging by the unique geometry and complex skin reflectance of each individual. Our method focuses on decomposing key facial attributes, including geometry, diffuse reflectance, and specular reflectance, from ambient light. Initially, we create a general facial template from a div… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: 8 pages, 8 figures

  42. arXiv:2311.14986  [pdf, other

    cs.CV

    SAME++: A Self-supervised Anatomical eMbeddings Enhanced medical image registration framework using stable sampling and regularized transformation

    Authors: Lin Tian, Zi Li, Fengze Liu, Xiaoyu Bai, Jia Ge, Le Lu, Marc Niethammer, Xianghua Ye, Ke Yan, Daikai Jin

    Abstract: Image registration is a fundamental medical image analysis task. Ideally, registration should focus on aligning semantically corresponding voxels, i.e., the same anatomical locations. However, existing methods often optimize similarity measures computed directly on intensities or on hand-crafted features, which lack anatomical semantic information. These similarity measures may lead to sub-optimal… ▽ More

    Submitted 25 February, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

  43. arXiv:2311.14543  [pdf, other

    cs.CL cs.AI

    Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language

    Authors: Di Jin, Shikib Mehri, Devamanyu Hazarika, Aishwarya Padmakumar, Sungjin Lee, Yang Liu, Mahdi Namazifar

    Abstract: Learning from human feedback is a prominent technique to align the output of large language models (LLMs) with human expectations. Reinforcement learning from human feedback (RLHF) leverages human preference signals that are in the form of ranking of response pairs to perform this alignment. However, human preference on LLM outputs can come in much richer forms including natural language, which ma… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted by Workshop on Instruction Tuning and Instruction Following at NeurIPS 2023, Submitted to AAAI 2024

  44. arXiv:2311.13444  [pdf, other

    cs.CV

    SkeletonGait: Gait Recognition Using Skeleton Maps

    Authors: Chao Fan, Jingzhe Ma, Dongyang Jin, Chuanfu Shen, Shiqi Yu

    Abstract: The choice of the representations is essential for deep gait recognition methods. The binary silhouettes and skeletal coordinates are two dominant representations in recent literature, achieving remarkable advances in many scenarios. However, inherent challenges remain, in which silhouettes are not always guaranteed in unconstrained scenes, and structural cues have not been fully utilized from ske… ▽ More

    Submitted 18 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

  45. arXiv:2311.12273  [pdf, other

    cs.NI eess.SY

    How AI-driven Digital Twins Can Empower Mobile Networks

    Authors: Tong Li, Fenyu Jiang, Qiaohong Yu, Wenzhen Huang, Tao Jiang, Depeng Jin

    Abstract: The growing complexity of next-generation networks exacerbates the modeling and algorithmic flaws of conventional network optimization methodology. In this paper, we propose a mobile network digital twin (MNDT) architecture for 6G networks. To address the modeling and algorithmic shortcomings, the MNDT uses a simulation-optimization structure. The feedback from the network simulation engine, which… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  46. arXiv:2311.08302  [pdf, other

    cs.IR

    Inverse Learning with Extremely Sparse Feedback for Recommendation

    Authors: Guanyu Lin, Chen Gao, Yu Zheng, Yinfeng Li, Jianxin Chang, Yanan Niu, Yang Song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li

    Abstract: Modern personalized recommendation services often rely on user feedback, either explicit or implicit, to improve the quality of services. Explicit feedback refers to behaviors like ratings, while implicit feedback refers to behaviors like user clicks. However, in the scenario of full-screen video viewing experiences like Tiktok and Reels, the click action is absent, resulting in unclear feedback f… ▽ More

    Submitted 20 November, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: WSDM 2024

  47. arXiv:2311.08272  [pdf, other

    cs.IR cs.LG

    Mixed Attention Network for Cross-domain Sequential Recommendation

    Authors: Guanyu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang Song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li, Meng Wang

    Abstract: In modern recommender systems, sequential recommendation leverages chronological user behaviors to make effective next-item suggestions, which suffers from data sparsity issues, especially for new users. One promising line of work is the cross-domain recommendation, which trains models with data across multiple domains to improve the performance in data-scarce domains. Recent proposed cross-domain… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: WSDM 2024

  48. arXiv:2310.10545  [pdf, other

    stat.ML cs.IT cs.LG eess.SP

    Optimal vintage factor analysis with deflation varimax

    Authors: Xin Bing, Dian Jin, Yuqian Zhang

    Abstract: Vintage factor analysis is one important type of factor analysis that aims to first find a low-dimensional representation of the original data, and then to seek a rotation such that the rotated low-dimensional representation is scientifically meaningful. The most widely used vintage factor analysis is the Principal Component Analysis (PCA) followed by the varimax rotation. Despite its popularity,… ▽ More

    Submitted 24 September, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  49. arXiv:2310.10467  [pdf, other

    cs.CL cs.AI

    Stance Detection with Collaborative Role-Infused LLM-Based Agents

    Authors: Xiaochong Lan, Chen Gao, Depeng Jin, Yong Li

    Abstract: Stance detection automatically detects the stance in a text towards a target, vital for content analysis in web and social media research. Despite their promising capabilities, LLMs encounter challenges when directly applied to stance detection. First, stance detection demands multi-aspect knowledge, from deciphering event-related terminologies to understanding the expression styles in social medi… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  50. arXiv:2309.16584  [pdf, other

    cs.MA cs.ET cs.LG cs.SE

    Collaborative Distributed Machine Learning

    Authors: David Jin, Niclas Kannengießer, Sascha Rank, Ali Sunyaev

    Abstract: Various collaborative distributed machine learning (CDML) systems, including federated learning systems and swarm learning systems, with different key traits were developed to leverage resources for development and use of machine learning (ML) models in a confidentiality-preserving way. To meet use case requirements, suitable CDML systems need to be selected. However, comparison between CDML syste… ▽ More

    Submitted 21 March, 2024; v1 submitted 28 September, 2023; originally announced September 2023.