Skip to main content

Showing 1–50 of 397 results for author: Fu, X

  1. arXiv:2410.14923  [pdf, other

    cs.CR

    Imprompter: Tricking LLM Agents into Improper Tool Use

    Authors: Xiaohan Fu, Shuheng Li, Zihan Wang, Yihao Liu, Rajesh K. Gupta, Taylor Berg-Kirkpatrick, Earlence Fernandes

    Abstract: Large Language Model (LLM) Agents are an emerging computing paradigm that blends generative machine learning with tools such as code interpreters, web browsing, email, and more generally, external resources. These agent-based systems represent an emerging shift in personal computing. We contribute to the security foundations of agent-based systems and surface a new class of automatically computed… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: website: imprompter.ai code: https://github.com/Reapor-Yurnero/imprompter

  2. arXiv:2410.12771  [pdf, other

    cond-mat.mtrl-sci cs.AI physics.comp-ph

    Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models

    Authors: Luis Barroso-Luque, Muhammed Shuaibi, Xiang Fu, Brandon M. Wood, Misko Dzamba, Meng Gao, Ammar Rizvi, C. Lawrence Zitnick, Zachary W. Ulissi

    Abstract: The ability to discover new materials with desirable properties is critical for numerous applications from helping mitigate climate change to advances in next generation computing hardware. AI has the potential to accelerate materials discovery and design by more effectively exploring the chemical space compared to other computational methods or by trial-and-error. While substantial progress has b… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 19 pages

  3. arXiv:2410.07192  [pdf, other

    cs.DC cs.LG

    PipeFill: Using GPUs During Bubbles in Pipeline-parallel LLM Training

    Authors: Daiyaan Arfeen, Zhen Zhang, Xinwei Fu, Gregory R. Ganger, Yida Wang

    Abstract: Training Deep Neural Networks (DNNs) with billions of parameters generally involves pipeline-parallel (PP) execution. Unfortunately, PP model training can use GPUs inefficiently, especially at large scale, due to idle GPU time caused by pipeline bubbles, which are often 15-30% and can exceed 60% of the training job's GPU allocation. To improve the GPU utilization of PP model training, this paper d… ▽ More

    Submitted 23 September, 2024; originally announced October 2024.

  4. arXiv:2410.06272  [pdf, other

    cs.CL

    The Mystery of Compositional Generalization in Graph-based Generative Commonsense Reasoning

    Authors: Xiyan Fu, Anette Frank

    Abstract: While LLMs have emerged as performant architectures for reasoning tasks, their compositional generalization capabilities have been questioned. In this work, we introduce a Compositional Generalization Challenge for Graph-based Commonsense Reasoning (CGGC) that goes beyond previous evaluations that are based on sequences or tree structures - and instead involves a reasoning graph: It requires model… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: Accepted Findings at EMNLP 2024

  5. arXiv:2410.00706  [pdf

    cs.RO cs.CV

    A Low-Cost, High-Speed, and Robust Bin Picking System for Factory Automation Enabled by a Non-Stop, Multi-View, and Active Vision Scheme

    Authors: Xingdou Fu, Lin Miao, Yasuhiro Ohnishi, Yuki Hasegawa, Masaki Suwa

    Abstract: Bin picking systems in factory automation usually face robustness issues caused by sparse and noisy 3D data of metallic objects. Utilizing multiple views, especially with a one-shot 3D sensor and "sensor on hand" configuration is getting more popularity due to its effectiveness, flexibility, and low cost. While moving the 3D sensor to acquire multiple views for 3D fusion, joint optimization, or ac… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  6. arXiv:2410.00404  [pdf, other

    eess.IV cs.CV

    3DGR-CAR: Coronary artery reconstruction from ultra-sparse 2D X-ray views with a 3D Gaussians representation

    Authors: Xueming Fu, Yingtai Li, Fenghe Tang, Jun Li, Mingyue Zhao, Gao-Jun Teng, S. Kevin Zhou

    Abstract: Reconstructing 3D coronary arteries is important for coronary artery disease diagnosis, treatment planning and operation navigation. Traditional reconstruction techniques often require many projections, while reconstruction from sparse-view X-ray projections is a potential way of reducing radiation dose. However, the extreme sparsity of coronary arteries in a 3D volume and ultra-limited number of… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

    Comments: 10 pages, 5 figures, Accepted at MICCAI 2024

  7. arXiv:2409.19564  [pdf, other

    cs.DC

    Hamster: A Fast Synchronous Byzantine Fault Tolerance Protocol

    Authors: Ximing Fu, Mo Li, Qingming Zeng, Tianyang Li, Shenghao Yang, Yonghui Guan, Chuanyi Liu

    Abstract: This paper introduces Hamster, a novel synchronous Byzantine Fault Tolerance protocol that achieves better performance and has weaker dependency on synchrony. Specifically, Hamster employs coding techniques to significantly decrease communication complexity and addresses coding related security issues. Consequently, Hamster achieves a throughput gain that increases linearly with the number of node… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

  8. arXiv:2409.19422  [pdf, other

    cs.LG cs.AI stat.ML

    Identifiable Shared Component Analysis of Unpaired Multimodal Mixtures

    Authors: Subash Timilsina, Sagar Shrestha, Xiao Fu

    Abstract: A core task in multi-modal learning is to integrate information from multiple feature spaces (e.g., text and audio), offering modality-invariant essential representations of data. Recent research showed that, classical tools such as {\it canonical correlation analysis} (CCA) provably identify the shared components up to minor ambiguities, when samples in each modality are generated from a linear m… ▽ More

    Submitted 1 October, 2024; v1 submitted 28 September, 2024; originally announced September 2024.

  9. arXiv:2409.19286  [pdf, other

    cs.DC

    IM: Optimizing Byzantine Consensus for High-Performance Distributed Networks

    Authors: Qingming Zeng, Mo Li, Ximing Fu, Chuanyi Liu, Hui Jiang

    Abstract: Byzantine Fault Tolerant (BFT) consensus, a crucial component of blockchains, has made significant advancements. However, the efficiency of existing protocols can still be damaged by certain attacks from faulty nodes and network instability. In this paper, we propose a novel Shared Mempool (SMP) protocol, namely IM, that enhances performance under these attacks. Technically, IM organizing microblo… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

    Comments: 16 pages, 5 figures

  10. arXiv:2409.13440  [pdf, other

    eess.SP cs.AI cs.CR cs.LG

    Differentially Private Multimodal Laplacian Dropout (DP-MLD) for EEG Representative Learning

    Authors: Xiaowen Fu, Bingxin Wang, Xinzhou Guo, Guoqing Liu, Yang Xiang

    Abstract: Recently, multimodal electroencephalogram (EEG) learning has shown great promise in disease detection. At the same time, ensuring privacy in clinical studies has become increasingly crucial due to legal and ethical concerns. One widely adopted scheme for privacy protection is differential privacy (DP) because of its clear interpretation and ease of implementation. Although numerous methods have be… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  11. arXiv:2409.11056  [pdf, other

    cs.CL

    Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts

    Authors: Teng Wang, Zhenqi He, Wing-Yin Yu, Xiaojin Fu, Xiongwei Han

    Abstract: With the advent of Large Language Models (LLMs), generating rule-based data for real-world applications has become more accessible. Due to the inherent ambiguity of natural language and the complexity of rule sets, especially in long contexts, LLMs often struggle to follow all specified rules, frequently omitting at least one. To enhance the reasoning and understanding of LLMs on long and complex… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

  12. arXiv:2409.08733  [pdf, other

    cs.LG

    Multi-intent Aware Contrastive Learning for Sequential Recommendation

    Authors: Junshu Huang, Zi Long, Xianghua Fu, Yin Chen

    Abstract: Intent is a significant latent factor influencing user-item interaction sequences. Prevalent sequence recommendation models that utilize contrastive learning predominantly rely on single-intent representations to direct the training process. However, this paradigm oversimplifies real-world recommendation scenarios, attempting to encapsulate the diversity of intents within the single-intent level r… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  13. arXiv:2409.07714  [pdf, other

    cs.CV cs.MA

    CollaMamba: Efficient Collaborative Perception with Cross-Agent Spatial-Temporal State Space Model

    Authors: Yang Li, Quan Yuan, Guiyang Luo, Xiaoyuan Fu, Xuanhan Zhu, Yujia Yang, Rui Pan, Jinglin Li

    Abstract: By sharing complementary perceptual information, multi-agent collaborative perception fosters a deeper understanding of the environment. Recent studies on collaborative perception mostly utilize CNNs or Transformers to learn feature representation and fusion in the spatial dimension, which struggle to handle long-range spatial-temporal features under limited computing and communication resources.… ▽ More

    Submitted 26 September, 2024; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: Submitted to AAAI 2025

  14. arXiv:2409.06936  [pdf

    cs.DB cs.DL

    An Intelligent Innovation Dataset on Scientific Research Outcomes

    Authors: Xinran Wu, Hui Zou, Yidan Xing, Jingjing Qu, Qiongxiu Li, Renxia Xue, Xiaoming Fu

    Abstract: Various stakeholders, such as researchers, government agencies, businesses, and research laboratories require a large volume of reliable scientific research outcomes including research articles and patent data to support their work. These data are crucial for a variety of application, such as advancing scientific research, conducting business evaluations, and undertaking policy analysis. However,… ▽ More

    Submitted 29 September, 2024; v1 submitted 10 September, 2024; originally announced September 2024.

  15. arXiv:2409.04101  [pdf, other

    cs.LG

    Ultra-imbalanced classification guided by statistical information

    Authors: Yin Jin, Ningtao Wang, Ruofan Wu, Pengfei Shi, Xing Fu, Weiqiang Wang

    Abstract: Imbalanced data are frequently encountered in real-world classification tasks. Previous works on imbalanced learning mostly focused on learning with a minority class of few samples. However, the notion of imbalance also applies to cases where the minority class contains abundant samples, which is usually the case for industrial applications like fraud detection in the area of financial risk manage… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  16. Learning to Discover Forgery Cues for Face Forgery Detection

    Authors: Jiahe Tian, Peng Chen, Cai Yu, Xiaomeng Fu, Xi Wang, Jiao Dai, Jizhong Han

    Abstract: Locating manipulation maps, i.e., pixel-level annotation of forgery cues, is crucial for providing interpretable detection results in face forgery detection. Related learning objects have also been widely adopted as auxiliary tasks to improve the classification performance of detectors whereas they require comparisons between paired real and forged faces to obtain manipulation maps as supervision.… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: TIFS 2024

  17. arXiv:2409.00843  [pdf, other

    econ.GN cs.CE cs.CY q-fin.CP stat.ML

    Global Public Sentiment on Decentralized Finance: A Spatiotemporal Analysis of Geo-tagged Tweets from 150 Countries

    Authors: Yuqi Chen, Yifan Li, Kyrie Zhixuan Zhou, Xiaokang Fu, Lingbo Liu, Shuming Bao, Daniel Sui, Luyao Zhang

    Abstract: In the digital era, blockchain technology, cryptocurrencies, and non-fungible tokens (NFTs) have transformed financial and decentralized systems. However, existing research often neglects the spatiotemporal variations in public sentiment toward these technologies, limiting macro-level insights into their global impact. This study leverages Twitter data to explore public attention and sentiment acr… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  18. arXiv:2409.00597  [pdf, other

    cs.MM cs.CL

    Multimodal Multi-turn Conversation Stance Detection: A Challenge Dataset and Effective Model

    Authors: Fuqiang Niu, Zebang Cheng, Xianghua Fu, Xiaojiang Peng, Genan Dai, Yin Chen, Hu Huang, Bowen Zhang

    Abstract: Stance detection, which aims to identify public opinion towards specific targets using social media data, is an important yet challenging task. With the proliferation of diverse multimodal social media content including text, and images multimodal stance detection (MSD) has become a crucial research area. However, existing MSD studies have focused on modeling stance within individual text-image pa… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

    Comments: ACM MM2024

  19. arXiv:2409.00017  [pdf, other

    cs.HC

    Could Micro-Expressions be Quantified? Electromyography Gives Affirmative Evidence

    Authors: Jingting Li, Shaoyuan Lu, Yan Wang, Zizhao Dong, Su-Jing Wang, Xiaolan Fu

    Abstract: Micro-expressions (MEs) are brief, subtle facial expressions that reveal concealed emotions, offering key behavioral cues for social interaction. Characterized by short duration, low intensity, and spontaneity, MEs have been mostly studied through subjective coding, lacking objective, quantitative indicators. This paper explores ME characteristics using facial electromyography (EMG), analyzing dat… ▽ More

    Submitted 16 August, 2024; originally announced September 2024.

  20. arXiv:2408.17065  [pdf, other

    cs.CV

    Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning

    Authors: Zhiyuan Yan, Yandan Zhao, Shen Chen, Xinghe Fu, Taiping Yao, Shouhong Ding, Li Yuan

    Abstract: Three key challenges hinder the development of current deepfake video detection: (1) Temporal features can be complex and diverse: how can we identify general temporal artifacts to enhance model generalization? (2) Spatiotemporal models often lean heavily on one type of artifact and ignore the other: how can we ensure balanced learning from both? (3) Videos are naturally resource-intensive: how ca… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  21. arXiv:2408.17053  [pdf, other

    cs.LG

    Estimating Conditional Average Treatment Effects via Sufficient Representation Learning

    Authors: Pengfei Shi, Wei Zhong, Xinyu Zhang, Ningtao Wang, Xing Fu, Weiqiang Wang, Yin Jin

    Abstract: Estimating the conditional average treatment effects (CATE) is very important in causal inference and has a wide range of applications across many fields. In the estimation process of CATE, the unconfoundedness assumption is typically required to ensure the identifiability of the regression problems. When estimating CATE using high-dimensional data, there have been many variable selection methods… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  22. arXiv:2408.16295  [pdf

    cs.SI

    IC always bad? : Information Cocooning as a Group Emotional Stabilization Role in Social Networks

    Authors: Jinhu Ren, Tianlong Fan, Xifei Fu, Linyuan Lü

    Abstract: This research aims to investigate the effects of information cocooning on group mood changes caused by information spreading. The simulation of the realistic network evolution process is realized at the structural level by building a network evolution model based on individual viewpoints. Abstracting the accuracy of the real intelligent recommendation process by setting RA (Recommended Accuracy).… ▽ More

    Submitted 30 August, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  23. arXiv:2408.10519  [pdf, other

    cs.DC cs.DS

    Almost Optimal Algorithms for Token Collision in Anonymous Networks

    Authors: Sirui Bai, Xinyu Fu, Xudong Wu, Penghui Yao, Chaodong Zheng

    Abstract: In distributed systems, situations often arise where some nodes each holds a collection of tokens, and all nodes collectively need to determine whether all tokens are distinct. For example, if each token represents a logged-in user, the problem corresponds to checking whether there are duplicate logins. Similarly, if each token represents a data object or a timestamp, the problem corresponds to ch… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  24. arXiv:2408.09393  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Graph Learning with Structure Proxy Alignment

    Authors: Xingbo Fu, Zihan Chen, Binchi Zhang, Chen Chen, Jundong Li

    Abstract: Federated Graph Learning (FGL) aims to learn graph learning models over graph data distributed in multiple data owners, which has been applied in various applications such as social recommendation and financial fraud detection. Inherited from generic Federated Learning (FL), FGL similarly has the data heterogeneity issue where the label distribution may vary significantly for distributed graph dat… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: Accepted by KDD 2024

  25. arXiv:2408.08056  [pdf, other

    cs.LG

    DATTA: Towards Diversity Adaptive Test-Time Adaptation in Dynamic Wild World

    Authors: Chuyang Ye, Dongyan Wei, Zhendong Liu, Yuanyi Pang, Yixi Lin, Jiarong Liao, Qinting Jiang, Xianghua Fu, Qing Li, Jingyan Jiang

    Abstract: Test-time adaptation (TTA) effectively addresses distribution shifts between training and testing data by adjusting models on test samples, which is crucial for improving model inference in real-world applications. However, traditional TTA methods typically follow a fixed pattern to address the dynamic data patterns (low-diversity or high-diversity patterns) often leading to performance degradatio… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: 16 pages, 2 figures

  26. arXiv:2408.06831  [pdf, other

    cs.CG

    Polynomial 2D Green Coordinates for High-order Cages

    Authors: Shibo Liu, Ligang Liu, Xiao-Ming Fu

    Abstract: We propose conformal polynomial coordinates for 2D closed high-order cages, which consist of polynomial curves of any order. The coordinates enable the transformation of the input polynomial curves into polynomial curves of any order. We extend the classical 2D Green coordinates to define our coordinates, thereby leading to cage-aware conformal harmonic deformations. We extensively test our method… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  27. arXiv:2408.06082  [pdf, ps, other

    cs.SE

    AutoCheck: Automatically Identifying Variables for Checkpointing by Data Dependency Analysis

    Authors: Xiang Fu, Weiping Zhang, Xin Huang, Shiman Meng, Wubiao Xu, Luanzheng Guo, Kento Sato

    Abstract: Checkpoint/Restart (C/R) has been widely deployed in numerous HPC systems, Clouds, and industrial data centers, which are typically operated by system engineers. Nevertheless, there is no existing approach that helps system engineers without domain expertise, and domain scientists without system fault tolerance knowledge identify those critical variables accounted for correct application execution… ▽ More

    Submitted 15 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

    Comments: 11 pages, 7 figures, 4 tables

  28. arXiv:2408.05815  [pdf, other

    cs.CV

    HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training

    Authors: Fenghe Tang, Ronghao Xu, Qingsong Yao, Xueming Fu, Quan Quan, Heqin Zhu, Zaiyi Liu, S. Kevin Zhou

    Abstract: The generative self-supervised learning strategy exhibits remarkable learning representational capabilities. However, there is limited attention to end-to-end pre-training methods based on a hybrid architecture of CNN and Transformer, which can learn strong local and global representations simultaneously. To address this issue, we propose a generative pre-training strategy called Hybrid Sparse mas… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: Early accept at MICCAI 2024

    ACM Class: I.4.10; I.4.6

  29. arXiv:2408.00278  [pdf, other

    cs.LG cs.AI cs.NE

    High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures

    Authors: Xiang Fu, Xinpeng Zhang, Jixiang Ma, Peng Zhao, Shuai Lu, Xu T. Liu

    Abstract: Convolution is the core component within deep neural networks and it is computationally intensive and time consuming. Tensor data layouts significantly impact convolution operations in terms of memory access and computational efficiency. Yet, there is still a lack of comprehensive performance characterization on data layouts on SIMD architectures concerning convolution methods. This paper proposes… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  30. arXiv:2407.16990  [pdf, other

    cs.NI

    Region-based Content Enhancement for Efficient Video Analytics at the Edge

    Authors: Weijun Wang, Liang Mi, Shaowei Cen, Haipeng Dai, Yuanchun Li, Xiaoming Fu, Yunxin Liu

    Abstract: Video analytics is widespread in various applications serving our society. Recent advances of content enhancement in video analytics offer significant benefits for the bandwidth saving and accuracy improvement. However, existing content-enhanced video analytics systems are excessively computationally expensive and provide extremely low throughput. In this paper, we present region-based content enh… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  31. arXiv:2407.14899  [pdf, other

    cs.LG cs.CV

    Hyperspectral Unmixing Under Endmember Variability: A Variational Inference Framework

    Authors: Yuening Li, Xiao Fu, Junbin Liu, Wing-Kin Ma

    Abstract: This work proposes a variational inference (VI) framework for hyperspectral unmixing in the presence of endmember variability (HU-EV). An EV-accounted noisy linear mixture model (LMM) is considered, and the presence of outliers is also incorporated into the model. Following the marginalized maximum likelihood (MML) principle, a VI algorithmic structure is designed for probabilistic inference for H… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  32. arXiv:2407.13252  [pdf, other

    cs.CV

    Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

    Authors: Qiao Li, Xiaomeng Fu, Xi Wang, Jin Liu, Xingyu Gao, Jiao Dai, Jizhong Han

    Abstract: With the rapid advancements of large-scale text-to-image diffusion models, various practical applications have emerged, bringing significant convenience to society. However, model developers may misuse the unauthorized data to train diffusion models. These data are at risk of being memorized by the models, thus potentially violating citizens' privacy rights. Therefore, in order to judge whether a… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  33. arXiv:2407.06902  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective

    Authors: Shahana Ibrahim, Panagiotis A. Traganitis, Xiao Fu, Georgios B. Giannakis

    Abstract: One of the primary catalysts fueling advances in artificial intelligence (AI) and machine learning (ML) is the availability of massive, curated datasets. A commonly used technique to curate such massive datasets is crowdsourcing, where data are dispatched to multiple annotators. The annotator-produced labels are then fused to serve downstream learning and inference tasks. This annotation process o… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  34. arXiv:2407.05047  [pdf, other

    cs.AI

    MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning

    Authors: Min Zhang, Xian Fu, Jianye Hao, Peilong Han, Hao Zhang, Lei Shi, Hongyao Tang, Yan Zheng

    Abstract: In recent years, Multi-modal Foundation Models (MFMs) and Embodied Artificial Intelligence (EAI) have been advancing side by side at an unprecedented pace. The integration of the two has garnered significant attention from the AI research community. In this work, we attempt to provide an in-depth and comprehensive evaluation of the performance of MFM s on embodied task planning, aiming to shed lig… ▽ More

    Submitted 7 October, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

  35. arXiv:2407.04557  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates

    Authors: Ryotaro Okabe, Mouyang Cheng, Abhijatmedhi Chotrattanapituk, Nguyen Tuan Hung, Xiang Fu, Bowen Han, Yao Wang, Weiwei Xie, Robert J. Cava, Tommi S. Jaakkola, Yongqiang Cheng, Mingda Li

    Abstract: Billions of organic molecules are known, but only a tiny fraction of the functional inorganic materials have been discovered, a particularly relevant problem to the community searching for new quantum materials. Recent advancements in machine-learning-based generative models, particularly diffusion models, show great promise for generating new, stable materials. However, integrating geometric patt… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 512 pages total, 4 main figures + 218 supplementary figures

  36. arXiv:2407.00615  [pdf, other

    cs.LG

    GC-Bench: An Open and Unified Benchmark for Graph Condensation

    Authors: Qingyun Sun, Ziying Chen, Beining Yang, Cheng Ji, Xingcheng Fu, Sheng Zhou, Hao Peng, Jianxin Li, Philip S. Yu

    Abstract: Graph condensation (GC) has recently garnered considerable attention due to its ability to reduce large-scale graph datasets while preserving their essential properties. The core concept of GC is to create a smaller, more manageable graph that retains the characteristics of the original graph. Despite the proliferation of graph condensation methods developed in recent years, there is no comprehens… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Preprint, under review)

  37. arXiv:2406.16066  [pdf, other

    cs.CE

    Constructing Boundary-identical Microstructures by Guided Diffusion for Fast Multiscale Designs

    Authors: Jingxuan Feng, Lili Wang, Xiaoya Zhai, Kai Chen, Wenming Wu, Ligang Liu, Xiao-Ming Fu

    Abstract: We propose a novel method to construct large-scale boundary-identical microstructure datasets with high attribute coverage for highly efficient multiscale design. Central to our technique is using a deep generative model to generate microstructures under the two conditions, including the specified boundary and homogenized elastic tensor. We achieve the desired dataset by alternately adding microst… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  38. arXiv:2406.14288  [pdf, other

    cs.LG cs.AI

    Revisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective

    Authors: Yunfei Liu, Jintang Li, Yuehe Chen, Ruofan Wu, Ericbk Wang, Jing Zhou, Sheng Tian, Shuheng Shen, Xing Fu, Changhua Meng, Weiqiang Wang, Liang Chen

    Abstract: Graph clustering, a fundamental and challenging task in graph mining, aims to classify nodes in a graph into several disjoint clusters. In recent years, graph contrastive learning (GCL) has emerged as a dominant line of research in graph clustering and advances the new state-of-the-art. However, GCL-based methods heavily rely on graph augmentations and contrastive schemes, which may potentially in… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: KDD 2024 research track. Code available at https://github.com/EdisonLeeeee/MAGI

  39. arXiv:2406.13495  [pdf, other

    cs.CV

    DF40: Toward Next-Generation Deepfake Detection

    Authors: Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Li Yuan, Chengjie Wang, Shouhong Ding, Yunsheng Wu

    Abstract: We propose a new comprehensive benchmark to revolutionize the current deepfake detection field to the next generation. Predominantly, existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset (e.g., FF++) and testing them on other prevalent deepfake datasets. This protocol is often regarded as a "golden compass"… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  40. arXiv:2406.11243  [pdf, other

    cs.CL cs.AI

    FamiCom: Further Demystifying Prompts for Language Models with Task-Agnostic Performance Estimation

    Authors: Bangzheng Li, Ben Zhou, Xingyu Fu, Fei Wang, Dan Roth, Muhao Chen

    Abstract: Language models have shown impressive in-context-learning capabilities, which allow them to benefit from input prompts and perform better on downstream end tasks. Existing works investigate the mechanisms behind this observation, and propose label-agnostic prompt metrics that can better estimate end-task performances. One popular approach is using perplexity as a way to measure models' familiarity… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  41. arXiv:2406.09870  [pdf, other

    cs.LG cs.AI

    IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

    Authors: Jiawen Qin, Haonan Yuan, Qingyun Sun, Lyujin Xu, Jiaqi Yuan, Pengfeng Huang, Zhaonan Wang, Xingcheng Fu, Hao Peng, Jianxin Li, Philip S. Yu

    Abstract: Deep graph learning has gained grand popularity over the past years due to its versatility and success in representing graph data across a wide range of domains. However, the pervasive issue of imbalanced graph data distributions, where certain parts exhibit disproportionally abundant data while others remain sparse, undermines the efficacy of conventional graph learning algorithms, leading to bia… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Preprint, under review)

  42. arXiv:2406.09700  [pdf, other

    cs.RO physics.bio-ph

    Jointed Tails Enhance Control of Three-dimensional Body Rotation

    Authors: Xun Fu, Bohao Zhang, Ceri J. Weber, Kimberly L. Cooper, Ram Vasudevan, Talia Y. Moore

    Abstract: Tails used as inertial appendages induce body rotations of animals and robots, a phenomenon that is governed largely by the ratio of the body and tail moments of inertia. However, vertebrate tails have more degrees of freedom (e.g., number of joints, rotational axes) than most current theoretical models and robotic tails. To understand how morphology affects inertial appendage function, we develop… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  43. arXiv:2406.09411  [pdf, other

    cs.CV cs.AI cs.CL

    MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

    Authors: Fei Wang, Xingyu Fu, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Jacky Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen

    Abstract: We introduce MuirBench, a comprehensive benchmark that focuses on robust multi-image understanding capabilities of multimodal LLMs. MuirBench consists of 12 diverse multi-image tasks (e.g., scene understanding, ordering) that involve 10 categories of multi-image relations (e.g., multiview, temporal relations). Comprising 11,264 images and 2,600 multiple-choice questions, MuirBench is created in a… ▽ More

    Submitted 1 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: typos corrected, references added, Project Page: https://muirbench.github.io/

  44. arXiv:2406.09403  [pdf, other

    cs.CV cs.CL

    Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

    Authors: Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Ranjay Krishna

    Abstract: Humans draw to facilitate reasoning: we draw auxiliary lines when solving geometry problems; we mark and circle when reasoning on maps; we use sketches to amplify our ideas and relieve our limited-capacity working memory. However, such actions are missing in current multimodal language models (LMs). Current chain-of-thought and tool-use paradigms only use text as intermediate reasoning steps. In t… ▽ More

    Submitted 10 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Project and codes url: https://visualsketchpad.github.io/

  45. arXiv:2406.07951  [pdf, other

    cs.CV

    DemosaicFormer: Coarse-to-Fine Demosaicing Network for HybridEVS Camera

    Authors: Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha

    Abstract: Hybrid Event-Based Vision Sensor (HybridEVS) is a novel sensor integrating traditional frame-based and event-based sensors, offering substantial benefits for applications requiring low-light, high dynamic range, and low-latency environments, such as smartphones and wearable devices. Despite its potential, the lack of Image signal processing (ISP) pipeline specifically designed for HybridEVS poses… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  46. arXiv:2406.07546  [pdf, other

    cs.CV cs.AI cs.CL

    Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

    Authors: Xingyu Fu, Muyu He, Yujie Lu, William Yang Wang, Dan Roth

    Abstract: We present a novel task and benchmark for evaluating the ability of text-to-image(T2I) generation models to produce images that align with commonsense in real life, which we call Commonsense-T2I. Given two adversarial text prompts containing an identical set of action words with minor differences, such as "a lightbulb without electricity" v.s. "a lightbulb with electricity", we evaluate whether T2… ▽ More

    Submitted 12 August, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: COLM 2024, Project Url: https://zeyofu.github.io/CommonsenseT2I/

  47. arXiv:2406.04378  [pdf, other

    cs.LG hep-ex

    TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising

    Authors: J. T. Fry, Aobo Li, Lindley Winslow, Xinyi Hope Fu, Zhenghao Fu, Kaliroe M. W. Pappas

    Abstract: Dark matter makes up approximately 85% of total matter in our universe, yet it has never been directly observed in any laboratory on Earth. The origin of dark matter is one of the most important questions in contemporary physics, and a convincing detection of dark matter would be a Nobel-Prize-level breakthrough in fundamental science. The ABRACADABRA experiment was specifically designed to search… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  48. arXiv:2406.02610  [pdf, other

    q-bio.QM cs.AI cs.LG

    MoFormer: Multi-objective Antimicrobial Peptide Generation Based on Conditional Transformer Joint Multi-modal Fusion Descriptor

    Authors: Li Wang, Xiangzheng Fu, Jiahao Yang, Xinyi Zhang, Xiucai Ye, Yiping Liu, Tetsuya Sakurai, Xiangxiang Zeng

    Abstract: Deep learning holds a big promise for optimizing existing peptides with more desirable properties, a critical step towards accelerating new drug discovery. Despite the recent emergence of several optimized Antimicrobial peptides(AMP) generation methods, multi-objective optimizations remain still quite challenging for the idealism-realism tradeoff. Here, we establish a multi-objective AMP synthesis… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  49. arXiv:2406.01724  [pdf, other

    cs.RO

    Predictive Braking on a Nonplanar Road

    Authors: Thomas Fork, Francesco Camozzi, Xiao-Yu Fu, Francesco Borrelli

    Abstract: We present an approach for predictive braking of a four-wheeled vehicle on a nonplanar road. Our main contribution is a methodology to consider friction and road contact safety on general smooth road geometry. We use this to develop an active safety system to preemptively reduce vehicle speed for upcoming road geometry, such as off-camber turns. Our system may be used for human-driven or autonomou… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  50. arXiv:2405.19991  [pdf, other

    cs.CE math.OC

    OpenTM: An Open-source, Single-GPU, Large-scale Thermal Microstructure Design Framework

    Authors: Yuchen Quan, Xiaoya Zhai, Xiao-Ming Fu

    Abstract: Thermal microstructures are artificially engineered materials designed to manipulate and control heat flow in unconventional ways. This paper presents an educational framework, called \emph{OpenTM}, to use a single GPU for designing periodic 3D high-resolution thermal microstructures to match the predefined thermal conductivity matrices with volume fraction constraints. Specifically, we use adapti… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.