Skip to main content

Showing 1–50 of 3,688 results for author: Wu, Z

  1. arXiv:2410.16162  [pdf, other

    cs.CV cs.CL

    Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning

    Authors: Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao

    Abstract: Vision language models (VLMs) have demonstrated impressive performance across a wide range of downstream tasks. However, their proficiency in spatial reasoning remains limited, despite its crucial role in tasks involving navigation and interaction with physical environments. Specifically, much of the spatial reasoning in these tasks occurs in two-dimensional (2D) environments, and our evaluation r… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  2. arXiv:2410.16126  [pdf, other

    math.GT

    Clock Moves and Alexander Polynomial of Plane Graphs

    Authors: Wenbo Liao, Zhongtao Wu

    Abstract: In this paper, we introduce a notion of clock moves for spanning trees in plane graphs. This enables us to develop a spanning tree model of an Alexander polynomial for a plane graph and prove the unimodal property of its associate coefficient sequence. In particular, this confirms the trapezoidal conjecture for planar singular knots and gives new insights to Fox's original conjecture on alternatin… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  3. arXiv:2410.15930  [pdf, other

    cs.IR cs.AI

    Centrality-aware Product Retrieval and Ranking

    Authors: Hadeel Saadany, Swapnil Bhosale, Samarth Agrawal, Diptesh Kanojia, Constantin Orasan, Zhe Wu

    Abstract: This paper addresses the challenge of improving user experience on e-commerce platforms by enhancing product ranking relevant to users' search queries. Ambiguity and complexity of user queries often lead to a mismatch between the user's intent and retrieved product titles or documents. Recent approaches have proposed the use of Transformer-based models, which need millions of annotated query-title… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: EMNLP 2024: Industry track

  4. arXiv:2410.15916  [pdf, other

    cs.CV

    Leveraging CORAL-Correlation Consistency Network for Semi-Supervised Left Atrium MRI Segmentation

    Authors: Xinze Li, Runlin Huang, Zhenghao Wu, Bohan Yang, Wentao Fan, Chengzhang Zhu, Weifeng Su

    Abstract: Semi-supervised learning (SSL) has been widely used to learn from both a few labeled images and many unlabeled images to overcome the scarcity of labeled samples in medical image segmentation. Most current SSL-based segmentation methods use pixel values directly to identify similar features in labeled and unlabeled data. They usually fail to accurately capture the intricate attachment structures i… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: 5 pages, 3 figures, Accepted by 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2024)

    ACM Class: I.4.6

  5. arXiv:2410.15700  [pdf, other

    cs.AI cs.CL

    InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems

    Authors: Zijian Wu, Suozhi Huang, Zhejian Zhou, Huaiyuan Ying, Jiayu Wang, Dahua Lin, Kai Chen

    Abstract: Large Language Models (LLMs) have emerged as powerful tools in mathematical theorem proving, particularly when utilizing formal languages such as LEAN. The major learning paradigm is expert iteration, which necessitates a pre-defined dataset comprising numerous mathematical problems. In this process, LLMs attempt to prove problems within the dataset and iteratively refine their capabilities throug… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  6. arXiv:2410.15234  [pdf, other

    cs.AI

    Bias Amplification: Language Models as Increasingly Biased Media

    Authors: Ze Wang, Zekun Wu, Jeremy Zhang, Navya Jain, Xin Guan, Adriano Koshiyama

    Abstract: As Large Language Models (LLMs) become increasingly integrated into various facets of society, a significant portion of online text consequently become synthetic. This raises concerns about bias amplification, a phenomenon where models trained on synthetic data amplify the pre-existing biases over successive training iterations. Previous literature seldom discusses bias amplification as an indepen… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

    Comments: Submitted to ARR Roling Review October

  7. arXiv:2410.15164  [pdf, other

    cs.AI

    SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

    Authors: Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Kaiwen Zhou, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye Hao, Jun Wang, Kun Shao

    Abstract: Smartphone agents are increasingly important for helping users control devices efficiently, with (Multimodal) Large Language Model (MLLM)-based approaches emerging as key contenders. Fairly comparing these agents is essential but challenging, requiring a varied task scope, the integration of agents with different implementations, and a generalisable evaluation pipeline to assess their strengths an… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  8. arXiv:2410.15155  [pdf, other

    cs.LG cs.AR math.OC

    Pipeline Gradient-based Model Training on Analog In-memory Accelerators

    Authors: Zhaoxian Wu, Quan Xiao, Tayfun Gokmen, Hsinyu Tsai, Kaoutar El Maghraoui, Tianyi Chen

    Abstract: Aiming to accelerate the training of large deep neural models (DNN) in an energy-efficient way, an analog in-memory computing (AIMC) accelerator emerges as a solution with immense potential. In AIMC accelerators, trainable weights are kept in memory without the need to move from memory to processors during the training, reducing a bunch of overhead. However, although the in-memory feature enables… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  9. arXiv:2410.15052  [pdf, other

    cs.AI

    Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization

    Authors: Zihui Wu, Haichang Gao, Ping Wang, Shudong Zhang, Zhaoxiang Liu, Shiguo Lian

    Abstract: Glitch tokens in Large Language Models (LLMs) can trigger unpredictable behaviors, compromising model reliability and safety. Existing detection methods often rely on manual observation to infer the prior distribution of glitch tokens, which is inefficient and lacks adaptability across diverse model architectures. To address these limitations, we introduce GlitchMiner, a gradient-based discrete op… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  10. arXiv:2410.15027  [pdf, other

    cs.CV

    Group Diffusion Transformers are Unsupervised Multitask Learners

    Authors: Lianghua Huang, Wei Wang, Zhi-Fan Wu, Huanzhang Dou, Yupeng Shi, Yutong Feng, Chen Liang, Yu Liu, Jingren Zhou

    Abstract: While large language models (LLMs) have revolutionized natural language processing with their task-agnostic capabilities, visual generation tasks such as image translation, style transfer, and character customization still rely heavily on supervised, task-specific datasets. In this work, we introduce Group Diffusion Transformers (GDTs), a novel framework that unifies diverse visual generation task… ▽ More

    Submitted 19 October, 2024; originally announced October 2024.

  11. arXiv:2410.14803  [pdf, other

    cs.LG cs.AI cs.DC eess.SY

    DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

    Authors: Taiyi Wang, Zhihao Wu, Jianheng Liu, Jianye Hao, Jun Wang, Kun Shao

    Abstract: On-device control agents, especially on mobile devices, are responsible for operating mobile devices to fulfill users' requests, enabling seamless and intuitive interactions. Integrating Multimodal Large Language Models (MLLMs) into these agents enhances their ability to understand and execute complex commands, thereby improving user experience. However, fine-tuning MLLMs for on-device control pre… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: Paper and Appendix, 24 pages

  12. arXiv:2410.14493  [pdf, other

    cs.CR

    Safeguarding Blockchain Ecosystem: Understanding and Detecting Attack Transactions on Cross-chain Bridges

    Authors: Jiajing Wu, Kaixin Lin, Dan Lin, Bozhao Zhang, Zhiying Wu, Jianzhong Su

    Abstract: Cross-chain bridges are essential decentralized applications (DApps) to facilitate interoperability between different blockchain networks. Unlike regular DApps, the functionality of cross-chain bridges relies on the collaboration of information both on and off the chain, which exposes them to a wider risk of attacks. According to our statistics, attacks on cross-chain bridges have resulted in loss… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  13. arXiv:2410.14238  [pdf, other

    cs.CV

    Storyboard guided Alignment for Fine-grained Video Action Recognition

    Authors: Enqi Liu, Liyuan Pan, Yan Yang, Yiran Zhong, Zhijing Wu, Xinxiao Wu, Liu Liu

    Abstract: Fine-grained video action recognition can be conceptualized as a video-text matching problem. Previous approaches often rely on global video semantics to consolidate video embeddings, which can lead to misalignment in video-text pairs due to a lack of understanding of action semantics at an atomic granularity level. To tackle this challenge, we propose a multi-granularity framework based on two ob… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  14. arXiv:2410.14169  [pdf, other

    cs.CV

    DaRePlane: Direction-aware Representations for Dynamic Scene Reconstruction

    Authors: Ange Lou, Benjamin Planche, Zhongpai Gao, Yamin Li, Tianyu Luan, Hao Ding, Meng Zheng, Terrence Chen, Ziyan Wu, Jack Noble

    Abstract: Numerous recent approaches to modeling and re-rendering dynamic scenes leverage plane-based explicit representations, addressing slow training times associated with models like neural radiance fields (NeRF) and Gaussian splatting (GS). However, merely decomposing 4D dynamic scenes into multiple 2D plane-based representations is insufficient for high-fidelity re-rendering of scenes with complex mot… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.02265

  15. arXiv:2410.13974  [pdf, other

    cs.LG cs.CR

    Trojan Prompt Attacks on Graph Neural Networks

    Authors: Minhua Lin, Zhiwei Zhang, Enyan Dai, Zongyu Wu, Yilong Wang, Xiang Zhang, Suhang Wang

    Abstract: Graph Prompt Learning (GPL) has been introduced as a promising approach that uses prompts to adapt pre-trained GNN models to specific downstream tasks without requiring fine-tuning of the entire model. Despite the advantages of GPL, little attention has been given to its vulnerability to backdoor attacks, where an adversary can manipulate the model's behavior by embedding hidden triggers. Existing… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  16. arXiv:2410.13848  [pdf, other

    cs.CV cs.AI cs.CL

    Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

    Authors: Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, Ping Luo

    Abstract: In this paper, we introduce Janus, an autoregressive framework that unifies multimodal understanding and generation. Prior research often relies on a single visual encoder for both tasks, such as Chameleon. However, due to the differing levels of information granularity required by multimodal understanding and generation, this approach can lead to suboptimal performance, particularly in multimodal… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Technical Report

  17. arXiv:2410.13748  [pdf, other

    hep-ex

    Test of lepton flavour universality with $B_s^0 \rightarrow φ\ell^+\ell^-$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1124 additional authors not shown)

    Abstract: Lepton flavour universality in rare $b\rightarrow s$ transitions is tested for the first time using $B_s^0$ meson decays. The measurements are performed using $pp$ collision data collected by the LHCb experiment between 2011 and 2018, corresponding to a total integrated luminosity of 9$\,{\rm fb}^{-1}$. Branching fraction ratios between the $B_s^0 \rightarrow φe^+e^-$ and… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3513/ (LHCb public pages)

    Report number: LHCb-PAPER-2024-032, CERN-EP-2024-255

  18. arXiv:2410.13743  [pdf, other

    cs.LG

    Single-Timescale Multi-Sequence Stochastic Approximation Without Fixed Point Smoothness: Theories and Applications

    Authors: Yue Huang, Zhaoxian Wu, Shiqian Ma, Qing Ling

    Abstract: Stochastic approximation (SA) that involves multiple coupled sequences, known as multiple-sequence SA (MSSA), finds diverse applications in the fields of signal processing and machine learning. However, existing theoretical understandings {of} MSSA are limited: the multi-timescale analysis implies a slow convergence rate, whereas the single-timescale analysis relies on a stringent fixed point smoo… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  19. arXiv:2410.13515  [pdf, other

    hep-ex hep-lat hep-ph nucl-ex

    Observation of a rare beta decay of the charmed baryon with a Graph Neural Network

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

    Abstract: The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 28 pages, 6 figures

  20. arXiv:2410.13478  [pdf, other

    hep-ex

    Observation of $χ_{c0}\toΣ^{+}\barΣ^{-}η$ and evidence for $χ_{c1,2}\toΣ^{+}\barΣ^{-}η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\barΣ^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  21. arXiv:2410.13368  [pdf, other

    hep-ex hep-ph

    Observation of the Singly Cabibbo-Suppressed Decay $Λ_c^{+}\to pπ^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 9 pages, 4 figures

  22. arXiv:2410.12934  [pdf, other

    cs.CL

    Enhancing Mathematical Reasoning in LLMs by Stepwise Correction

    Authors: Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang

    Abstract: Best-of-N decoding methods instruct large language models (LLMs) to generate multiple solutions, score each using a scoring function, and select the highest scored as the final answer to mathematical reasoning problems. However, this repeated independent process often leads to the same mistakes, making the selected solution still incorrect. We propose a novel prompting method named Stepwise Correc… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: under review

  23. arXiv:2410.12620  [pdf, other

    hep-ex

    Search for $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ at center-of-mass energies from 4.47 to 4.95 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 14 pages, 6 figures

  24. arXiv:2410.12214  [pdf, other

    cs.CV cs.AI

    Order-aware Interactive Segmentation

    Authors: Bin Wang, Anwesa Choudhuri, Meng Zheng, Zhongpai Gao, Benjamin Planche, Andong Deng, Qin Liu, Terrence Chen, Ulas Bagci, Ziyan Wu

    Abstract: Interactive segmentation aims to accurately segment target objects with minimal user interactions. However, current methods often fail to accurately separate target objects from the background, due to a limited understanding of order, the relative depth between objects in a scene. To address this issue, we propose OIS: order-aware interactive segmentation, where we explicitly encode the relative d… ▽ More

    Submitted 17 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

    Comments: Interactive demo can be found in project page: https://ukaukaaaa.github.io/projects/OIS/index.html

  25. arXiv:2410.12207  [pdf, other

    cs.AI cs.LG

    Divide-Verify-Refine: Aligning LLM Responses with Complex Instructions

    Authors: Xianren Zhang, Xianfeng Tang, Hui Liu, Zongyu Wu, Qi He, Dongwon Lee, Suhang Wang

    Abstract: Recent studies show that LLMs, particularly open-source models, struggle to follow complex instructions with multiple constraints. Despite the importance, methods to improve LLMs' adherence to such constraints remain unexplored, and current research focuses on evaluating this ability rather than developing solutions. While a few studies enhance constraint adherence through model tuning, this appro… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: Under review

  26. arXiv:2410.11607  [pdf, other

    hep-ex

    Observation of $χ_{cJ}\to p \bar p K^0_S K^- π^+ + c.c.$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (648 additional authors not shown)

    Abstract: By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: 12 pages, 5 figures

  27. arXiv:2410.11059  [pdf, other

    cs.CL cs.AI

    Assessing Bias in Metric Models for LLM Open-Ended Generation Bias Benchmarks

    Authors: Nathaniel Demchak, Xin Guan, Zekun Wu, Ziyi Xu, Adriano Koshiyama, Emre Kazim

    Abstract: Open-generation bias benchmarks evaluate social biases in Large Language Models (LLMs) by analyzing their outputs. However, the classifiers used in analysis often have inherent biases, leading to unfair conclusions. This study examines such biases in open-generation benchmarks like BOLD and SAGED. Using the MGSD dataset, we conduct two experiments. The first uses counterfactuals to measure predict… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: NeurIPS 2024 EvalEval Workshop

  28. arXiv:2410.10926  [pdf, other

    cs.LG cs.AI cs.CL

    Federated Data-Efficient Instruction Tuning for Large Language Models

    Authors: Zhen Qin, Zhaomin Wu, Bingsheng He, Shuiguang Deng

    Abstract: Instruction tuning helps improve pretrained large language models (LLMs) in terms of the responsiveness to human instructions, which is benefited from diversified instruction data. Federated learning extends the sources of instruction data by exploiting the diversified client-side data, making it increasingly popular for tuning LLMs. Existing approaches of federated LLM tuning typically traverse a… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 11 pages. Ongoing work

  29. arXiv:2410.10664  [pdf

    quant-ph physics.atom-ph physics.optics physics.pop-ph

    Tunable Einstein-Bohr recoiling-slit gedankenexperiment at the quantum limit

    Authors: Yu-Chen Zhang, Hao-Wen Cheng, Zhao-Qiu Zengxu, Zhan Wu, Rui Lin, Yu-Cheng Duan, Jun Rui, Ming-Cheng Chen, Chao-Yang Lu, Jian-Wei Pan

    Abstract: In 1927, during the fifth Solvay Conference, Einstein and Bohr described a double-slit interferometer with a "movable slit" that can detect the momentum recoil of one photon. Here, we report a faithful realization of the Einstein-Bohr interferometer using a single atom in an optical tweezer, cooled to the motional ground state in three dimensions. The single atom has an intrinsic momentum uncertai… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 18 pages, 4 figures

  30. arXiv:2410.10481  [pdf, other

    cs.LG cs.AI cs.CR

    Model-Based Differentially Private Knowledge Transfer for Large Language Models

    Authors: Zhaomin Wu, Jizhou Guo, Junyi Hou, Bingsheng He, Lixin Fan, Qiang Yang

    Abstract: As large language models (LLMs) become increasingly prevalent in web services, effectively leveraging domain-specific knowledge while ensuring privacy has become critical. Existing methods, such as retrieval-augmented generation (RAG) and differentially private data synthesis, often compromise either the utility of domain knowledge or the privacy of sensitive data, limiting their applicability in… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  31. arXiv:2410.10398  [pdf, other

    cs.CE cs.AI

    FairMindSim: Alignment of Behavior, Emotion, and Belief in Humans and LLM Agents Amid Ethical Dilemmas

    Authors: Yu Lei, Hao Liu, Chengxing Xie, Songjia Liu, Zhiyu Yin, Canyu Chen, Guohao Li, Philip Torr, Zhen Wu

    Abstract: AI alignment is a pivotal issue concerning AI control and safety. It should consider not only value-neutral human preferences but also moral and ethical considerations. In this study, we introduced FairMindSim, which simulates the moral dilemma through a series of unfair scenarios. We used LLM agents to simulate human behavior, ensuring alignment across various stages. To explore the various socio… ▽ More

    Submitted 17 October, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

  32. arXiv:2410.10312  [pdf, other

    cs.IT

    Achievable Second-Order Asymptotics for MAC and RAC with Additive Non-Gaussian Noise

    Authors: Yiming Wang, Lin Bai, Zhuangfei Wu, Lin Zhou

    Abstract: We first study the two-user additive noise multiple access channel (MAC) where the noise distribution is arbitrary. For such a MAC, we use spherical codebooks and either joint nearest neighbor (JNN) or successive interference cancellation (SIC) decoding. Under both decoding methods, we derive second-order achievable rate regions and compare the finite blocklength performance between JNN and SIC de… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  33. arXiv:2410.10083  [pdf, other

    cs.AI

    Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?

    Authors: Yifan Feng, Chengwu Yang, Xingliang Hou, Shaoyi Du, Shihui Ying, Zongze Wu, Yue Gao

    Abstract: Existing benchmarks like NLGraph and GraphQA evaluate LLMs on graphs by focusing mainly on pairwise relationships, overlooking the high-order correlations found in real-world data. Hypergraphs, which can model complex beyond-pairwise relationships, offer a more robust framework but are still underexplored in the context of LLMs. To address this gap, we introduce LLM4Hypergraph, the first comprehen… ▽ More

    Submitted 16 October, 2024; v1 submitted 13 October, 2024; originally announced October 2024.

  34. arXiv:2410.09732  [pdf, other

    cs.CV

    LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

    Authors: Junyan Ye, Baichuan Zhou, Zilong Huang, Junan Zhang, Tianyi Bai, Hengrui Kang, Jun He, Honglin Lin, Zihao Wang, Tong Wu, Zhizheng Wu, Yiping Chen, Dahua Lin, Conghui He, Weijia Li

    Abstract: With the rapid development of AI-generated content, the future internet may be inundated with synthetic data, making the discrimination of authentic and credible multimodal data increasingly challenging. Synthetic data detection has thus garnered widespread attention, and the performance of large multimodal models (LMMs) in this task has attracted significant interest. LMMs can provide natural lan… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: 79 pages, 63 figures

  35. arXiv:2410.09674  [pdf, other

    eess.IV cs.CV cs.LG cs.NE

    EG-SpikeFormer: Eye-Gaze Guided Transformer on Spiking Neural Networks for Medical Image Analysis

    Authors: Yi Pan, Hanqi Jiang, Junhao Chen, Yiwei Li, Huaqin Zhao, Yifan Zhou, Peng Shu, Zihao Wu, Zhengliang Liu, Dajiang Zhu, Xiang Li, Yohannes Abate, Tianming Liu

    Abstract: Neuromorphic computing has emerged as a promising energy-efficient alternative to traditional artificial intelligence, predominantly utilizing spiking neural networks (SNNs) implemented on neuromorphic hardware. Significant advancements have been made in SNN-based convolutional neural networks (CNNs) and Transformer architectures. However, their applications in the medical imaging domain remain un… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  36. arXiv:2410.09302  [pdf, other

    cs.LG cs.AI cs.CL

    Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

    Authors: Guanlin Liu, Kaixuan Ji, Renjie Zheng, Zheng Wu, Chen Dun, Quanquan Gu, Lin Yan

    Abstract: Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs) with human preferences and improving their ability to perform complex tasks. However, current approaches either require significant computational resources due to the use of multiple models and extensive online sampling for training (e.g., PPO) or are framed as bandit problems (e.g., DPO, DRO), which often st… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  37. arXiv:2410.08794  [pdf, other

    cs.LG cs.AI

    M$^3$-Impute: Mask-guided Representation Learning for Missing Value Imputation

    Authors: Zhongyi Yu, Zhenghao Wu, Shuhan Zhong, Weifeng Su, S. -H. Gary Chan, Chul-Ho Lee, Weipeng Zhuo

    Abstract: Missing values are a common problem that poses significant challenges to data analysis and machine learning. This problem necessitates the development of an effective imputation method to fill in the missing values accurately, thereby enhancing the overall quality and utility of the datasets. Existing imputation methods, however, fall short of explicitly considering the `missingness' information i… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  38. arXiv:2410.08603  [pdf, other

    hep-ex

    Observation of $D^+\toη^\primeμ^+ν_μ$ and First Study of $D^+\to η^\prime \ell^+ν_\ell$ Decay Dynamics

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy 3.773\,GeV with the BESIII detector, we report the first observation of the semileptonic decay $D^+\to η^\prime μ^+ν_μ$ with significance of $8.6σ$ including systematic uncertainties, and an improved measurement of $D^+\to η^\prime e^+ν_e$. The branching fractions of $D^+\to η^\prime μ^+ν_μ$ and… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  39. arXiv:2410.08189  [pdf, other

    cs.CV cs.RO

    SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation

    Authors: Hang Yin, Xiuwei Xu, Zhenyu Wu, Jie Zhou, Jiwen Lu

    Abstract: In this paper, we propose a new framework for zero-shot object navigation. Existing zero-shot object navigation methods prompt LLM with the text of spatially closed objects, which lacks enough scene context for in-depth reasoning. To better preserve the information of environment and fully exploit the reasoning ability of LLM, we propose to represent the observed scene with 3D scene graph. The sce… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Accepted to NeurIPS 2024. Project page: https://bagh2178.github.io/SG-Nav/

  40. arXiv:2410.07626  [pdf, other

    hep-ex

    Precision Measurement of the Branching Fraction of $D^{+}\to μ^{+}ν_μ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of $E_{\rm cm}=3.773$ GeV with the BESIII detector operating at the BEPCII collider, we determine the branching fraction of the leptonic decay $D^+\toμ^+ν_μ$ to be $(3.981\pm0.079_{\rm stat}\pm0.040_{\rm syst})\times10^{-4}$. Interpreting our measurement with knowledge of the Fermi coupling constant… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 9 pages, 2 figures

  41. arXiv:2410.07577  [pdf, other

    cs.CV

    3D Vision-Language Gaussian Splatting

    Authors: Qucheng Peng, Benjamin Planche, Zhongpai Gao, Meng Zheng, Anwesa Choudhuri, Terrence Chen, Chen Chen, Ziyan Wu

    Abstract: Recent advancements in 3D reconstruction methods and vision-language models have propelled the development of multi-modal 3D scene understanding, which has vital applications in robotics, autonomous driving, and virtual/augmented reality. However, current multi-modal scene understanding approaches have naively embedded semantic representations into 3D reconstruction methods without striking a bala… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: main paper + supplementary material

  42. arXiv:2410.07556  [pdf, ps, other

    math.GR

    On The Largest Character Degree And Solvable Subgroups Of Finite Groups

    Authors: Zongshu Wu, Yong Yang

    Abstract: Let $G$ be a finite group, and $π$ be a set of primes. The $π$-core $\mathbf{O}_π(G)$ is the unique maximal normal $π$-subgroup of $G$, and $b(G)$ is the largest irreducible character degree of $G$. In 2017, Qian and Yang proved that if $H$ is a solvable $π$-subgroup of $G$, then $|H\mathbf{O}_π(G)/\mathbf{O}_π(G)|\le b(G)^3$. In this paper, we improve the exponent of $3$ to… ▽ More

    Submitted 10 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

  43. arXiv:2410.07219  [pdf, other

    cs.IT

    CKMImageNet: A Comprehensive Dataset to Enable Channel Knowledge Map Construction via Computer Vision

    Authors: Di Wu, Zijian Wu, Yuelong Qiu, Shen Fu, Yong Zeng

    Abstract: Environment-aware communication and sensing is one of the promising paradigm shifts towards 6G, which fully leverages prior information of the local wireless environment to optimize network performance. One of the key enablers for environment-aware communication and sensing is channel knowledge map (CKM), which provides location-specific channel knowledge that is crucial for channel state informat… ▽ More

    Submitted 29 September, 2024; originally announced October 2024.

  44. arXiv:2410.07133  [pdf, other

    cs.CV

    EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

    Authors: Rui Zhao, Hangjie Yuan, Yujie Wei, Shiwei Zhang, Yuchao Gu, Lingmin Ran, Xiang Wang, Zhangjie Wu, Junhao Zhang, Yingya Zhang, Mike Zheng Shou

    Abstract: Recent advancements in generation models have showcased remarkable capabilities in generating fantastic content. However, most of them are trained on proprietary high-quality data, and some models withhold their parameters and only provide accessible application programming interfaces (APIs), limiting their benefits for downstream tasks. To explore the feasibility of training a text-to-image gener… ▽ More

    Submitted 10 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

  45. arXiv:2410.07098  [pdf, ps, other

    math.CO

    Clique density vs blowups

    Authors: Domagoj Bradač, Hong Liu, Zhuo Wu, Zixiang Xu

    Abstract: A well-known theorem of Nikiforov asserts that any graph with a positive $K_{r}$-density contains a logarithmic blowup of $K_r$. In this paper, we explore variants of Nikiforov's result in the following form. Given $r,t\in\mathbb{N}$, when a positive $K_{r}$-density implies the existence of a significantly larger (with almost linear size) blowup of $K_t$? Our results include: For an $n$-vertex o… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  46. arXiv:2410.06500  [pdf, other

    hep-ex

    Search for the radiative decays $D^+\toγρ^+$ and $D^+\toγK^{*+}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (648 additional authors not shown)

    Abstract: We search for the radiative decays $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ using 20.3~fb$^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$ GeV by the BESIII detector operating at the BEPCII collider. No significant signals are observed, and the upper limits on the branching fractions of $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ at 90\% confidence level ar… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  47. arXiv:2410.05736  [pdf, ps, other

    hep-ex

    Observation of an axial-vector state in the study of $ψ(3686) \to φηη'$ decay

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (625 additional authors not shown)

    Abstract: Using (2712.4 $\pm$ 14.3)$\times 10^{6}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, a partial wave analysis of the decay $ψ(3686) \to φηη' $ is performed with the covariant tensor approach. An axial-vector state with a mass near 2.3 $\rm GeV/c^2$ is observed for the first time. Its mass and width are measured to be 2316… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  48. arXiv:2410.05624  [pdf, other

    cs.CV cs.LG

    Remote Sensing Image Segmentation Using Vision Mamba and Multi-Scale Multi-Frequency Feature Fusion

    Authors: Yice Cao, Chenchen Liu, Zhenhua Wu, Wenxin Yao, Liu Xiong, Jie Chen, Zhixiang Huang

    Abstract: As remote sensing imaging technology continues to advance and evolve, processing high-resolution and diversified satellite imagery to improve segmentation accuracy and enhance interpretation efficiency emerg as a pivotal area of investigation within the realm of remote sensing. Although segmentation algorithms based on CNNs and Transformers achieve significant progress in performance, balancing se… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  49. arXiv:2410.05410  [pdf, other

    cs.CV eess.IV

    Enhanced Super-Resolution Training via Mimicked Alignment for Real-World Scenes

    Authors: Omar Elezabi, Zongwei Wu, Radu Timofte

    Abstract: Image super-resolution methods have made significant strides with deep learning techniques and ample training data. However, they face challenges due to inherent misalignment between low-resolution (LR) and high-resolution (HR) pairs in real-world datasets. In this study, we propose a novel plug-and-play module designed to mitigate these misalignment issues by aligning LR inputs with HR images dur… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Accepted by ACCV 2024

  50. arXiv:2410.04974  [pdf, other

    cs.CV cs.AI

    6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering

    Authors: Zhongpai Gao, Benjamin Planche, Meng Zheng, Anwesa Choudhuri, Terrence Chen, Ziyan Wu

    Abstract: Novel view synthesis has advanced significantly with the development of neural radiance fields (NeRF) and 3D Gaussian splatting (3DGS). However, achieving high quality without compromising real-time rendering remains challenging, particularly for physically-based ray tracing with view-dependent effects. Recently, N-dimensional Gaussians (N-DG) introduced a 6D spatial-angular representation to bett… ▽ More

    Submitted 10 October, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: Project: https://gaozhongpai.github.io/6dgs/ and fixed iteration typos