Skip to main content

Showing 1–50 of 2,530 results for author: Liang, Y

  1. arXiv:2410.15636  [pdf, other

    cs.CV

    LucidFusion: Generating 3D Gaussians with Arbitrary Unposed Images

    Authors: Hao He, Yixun Liang, Luozhou Wang, Yuanhao Cai, Xinli Xu, Hao-Xiang Guo, Xiang Wen, Yingcong Chen

    Abstract: Recent large reconstruction models have made notable progress in generating high-quality 3D objects from single images. However, these methods often struggle with controllability, as they lack information from multiple views, leading to incomplete or inconsistent 3D reconstructions. To address this limitation, we introduce LucidFusion, a flexible end-to-end feed-forward framework that leverages th… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: 17 pages, 12 figures, project page: coming soon

  2. arXiv:2410.14911  [pdf

    cs.CV cs.AI cs.CL

    A Hybrid Defense Strategy for Boosting Adversarial Robustness in Vision-Language Models

    Authors: Yuhan Liang, Yijun Li, Yumeng Niu, Qianhe Shen, Hangyu Liu

    Abstract: The robustness of Vision-Language Models (VLMs) such as CLIP is critical for their deployment in safety-critical applications like autonomous driving, healthcare diagnostics, and security systems, where accurate interpretation of visual and textual data is essential. However, these models are highly susceptible to adversarial attacks, which can severely compromise their performance and reliability… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  3. arXiv:2410.14770  [pdf, other

    cs.CV cs.GR

    A Survey on Computational Solutions for Reconstructing Complete Objects by Reassembling Their Fractured Parts

    Authors: Jiaxin Lu, Yongqing Liang, Huijun Han, Jiacheng Hua, Junfeng Jiang, Xin Li, Qixing Huang

    Abstract: Reconstructing a complete object from its parts is a fundamental problem in many scientific domains. The purpose of this article is to provide a systematic survey on this topic. The reassembly problem requires understanding the attributes of individual pieces and establishing matches between different pieces. Many approaches also model priors of the underlying complete object. Existing approaches… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: 36 pages, 22 figures

  4. arXiv:2410.13854  [pdf, other

    cs.CL cs.AI cs.CV cs.CY

    Can MLLMs Understand the Deep Implication Behind Chinese Images?

    Authors: Chenhao Zhang, Xi Feng, Yuelin Bai, Xinrun Du, Jinchang Hou, Kaixin Deng, Guangzeng Han, Qinrui Li, Bingli Wang, Jiaheng Liu, Xingwei Qu, Yifei Zhang, Qixuan Zhao, Yiming Liang, Ziqiang Liu, Feiteng Fang, Min Yang, Wenhao Huang, Chenghua Lin, Ge Zhang, Shiwen Ni

    Abstract: As the capabilities of Multimodal Large Language Models (MLLMs) continue to improve, the need for higher-order capability evaluation of MLLMs is increasing. However, there is a lack of work evaluating MLLM for higher-order perception and understanding of Chinese visual content. To fill the gap, we introduce the **C**hinese **I**mage **I**mplication understanding **Bench**mark, **CII-Bench**, which… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 32 pages,18 figures. Project Page: https://cii-bench.github.io/ Code: https://github.com/MING_X/CII-Bench Dataset: https://huggingface.co/datasets/m-a-p/CII-Bench

  5. arXiv:2410.13761  [pdf, other

    cs.LG

    GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning

    Authors: Guibin Zhang, Haonan Dong, Yuchen Zhang, Zhixun Li, Dingshuo Chen, Kai Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang

    Abstract: Training high-quality deep models necessitates vast amounts of data, resulting in overwhelming computational and memory demands. Recently, data pruning, distillation, and coreset selection have been developed to streamline data volume by retaining, synthesizing, or selecting a small yet informative subset from the full set. Among these methods, data pruning incurs the least additional training cos… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: NeurIPS 2024

  6. arXiv:2410.13746  [pdf, other

    cs.LG stat.ML

    Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers

    Authors: Yuchen Liang, Peizhong Ju, Yingbin Liang, Ness Shroff

    Abstract: The denoising diffusion model has recently emerged as a powerful generative technique, capable of transforming noise into meaningful data. While theoretical convergence guarantees for diffusion models are well established when the target distribution aligns with the training distribution, practical scenarios often present mismatches. One common case is in zero-shot conditional diffusion sampling,… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  7. arXiv:2410.13674  [pdf, other

    cs.CV cs.AI

    Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion

    Authors: Yijun Liang, Shweta Bhardwaj, Tianyi Zhou

    Abstract: Low-quality or scarce data has posed significant challenges for training deep neural networks in practice. While classical data augmentation cannot contribute very different new data, diffusion models opens up a new door to build self-evolving AI by generating high-quality and diverse synthetic data through text-guided prompts. However, text-only guidance cannot control synthetic images' proximity… ▽ More

    Submitted 17 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 23 pages, including references and appendix. Code is available at http://github.com/tianyi-lab/DisCL

  8. arXiv:2410.13607  [pdf, other

    cs.CV

    DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering

    Authors: Jiahao Lu, Jiacheng Deng, Ruijie Zhu, Yanzhe Liang, Wenfei Yang, Tianzhu Zhang, Xu Zhou

    Abstract: Dynamic scenes rendering is an intriguing yet challenging problem. Although current methods based on NeRF have achieved satisfactory performance, they still can not reach real-time levels. Recently, 3D Gaussian Splatting (3DGS) has gar?nered researchers attention due to their outstanding rendering quality and real?time speed. Therefore, a new paradigm has been proposed: defining a canonical 3D gau… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Accepted by NeurIPS 2024

  9. arXiv:2410.13515  [pdf, other

    hep-ex hep-lat hep-ph nucl-ex

    Observation of a rare beta decay of the charmed baryon with a Graph Neural Network

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

    Abstract: The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 28 pages, 6 figures

  10. arXiv:2410.13478  [pdf, other

    hep-ex

    Observation of $χ_{c0}\toΣ^{+}\barΣ^{-}η$ and evidence for $χ_{c1,2}\toΣ^{+}\barΣ^{-}η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\barΣ^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  11. arXiv:2410.13368  [pdf, other

    hep-ex hep-ph

    Observation of the Singly Cabibbo-Suppressed Decay $Λ_c^{+}\to pπ^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 9 pages, 4 figures

  12. arXiv:2410.13177  [pdf, ps, other

    astro-ph.SR astro-ph.GA

    Chemical abundances of 20 barium stars from the OHP spectra

    Authors: Guochao Yang, Jingkun Zhao, Yanchun Liang, Monique Spite, Francois Spite, Jianrong Shi, Shuai Liu, Nian Liu, Wenyuan Cui, Gang Zhao

    Abstract: Based on the high resolution and high signal-to-noise spectra, we derived the chemical abundances of 20 elements for 20 barium (Ba-) stars. For the first time, the detailed abundances of four sample stars, namely HD 92482, HD 150430, HD 151101 and HD 177304 have been analyzed. Additionally, Ba element abundance has been measured using high resolution spectra for the first time in six of the other… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 16 pages, 12 figures, accepted for publication in MNRAS

  13. arXiv:2410.12620  [pdf, other

    hep-ex

    Search for $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ at center-of-mass energies from 4.47 to 4.95 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 14 pages, 6 figures

  14. arXiv:2410.12593  [pdf, other

    cs.LG cs.AI

    Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting

    Authors: Wei Chen, Yuxuan Liang

    Abstract: The widespread deployment of sensing devices leads to a surge in data for spatio-temporal forecasting applications such as traffic flow, air quality, and wind energy. Although spatio-temporal graph neural networks have achieved success in modeling various static spatio-temporal forecasting scenarios, real-world spatio-temporal data are typically received in a streaming manner, and the network cont… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  15. arXiv:2410.12360  [pdf, other

    cs.LG cs.AI

    Towards Neural Scaling Laws for Time Series Foundation Models

    Authors: Qingren Yao, Chao-Han Huck Yang, Renhe Jiang, Yuxuan Liang, Ming Jin, Shirui Pan

    Abstract: Scaling laws offer valuable insights into the design of time series foundation models (TSFMs). However, previous research has largely focused on the scaling laws of TSFMs for in-distribution (ID) data, leaving their out-of-distribution (OOD) scaling behavior and the influence of model architectures less explored. In this work, we examine two common TSFM architectures, encoder-only and decoder-only… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  16. arXiv:2410.12259  [pdf

    cs.CV cs.LG

    Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm

    Authors: Guanming Huang, Aoran Shen, Yuxiang Hu, Junliang Du, Jiacheng Hu, Yingbin Liang

    Abstract: This paper explores the application of knowledge distillation technology in target detection tasks, especially the impact of different distillation temperatures on the performance of student models. By using YOLOv5l as the teacher network and a smaller YOLOv5s as the student network, we found that with the increase of distillation temperature, the student's detection accuracy gradually improved, a… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  17. arXiv:2410.12198  [pdf, other

    astro-ph.GA

    The Physical Origin of Extreme Emission Line Galaxies at High redshifts: Strong {\sc [Oiii]} Emission Lines Produced by Obscured AGNs

    Authors: Chenghao Zhu, Yuichi Harikane, Masami Ouchi, Yoshiaki Ono, Masato Onodera, Shenli Tang, Yuki Isobe, Yoshiki Matsuoka, Toshihiro Kawaguchi, Hiroya Umeda, Kimihiko Nakajima, Yongming Liang, Yi Xu, Yechi Zhang, Dongsheng Sun, Kazuhiro Shimasaku, Jenny Greene, Kazushi Iwasawa, Kotaro Kohno, Tohru Nagao, Andreas Schulze, Takatoshi Shibuya, Miftahul Hilmi, Malte Schramm

    Abstract: We present deep Subaru/FOCAS spectra for two extreme emission line galaxies (EELGs) at $z\sim 1$ with strong {\sc[Oiii]}$λ$5007 emission lines, exhibiting equivalent widths (EWs) of $2905^{+946}_{-578}$ Å and $2000^{+188}_{-159}$ Å, comparable to those of EELGs at high redshifts that are now routinely identified with JWST spectroscopy. Adding a similarly large {\sc [Oiii]} EW (… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: submitted to ApJ

  18. arXiv:2410.11720  [pdf, other

    cs.DC cs.LG

    Light-Weight Fault Tolerant Attention for Large Language Model Training

    Authors: Yuhang Liang, Xinyi Li, Jie Ren, Ang Li, Bo Fang, Jieyang Chen

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance in various natural language processing tasks. However, the training of these models is computationally intensive and susceptible to faults, particularly in the attention mechanism, which is a critical component of transformer-based LLMs. In this paper, we investigate the impact of faults on LLM training, focusing on INF, NaN, an… ▽ More

    Submitted 16 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    ACM Class: C.1.4; B.2.3; I.2.7

  19. arXiv:2410.11607  [pdf, other

    hep-ex

    Observation of $χ_{cJ}\to p \bar p K^0_S K^- π^+ + c.c.$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (648 additional authors not shown)

    Abstract: By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: 12 pages, 5 figures

  20. arXiv:2410.11279  [pdf, other

    cs.LG cs.AI math.NA

    Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study

    Authors: Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song

    Abstract: Recent empirical studies have identified fixed point iteration phenomena in deep neural networks, where the hidden state tends to stabilize after several layers, showing minimal change in subsequent layers. This observation has spurred the development of practical methodologies, such as accelerating inference by bypassing certain layers once the hidden state stabilizes, selectively fine-tuning lay… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  21. arXiv:2410.11268  [pdf, other

    cs.LG cs.AI

    Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent

    Authors: Bo Chen, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song

    Abstract: In-context learning has been recognized as a key factor in the success of Large Language Models (LLMs). It refers to the model's ability to learn patterns on the fly from provided in-context examples in the prompt during inference. Previous studies have demonstrated that the Transformer architecture used in LLMs can implement a single-step gradient descent update by processing in-context examples… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  22. arXiv:2410.11261  [pdf, other

    cs.LG cs.AI cs.CL

    Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix

    Authors: Yingyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: Large Language Models (LLMs) have shown immense potential in enhancing various aspects of our daily lives, from conversational AI to search and AI assistants. However, their growing capabilities come at the cost of extremely large model sizes, making deployment on edge devices challenging due to memory and computational constraints. This paper introduces a novel approach to LLM weight pruning that… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  23. arXiv:2410.10469  [pdf, other

    cs.LG stat.ML

    Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts

    Authors: Xu Liu, Juncheng Liu, Gerald Woo, Taha Aksu, Yuxuan Liang, Roger Zimmermann, Chenghao Liu, Silvio Savarese, Caiming Xiong, Doyen Sahoo

    Abstract: Time series foundation models have demonstrated impressive performance as zero-shot forecasters. However, achieving effectively unified training on time series remains an open challenge. Existing approaches introduce some level of model specialization to account for the highly heterogeneous nature of time series data. For instance, Moirai pursues unified training by employing multiple input/output… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  24. arXiv:2410.10208  [pdf, other

    quant-ph

    Floquet Engineering of Anisotropic Transverse Interactions in Superconducting Qubits

    Authors: Yongqi Liang, Wenhui Huang, Libo Zhang, Ziyu Tao, Kai Tang, Ji Chu, Jiawei Qiu, Xuandong Sun, Yuxuan Zhou, Jiawei Zhang, Jiajian Zhang, Weijie Guo, Yang Liu, Yuanzhen Chen, Song Liu, Youpeng Zhong, Jingjing Niu, Dapeng Yu

    Abstract: Superconducting transmon qubits have established as a leading candidate for quantum computation, as well as a flexible platform for exploring exotic quantum phases and dynamics. However, physical coupling naturally yields isotropic transverse interactions between qubits, restricting their access to diverse quantum phases that require spatially dependent interactions. Here, we demonstrate the simul… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 7+14 pages; 4+12 figures

  25. arXiv:2410.10165  [pdf, other

    cs.LG cs.AI cs.CL

    HSR-Enhanced Sparse Attention Acceleration

    Authors: Bo Chen, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across various applications, but their performance on long-context tasks is often limited by the computational complexity of attention mechanisms. This paper introduces a novel approach to accelerate attention computation in LLMs, particularly for long-context scenarios. We leverage the inherent sparsity within attention mechan… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  26. arXiv:2410.10016  [pdf, ps, other

    math.AP math-ph

    Stability for inverse random source problems of the polyharmonic wave equation

    Authors: Peijun Li, Zhenqian Li, Ying Liang

    Abstract: This paper investigates stability estimates for inverse source problems in the stochastic polyharmonic wave equation, where the source is represented by white noise. The study examines the well-posedness of the direct problem and derives stability estimates for identifying the strength of the random source. Assuming a priori information of the regularity and support of the source strength, the Höl… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    MSC Class: 35R30; 35R60

  27. arXiv:2410.09714  [pdf, other

    cs.CV cs.LG

    AM-SAM: Automated Prompting and Mask Calibration for Segment Anything Model

    Authors: Yuchen Li, Li Zhang, Youwei Liang, Pengtao Xie

    Abstract: Segment Anything Model (SAM) has gained significant recognition in the field of semantic segmentation due to its versatile capabilities and impressive performance. Despite its success, SAM faces two primary limitations: (1) it relies heavily on meticulous human-provided prompts like key points, bounding boxes or text messages, which is labor-intensive; (2) the mask decoder's feature representation… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  28. arXiv:2410.09605  [pdf, other

    cs.LG cs.CL

    Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis

    Authors: Hongru Yang, Bhavya Kailkhura, Zhangyang Wang, Yingbin Liang

    Abstract: Understanding the training dynamics of transformers is important to explain the impressive capabilities behind large language models. In this work, we study the dynamics of training a shallow transformer on a task of recognizing co-occurrence of two designated words. In the literature of studying training dynamics of transformers, several simplifications are commonly adopted such as weight reparam… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

    Comments: Accepted by NeurIPS 2024

  29. arXiv:2410.09503  [pdf, other

    eess.AS cs.SD

    SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs

    Authors: Wenxi Chen, Ziyang Ma, Xiquan Li, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Kai Yu, Xie Chen

    Abstract: Automated Audio Captioning (AAC) aims to generate natural textual descriptions for input audio signals. Recent progress in audio pre-trained models and large language models (LLMs) has significantly enhanced audio understanding and textual reasoning capabilities, making improvements in AAC possible. In this paper, we propose SLAM-AAC to further enhance AAC with paraphrasing augmentation and CLAP-R… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  30. arXiv:2410.09472  [pdf, other

    cs.SD cs.AI eess.AS

    DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning

    Authors: Xiquan Li, Wenxi Chen, Ziyang Ma, Xuenan Xu, Yuzhe Liang, Zhisheng Zheng, Qiuqiang Kong, Xie Chen

    Abstract: While automated audio captioning (AAC) has made notable progress, traditional fully supervised AAC models still face two critical challenges: the need for expensive audio-text pair data for training and performance degradation when transferring across domains. To overcome these limitations, we present DRCap, a data-efficient and flexible zero-shot audio captioning system that requires text-only da… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  31. arXiv:2410.09397  [pdf, other

    cs.LG cs.AI cs.CC cs.CL

    Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes

    Authors: Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in processing long-context information. However, the quadratic complexity of attention computation with respect to sequence length poses significant computational challenges, and I/O aware algorithms have been proposed. This paper presents a comprehensive analysis of the I/O complexity for attention mechanisms, focusing on back… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  32. arXiv:2410.09375  [pdf, ps, other

    cs.LG cs.AI cs.CC

    Looped ReLU MLPs May Be All You Need as Practical Programmable Computers

    Authors: Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: Previous work has demonstrated that attention mechanisms are Turing complete. More recently, it has been shown that a looped 13-layer Transformer can function as a universal programmable computer. In contrast, the multi-layer perceptrons with $\mathsf{ReLU}$ activation ($\mathsf{ReLU}$-$\mathsf{MLP}$), one of the most fundamental components of neural networks, is known to be expressive; specifical… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  33. arXiv:2410.09113  [pdf, other

    cs.AR cs.AI

    M$^2$-ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization

    Authors: Yanbiao Liang, Huihong Shi, Zhongfeng Wang

    Abstract: Although Vision Transformers (ViTs) have achieved significant success, their intensive computations and substantial memory overheads challenge their deployment on edge devices. To address this, efficient ViTs have emerged, typically featuring Convolution-Transformer hybrid architectures to enhance both accuracy and hardware efficiency. While prior work has explored quantization for efficient ViTs… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  34. arXiv:2410.08898  [pdf, other

    cs.LG

    Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization

    Authors: Yang Chen, Yitao Liang, Zhouchen Lin

    Abstract: Low-Dimension-to-High-Dimension (LDHD) generalization is a special case of Out-of-Distribution (OOD) generalization, where the training data are restricted to a low-dimensional subspace of the high-dimensional testing space. Assuming that each instance is generated from a latent variable and the dimension of the latent variable reflects the problem scale, the inherent scaling challenge in length g… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  35. arXiv:2410.08603  [pdf, other

    hep-ex

    Observation of $D^+\toη^\primeμ^+ν_μ$ and First Study of $D^+\to η^\prime \ell^+ν_\ell$ Decay Dynamics

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy 3.773\,GeV with the BESIII detector, we report the first observation of the semileptonic decay $D^+\to η^\prime μ^+ν_μ$ with significance of $8.6σ$ including systematic uncertainties, and an improved measurement of $D^+\to η^\prime e^+ν_e$. The branching fractions of $D^+\to η^\prime μ^+ν_μ$ and… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  36. arXiv:2410.08126  [pdf, other

    cs.LG cs.AI cs.CL

    Mars: Situated Inductive Reasoning in an Open-World Environment

    Authors: Xiaojuan Tang, Jiaqi Li, Yitao Liang, Song-chun Zhu, Muhan Zhang, Zilong Zheng

    Abstract: Large Language Models (LLMs) trained on massive corpora have shown remarkable success in knowledge-intensive tasks. Yet, most of them rely on pre-stored knowledge. Inducing new general knowledge from a specific environment and performing reasoning with the acquired knowledge -- \textit{situated inductive reasoning}, is crucial and challenging for machine intelligence. In this paper, we design Mars… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  37. arXiv:2410.07961  [pdf, other

    quant-ph cs.DS cs.LG

    QCircuitNet: A Large-Scale Hierarchical Dataset for Quantum Algorithm Design

    Authors: Rui Yang, Yuntian Gu, Ziruo Wang, Yitao Liang, Tongyang Li

    Abstract: Quantum computing is an emerging field recognized for the significant speedup it offers over classical computing through quantum algorithms. However, designing and implementing quantum algorithms pose challenges due to the complex nature of quantum mechanics and the necessity for precise control over quantum states. Despite the significant advancements in AI, there has been a lack of datasets spec… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 35 pages, 7 figures, 4 tables, GitHub repository: https://github.com/EstelYang/QCircuitNet_Dataset

  38. arXiv:2410.07938  [pdf, ps, other

    math.AP

    Stability estimates of inverse random source problems for the wave equations by using correlation-based data

    Authors: Peijun Li, Ying Liang, Xu Wang

    Abstract: This paper focuses on stability estimates of the inverse random source problems for the polyharmonic, electromagnetic, and elastic wave equations. The source is represented as a microlocally isotropic Gaussian random field, which is defined by its covariance operator in the form of a classical pseudo-differential operator. The inverse problem is to determine the strength function of the principal… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    MSC Class: 35R30; 35R60

  39. arXiv:2410.07707  [pdf, other

    cs.CV cs.GR cs.LG

    MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting

    Authors: Ruijie Zhu, Yanzhe Liang, Hanzhi Chang, Jiacheng Deng, Jiahao Lu, Wenfei Yang, Tianzhu Zhang, Yongdong Zhang

    Abstract: Dynamic scene reconstruction is a long-term challenge in the field of 3D vision. Recently, the emergence of 3D Gaussian Splatting has provided new insights into this problem. Although subsequent efforts rapidly extend static 3D Gaussian to dynamic scenes, they often lack explicit constraints on object motion, leading to optimization difficulties and performance degradation. To address the above is… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Accepted by NeurIPS 2024. 21 pages, 14 figures,7 tables

  40. arXiv:2410.07626  [pdf, other

    hep-ex

    Precision Measurement of the Branching Fraction of $D^{+}\to μ^{+}ν_μ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of $E_{\rm cm}=3.773$ GeV with the BESIII detector operating at the BEPCII collider, we determine the branching fraction of the leptonic decay $D^+\toμ^+ν_μ$ to be $(3.981\pm0.079_{\rm stat}\pm0.040_{\rm syst})\times10^{-4}$. Interpreting our measurement with knowledge of the Fermi coupling constant… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 9 pages, 2 figures

  41. arXiv:2410.07517  [pdf, other

    physics.med-ph

    A 3D-Printed Table for Hybrid X-ray CT and Optical Imaging of a Live Mouse

    Authors: Wenxuan Xue, Yuxuan Liang, Mengzhou Li, Shan Gao, Xavier R. Intes, Ge Wang

    Abstract: Multimodal imaging has shown great potential in cancer research by concurrently providing anatomical, functional, and molecular information in live, intact animals. During preclinical imaging of small animals like mice, anesthesia is required to prevent movement and improve image quality. However, their high surface area-to-body weight ratio predisposes mice, particularly nude mice, to hypothermia… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  42. arXiv:2410.06794  [pdf, ps, other

    math.CA

    A signal recovery guarantee with Restricted Isometry Property and Null Space Property for weighted $\ell_1$ minimization

    Authors: Xiaotong Liu, Yiyu Liang

    Abstract: Signal reconstruction is a crucial aspect of compressive sensing. In weighted cases, there are two common types of weights. In order to establish a unified framework for handling various types of weights, the sparse function is introduced. By employing this sparse function, a generalized form of the weighted null space property is developed, which is sufficient and necessary to exact recovery thro… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    MSC Class: 15A12; 94A12; 47A52

  43. arXiv:2410.06651  [pdf, other

    cs.LG cs.AI

    Toward Physics-guided Time Series Embedding

    Authors: Jiaxi Hu, Bowen Zhang, Qingsong Wen, Fugee Tsung, Yuxuan Liang

    Abstract: In various scientific and engineering fields, the primary research areas have revolved around physics-based dynamical systems modeling and data-driven time series analysis. According to the embedding theory, dynamical systems and time series can be mutually transformed using observation functions and physical reconstruction techniques. Based on this, we propose Embedding Duality Theory, where the… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  44. arXiv:2410.06610  [pdf, other

    quant-ph

    Experimental single-copy distillation of quantumness from higher-dimensional entanglement

    Authors: Xiao-Xu Fang, Gelo Noel M. Tabia, Kai-Siang Chen, Yeong-Cherng Liang, He Lu

    Abstract: Entanglement is at the heart of quantum theory and is responsible for various quantum-enabling technologies. In practice, during its preparation, storage, and distribution to the intended recipients, this valuable quantum resource may suffer from noisy interactions that reduce its usefulness for the desired information-processing tasks. Conventional schemes of entanglement distillation aim to alle… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: 16 pages, 15 figures

  45. arXiv:2410.06593  [pdf, other

    cs.CV

    Towards Natural Image Matting in the Wild via Real-Scenario Prior

    Authors: Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Qianru Sun, Yang Tang, Bo Li, Pan Zhou

    Abstract: Recent approaches attempt to adapt powerful interactive segmentation models, such as SAM, to interactive matting and fine-tune the models based on synthetic matting datasets. However, models trained on synthetic data fail to generalize to complex and occlusion scenes. We address this challenge by proposing a new matting dataset based on the COCO dataset, namely COCO-Matting. Specifically, the cons… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  46. arXiv:2410.06500  [pdf, other

    hep-ex

    Search for the radiative decays $D^+\toγρ^+$ and $D^+\toγK^{*+}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (648 additional authors not shown)

    Abstract: We search for the radiative decays $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ using 20.3~fb$^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$ GeV by the BESIII detector operating at the BEPCII collider. No significant signals are observed, and the upper limits on the branching fractions of $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ at 90\% confidence level ar… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  47. arXiv:2410.06203  [pdf, other

    cs.CL cs.AI

    Integrating Planning into Single-Turn Long-Form Text Generation

    Authors: Yi Liang, You Wu, Honglei Zhuang, Li Chen, Jiaming Shen, Yiling Jia, Zhen Qin, Sumit Sanghai, Xuanhui Wang, Carl Yang, Michael Bendersky

    Abstract: Generating high-quality, in-depth textual documents, such as academic papers, news articles, Wikipedia entries, and books, remains a significant challenge for Large Language Models (LLMs). In this paper, we propose to use planning to generate long form content. To achieve our goal, we generate intermediate steps via an auxiliary task that teaches the LLM to plan, reason and structure before genera… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  48. arXiv:2410.06101  [pdf, other

    cs.AI cs.MA

    Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

    Authors: Hao Ma, Tianyi Hu, Zhiqiang Pu, Boyin Liu, Xiaolin Ai, Yanyan Liang, Min Chen

    Abstract: Reinforcement learning (RL) has emerged as a pivotal technique for fine-tuning large language models (LLMs) on specific tasks. However, prevailing RL fine-tuning methods predominantly rely on PPO and its variants. Though these algorithms are effective in general RL settings, they often exhibit suboptimal performance and vulnerability to distribution collapse when applied to the fine-tuning of LLMs… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 28 pages, 26 images

  49. arXiv:2410.05939  [pdf, other

    cs.IR

    RLRF4Rec: Reinforcement Learning from Recsys Feedback for Enhanced Recommendation Reranking

    Authors: Chao Sun, Yaobo Liang, Yaming Yang, Shilin Xu, Tianmeng Yang, Yunhai Tong

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across diverse domains, prompting researchers to explore their potential for use in recommendation systems. Initial attempts have leveraged the exceptional capabilities of LLMs, such as rich knowledge and strong generalization through In-context Learning, which involves phrasing the recommendation task as prompts. Nevertheless,… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  50. arXiv:2410.05736  [pdf, ps, other

    hep-ex

    Observation of an axial-vector state in the study of $ψ(3686) \to φηη'$ decay

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (625 additional authors not shown)

    Abstract: Using (2712.4 $\pm$ 14.3)$\times 10^{6}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, a partial wave analysis of the decay $ψ(3686) \to φηη' $ is performed with the covariant tensor approach. An axial-vector state with a mass near 2.3 $\rm GeV/c^2$ is observed for the first time. Its mass and width are measured to be 2316… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.