subscribe to arXiv mailings

IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems

Authors: Yihuan Mao, Yipeng Kang, Peilun Li, Ning Zhang, Wei Xu, Chongjie Zhang

Abstract: As large language model (LLM) agents increasingly integrate into our infrastructure, their robust coordination and message synchronization become vital. The Byzantine Generals Problem (BGP) is a critical model for constructing resilient multi-agent systems (MAS) under adversarial attacks. It describes a scenario where malicious agents with unknown identities exist in the system-situations that, in… ▽ More As large language model (LLM) agents increasingly integrate into our infrastructure, their robust coordination and message synchronization become vital. The Byzantine Generals Problem (BGP) is a critical model for constructing resilient multi-agent systems (MAS) under adversarial attacks. It describes a scenario where malicious agents with unknown identities exist in the system-situations that, in our context, could result from LLM agents' hallucinations or external attacks. In BGP, the objective of the entire system is to reach a consensus on the action to be taken. Traditional BGP requires global consensus among all agents; however, in practical scenarios, global consensus is not always necessary and can even be inefficient. Therefore, there is a pressing need to explore a refined version of BGP that aligns with the local coordination patterns observed in MAS. We refer to this refined version as Imperfect BGP (IBGP) in our research, aiming to address this discrepancy. To tackle this issue, we propose a framework that leverages consensus protocols within general MAS settings, providing provable resilience against communication attacks and adaptability to changing environments, as validated by empirical results. Additionally, we present a case study in a sensor network environment to illustrate the practical application of our protocol. △ Less

Submitted 21 October, 2024; originally announced October 2024.

arXiv:2410.15768 [pdf, other]

Learning to Synthesize Graphics Programs for Geometric Artworks

Authors: Qi Bing, Chaoyi Zhang, Weidong Cai

Abstract: Creating and understanding art has long been a hallmark of human ability. When presented with finished digital artwork, professional graphic artists can intuitively deconstruct and replicate it using various drawing tools, such as the line tool, paint bucket, and layer features, including opacity and blending modes. While most recent research in this field has focused on art generation, proposing… ▽ More Creating and understanding art has long been a hallmark of human ability. When presented with finished digital artwork, professional graphic artists can intuitively deconstruct and replicate it using various drawing tools, such as the line tool, paint bucket, and layer features, including opacity and blending modes. While most recent research in this field has focused on art generation, proposing a range of methods, these often rely on the concept of artwork being represented as a final image. To bridge the gap between pixel-level results and the actual drawing process, we present an approach that treats a set of drawing tools as executable programs. This method predicts a sequence of steps to achieve the final image, allowing for understandable and resolution-independent reproductions under the usage of a set of drawing commands. Our experiments demonstrate that our program synthesizer, Art2Prog, can comprehensively understand complex input images and reproduce them using high-quality executable programs. The experimental results evidence the potential of machines to grasp higher-level information from images and generate compact program-level descriptions. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: ICPR 2024

arXiv:2410.15760 [pdf, other]

DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization

Authors: Qi Bing, Chaoyi Zhang, Weidong Cai

Abstract: In contrast to the well-established technique of rasterization, vectorization of images poses a significant challenge in the field of computer graphics. Recent learning-based methods for converting raster images to vector formats frequently suffer from incomplete shapes, redundant path prediction, and a lack of accuracy in preserving the semantics of the original content. These shortcomings severe… ▽ More In contrast to the well-established technique of rasterization, vectorization of images poses a significant challenge in the field of computer graphics. Recent learning-based methods for converting raster images to vector formats frequently suffer from incomplete shapes, redundant path prediction, and a lack of accuracy in preserving the semantics of the original content. These shortcomings severely hinder the utility of these methods for further editing and manipulation of images. To address these challenges, we present DeepIcon, a novel hierarchical image vectorization network specifically tailored for generating variable-length icon vector graphics based on the raster image input. Our experimental results indicate that DeepIcon can efficiently produce Scalable Vector Graphics (SVGs) directly from raster images, bypassing the need for a differentiable rasterizer while also demonstrating a profound understanding of the image contents. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: Accepted as Oral Presentation at DICTA 2024

arXiv:2410.15526 [pdf, other]

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

Authors: Jinda Jia, Cong Xie, Hanlin Lu, Daoce Wang, Hao Feng, Chengming Zhang, Baixi Sun, Haibin Lin, Zhi Zhang, Xin Liu, Dingwen Tao

Abstract: Recent years have witnessed a clear trend towards language models with an ever-increasing number of parameters, as well as the growing training overhead and memory usage. Distributed training, particularly through Sharded Data Parallelism (ShardedDP) which partitions optimizer states among workers, has emerged as a crucial technique to mitigate training time and memory usage. Yet, a major challeng… ▽ More Recent years have witnessed a clear trend towards language models with an ever-increasing number of parameters, as well as the growing training overhead and memory usage. Distributed training, particularly through Sharded Data Parallelism (ShardedDP) which partitions optimizer states among workers, has emerged as a crucial technique to mitigate training time and memory usage. Yet, a major challenge in the scalability of ShardedDP is the intensive communication of weights and gradients. While compression techniques can alleviate this issue, they often result in worse accuracy. Driven by this limitation, we propose SDP4Bit (Toward 4Bit Communication Quantization in Sharded Data Parallelism for LLM Training), which effectively reduces the communication of weights and gradients to nearly 4 bits via two novel techniques: quantization on weight differences, and two-level gradient smooth quantization. Furthermore, SDP4Bit presents an algorithm-system co-design with runtime optimization to minimize the computation overhead of compression. In addition to the theoretical guarantees of convergence, we empirically evaluate the accuracy of SDP4Bit on the pre-training of GPT models with up to 6.7 billion parameters, and the results demonstrate a negligible impact on training loss. Furthermore, speed experiments show that SDP4Bit achieves up to 4.08$\times$ speedup in end-to-end throughput on a scale of 128 GPUs. △ Less

Submitted 20 October, 2024; originally announced October 2024.

Comments: Accepted by NeurIPS 2024

arXiv:2410.15336 [pdf, other]

Diffusion-PINN Sampler

Authors: Zhekun Shi, Longlin Yu, Tianyu Xie, Cheng Zhang

Abstract: Recent success of diffusion models has inspired a surge of interest in developing sampling techniques using reverse diffusion processes. However, accurately estimating the drift term in the reverse stochastic differential equation (SDE) solely from the unnormalized target density poses significant challenges, hindering existing methods from achieving state-of-the-art performance. In this paper, we… ▽ More Recent success of diffusion models has inspired a surge of interest in developing sampling techniques using reverse diffusion processes. However, accurately estimating the drift term in the reverse stochastic differential equation (SDE) solely from the unnormalized target density poses significant challenges, hindering existing methods from achieving state-of-the-art performance. In this paper, we introduce the Diffusion-PINN Sampler (DPS), a novel diffusion-based sampling algorithm that estimates the drift term by solving the governing partial differential equation of the log-density of the underlying SDE marginals via physics-informed neural networks (PINN). We prove that the error of log-density approximation can be controlled by the PINN residual loss, enabling us to establish convergence guarantees of DPS. Experiments on a variety of sampling tasks demonstrate the effectiveness of our approach, particularly in accurately identifying mixing proportions when the target contains isolated components. △ Less

Submitted 20 October, 2024; originally announced October 2024.

Comments: 33 pages, 7 figures

arXiv:2410.15099 [pdf]

A new approach to N-doped di-molybdenum carbide with enhanced superconductivity via Urea

Authors: Longfu Li, Lei Shi, Lingyong Zeng, Kuan Li, Peifeng Yu, Kangwang Wang, Chao Zhang, Rui Chen, Zaichen Xiang, Yunwei Zhang, Huixia Luo

Abstract: Chemical doping is a critical factor in the development of new superconductors or optimizing the superconducting transition temperature (Tc) of the parent superconducting materials. Herein, a new simple urea approach is developed to synthesize the N-doped alfa-Mo2C. Benefiting from the simple urea method, a broad superconducting dome is found in the Mo2C1-xNx compositions. XRD results show that th… ▽ More Chemical doping is a critical factor in the development of new superconductors or optimizing the superconducting transition temperature (Tc) of the parent superconducting materials. Herein, a new simple urea approach is developed to synthesize the N-doped alfa-Mo2C. Benefiting from the simple urea method, a broad superconducting dome is found in the Mo2C1-xNx compositions. XRD results show that the structure of alfa-Mo2C remains unchanged and that there is a variation of lattice parameters with nitrogen doping. Resistivity, magnetic susceptibility, and heat capacity measurement results confirm that the superconducting transition temperature (Tc) was strongly increased from 2.68 K (x = 0) to 7.05 K (x = 0.49). First-principles calculations and our analysis indicate that increasing nitrogen doping leads to a rise in the density of states at the Fermi level and doping-induced phonon softening, which enhances electron-phonon coupling. This results in an increase in Tc and a sharp rise in the upper critical field. Our findings provide a promising strategy for fabricating transition metal carbonitrides and provide a material platform for further study of the superconductivity of transition metal carbides. △ Less

Submitted 19 October, 2024; originally announced October 2024.

Comments: 15 pages, 6 Figures, 1 Table

Journal ref: Chin. Phys. Lett. 2024

arXiv:2410.14309 [pdf, other]

LoGU: Long-form Generation with Uncertainty Expressions

Authors: Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen Yang, Nigel Collier, Dong Yu, Deqing Yang

Abstract: While Large Language Models (LLMs) demonstrate impressive capabilities, they still struggle with generating factually incorrect content (i.e., hallucinations). A promising approach to mitigate this issue is enabling models to express uncertainty when unsure. Previous research on uncertainty modeling has primarily focused on short-form QA, but realworld applications often require much longer respon… ▽ More While Large Language Models (LLMs) demonstrate impressive capabilities, they still struggle with generating factually incorrect content (i.e., hallucinations). A promising approach to mitigate this issue is enabling models to express uncertainty when unsure. Previous research on uncertainty modeling has primarily focused on short-form QA, but realworld applications often require much longer responses. In this work, we introduce the task of Long-form Generation with Uncertainty(LoGU). We identify two key challenges: Uncertainty Suppression, where models hesitate to express uncertainty, and Uncertainty Misalignment, where models convey uncertainty inaccurately. To tackle these challenges, we propose a refinement-based data collection framework and a two-stage training pipeline. Our framework adopts a divide-and-conquer strategy, refining uncertainty based on atomic claims. The collected data are then used in training through supervised fine-tuning (SFT) and direct preference optimization (DPO) to enhance uncertainty expression. Extensive experiments on three long-form instruction following datasets show that our method significantly improves accuracy, reduces hallucinations, and maintains the comprehensiveness of responses. △ Less

Submitted 18 October, 2024; originally announced October 2024.

arXiv:2410.14268 [pdf, other]

MoDification: Mixture of Depths Made Easy

Authors: Chen Zhang, Meizhi Zhong, Qimeng Wang, Xuantao Lu, Zheyu Ye, Chengqiang Lu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang, Dawei Song

Abstract: Long-context efficiency has recently become a trending topic in serving large language models (LLMs). And mixture of depths (MoD) is proposed as a perfect fit to bring down both latency and memory. In this paper, however, we discover that MoD can barely transform existing LLMs without costly training over an extensive number of tokens. To enable the transformations from any LLMs to MoD ones, we sh… ▽ More Long-context efficiency has recently become a trending topic in serving large language models (LLMs). And mixture of depths (MoD) is proposed as a perfect fit to bring down both latency and memory. In this paper, however, we discover that MoD can barely transform existing LLMs without costly training over an extensive number of tokens. To enable the transformations from any LLMs to MoD ones, we showcase top-k operator in MoD should be promoted to threshold-p operator, and refinement to architecture and data should also be crafted along. All these designs form our method termed MoDification. Through a comprehensive set of experiments covering model scales from 3B to 70B, we exhibit MoDification strikes an excellent balance between efficiency and effectiveness. MoDification can achieve up to ~1.2x speedup in latency and ~1.8x reduction in memory compared to original LLMs especially in long-context applications. △ Less

Submitted 18 October, 2024; originally announced October 2024.

Comments: 12 pages, 9 figures, 5 tables, work in progress

arXiv:2410.14215 [pdf, other]

Jamming Detection and Channel Estimation for Spatially Correlated Beamspace Massive MIMO

Authors: Pengguang Du, Cheng Zhang, Yindi Jing, Chao Fang, Zhilei Zhang, Yongming Huang

Abstract: In this paper, we investigate the problem of jamming detection and channel estimation during multi-user uplink beam training under random pilot jamming attacks in beamspace massive multi-input-multi-output (MIMO) systems. For jamming detection, we distinguish the signals from the jammer and the user by projecting the observation signals onto the pilot space. By using the multiple projected observa… ▽ More In this paper, we investigate the problem of jamming detection and channel estimation during multi-user uplink beam training under random pilot jamming attacks in beamspace massive multi-input-multi-output (MIMO) systems. For jamming detection, we distinguish the signals from the jammer and the user by projecting the observation signals onto the pilot space. By using the multiple projected observation vectors corresponding to the unused pilots, we propose a jamming detection scheme based on the locally most powerful test (LMPT) for systems with general channel conditions. Analytical expressions for the probability of detection and false alarms are derived using the second-order statistics and likelihood functions of the projected observation vectors. For the detected jammer along with users, we propose a two-step minimum mean square error (MMSE) channel estimation using the projected observation vectors. As a part of the channel estimation, we develop schemes to estimate the norm and the phase of the inner-product of the legitimate pilot vector and the random jamming pilot vector, which can be obtained using linear MMSE estimation and a bilinear form of the multiple projected observation vectors. From simulations under different system parameters, we observe that the proposed technique improves the detection probability by 32.22% compared to the baseline at medium channel correlation level, and the channel estimation achieves a mean square error of -15.93dB. △ Less

Submitted 18 October, 2024; originally announced October 2024.

Comments: 13 pages, 9 figures. The paper has been submitted to an IEEE journal for possible publication

arXiv:2410.14144 [pdf, other]

A Lightweight Multi Aspect Controlled Text Generation Solution For Large Language Models

Authors: Chenyang Zhang, Jiayi Lin, Haibo Tong, Bingxuan Hou, Dongyu Zhang, Jialin Li, Junli Wang

Abstract: Large language models (LLMs) show remarkable abilities with instruction tuning. However, they fail to achieve ideal tasks when lacking high-quality instruction tuning data on target tasks. Multi-Aspect Controllable Text Generation (MCTG) is a representative task for this dilemma, where aspect datasets are usually biased and correlated. Existing work exploits additional model structures and strateg… ▽ More Large language models (LLMs) show remarkable abilities with instruction tuning. However, they fail to achieve ideal tasks when lacking high-quality instruction tuning data on target tasks. Multi-Aspect Controllable Text Generation (MCTG) is a representative task for this dilemma, where aspect datasets are usually biased and correlated. Existing work exploits additional model structures and strategies for solutions, limiting adaptability to LLMs. To activate MCTG ability of LLMs, we propose a lightweight MCTG pipeline based on data augmentation. We analyze bias and correlations in traditional datasets, and address these concerns with augmented control attributes and sentences. Augmented datasets are feasible for instruction tuning. In our experiments, LLMs perform better in MCTG after data augmentation, with a 20% accuracy rise and less aspect correlations. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.13854 [pdf, other]

Can MLLMs Understand the Deep Implication Behind Chinese Images?

Authors: Chenhao Zhang, Xi Feng, Yuelin Bai, Xinrun Du, Jinchang Hou, Kaixin Deng, Guangzeng Han, Qinrui Li, Bingli Wang, Jiaheng Liu, Xingwei Qu, Yifei Zhang, Qixuan Zhao, Yiming Liang, Ziqiang Liu, Feiteng Fang, Min Yang, Wenhao Huang, Chenghua Lin, Ge Zhang, Shiwen Ni

Abstract: As the capabilities of Multimodal Large Language Models (MLLMs) continue to improve, the need for higher-order capability evaluation of MLLMs is increasing. However, there is a lack of work evaluating MLLM for higher-order perception and understanding of Chinese visual content. To fill the gap, we introduce the **C**hinese **I**mage **I**mplication understanding **Bench**mark, **CII-Bench**, which… ▽ More As the capabilities of Multimodal Large Language Models (MLLMs) continue to improve, the need for higher-order capability evaluation of MLLMs is increasing. However, there is a lack of work evaluating MLLM for higher-order perception and understanding of Chinese visual content. To fill the gap, we introduce the **C**hinese **I**mage **I**mplication understanding **Bench**mark, **CII-Bench**, which aims to assess the higher-order perception and understanding capabilities of MLLMs for Chinese images. CII-Bench stands out in several ways compared to existing benchmarks. Firstly, to ensure the authenticity of the Chinese context, images in CII-Bench are sourced from the Chinese Internet and manually reviewed, with corresponding answers also manually crafted. Additionally, CII-Bench incorporates images that represent Chinese traditional culture, such as famous Chinese traditional paintings, which can deeply reflect the model's understanding of Chinese traditional culture. Through extensive experiments on CII-Bench across multiple MLLMs, we have made significant findings. Initially, a substantial gap is observed between the performance of MLLMs and humans on CII-Bench. The highest accuracy of MLLMs attains 64.4%, where as human accuracy averages 78.2%, peaking at an impressive 81.0%. Subsequently, MLLMs perform worse on Chinese traditional culture images, suggesting limitations in their ability to understand high-level semantics and lack a deep knowledge base of Chinese traditional culture. Finally, it is observed that most models exhibit enhanced accuracy when image emotion hints are incorporated into the prompts. We believe that CII-Bench will enable MLLMs to gain a better understanding of Chinese semantics and Chinese-specific images, advancing the journey towards expert artificial general intelligence (AGI). Our project is publicly available at https://cii-bench.github.io/. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: 32 pages,18 figures. Project Page: https://cii-bench.github.io/ Code: https://github.com/MING_X/CII-Bench Dataset: https://huggingface.co/datasets/m-a-p/CII-Bench

arXiv:2410.13837 [pdf, other]

ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization

Authors: Chen Bo Calvin Zhang, Zhang-Wei Hong, Aldo Pacchiano, Pulkit Agrawal

Abstract: Reward shaping is a critical component in reinforcement learning (RL), particularly for complex tasks where sparse rewards can hinder learning. While shaping rewards have been introduced to provide additional guidance, selecting effective shaping functions remains challenging and computationally expensive. This paper introduces Online Reward Selection and Policy Optimization (ORSO), a novel approa… ▽ More Reward shaping is a critical component in reinforcement learning (RL), particularly for complex tasks where sparse rewards can hinder learning. While shaping rewards have been introduced to provide additional guidance, selecting effective shaping functions remains challenging and computationally expensive. This paper introduces Online Reward Selection and Policy Optimization (ORSO), a novel approach that frames shaping reward selection as an online model selection problem. ORSO employs principled exploration strategies to automatically identify promising shaping reward functions without human intervention, balancing exploration and exploitation with provable regret guarantees. We demonstrate ORSO's effectiveness across various continuous control tasks using the Isaac Gym simulator. Compared to traditional methods that fully evaluate each shaping reward function, ORSO significantly improves sample efficiency, reduces computational time, and consistently identifies high-quality reward functions that produce policies comparable to those generated by domain experts through hand-engineered rewards. △ Less

Submitted 19 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

Comments: preprint, 35 pages, 23 figures

arXiv:2410.13748 [pdf, other]

Test of lepton flavour universality with $B_s^0 \rightarrow φ\ell^+\ell^-$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1124 additional authors not shown)

Abstract: Lepton flavour universality in rare $b\rightarrow s$ transitions is tested for the first time using $B_s^0$ meson decays. The measurements are performed using $pp$ collision data collected by the LHCb experiment between 2011 and 2018, corresponding to a total integrated luminosity of 9$\,{\rm fb}^{-1}$. Branching fraction ratios between the $B_s^0 \rightarrow φe^+e^-$ and… ▽ More Lepton flavour universality in rare $b\rightarrow s$ transitions is tested for the first time using $B_s^0$ meson decays. The measurements are performed using $pp$ collision data collected by the LHCb experiment between 2011 and 2018, corresponding to a total integrated luminosity of 9$\,{\rm fb}^{-1}$. Branching fraction ratios between the $B_s^0 \rightarrow φe^+e^-$ and $B_s^0 \rightarrow φμ^+μ^-$ decays are measured in three regions of dilepton mass squared, $q^2$, with $0.1 < q^2 < 1.1$, $1.1 < q^2 < 6.0$, and $15 < q^2 < 19\,{\rm GeV}^2/c^4$. The results agree with the Standard Model expectation of lepton flavour universality. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3513/ (LHCb public pages)

Report number: LHCb-PAPER-2024-032, CERN-EP-2024-255

arXiv:2410.13515 [pdf, other]

Observation of a rare beta decay of the charmed baryon with a Graph Neural Network

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

Abstract: The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the… ▽ More The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the fundamental parameters of the Cabibbo-Kobayashi-Maskawa matrix in weak interaction theory. This article presents the first observation of the Cabibbo-suppressed $Λ_c^+$ beta decay into a neutron $Λ_c^+ \rightarrow n e^+ ν_{e}$, based on $4.5~\mathrm{fb}^{-1}$ of electron-positron annihilation data collected with the BESIII detector in the energy region above the $Λ^+_c\barΛ^-_c$ threshold. A novel machine learning technique, leveraging Graph Neural Networks, has been utilized to effectively separate signals from dominant backgrounds, particularly $Λ_c^+ \rightarrow Λe^+ ν_{e}$. This approach has yielded a statistical significance of more than $10σ$. The absolute branching fraction of $Λ_c^+ \rightarrow n e^+ ν_{e}$ is measured to be $(3.57\pm0.34_{\mathrm{stat}}\pm0.14_{\mathrm{syst}})\times 10^{-3}$. For the first time, the CKM matrix element $\left|V_{cd}\right|$ is extracted via a charmed baryon decay to be $0.208\pm0.011_{\rm exp.}\pm0.007_{\rm LQCD}\pm0.001_{τ_{Λ_c^+}}$. This study provides a new probe to further understand fundamental interactions in the charmed baryon sector, and demonstrates the power of modern machine learning techniques in enhancing experimental capability in high energy physics research. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: 28 pages, 6 figures

arXiv:2410.13478 [pdf, other]

Observation of $χ_{c0}\toΣ^{+}\barΣ^{-}η$ and evidence for $χ_{c1,2}\toΣ^{+}\barΣ^{-}η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\barΣ^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be… ▽ More Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\barΣ^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be $\mathcal{B}(χ_{c0}\toΣ^{+}\barΣ^{-}η)=({1.26 \pm 0.20 \pm 0.13}) \times 10^{-4}, ~\mathcal{B}(χ_{c1}\toΣ^{+}\barΣ^{-}η)=({5.10 \pm 1.21 \pm 0.67}) \times 10^{-5}$, and $\mathcal{B}(χ_{c2}\toΣ^{+}\barΣ^{-}η)=({5.46 \pm 1.18 \pm 0.50}) \times 10^{-5}$, where the first uncertainties are statistical, and the second ones are systematic. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.13368 [pdf, other]

Observation of the Singly Cabibbo-Suppressed Decay $Λ_c^{+}\to pπ^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured… ▽ More Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured as $\mathcal{B}(Λ_c^{+}\to pπ^0)/\mathcal{B}(Λ_c^{+}\to pη)=(0.120\pm0.026_{\rm stat.}\pm0.007_{\rm syst.})$. This result resolves the longstanding discrepancy between earlier experimental searches, providing both a decisive conclusion and valuable input for QCD-inspired theoretical models. A sophisticated deep learning approach using a Transformer-based architecture is employed to distinguish the signal from the prevalent hadronic backgrounds, complemented by thorough validation and systematic uncertainty quantification. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: 9 pages, 4 figures

arXiv:2410.13306 [pdf, other]

The cloud cover and meteorological parameters at the Lenghu site on the Tibetan Plateau

Authors: Ruiyue Li, Fei He, Licai Deng, Xiaodian Chen, Fan Yang, Yong Zhao, Bo Zhang, Chunguang Zhang, Chen Yang, Tian Lan

Abstract: The cloud cover and meteorological parameters serve as fundamental criteria for the qualification of an astronomical observatory working in optical and infrared wavelengths. In this paper, we present a systematic assessment of key meteorological parameters at the Lenghu site. The datasets adopted in this study includes the meteorological parameters collected at the local weather stations at the si… ▽ More The cloud cover and meteorological parameters serve as fundamental criteria for the qualification of an astronomical observatory working in optical and infrared wavelengths. In this paper, we present a systematic assessment of key meteorological parameters at the Lenghu site. The datasets adopted in this study includes the meteorological parameters collected at the local weather stations at the site and in the Lenghu Town, the sky brightness at the local zenith acquired by the Sky Quality Meters and night sky all-sky images from a digital camera, the ERA5 reanalysis database and global climate monitoring data. From 2019 to 2023, the fractional observable time of photometric condition is 69.70%, 74.97%, 70.26%, 74.27% and 65.12%, respectively. The fractional observing time is inversely correlated with surface air temperature, relative humidity, precipitable water vapor, and dew temperature, demonstrating that the observing conditions are influenced by these meteorological parameters. Large-scale air-sea interactions affect the climate at Lenghu site, which in fact delivers a clue to understand the irregularity of 2023. Specifically, precipitable water vapor at Lenghu site is correlated to both the westerly wind index and the summer North Atlantic Oscillation index, the yearly average temperature of Lenghu site is observed to increase significantly during the occurrence of a strong El Niño event and the relative humidity anomaly at Lenghu site is correlated to the Pacific Decadal Oscillation index. The decrease of fractional observing time in 2023 was due to the ongoing strong El Niño event and relevant global climate change. We underscore the substantial role of global climate change in regulating astronomical observing conditions and the necessity for long-term continuous monitoring of the astronomical meteorological parameters at Lenghu site. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: accepted for publication in MNRAS

arXiv:2410.13285 [pdf, other]

Composing Novel Classes: A Concept-Driven Approach to Generalized Category Discovery

Authors: Chuyu Zhang, Peiyan Gu, Xueyang Yu, Xuming He

Abstract: We tackle the generalized category discovery (GCD) problem, which aims to discover novel classes in unlabeled datasets by leveraging the knowledge of known classes. Previous works utilize the known class knowledge through shared representation spaces. Despite their progress, our analysis experiments show that novel classes can achieve impressive clustering results on the feature space of a known c… ▽ More We tackle the generalized category discovery (GCD) problem, which aims to discover novel classes in unlabeled datasets by leveraging the knowledge of known classes. Previous works utilize the known class knowledge through shared representation spaces. Despite their progress, our analysis experiments show that novel classes can achieve impressive clustering results on the feature space of a known class pre-trained model, suggesting that existing methods may not fully utilize known class knowledge. To address it, we introduce a novel concept learning framework for GCD, named ConceptGCD, that categorizes concepts into two types: derivable and underivable from known class concepts, and adopts a stage-wise learning strategy to learn them separately. Specifically, our framework first extracts known class concepts by a known class pre-trained model and then produces derivable concepts from them by a generator layer with a covariance-augmented loss. Subsequently, we expand the generator layer to learn underivable concepts in a balanced manner ensured by a concept score normalization strategy and integrate a contrastive loss to preserve previously learned concepts. Extensive experiments on various benchmark datasets demonstrate the superiority of our approach over the previous state-of-the-art methods. Code will be available soon. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: Underreview. The first two authors contribute equally

arXiv:2410.13260 [pdf, other]

Cyber Attacks Prevention Towards Prosumer-based EV Charging Stations: An Edge-assisted Federated Prototype Knowledge Distillation Approach

Authors: Luyao Zou, Quang Hieu Vo, Kitae Kim, Huy Q. Le, Chu Myaet Thwal, Chaoning Zhang, Choong Seon Hong

Abstract: In this paper, cyber-attack prevention for the prosumer-based electric vehicle (EV) charging stations (EVCSs) is investigated, which covers two aspects: 1) cyber-attack detection on prosumers' network traffic (NT) data, and 2) cyber-attack intervention. To establish an effective prevention mechanism, several challenges need to be tackled, for instance, the NT data per prosumer may be non-independe… ▽ More In this paper, cyber-attack prevention for the prosumer-based electric vehicle (EV) charging stations (EVCSs) is investigated, which covers two aspects: 1) cyber-attack detection on prosumers' network traffic (NT) data, and 2) cyber-attack intervention. To establish an effective prevention mechanism, several challenges need to be tackled, for instance, the NT data per prosumer may be non-independent and identically distributed (non-IID), and the boundary between benign and malicious traffic becomes blurred. To this end, we propose an edge-assisted federated prototype knowledge distillation (E-FPKD) approach, where each client is deployed on a dedicated local edge server (DLES) and can report its availability for joining the federated learning (FL) process. Prior to the E-FPKD approach, to enhance accuracy, the Pearson Correlation Coefficient is adopted for feature selection. Regarding the proposed E-FPKD approach, we integrate the knowledge distillation and prototype aggregation technique into FL to deal with the non-IID challenge. To address the boundary issue, instead of directly calculating the distance between benign and malicious traffic, we consider maximizing the overall detection correctness of all prosumers (ODC), which can mitigate the computational cost compared with the former way. After detection, a rule-based method will be triggered at each DLES for cyber-attack intervention. Experimental analysis demonstrates that the proposed E-FPKD can achieve the largest ODC on NSL-KDD, UNSW-NB15, and IoTID20 datasets in both binary and multi-class classification, compared with baselines. For instance, the ODC for IoTID20 obtained via the proposed method is separately 0.3782% and 4.4471% greater than FedProto and FedAU in multi-class classification. △ Less

Submitted 18 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

Comments: 27 pages, 12 figures

arXiv:2410.13247 [pdf, other]

Enhancing Sentiment Analysis with Collaborative AI: Architecture, Predictions, and Deployment Strategies

Authors: Chaofeng Zhang, Jia Hou, Xueting Tan, Caijuan Chen, Hiroshi Hashimoto

Abstract: The advancement of large language model (LLM) based artificial intelligence technologies has been a game-changer, particularly in sentiment analysis. This progress has enabled a shift from highly specialized research environments to practical, widespread applications within the industry. However, integrating diverse AI models for processing complex multimodal data and the associated high costs of… ▽ More The advancement of large language model (LLM) based artificial intelligence technologies has been a game-changer, particularly in sentiment analysis. This progress has enabled a shift from highly specialized research environments to practical, widespread applications within the industry. However, integrating diverse AI models for processing complex multimodal data and the associated high costs of feature extraction presents significant challenges. Motivated by the marketing oriented software development +needs, our study introduces a collaborative AI framework designed to efficiently distribute and resolve tasks across various AI systems to address these issues. Initially, we elucidate the key solutions derived from our development process, highlighting the role of generative AI models like \emph{chatgpt}, \emph{google gemini} in simplifying intricate sentiment analysis tasks into manageable, phased objectives. Furthermore, we present a detailed case study utilizing our collaborative AI system in edge and cloud, showcasing its effectiveness in analyzing sentiments across diverse online media channels. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.13246 [pdf, other]

Atomic Calibration of LLMs in Long-Form Generations

Authors: Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen Yang, Dong Yu, Nigel Collier

Abstract: Large language models (LLMs) often suffer from hallucinations, posing significant challenges for real-world applications. Confidence calibration, which estimates the underlying uncertainty of model predictions, is essential to enhance the LLMs' trustworthiness. Existing research on LLM calibration has primarily focused on short-form tasks, providing a single confidence score at the response level… ▽ More Large language models (LLMs) often suffer from hallucinations, posing significant challenges for real-world applications. Confidence calibration, which estimates the underlying uncertainty of model predictions, is essential to enhance the LLMs' trustworthiness. Existing research on LLM calibration has primarily focused on short-form tasks, providing a single confidence score at the response level (macro calibration). However, this approach is insufficient for long-form generations, where responses often contain more complex statements and may include both accurate and inaccurate information. Therefore, we introduce atomic calibration, a novel approach that evaluates factuality calibration at a fine-grained level by breaking down long responses into atomic claims. We classify confidence elicitation methods into discriminative and generative types and demonstrate that their combination can enhance calibration. Our extensive experiments on various LLMs and datasets show that atomic calibration is well-suited for long-form generation and can also improve macro calibration results. Additionally, atomic calibration reveals insightful patterns in LLM confidence throughout the generation process. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.12896 [pdf, other]

A Survey on Data Synthesis and Augmentation for Large Language Models

Authors: Ke Wang, Jiahui Zhu, Minjie Ren, Zeming Liu, Shiwei Li, Zongye Zhang, Chenkai Zhang, Xiaoyu Wu, Qiqi Zhan, Qingjie Liu, Yunhong Wang

Abstract: The success of Large Language Models (LLMs) is inherently linked to the availability of vast, diverse, and high-quality data for training and evaluation. However, the growth rate of high-quality data is significantly outpaced by the expansion of training datasets, leading to a looming data exhaustion crisis. This underscores the urgent need to enhance data efficiency and explore new data sources.… ▽ More The success of Large Language Models (LLMs) is inherently linked to the availability of vast, diverse, and high-quality data for training and evaluation. However, the growth rate of high-quality data is significantly outpaced by the expansion of training datasets, leading to a looming data exhaustion crisis. This underscores the urgent need to enhance data efficiency and explore new data sources. In this context, synthetic data has emerged as a promising solution. Currently, data generation primarily consists of two major approaches: data augmentation and synthesis. This paper comprehensively reviews and summarizes data generation techniques throughout the lifecycle of LLMs, including data preparation, pre-training, fine-tuning, instruction-tuning, preference alignment, and applications. Furthermore, We discuss the current constraints faced by these methods and investigate potential pathways for future development and research. Our aspiration is to equip researchers with a clear understanding of these methodologies, enabling them to swiftly identify appropriate data generation strategies in the construction of LLMs, while providing valuable insights for future exploration. △ Less

Submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.12790 [pdf, other]

Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

Authors: Ce Zhang, Simon Stepputtis, Katia Sycara, Yaqi Xie

Abstract: Test-time adaptation, which enables models to generalize to diverse data with unlabeled test samples, holds significant value in real-world scenarios. Recently, researchers have applied this setting to advanced pre-trained vision-language models (VLMs), developing approaches such as test-time prompt tuning to further extend their practical applicability. However, these methods typically focus sole… ▽ More Test-time adaptation, which enables models to generalize to diverse data with unlabeled test samples, holds significant value in real-world scenarios. Recently, researchers have applied this setting to advanced pre-trained vision-language models (VLMs), developing approaches such as test-time prompt tuning to further extend their practical applicability. However, these methods typically focus solely on adapting VLMs from a single modality and fail to accumulate task-specific knowledge as more samples are processed. To address this, we introduce Dual Prototype Evolving (DPE), a novel test-time adaptation approach for VLMs that effectively accumulates task-specific knowledge from multi-modalities. Specifically, we create and evolve two sets of prototypes--textual and visual--to progressively capture more accurate multi-modal representations for target classes during test time. Moreover, to promote consistent multi-modal representations, we introduce and optimize learnable residuals for each test sample to align the prototypes from both modalities. Extensive experimental results on 15 benchmark datasets demonstrate that our proposed DPE consistently outperforms previous state-of-the-art methods while also exhibiting competitive computational efficiency. Code is available at https://github.com/zhangce01/DPE-CLIP. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: Accepted by NeurIPS 2024. Project page: https://zhangce01.github.io/DPE-CLIP

arXiv:2410.12620 [pdf, other]

Search for $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ at center-of-mass energies from 4.47 to 4.95 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

Abstract: Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for… ▽ More Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for $e^{+}e^{-} \to φχ_{c0}$, as well as the product of the Born cross section for $e^{+}e^{-} \to φη_{c2}(1D)$ and a sum of five branching fractions. Furthermore, the product of the electronic width of $Y(4660)$ and the branching fraction of the $Y(4660) \to φχ_{c0}$, denoted as $Γ^{Y(4660)}_{e^{+}e^{-}} \mathcal{B}_{Y(4660) \to φχ_{c0}}$, is determined to be $< 0.40$ eV at the 90\% confidence level. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 14 pages, 6 figures

arXiv:2410.12496 [pdf, other]

doi 10.1145/3698810

Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs

Authors: Wenjing Deng, Qiuyang Mang, Chengyu Zhang, Manuel Rigger

Abstract: Spatial Database Management Systems (SDBMSs) aim to store, manipulate, and retrieve spatial data. SDBMSs are employed in various modern applications, such as geographic information systems, computer-aided design tools, and location-based services. However, the presence of logic bugs in SDBMSs can lead to incorrect results, substantially undermining the reliability of these applications. Detecting… ▽ More Spatial Database Management Systems (SDBMSs) aim to store, manipulate, and retrieve spatial data. SDBMSs are employed in various modern applications, such as geographic information systems, computer-aided design tools, and location-based services. However, the presence of logic bugs in SDBMSs can lead to incorrect results, substantially undermining the reliability of these applications. Detecting logic bugs in SDBMSs is challenging due to the lack of ground truth for identifying incorrect results. In this paper, we propose an automated geometry-aware generator to generate high-quality SQL statements for SDBMSs and a novel concept named Affine Equivalent Inputs (AEI) to validate the results of SDBMSs. We implemented them as a tool named Spatter (Spatial DBMSs Tester) for finding logic bugs in four popular SDBMSs: PostGIS, DuckDB Spatial, MySQL, and SQL Server. Our testing campaign detected 34 previously unknown and unique bugs in these SDBMS, of which 30 have been confirmed, and 18 have been already fixed. Our testing efforts have been well appreciated by the developers. Experimental results demonstrate that the geometry-aware generator significantly outperforms a naive random-shape generator in detecting unique bugs, and AEI can identify 14 logic bugs in SDBMSs that were overlooked by previous methodologies. △ Less

Submitted 17 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.12474 [pdf, other]

Mind the Gap Between Prototypes and Images in Cross-domain Finetuning

Authors: Hongduan Tian, Feng Liu, Zhanke Zhou, Tongliang Liu, Chengqi Zhang, Bo Han

Abstract: In cross-domain few-shot classification (CFC), recent works mainly focus on adapting a simple transformation head on top of a frozen pre-trained backbone with few labeled data to project embeddings into a task-specific metric space where classification can be performed by measuring similarities between image instance and prototype representations. Technically, an assumption implicitly adopted in s… ▽ More In cross-domain few-shot classification (CFC), recent works mainly focus on adapting a simple transformation head on top of a frozen pre-trained backbone with few labeled data to project embeddings into a task-specific metric space where classification can be performed by measuring similarities between image instance and prototype representations. Technically, an assumption implicitly adopted in such a framework is that the prototype and image instance embeddings share the same representation transformation. However, in this paper, we find that there naturally exists a gap, which resembles the modality gap, between the prototype and image instance embeddings extracted from the frozen pre-trained backbone, and simply applying the same transformation during the adaptation phase constrains exploring the optimal representations and shrinks the gap between prototype and image representations. To solve this problem, we propose a simple yet effective method, contrastive prototype-image adaptation (CoPA), to adapt different transformations respectively for prototypes and images similarly to CLIP by treating prototypes as text prompts. Extensive experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently. Meanwhile, further analyses also indicate that CoPA can learn better representation clusters, enlarge the gap, and achieve minimal validation loss at the enlarged gap. △ Less

Submitted 20 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.12444 [pdf, other]

Expanding Chatbot Knowledge in Customer Service: Context-Aware Similar Question Generation Using Large Language Models

Authors: Mengze Hong, Yuanfeng Song, Di Jiang, Lu Wang, Zichang Guo, Chen Jason Zhang

Abstract: Reliable responses of service chatbots are often achieved by employing retrieval-based methods that restrict answers to a knowledge base comprising predefined question-answer pairs (QA pairs). To accommodate potential variations in how a customer's query may be expressed, it emerges as the favored solution to augment these QA pairs with similar questions that are possibly diverse while remaining s… ▽ More Reliable responses of service chatbots are often achieved by employing retrieval-based methods that restrict answers to a knowledge base comprising predefined question-answer pairs (QA pairs). To accommodate potential variations in how a customer's query may be expressed, it emerges as the favored solution to augment these QA pairs with similar questions that are possibly diverse while remaining semantic consistency. This augmentation task is known as Similar Question Generation (SQG). Traditional methods that heavily rely on human efforts or rule-based techniques suffer from limited diversity or significant semantic deviation from the source question, only capable of producing a finite number of useful questions. To address these limitations, we propose an SQG approach based on Large Language Models (LLMs), capable of producing a substantial number of diverse questions while maintaining semantic consistency to the source QA pair. This is achieved by leveraging LLMs' natural language understanding capability through fine-tuning with specially designed prompts. The experiments conducted on a real customer-service dataset demonstrate that our method surpasses baseline methods by a significant margin in terms of semantic diversity. Human evaluation further confirms that integrating the answer that reflects the customer's intention is crucial for increasing the number of generated questions that meet business requirements. △ Less

Submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.12428 [pdf, other]

Conformity in Large Language Models

Authors: Xiaochen Zhu, Caiqi Zhang, Tom Stafford, Nigel Collier, Andreas Vlachos

Abstract: The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in large language models (LLMs) is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivity. Thus, conformity to incorrect responses can compromise their effectiveness. In this paper, we ad… ▽ More The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in large language models (LLMs) is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivity. Thus, conformity to incorrect responses can compromise their effectiveness. In this paper, we adapt psychological experiments to examine the extent of conformity in state-of-the-art LLMs. Our findings reveal that all models tested exhibit varying levels of conformity toward the majority, regardless of their initial choice or correctness, across different knowledge domains. Notably, we are the first to show that LLMs are more likely to conform when they are more uncertain in their own prediction. We further explore factors that influence conformity, such as training paradigms and input characteristics, finding that instruction-tuned models are less susceptible to conformity, while increasing the naturalness of majority tones amplifies conformity. Finally, we propose two interventions--Devil's Advocate and Question Distillation--to mitigate conformity, providing insights into building more robust language models. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 16 pages (8 pages main body), 14 figures

arXiv:2410.12347 [pdf, ps, other]

Guaranteeing MMS for All but One Agent When Allocating Indivisible Chores

Authors: Jiawei Qiu, Xiaowei Wu, Cong Zhang, Shengwei Zhou

Abstract: We study the problem of allocating $m$ indivisible chores to $n$ agents with additive cost functions under the fairness notion of maximin share (MMS). In this work, we propose a notion called $α$-approximate all-but-one maximin share ($α$-AMMS) which is a stronger version of $α$-approximate MMS. An allocation is called $α$-AMMS if $n-1$ agents are guaranteed their MMS values and the remaining agen… ▽ More We study the problem of allocating $m$ indivisible chores to $n$ agents with additive cost functions under the fairness notion of maximin share (MMS). In this work, we propose a notion called $α$-approximate all-but-one maximin share ($α$-AMMS) which is a stronger version of $α$-approximate MMS. An allocation is called $α$-AMMS if $n-1$ agents are guaranteed their MMS values and the remaining agent is guaranteed $α$-approximation of her MMS value. We show that there exist $α$-AMMS allocations, with $α= 9/8$ for three agents; $α= 4/3$ for four agents; and $α= (n+1)^2/4n$ for $n\geq 5$ agents. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 18 pages, 7 figures

arXiv:2410.12089 [pdf, other]

BICEP/Keck XVIII: Measurement of BICEP3 polarization angles and consequences for constraining cosmic birefringence and inflation

Authors: BICEP/Keck Collaboration, :, P. A. R. Ade, Z. Ahmed, M. Amiri, D. Barkats, R. Basu Thakur, C. A. Bischoff, D. Beck, J. J. Bock, H. Boenish, V. Buza, J. R. Cheshire IV, J. Connors, J. Cornelison, M. Crumrine, A. J. Cukierman, E. Denison, L. Duband, M. Eiben, B. D. Elwood, S. Fatigoni, J. P. Filippini, A. Fortes, M. Gao , et al. (60 additional authors not shown)

Abstract: We use a custom-made calibrator to measure individual detectors' polarization angles of BICEP3, a small aperture telescope observing the cosmic microwave background (CMB) at 95GHz from the South Pole. We describe our calibration strategy and the statistical and systematic uncertainties associated with the measurement. We reach an unprecedented precision for such measurement on a CMB experiment, wi… ▽ More We use a custom-made calibrator to measure individual detectors' polarization angles of BICEP3, a small aperture telescope observing the cosmic microwave background (CMB) at 95GHz from the South Pole. We describe our calibration strategy and the statistical and systematic uncertainties associated with the measurement. We reach an unprecedented precision for such measurement on a CMB experiment, with a repeatability for each detector pair of $0.02°$. We show that the relative angles measured using this method are in excellent agreement with those extracted from CMB data. Because the absolute measurement is currently limited by a systematic uncertainty, we do not derive cosmic birefringence constraints from BICEP3 data in this work. Rather, we forecast the sensitivity of BICEP3 sky maps for such analysis. We investigate the relative contributions of instrument noise, lensing, and dust, as well as astrophysical and instrumental systematics. We also explore the constraining power of different angle estimators, depending on analysis choices. We establish that the BICEP3 2-year dataset (2017--2018) has an on-sky sensitivity to the cosmic birefringence angle of $σ= 0.078°$, which could be improved to $σ= 0.055°$ by adding all of the existing BICEP3 data (through 2023). Furthermore, we emphasize the possibility of using the BICEP3 sky patch as a polarization calibration source for CMB experiments, which with the present data could reach a precision of $0.035°$. Finally, in the context of inflation searches, we investigate the impact of detector-to-detector variations in polarization angles as they may bias the tensor-to-scalar ratio r. We show that while the effect is expected to remain subdominant to other sources of systematic uncertainty, it can be reliably calibrated using polarization angle measurements such as the ones we present in this paper. △ Less

Submitted 15 October, 2024; originally announced October 2024.

Comments: 29 Pages, 17 Figures, 6 Tables, as submitted to PRD

arXiv:2410.11710 [pdf, other]

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models

Authors: Pei Wang, Yanan Wu, Zekun Wang, Jiaheng Liu, Xiaoshuai Song, Zhongyuan Peng, Ken Deng, Chenchen Zhang, Jiakai Wang, Junran Peng, Ge Zhang, Hangyu Guo, Zhaoxiang Zhang, Wenbo Su, Bo Zheng

Abstract: Large Language Models (LLMs) have displayed massive improvements in reasoning and decision-making skills and can hold natural conversations with users. Recently, many tool-use benchmark datasets have been proposed. However, existing datasets have the following limitations: (1). Insufficient evaluation scenarios (e.g., only cover limited tool-use scenes). (2). Extensive evaluation costs (e.g., GPT… ▽ More Large Language Models (LLMs) have displayed massive improvements in reasoning and decision-making skills and can hold natural conversations with users. Recently, many tool-use benchmark datasets have been proposed. However, existing datasets have the following limitations: (1). Insufficient evaluation scenarios (e.g., only cover limited tool-use scenes). (2). Extensive evaluation costs (e.g., GPT API costs). To address these limitations, in this work, we propose a multi-granularity tool-use benchmark for large language models called MTU-Bench. For the "multi-granularity" property, our MTU-Bench covers five tool usage scenes (i.e., single-turn and single-tool, single-turn and multiple-tool, multiple-turn and single-tool, multiple-turn and multiple-tool, and out-of-distribution tasks). Besides, all evaluation metrics of our MTU-Bench are based on the prediction results and the ground truth without using any GPT or human evaluation metrics. Moreover, our MTU-Bench is collected by transforming existing high-quality datasets to simulate real-world tool usage scenarios, and we also propose an instruction dataset called MTU-Instruct data to enhance the tool-use abilities of existing LLMs. Comprehensive experimental results demonstrate the effectiveness of our MTU-Bench. Code and data will be released at https: //github.com/MTU-Bench-Team/MTU-Bench.git. △ Less

Submitted 15 October, 2024; originally announced October 2024.

arXiv:2410.11607 [pdf, other]

Observation of $χ_{cJ}\to p \bar p K^0_S K^- π^+ + c.c.$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (648 additional authors not shown)

Abstract: By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be… ▽ More By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\to p \bar p K^{0}_{S} K^- π^+ + c.c.)=(2.61\pm0.27\pm0.32)\times10^{-5},$ $\mathcal{B}(χ_{c1}\to p \bar p K^{0}_{S} K^- π^+ + c.c.)=(4.16\pm0.24\pm0.46)\times10^{-5},$ and $\mathcal{B}(χ_{c2}\to p \bar p K^{0}_{S} K^- π^+ + c.c.)=(5.63\pm0.28\pm0.46)\times10^{-5}$, respectively. The processes $χ_{c1,2} \to \bar{p} Λ(1520) K^0_S π^{+} + c.c.$ are also observed, with statistical significances of 5.7$σ$ and 7.0$σ$, respectively. Evidence for $χ_{c0} \to\bar{p} Λ(1520) K^0_S π^{+} + c.c.$ is found with statistical significances of 3.3$σ$ each. The corresponding branching fractions are determined to be $\mathcal{B}(χ_{c0}\to \bar{p} Λ(1520) K^0_S π^{+} + c.c.) =(1.61^{+0.68}_{-0.64}\pm0.23)\times10^{-5}$, $\mathcal{B}(χ_{c1}\to \bar{p} Λ(1520) K^0_S π^{+} + c.c.)=(4.06^{+0.80}_{-0.76}\pm0.52)\times10^{-5}$, and $\mathcal{B}(χ_{c2}\to \bar{p} Λ(1520) K^0_S π^{+} + c.c.)=(4.09^{+0.87}_{-0.84}\pm0.42)\times10^{-5}$. Here, the first uncertainties are statistical and the second ones are systematic. △ Less

Submitted 15 October, 2024; originally announced October 2024.

Comments: 12 pages, 5 figures

arXiv:2410.11576 [pdf, other]

The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection

Authors: Qingyang Zhang, Qiuxuan Feng, Joey Tianyi Zhou, Yatao Bian, Qinghua Hu, Changqing Zhang

Abstract: Out-of-distribution (OOD) detection is essential for model trustworthiness which aims to sensitively identify semantic OOD samples and robustly generalize for covariate-shifted OOD samples. However, we discover that the superior OOD detection performance of state-of-the-art methods is achieved by secretly sacrificing the OOD generalization ability. Specifically, the classification accuracy of thes… ▽ More Out-of-distribution (OOD) detection is essential for model trustworthiness which aims to sensitively identify semantic OOD samples and robustly generalize for covariate-shifted OOD samples. However, we discover that the superior OOD detection performance of state-of-the-art methods is achieved by secretly sacrificing the OOD generalization ability. Specifically, the classification accuracy of these models could deteriorate dramatically when they encounter even minor noise. This phenomenon contradicts the goal of model trustworthiness and severely restricts their applicability in real-world scenarios. What is the hidden reason behind such a limitation? In this work, we theoretically demystify the ``\textit{sensitive-robust}'' dilemma that lies in many existing OOD detection methods. Consequently, a theory-inspired algorithm is induced to overcome such a dilemma. By decoupling the uncertainty learning objective from a Bayesian perspective, the conflict between OOD detection and OOD generalization is naturally harmonized and a dual-optimal performance could be expected. Empirical studies show that our method achieves superior performance on standard benchmarks. To our best knowledge, this work is the first principled OOD detection method that achieves state-of-the-art OOD detection performance without compromising OOD generalization ability. Our code is available at \href{https://github.com/QingyangZhang/DUL}{https://github.com/QingyangZhang/DUL}. △ Less

Submitted 12 October, 2024; originally announced October 2024.

Comments: Accepted by NeurlPS24. Code is available at https://github.com/QingyangZhang/DUL

arXiv:2410.11560 [pdf, other]

PSVMA+: Exploring Multi-granularity Semantic-visual Adaption for Generalized Zero-shot Learning

Authors: Man Liu, Huihui Bai, Feng Li, Chunjie Zhang, Yunchao Wei, Meng Wang, Tat-Seng Chua, Yao Zhao

Abstract: Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features. However, GZSL suffers from insufficient visual-semantic correspondences due to the attribute diversity and instance diversity. Attribute diversity refers to varying semantic gran… ▽ More Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features. However, GZSL suffers from insufficient visual-semantic correspondences due to the attribute diversity and instance diversity. Attribute diversity refers to varying semantic granularity in attribute descriptions, ranging from low-level (specific, directly observable) to high-level (abstract, highly generic) characteristics. This diversity challenges the collection of adequate visual cues for attributes under a uni-granularity. Additionally, diverse visual instances corresponding to the same sharing attributes introduce semantic ambiguity, leading to vague visual patterns. To tackle these problems, we propose a multi-granularity progressive semantic-visual mutual adaption (PSVMA+) network, where sufficient visual elements across granularity levels can be gathered to remedy the granularity inconsistency. PSVMA+ explores semantic-visual interactions at different granularity levels, enabling awareness of multi-granularity in both visual and semantic elements. At each granularity level, the dual semantic-visual transformer module (DSVTM) recasts the sharing attributes into instance-centric attributes and aggregates the semantic-related visual regions, thereby learning unambiguous visual features to accommodate various instances. Given the diverse contributions of different granularities, PSVMA+ employs selective cross-granularity learning to leverage knowledge from reliable granularities and adaptively fuses multi-granularity features for comprehensive representations. Experimental results demonstrate that PSVMA+ consistently outperforms state-of-the-art methods. △ Less

Submitted 15 October, 2024; originally announced October 2024.

Comments: Accepted to TPAMI 2024. arXiv admin note: text overlap with arXiv:2303.15322

arXiv:2410.11285 [pdf, other]

Scalable Indoor Novel-View Synthesis using Drone-Captured 360 Imagery with 3D Gaussian Splatting

Authors: Yuanbo Chen, Chengyu Zhang, Jason Wang, Xuefan Gao, Avideh Zakhor

Abstract: Scene reconstruction and novel-view synthesis for large, complex, multi-story, indoor scenes is a challenging and time-consuming task. Prior methods have utilized drones for data capture and radiance fields for scene reconstruction, both of which present certain challenges. First, in order to capture diverse viewpoints with the drone's front-facing camera, some approaches fly the drone in an unsta… ▽ More Scene reconstruction and novel-view synthesis for large, complex, multi-story, indoor scenes is a challenging and time-consuming task. Prior methods have utilized drones for data capture and radiance fields for scene reconstruction, both of which present certain challenges. First, in order to capture diverse viewpoints with the drone's front-facing camera, some approaches fly the drone in an unstable zig-zag fashion, which hinders drone-piloting and generates motion blur in the captured data. Secondly, most radiance field methods do not easily scale to arbitrarily large number of images. This paper proposes an efficient and scalable pipeline for indoor novel-view synthesis from drone-captured 360 videos using 3D Gaussian Splatting. 360 cameras capture a wide set of viewpoints, allowing for comprehensive scene capture under a simple straightforward drone trajectory. To scale our method to large scenes, we devise a divide-and-conquer strategy to automatically split the scene into smaller blocks that can be reconstructed individually and in parallel. We also propose a coarse-to-fine alignment strategy to seamlessly match these blocks together to compose the entire scene. Our experiments demonstrate marked improvement in both reconstruction quality, i.e. PSNR and SSIM, and computation time compared to prior approaches. △ Less

Submitted 15 October, 2024; originally announced October 2024.

Comments: Accepted to ECCV 2024 S3DSGR Workshop

arXiv:2410.10894 [pdf, other]

COME: Test-time adaption by Conservatively Minimizing Entropy

Authors: Qingyang Zhang, Yatao Bian, Xinke Kong, Peilin Zhao, Changqing Zhang

Abstract: Machine learning models must continuously self-adjust themselves for novel data distribution in the open world. As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to… ▽ More Machine learning models must continuously self-adjust themselves for novel data distribution in the open world. As the predominant principle, entropy minimization (EM) has been proven to be a simple yet effective cornerstone in existing test-time adaption (TTA) methods. While unfortunately its fatal limitation (i.e., overconfidence) tends to result in model collapse. For this issue, we propose to Conservatively Minimize the Entropy (COME), which is a simple drop-in replacement of traditional EM to elegantly address the limitation. In essence, COME explicitly models the uncertainty by characterizing a Dirichlet prior distribution over model predictions during TTA. By doing so, COME naturally regularizes the model to favor conservative confidence on unreliable samples. Theoretically, we provide a preliminary analysis to reveal the ability of COME in enhancing the optimization stability by introducing a data-adaptive lower bound on the entropy. Empirically, our method achieves state-of-the-art performance on commonly used benchmarks, showing significant improvements in terms of classification accuracy and uncertainty estimation under various settings including standard, life-long and open-world TTA, i.e., up to $34.5\%$ improvement on accuracy and $15.1\%$ on false positive rate. △ Less

Submitted 12 October, 2024; originally announced October 2024.

Comments: Ongoing work

arXiv:2410.10551 [pdf, other]

Preserving Cardiac Integrity: A Topology-Infused Approach to Whole Heart Segmentation

Authors: Chenyu Zhang, Wenxue Guan, Xiaodan Xing, Guang Yang

Abstract: Whole heart segmentation (WHS) supports cardiovascular disease (CVD) diagnosis, disease monitoring, treatment planning, and prognosis. Deep learning has become the most widely used method for WHS applications in recent years. However, segmentation of whole-heart structures faces numerous challenges including heart shape variability during the cardiac cycle, clinical artifacts like motion and poor… ▽ More Whole heart segmentation (WHS) supports cardiovascular disease (CVD) diagnosis, disease monitoring, treatment planning, and prognosis. Deep learning has become the most widely used method for WHS applications in recent years. However, segmentation of whole-heart structures faces numerous challenges including heart shape variability during the cardiac cycle, clinical artifacts like motion and poor contrast-to-noise ratio, domain shifts in multi-center data, and the distinct modalities of CT and MRI. To address these limitations and improve segmentation quality, this paper introduces a new topology-preserving module that is integrated into deep neural networks. The implementation achieves anatomically plausible segmentation by using learned topology-preserving fields, which are based entirely on 3D convolution and are therefore very effective for 3D voxel data. We incorporate natural constraints between structures into the end-to-end training and enrich the feature representation of the neural network. The effectiveness of the proposed method is validated on an open-source medical heart dataset, specifically using the WHS++ data. The results demonstrate that the architecture performs exceptionally well, achieving a Dice coefficient of 0.939 during testing. This indicates full topology preservation for individual structures and significantly outperforms other baselines in preserving the overall scene topology. △ Less

Submitted 17 October, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

arXiv:2410.10298 [pdf, other]

ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object

Authors: Jiwei Chen, Laiyan Ding, Chi Zhang, Feifei Li, Rui Huang

Abstract: Vision-based BEV (Bird-Eye-View) 3D object detection has recently become popular in autonomous driving. However, objects with a high similarity to the background from a camera perspective cannot be detected well by existing methods. In this paper, we propose 2D Region-oriented Attention for a BEV-based 3D Object Detection Network (ROA-BEV), which can make the backbone focus more on feature learnin… ▽ More Vision-based BEV (Bird-Eye-View) 3D object detection has recently become popular in autonomous driving. However, objects with a high similarity to the background from a camera perspective cannot be detected well by existing methods. In this paper, we propose 2D Region-oriented Attention for a BEV-based 3D Object Detection Network (ROA-BEV), which can make the backbone focus more on feature learning in areas where objects may exist. Moreover, our method increases the information content of ROA through a multi-scale structure. In addition, every block of ROA utilizes a large kernel to ensure that the receptive field is large enough to catch large objects' information. Experiments on nuScenes show that ROA-BEV improves the performance based on BEVDet and BEVDepth. The code will be released soon. △ Less

Submitted 14 October, 2024; originally announced October 2024.

arXiv:2410.09828 [pdf, other]

The semiclassical propagator for coherent state on twisted geometry

Authors: Gaoping Long, Hongguang Liu, Cong Zhang

Abstract: A new set of twisted geometric variables is introduced to parametrize the holonomy-flux phase space in loop quantum gravity. It is verified that these new geometric variables, after symplectic reduction with respect to the Gauss constraint, form a Poisson algebra which is analogue to that in quantum mechanics. This property ensures that these new geometric variables provide a simple path measure,… ▽ More A new set of twisted geometric variables is introduced to parametrize the holonomy-flux phase space in loop quantum gravity. It is verified that these new geometric variables, after symplectic reduction with respect to the Gauss constraint, form a Poisson algebra which is analogue to that in quantum mechanics. This property ensures that these new geometric variables provide a simple path measure, upon which a new formulation of coherent state path integral based on twisted geometry coherent state is established in loop quantum gravity. Especially, this path integral is analytically computable by expanding the corresponding effective action around the complex evolution trajectories at second order, and the result gives the semi-classical approximation of the quantum propagator between twisted geometry coherent state in LQG. △ Less

Submitted 13 October, 2024; originally announced October 2024.

arXiv:2410.09591 [pdf, other]

Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy

Authors: Yangsibo Huang, Daogao Liu, Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Milad Nasr, Amer Sinha, Chiyuan Zhang

Abstract: Machine unlearning algorithms, designed for selective removal of training data from models, have emerged as a promising approach to growing privacy concerns. In this work, we expose a critical yet underexplored vulnerability in the deployment of unlearning systems: the assumption that the data requested for removal is always part of the original training set. We present a threat model where an att… ▽ More Machine unlearning algorithms, designed for selective removal of training data from models, have emerged as a promising approach to growing privacy concerns. In this work, we expose a critical yet underexplored vulnerability in the deployment of unlearning systems: the assumption that the data requested for removal is always part of the original training set. We present a threat model where an attacker can degrade model accuracy by submitting adversarial unlearning requests for data not present in the training set. We propose white-box and black-box attack algorithms and evaluate them through a case study on image classification tasks using the CIFAR-10 and ImageNet datasets, targeting a family of widely used unlearning methods. Our results show extremely poor test accuracy following the attack: 3.6% on CIFAR-10 and 0.4% on ImageNet for white-box attacks, and 8.5% on CIFAR-10 and 1.3% on ImageNet for black-box attacks. Additionally, we evaluate various verification mechanisms to detect the legitimacy of unlearning requests and reveal the challenges in verification, as most of the mechanisms fail to detect stealthy attacks without severely impairing their ability to process valid requests. These findings underscore the urgent need for research on more robust request verification methods and unlearning protocols, should the deployment of machine unlearning systems become more prevalent in the future. △ Less

Submitted 12 October, 2024; originally announced October 2024.

arXiv:2410.09151 [pdf, other]

A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1758 additional authors not shown)

Abstract: The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by… ▽ More The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs. △ Less

Submitted 11 October, 2024; originally announced October 2024.

Comments: 15 pages of text including references, 4 figures, 5 tables

Report number: LIGO-P2400192

arXiv:2410.08871 [pdf, other]

Adaptive optimization of wave energy conversion in oscillatory wave surge converters via SPH simulation and deep reinforcement learning

Authors: Mai Ye, Chi Zhang, Yaru Ren, Ziyuan Liu, Oskar J. Haidn, Xiangyu Hu

Abstract: The nonlinear damping characteristics of the oscillating wave surge converter (OWSC) significantly impact the performance of the power take-off system. This study presents a framework by integrating deep reinforcement learning (DRL) with numerical simulations of OWSC to identify optimal adaptive damping policy under varying wave conditions, thereby enhancing wave energy harvesting efficiency. Firs… ▽ More The nonlinear damping characteristics of the oscillating wave surge converter (OWSC) significantly impact the performance of the power take-off system. This study presents a framework by integrating deep reinforcement learning (DRL) with numerical simulations of OWSC to identify optimal adaptive damping policy under varying wave conditions, thereby enhancing wave energy harvesting efficiency. Firstly, the open-source multiphysics libraries SPHinXsys and Simbody are employed to establish the numerical environment for wave interaction with OWSCs. Subsequently, a comparative analysis of three DRL algorithms-proximal policy optimization (PPO), twin delayed deep deterministic policy gradient (TD3), and soft actor-critic (SAC)-is conducted using the two-dimensional (2D) numerical study of OWSC interacting with regular waves. The results reveal that artificial neural networks capture the nonlinear characteristics of wave-structure interactions and provide efficient PTO policies. Notably, the SAC algorithm demonstrates exceptional robustness and accuracy, achieving a 10.61% improvement in wave energy harvesting. Furthermore, policies trained in a 2D environment are successfully applied to the three-dimensional (3D) study, with an improvement of 22.54% in energy harvesting. Additionally, the study shows that energy harvesting is improved by 6.42% for complex irregular waves. However, for the complex dual OWSC system, optimizing the damping characteristics alone is insufficient to enhance energy harvesting. △ Less

Submitted 11 October, 2024; originally announced October 2024.

Comments: 67 pages and 25 figures

arXiv:2410.08603 [pdf, other]

Observation of $D^+\toη^\primeμ^+ν_μ$ and First Study of $D^+\to η^\prime \ell^+ν_\ell$ Decay Dynamics

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy 3.773\,GeV with the BESIII detector, we report the first observation of the semileptonic decay $D^+\to η^\prime μ^+ν_μ$ with significance of $8.6σ$ including systematic uncertainties, and an improved measurement of $D^+\to η^\prime e^+ν_e$. The branching fractions of $D^+\to η^\prime μ^+ν_μ$ and… ▽ More Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy 3.773\,GeV with the BESIII detector, we report the first observation of the semileptonic decay $D^+\to η^\prime μ^+ν_μ$ with significance of $8.6σ$ including systematic uncertainties, and an improved measurement of $D^+\to η^\prime e^+ν_e$. The branching fractions of $D^+\to η^\prime μ^+ν_μ$ and $D^+\to η^\prime e^+ν_e$ are determined to be $(1.92\pm0.28_{\rm stat}\pm 0.08_{\rm syst})\times 10^{-4}$ and $(1.79\pm0.19_{\rm stat}\pm 0.07_{\rm syst})\times 10^{-4}$, respectively. From an analysis of the $D^+\to η^\prime \ell^+ν_\ell$ decay dynamics, the product of the hadronic form factor $f_+^{η^{\prime}}(0)$ and the CKM matrix element $|V_{cd}|$ is measured for the first time, giving $f^{η^\prime}_+(0)|V_{cd}| = (5.92\pm0.56_{\rm stat}\pm0.13_{\rm syst})\times 10^{-2}$. No evidence for violation of $μ-e$ lepton-flavor universality is found in both the full range and several bins of $\ell^+ν_\ell$ four-momentum transfer. The $η-η^\prime$ mixing angle in the quark flavor basis is determined to be $φ_{\rm P} =(39.8\pm0.8_{\rm stat}\pm0.3_{\rm syst})^\circ$. △ Less

Submitted 11 October, 2024; originally announced October 2024.

arXiv:2410.08582 [pdf, ps, other]

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Authors: Nguyen Huu Bao Long, Chenyu Zhang, Yuzhi Shi, Tsubasa Hirakawa, Takayoshi Yamashita, Tohgoroh Matsui, Hironobu Fujiyoshi

Abstract: Vision Transformers with various attention modules have demonstrated superior performance on vision tasks. While using sparsity-adaptive attention, such as in DAT, has yielded strong results in image classification, the key-value pairs selected by deformable points lack semantic relevance when fine-tuning for semantic segmentation tasks. The query-aware sparsity attention in BiFormer seeks to focu… ▽ More Vision Transformers with various attention modules have demonstrated superior performance on vision tasks. While using sparsity-adaptive attention, such as in DAT, has yielded strong results in image classification, the key-value pairs selected by deformable points lack semantic relevance when fine-tuning for semantic segmentation tasks. The query-aware sparsity attention in BiFormer seeks to focus each query on top-k routed regions. However, during attention calculation, the selected key-value pairs are influenced by too many irrelevant queries, reducing attention on the more important ones. To address these issues, we propose the Deformable Bi-level Routing Attention (DBRA) module, which optimizes the selection of key-value pairs using agent queries and enhances the interpretability of queries in attention maps. Based on this, we introduce the Deformable Bi-level Routing Attention Transformer (DeBiFormer), a novel general-purpose vision transformer built with the DBRA module. DeBiFormer has been validated on various computer vision tasks, including image classification, object detection, and semantic segmentation, providing strong evidence of its effectiveness.Code is available at {https://github.com/maclong01/DeBiFormer} △ Less

Submitted 11 October, 2024; originally announced October 2024.

Comments: 20 pages, 7 figures. arXiv admin note: text overlap with arXiv:2303.08810 by other authors

Journal ref: ACCV 2024

arXiv:2410.08478 [pdf, other]

Personalized Item Representations in Federated Multimodal Recommendation

Authors: Zhiwei Li, Guodong Long, Jing Jiang, Chengqi Zhang

Abstract: Federated recommendation systems are essential for providing personalized recommendations while protecting user privacy. However, current methods mainly rely on ID-based item embeddings, neglecting the rich multimodal information of items. To address this, we propose a Federated Multimodal Recommendation System, called FedMR. FedMR uses a foundation model on the server to encode multimodal item da… ▽ More Federated recommendation systems are essential for providing personalized recommendations while protecting user privacy. However, current methods mainly rely on ID-based item embeddings, neglecting the rich multimodal information of items. To address this, we propose a Federated Multimodal Recommendation System, called FedMR. FedMR uses a foundation model on the server to encode multimodal item data, such as images and text. To handle data heterogeneity caused by user preference differences, FedMR introduces a Mixing Feature Fusion Module on each client, which adjusts fusion strategy weights based on user interaction history to generate personalized item representations that capture users' fine-grained preferences. FedMR is compatible with existing ID-based federated recommendation systems, improving performance without modifying the original framework. Experiments on four real-world multimodal datasets demonstrate FedMR's effectiveness. The code is available at https://anonymous.4open.science/r/FedMR. △ Less

Submitted 14 October, 2024; v1 submitted 10 October, 2024; originally announced October 2024.

Comments: 12 pages, 4 figures, 5 tables, conference

arXiv:2410.08102 [pdf, other]

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining

Authors: Tianyi Bai, Ling Yang, Zhen Hao Wong, Jiahui Peng, Xinlin Zhuang, Chi Zhang, Lijun Wu, Jiantao Qiu, Wentao Zhang, Binhang Yuan, Conghui He

Abstract: Efficient data selection is crucial to accelerate the pretraining of large language models (LLMs). While various methods have been proposed to enhance data efficiency, limited research has addressed the inherent conflicts between these approaches to achieve optimal data selection for LLM pretraining. To tackle this problem, we propose a novel multi-agent collaborative data selection mechanism. In… ▽ More Efficient data selection is crucial to accelerate the pretraining of large language models (LLMs). While various methods have been proposed to enhance data efficiency, limited research has addressed the inherent conflicts between these approaches to achieve optimal data selection for LLM pretraining. To tackle this problem, we propose a novel multi-agent collaborative data selection mechanism. In this framework, each data selection method serves as an independent agent, and an agent console is designed to dynamically integrate the information from all agents throughout the LLM training process. We conduct extensive empirical studies to evaluate our multi-agent framework. The experimental results demonstrate that our approach significantly improves data efficiency, accelerates convergence in LLM training, and achieves an average performance gain up to 10.5% across multiple language model benchmarks compared to the state-of-the-art methods. △ Less

Submitted 14 October, 2024; v1 submitted 10 October, 2024; originally announced October 2024.

arXiv:2410.07983 [pdf, other]

Characterizing Quantum Codes via the Coefficients in Knill-Laflamme Conditions

Authors: Mengxin Du, Chao Zhang, Yiu-Tung Poon, Bei Zeng

Abstract: Quantum error correction (QEC) is essential for protecting quantum information against noise, yet understanding the structure of the Knill-Laflamme (KL) coefficients $λ_{ij}$ from the condition $PE_i^\dagger E_j P = λ_{ij} P$ remains challenging, particularly for nonadditive codes. In this work, we introduce the signature vector $\vecλ(P)$, composed of the off-diagonal KL coefficients $λ_{ij}$, wh… ▽ More Quantum error correction (QEC) is essential for protecting quantum information against noise, yet understanding the structure of the Knill-Laflamme (KL) coefficients $λ_{ij}$ from the condition $PE_i^\dagger E_j P = λ_{ij} P$ remains challenging, particularly for nonadditive codes. In this work, we introduce the signature vector $\vecλ(P)$, composed of the off-diagonal KL coefficients $λ_{ij}$, where each coefficient corresponds to equivalence classes of errors counted only once. We define its Euclidean norm $λ^*(P)$ as a scalar measure representing the total strength of error correlations within the code subspace defined by the projector $P$. We parameterize $P$ on a Stiefel manifold and formulate an optimization problem based on the KL conditions to systematically explore possible values of $λ^*$. Moreover, we show that, for $((n,K,d))$ codes, $λ^*$ is invariant under local unitary transformations. Applying our approach to the $((6, 2, 3))$ quantum code, we find that $λ^*_{\text{min}} = \sqrt{0.6}$ and $λ^*_{\text{max}} = 1$, with $λ^* = 1$ corresponding to a known degenerate stabilizer code. We construct continuous families of new nonadditive codes parameterized by vectors in $\mathbb{R}^5$, with $λ^*$ varying over the interval $[\sqrt{0.6}, 1]$. For the $((7, 2, 3))$ code, we identify $λ^*_{\text{min}} = 0$ (corresponding to the non-degenerate Steane code) and $λ^*_{\text{max}} = \sqrt{7}$ (corresponding to the permutation-invariant code by Pollatsek and Ruskai), and we demonstrate continuous paths connecting these extremes via cyclic codes characterized solely by $λ^*$. Our findings provide new insights into the structure of quantum codes, advance the theoretical foundations of QEC, and open new avenues for investigating intricate relationships between code subspaces and error correlations. △ Less

Submitted 10 October, 2024; originally announced October 2024.

Comments: 18 pages, 2 figures

arXiv:2410.07626 [pdf, other]

Precision Measurement of the Branching Fraction of $D^{+}\to μ^{+}ν_μ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of $E_{\rm cm}=3.773$ GeV with the BESIII detector operating at the BEPCII collider, we determine the branching fraction of the leptonic decay $D^+\toμ^+ν_μ$ to be $(3.981\pm0.079_{\rm stat}\pm0.040_{\rm syst})\times10^{-4}$. Interpreting our measurement with knowledge of the Fermi coupling constant… ▽ More Using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of $E_{\rm cm}=3.773$ GeV with the BESIII detector operating at the BEPCII collider, we determine the branching fraction of the leptonic decay $D^+\toμ^+ν_μ$ to be $(3.981\pm0.079_{\rm stat}\pm0.040_{\rm syst})\times10^{-4}$. Interpreting our measurement with knowledge of the Fermi coupling constant $G_F$, the masses of the $D^+$ and $μ^+$ as well as the lifetime of the $D^+$, we determine $f_{D^+}|V_{cd}|=(47.53\pm0.48_{\rm stat}\pm0.24_{\rm syst}\pm0.12_{\rm input})~\mathrm{MeV}$. This result is a factor of 2.3 more precise than the previous best measurement. Using the value of the magnitude of the Cabibbo-Kobayashi-Maskawa matrix element $|V_{cd}|$ given by the global standard model fit, we obtain the $D^+$ decay constant $f_{D^+}=(211.5\pm2.3_{\rm stat}\pm1.1_{\rm syst}\pm0.8_{\rm input})$ MeV. Alternatively, using the value of $f_{D^+}$ from a precise lattice quantum chromodynamics calculation, we extract $|V_{cd}|=0.2242\pm0.0023_{\rm stat}\pm0.0011_{\rm syst}\pm0.0009_{\rm input}$. △ Less

Submitted 10 October, 2024; originally announced October 2024.

Comments: 9 pages, 2 figures

arXiv:2410.07538 [pdf, other]

Rank Aggregation in Crowdsourcing for Listwise Annotations

Authors: Wenshui Luo, Haoyu Liu, Yongliang Ding, Tao Zhou, Sheng wan, Runze Wu, Minmin Lin, Cong Zhang, Changjie Fan, Chen Gong

Abstract: Rank aggregation through crowdsourcing has recently gained significant attention, particularly in the context of listwise ranking annotations. However, existing methods primarily focus on a single problem and partial ranks, while the aggregation of listwise full ranks across numerous problems remains largely unexplored. This scenario finds relevance in various applications, such as model quality a… ▽ More Rank aggregation through crowdsourcing has recently gained significant attention, particularly in the context of listwise ranking annotations. However, existing methods primarily focus on a single problem and partial ranks, while the aggregation of listwise full ranks across numerous problems remains largely unexplored. This scenario finds relevance in various applications, such as model quality assessment and reinforcement learning with human feedback. In light of practical needs, we propose LAC, a Listwise rank Aggregation method in Crowdsourcing, where the global position information is carefully measured and included. In our design, an especially proposed annotation quality indicator is employed to measure the discrepancy between the annotated rank and the true rank. We also take the difficulty of the ranking problem itself into consideration, as it directly impacts the performance of annotators and consequently influences the final results. To our knowledge, LAC is the first work to directly deal with the full rank aggregation problem in listwise crowdsourcing, and simultaneously infer the difficulty of problems, the ability of annotators, and the ground-truth ranks in an unsupervised way. To evaluate our method, we collect a real-world business-oriented dataset for paragraph ranking. Experimental results on both synthetic and real-world benchmark datasets demonstrate the effectiveness of our proposed LAC method. △ Less

Submitted 9 October, 2024; originally announced October 2024.

Comments: 19 pages

arXiv:2410.07484 [pdf, other]

WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents

Authors: Siyu Zhou, Tianyi Zhou, Yijun Yang, Guodong Long, Deheng Ye, Jing Jiang, Chengqi Zhang

Abstract: Can large language models (LLMs) directly serve as powerful world models for model-based agents? While the gaps between the prior knowledge of LLMs and the specified environment's dynamics do exist, our study reveals that the gaps can be bridged by aligning an LLM with its deployed environment and such "world alignment" can be efficiently achieved by rule learning on LLMs. Given the rich prior kno… ▽ More Can large language models (LLMs) directly serve as powerful world models for model-based agents? While the gaps between the prior knowledge of LLMs and the specified environment's dynamics do exist, our study reveals that the gaps can be bridged by aligning an LLM with its deployed environment and such "world alignment" can be efficiently achieved by rule learning on LLMs. Given the rich prior knowledge of LLMs, only a few additional rules suffice to align LLM predictions with the specified environment dynamics. To this end, we propose a neurosymbolic approach to learn these rules gradient-free through LLMs, by inducing, updating, and pruning rules based on comparisons of agent-explored trajectories and world model predictions. The resulting world model is composed of the LLM and the learned rules. Our embodied LLM agent "WALL-E" is built upon model-predictive control (MPC). By optimizing look-ahead actions based on the precise world model, MPC significantly improves exploration and learning efficiency. Compared to existing LLM agents, WALL-E's reasoning only requires a few principal rules rather than verbose buffered trajectories being included in the LLM input. On open-world challenges in Minecraft and ALFWorld, WALL-E achieves higher success rates than existing methods, with lower costs on replanning time and the number of tokens used for reasoning. In Minecraft, WALL-E exceeds baselines by 15-30% in success rate while costing 8-20 fewer replanning rounds and only 60-80% of tokens. In ALFWorld, its success rate surges to a new record high of 95% only after 6 iterations. △ Less

Submitted 11 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

Comments: 35 pages, including references and appendix. Code is available at https://github.com/elated-sawyer/WALL-E

Showing 1–50 of 7,745 results for author: Zhang, C