Skip to main content

Showing 1–50 of 1,357 results for author: Ma, W

  1. arXiv:2410.16162  [pdf, other

    cs.CV cs.CL

    Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning

    Authors: Yihong Tang, Ao Qu, Zhaokai Wang, Dingyi Zhuang, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao

    Abstract: Vision language models (VLMs) have demonstrated impressive performance across a wide range of downstream tasks. However, their proficiency in spatial reasoning remains limited, despite its crucial role in tasks involving navigation and interaction with physical environments. Specifically, much of the spatial reasoning in these tasks occurs in two-dimensional (2D) environments, and our evaluation r… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  2. arXiv:2410.16024  [pdf, other

    cs.AI

    A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models

    Authors: Yue Deng, Weiyu Ma, Yuxin Fan, Yin Zhang, Haifeng Zhang, Jian Zhao

    Abstract: StarCraft Multi-Agent Challenge (SMAC) is one of the most commonly used experimental environments in multi-agent reinforcement learning (MARL), where the specific task is to control a set number of allied units to defeat enemy forces. Traditional MARL algorithms often require interacting with the environment for up to 1 million steps to train a model, and the resulting policies are typically non-i… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  3. arXiv:2410.15676  [pdf, ps, other

    nlin.SI

    Inverse scattering transform for the defocusing-defocusing coupled Hirota equations with non-parallel boundary conditions at infinity

    Authors: Peng-Fei Han, Wen-Xiu Ma, Yi Zhang

    Abstract: The inverse scattering transform for the defocusing-defocusing coupled Hirota equations is strictly discussed with non-zero boundary conditions at infinity including non-parallel boundary conditions, specifically referring to the asymptotic polarization vectors. To address the non-analyticity encountered in some of the Jost eigenfunctions, the "adjoint" Lax pair is employed. The inverse problem is… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  4. arXiv:2410.15665  [pdf, other

    cs.AI cs.LG

    Long Term Memory: The Foundation of AI Self-Evolution

    Authors: Xun Jiang, Feng Li, Han Zhao, Jiaying Wang, Jun Shao, Shihao Xu, Shu Zhang, Weiling Chen, Xavier Tang, Yize Chen, Mengyue Wu, Weizhi Ma, Mengdi Wang, Tianqiao Chen

    Abstract: Large language models (LLMs) like GPTs, trained on vast datasets, have demonstrated impressive capabilities in language understanding, reasoning, and planning, achieving human-level performance in various tasks. Most studies focus on enhancing these models by training on ever-larger datasets to build more powerful foundation models. While training stronger models is important, enabling models to e… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: 56 pages, 13 figures

  5. arXiv:2410.13298  [pdf, other

    cs.CL cs.AI

    Advancing Large Language Model Attribution through Self-Improving

    Authors: Lei Huang, Xiaocheng Feng, Weitao Ma, Liang Zhao, Yuchun Fan, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin

    Abstract: Teaching large language models (LLMs) to generate text with citations to evidence sources can mitigate hallucinations and enhance verifiability in information-seeking systems. However, improving this capability requires high-quality attribution data, which is costly and labor-intensive. Inspired by recent advances in self-improvement that enhance LLMs without manual annotation, we present START, a… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Accepted by EMNLP 2024 Main Conference

  6. arXiv:2410.13263  [pdf, other

    cs.AI cs.LG

    A Simplifying and Learnable Graph Convolutional Attention Network for Unsupervised Knowledge Graphs Alignment

    Authors: Weishan Cai, Wenjun Ma, Yuncheng Jiang

    Abstract: The success of current Entity Alignment (EA) task depends largely on the supervision information provided by labeled data. Considering the cost of labeled data, most supervised methods are difficult to apply in practical scenarios. Therefore, more and more works based on contrastive learning, active learning or other deep learning techniques have been developed, to solve the performance bottleneck… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 14 pages, 3 figures

  7. arXiv:2410.11225  [pdf, other

    math.ST stat.ML

    Statistical Inference in Tensor Completion: Optimal Uncertainty Quantification and Statistical-Computational Gaps

    Authors: Wanteng Ma, Dong Xia

    Abstract: This paper presents a simple yet efficient method for statistical inference of tensor linear forms with incomplete and noisy observations. Under the Tucker low-rank tensor model, we utilize an appropriate initial estimate, along with a debiasing technique followed by a one-step power iteration, to construct an asymptotic normal test statistic. This method is suitable for various statistical infere… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  8. arXiv:2410.10516  [pdf, other

    cs.LG cs.AI q-bio.BM

    UniGEM: A Unified Approach to Generation and Property Prediction for Molecules

    Authors: Shikun Feng, Yuyan Ni, Yan Lu, Zhi-Ming Ma, Wei-Ying Ma, Yanyan Lan

    Abstract: Molecular generation and molecular property prediction are both crucial for drug discovery, but they are often developed independently. Inspired by recent studies, which demonstrate that diffusion model, a prominent generative approach, can learn meaningful data representations that enhance predictive tasks, we explore the potential for developing a unified generative model in the molecular domain… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 11 pages, 5 figures

  9. arXiv:2410.10212  [pdf, other

    cs.AI cs.LG

    Large Language Model-Enhanced Reinforcement Learning for Generic Bus Holding Control Strategies

    Authors: Jiajie Yu, Yuhong Wang, Wei Ma

    Abstract: Bus holding control is a widely-adopted strategy for maintaining stability and improving the operational efficiency of bus systems. Traditional model-based methods often face challenges with the low accuracy of bus state prediction and passenger demand estimation. In contrast, Reinforcement Learning (RL), as a data-driven approach, has demonstrated great potential in formulating bus holding strate… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 41 pages, 15 figures

  10. arXiv:2410.10056  [pdf, other

    cs.LG cs.AI stat.ML

    The Epochal Sawtooth Effect: Unveiling Training Loss Oscillations in Adam and Other Optimizers

    Authors: Qi Liu, Wanjing Ma

    Abstract: In this paper, we identify and analyze a recurring training loss pattern, which we term the \textit{Epochal Sawtooth Effect (ESE)}, commonly observed during training with adaptive gradient-based optimizers, particularly Adam optimizer. This pattern is characterized by a sharp drop in loss at the beginning of each epoch, followed by a gradual increase, resulting in a sawtooth-shaped loss curve. Thr… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: 15 pages, 21 figures

  11. arXiv:2410.09543  [pdf, other

    cs.CE cs.AI q-bio.BM

    Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

    Authors: Xiaoran Jiao, Weian Mao, Wengong Jin, Peiyuan Yang, Hao Chen, Chunhua Shen

    Abstract: Predicting the change in binding free energy ($ΔΔG$) is crucial for understanding and modulating protein-protein interactions, which are critical in drug design. Due to the scarcity of experimental $ΔΔG$ data, existing methods focus on pre-training, while neglecting the importance of alignment. In this work, we propose the Boltzmann Alignment technique to transfer knowledge from pre-trained invers… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

  12. arXiv:2410.08901  [pdf, other

    cs.RO

    SegGrasp: Zero-Shot Task-Oriented Grasping via Semantic and Geometric Guided Segmentation

    Authors: Haosheng Li, Weixin Mao, Weipeng Deng, Chenyu Meng, Rui Zhang, Fan Jia, Tiancai Wang, Haoqiang Fan, Hongan Wang, Xiaoming Deng

    Abstract: Task-oriented grasping, which involves grasping specific parts of objects based on their functions, is crucial for developing advanced robotic systems capable of performing complex tasks in dynamic environments. In this paper, we propose a training-free framework that incorporates both semantic and geometric priors for zero-shot task-oriented grasp generation. The proposed framework, SegGrasp, fir… ▽ More

    Submitted 14 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: 7pages,6 figures

  13. arXiv:2410.08421  [pdf, other

    cs.LG

    Generalizable autoregressive modeling of time series through functional narratives

    Authors: Ran Liu, Wenrui Ma, Ellen Zippi, Hadi Pouransari, Jingyun Xiao, Chris Sandino, Behrooz Mahasseni, Juri Minxha, Erdrin Azemi, Eva L. Dyer, Ali Moin

    Abstract: Time series data are inherently functions of time, yet current transformers often learn time series by modeling them as mere concatenations of time periods, overlooking their functional properties. In this work, we propose a novel objective for transformers that learn time series by re-interpreting them as temporal functions. We build an alternative sequence of time series by constructing degradat… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  14. arXiv:2410.07745  [pdf, other

    cs.CL

    StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs

    Authors: Yuanqing Yu, Zhefan Wang, Weizhi Ma, Zhicheng Guo, Jingtao Zhan, Shuai Wang, Chuhan Wu, Zhiqiang Guo, Min Zhang

    Abstract: Despite having powerful reasoning and inference capabilities, Large Language Models (LLMs) still need external tools to acquire real-time information retrieval or domain-specific expertise to solve complex tasks, which is referred to as tool learning. Existing tool learning methods primarily rely on tuning with expert trajectories, focusing on token-sequence learning from a linguistic perspective.… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Ongoning Work

  15. arXiv:2410.05180  [pdf, other

    cs.CL

    Mitigating the Risk of Health Inequity Exacerbated by Large Language Models

    Authors: Yuelyu Ji, Wenhe Ma, Sonish Sivarajkumar, Hang Zhang, Eugene Mathew Sadhu, Zhuochun Li, Xizhi Wu, Shyam Visweswaran, Yanshan Wang

    Abstract: Recent advancements in large language models have demonstrated their potential in numerous medical applications, particularly in automating clinical trial matching for translational research and enhancing medical question answering for clinical decision support. However, our study shows that incorporating non decisive sociodemographic factors such as race, sex, income level, LGBT+ status, homeless… ▽ More

    Submitted 14 October, 2024; v1 submitted 7 October, 2024; originally announced October 2024.

  16. arXiv:2410.04555  [pdf, other

    cs.LG cs.CY

    $\texttt{dattri}$: A Library for Efficient Data Attribution

    Authors: Junwei Deng, Ting-Wei Li, Shiyuan Zhang, Shixuan Liu, Yijun Pan, Hao Huang, Xinhe Wang, Pingbang Hu, Xingjian Zhang, Jiaqi W. Ma

    Abstract: Data attribution methods aim to quantify the influence of individual training samples on the prediction of artificial intelligence (AI) models. As training data plays an increasingly crucial role in the modern development of large-scale AI models, data attribution has found broad applications in improving AI performance and safety. However, despite a surge of new data attribution methods being dev… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

  17. arXiv:2410.04087  [pdf, other

    cs.CL cs.AI

    GlobeSumm: A Challenging Benchmark Towards Unifying Multi-lingual, Cross-lingual and Multi-document News Summarization

    Authors: Yangfan Ye, Xiachong Feng, Xiaocheng Feng, Weitao Ma, Libo Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin

    Abstract: News summarization in today's global scene can be daunting with its flood of multilingual content and varied viewpoints from different sources. However, current studies often neglect such real-world scenarios as they tend to focus solely on either single-language or single-document tasks. To bridge this gap, we aim to unify Multi-lingual, Cross-lingual and Multi-document Summarization into a novel… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: EMNLP 2024 main conference, long paper

  18. arXiv:2410.03328  [pdf

    physics.bio-ph q-bio.BM

    Double-Strand Break Clustering: An Economical and Effective Strategy for DNA Repair

    Authors: Junyi Chen, Wenzong Ma, Yuqi Ma, Gen Yang

    Abstract: In mammalian cells, repair centers for DNA double-strand breaks (DSBs) have been identified. However, previous researches predominantly rely on methods that induce specific DSBs by cutting particular DNA sequences. The clustering and its spatiotemporal properties of non-specifically DSBs, especially those induced by environmental stresses such as irradiation, remains unclear. In this study, we use… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

  19. arXiv:2409.20031  [pdf, other

    cs.SD eess.AS

    Adaptive high-precision sound source localization at low frequencies based on convolutional neural network

    Authors: Wenbo Ma, Yan Lu, Yijun Liu

    Abstract: Sound source localization (SSL) technology plays a crucial role in various application areas such as fault diagnosis, speech separation, and vibration noise reduction. Although beamforming algorithms are widely used in SSL, their resolution at low frequencies is limited. In recent years, deep learning-based SSL methods have significantly improved their accuracy by employing large microphone arrays… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

  20. arXiv:2409.19633  [pdf, other

    hep-ex

    Search for proton decay via $p\rightarrow{e^+η}$ and $p\rightarrow{μ^+η}$ with a 0.37 Mton-year exposure of Super-Kamiokande

    Authors: Super-Kamiokande Collaboration, :, N. Taniuchi, K. Abe, S. Abe, Y. Asaoka, C. Bronner, M. Harada, Y. Hayato, K. Hiraide, K. Hosokawa, K. Ieki, M. Ikeda, J. Kameda, Y. Kanemura, R. Kaneshima, Y. Kashiwagi, Y. Kataoka, S. Miki, S. Mine, M. Miura, S. Moriyama, M. Nakahata, S. Nakayama, Y. Noguchi , et al. (267 additional authors not shown)

    Abstract: A search for proton decay into $e^+/μ^+$ and a $η$ meson has been performed using data from a 0.373 Mton$\cdot$year exposure (6050.3 live days) of Super-Kamiokande. Compared to previous searches this work introduces an improved model of the intranuclear $η$ interaction cross section, resulting in a factor of two reduction in uncertainties from this source and $\sim$10\% increase in signal efficien… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

  21. arXiv:2409.19590  [pdf, other

    cs.RO

    RoboNurse-VLA: Robotic Scrub Nurse System based on Vision-Language-Action Model

    Authors: Shunlei Li, Jin Wang, Rui Dai, Wanyu Ma, Wing Yin Ng, Yingbai Hu, Zheng Li

    Abstract: In modern healthcare, the demand for autonomous robotic assistants has grown significantly, particularly in the operating room, where surgical tasks require precision and reliability. Robotic scrub nurses have emerged as a promising solution to improve efficiency and reduce human error during surgery. However, challenges remain in terms of accurately grasping and handing over surgical instruments,… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

  22. arXiv:2409.19316  [pdf, ps, other

    cs.IT eess.SP

    Movable Antenna Enabled Near-Field Communications: Channel Modeling and Performance Optimization

    Authors: Lipeng Zhu, Wenyan Ma, Zhenyu Xiao, Rui Zhang

    Abstract: Movable antenna (MA) technology offers promising potential to enhance wireless communication by allowing flexible antenna movement. To maximize spatial degrees of freedom (DoFs), larger movable regions are required, which may render the conventional far-field assumption for channels between transceivers invalid. In light of it, we investigate in this paper MA-enabled near-field communications, whe… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

  23. arXiv:2409.19260  [pdf, other

    nucl-th nucl-ex

    Exploring the Diversity of Nuclear Density through Information Entropy

    Authors: Wei-Hu Ma, Yu-Gang Ma

    Abstract: This study explores the role of information entropy in understanding nuclear density distributions, including both stable configurations and non-traditional structures such as neutron halos and $α$-clustering. By quantifying the uncertainty and disorder inherent in nucleon distributions in nuclear many-body systems, information entropy provides a macroscopic measure of the physical properties of t… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

    Comments: 10 pages, 4 figures

    Journal ref: Entropy 26, 763 (2024)

  24. arXiv:2409.19254  [pdf, other

    gr-qc hep-ph

    A Novel Proposal for Exploring Spacetime Microstructure with Scaling

    Authors: Weihu Ma, Yu-Gang Ma

    Abstract: The study of physics at the Planck scale has garnered significant attention due to its implications for understanding the fundamental nature of the universe. At this scale, quantum fluctuations in spacetime become apparent, as suggested by the Heisenberg uncertainty principle. These fluctuations indicate that spacetime is not a smooth manifold but rather has a more complex structure that might be… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

    Comments: 14 pages, 1 figure

  25. arXiv:2409.18153  [pdf, other

    cs.LG stat.ML

    Most Influential Subset Selection: Challenges, Promises, and Beyond

    Authors: Yuzheng Hu, Pingbang Hu, Han Zhao, Jiaqi W. Ma

    Abstract: How can we attribute the behaviors of machine learning models to their training data? While the classic influence function sheds light on the impact of individual samples, it often fails to capture the more complex and pronounced collective influence of a set of samples. To tackle this challenge, we study the Most Influential Subset Selection (MISS) problem, which aims to identify a subset of trai… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

    Comments: Accepted at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

  26. arXiv:2409.17547  [pdf, other

    cs.CV cs.AI

    Triple Point Masking

    Authors: Jiaming Liu, Linghe Kong, Yue Wu, Maoguo Gong, Hao Li, Qiguang Miao, Wenping Ma, Can Qin

    Abstract: Existing 3D mask learning methods encounter performance bottlenecks under limited data, and our objective is to overcome this limitation. In this paper, we introduce a triple point masking scheme, named TPM, which serves as a scalable framework for pre-training of masked autoencoders to achieve multi-mask learning for 3D point clouds. Specifically, we augment the baselines with two additional mask… ▽ More

    Submitted 15 October, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

  27. arXiv:2409.15395  [pdf, other

    cs.CL cs.AI

    Parse Trees Guided LLM Prompt Compression

    Authors: Wenhao Mao, Chengbin Hou, Tianyu Zhang, Xinyu Lin, Ke Tang, Hairong Lv

    Abstract: Offering rich contexts to Large Language Models (LLMs) has shown to boost the performance in various tasks, but the resulting longer prompt would increase the computational cost and might exceed the input limit of LLMs. Recently, some prompt compression methods have been suggested to shorten the length of prompts by using language models to generate shorter prompts or by developing computational m… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  28. arXiv:2409.14177  [pdf, other

    cs.CR cs.AI

    PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach

    Authors: Zhihao Lin, Wei Ma, Mingyi Zhou, Yanjie Zhao, Haoyu Wang, Yang Liu, Jun Wang, Li Li

    Abstract: In recent years, Large Language Models (LLMs) have gained widespread use, raising concerns about their security. Traditional jailbreak attacks, which often rely on the model internal information or have limitations when exploring the unsafe behavior of the victim model, limiting their reducing their general applicability. In this paper, we introduce PathSeeker, a novel black-box jailbreak method,… ▽ More

    Submitted 3 October, 2024; v1 submitted 21 September, 2024; originally announced September 2024.

    Comments: update the abstract and cite a new related work

  29. arXiv:2409.13278  [pdf, other

    eess.SP

    6D Movable Antenna Enhanced Interference Mitigation for Cellular-Connected UAV Communications

    Authors: Tianshi Ren, Xianchao Zhang, Lipeng Zhu, Wenyan Ma, Xiaozheng Gao, Rui Zhang

    Abstract: Cellular-connected unmanned aerial vehicle (UAV) communications is an enabling technology to transmit control signaling or payload data for UAVs through cellular networks. Due to the line-of-sight (LoS) dominant air-to-ground channels, efficient interference mitigation is crucial to UAV communications, while the conventional fixed-position antenna (FPA) arrays have limited degrees of freedom (DoFs… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  30. arXiv:2409.09367  [pdf, other

    nucl-th nucl-ex

    Multiple-models prediction for light neutron-rich isotopes cross section by $Q_g$ systematics in $^{40}$Ar projectile fragmentation reactions

    Authors: X. B. Wei, H. L. Wei, C. W. Ma, C. Y. Qiao, Y. F. Guo, J. Pu, K. X. Cheng, Y. T. Wang, Z. X. Wang, T. R. Zhou, D. Peng, S. T. Wang, S. W. Tang, Y. H. Yu, X. H. Zhang, Y. Z. Sun, S. Y. Jin, G. L. Zhang, X. Jiang, Z. Y. Li, Y. F. Xu, F. H. Lu, T. Q. Liu

    Abstract: Precise predictions for nuclei near drip lines are crucial for experiments in new generation of rare isotope facilities. A multi-models investigation of the $Q_g$ systematics for fragments production cross sections, with $Q_g$ defined as the difference of mass excess (ME) between the projectile ($Z_{p}, A_{p}$) and the fragment ($Z_{f}, A_{f}$) nuclei $Q_{g}=ME(Z_{p}, A_{p})-ME(Z_{f}, A_{f})$, has… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

  31. arXiv:2409.08539  [pdf, other

    astro-ph.GA astro-ph.CO

    NeutralUniverseMachine: How Filaments and Dark Matter Halo Influence the Galaxy Cold Gas Content

    Authors: Wenlin Ma, Hong Guo, Michael G. Jones

    Abstract: Aims. To investigate the influence of distance to filaments and dark matter halos on galaxy cold gas content in the empirical model NeutralUniverseMachine (NUM) and the hydrodynamical simulation IllustrisTNG. Methods. We use DisPerSE to identify cosmic web structures and calculate the distance of galaxies to filaments for both observations and models. We show the results of the HI and H2 mass func… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Comments: Submitted to A&A, 10 pages, 7 figures

  32. arXiv:2409.05770  [pdf, other

    quant-ph cs.CV cs.DC cs.LG

    Consensus-based Distributed Quantum Kernel Learning for Speech Recognition

    Authors: Kuan-Cheng Chen, Wenxuan Ma, Xiaotian Xu

    Abstract: This paper presents a Consensus-based Distributed Quantum Kernel Learning (CDQKL) framework aimed at improving speech recognition through distributed quantum computing.CDQKL addresses the challenges of scalability and data privacy in centralized quantum kernel learning. It does this by distributing computational tasks across quantum terminals, which are connected through classical channels. This a… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  33. arXiv:2409.05657  [pdf, other

    cs.LG

    Adversarial Attacks on Data Attribution

    Authors: Xinhe Wang, Pingbang Hu, Junwei Deng, Jiaqi W. Ma

    Abstract: Data attribution aims to quantify the contribution of individual training data points to the outputs of an AI model, which has been used to measure the value of training data and compensate data providers. Given the impact on financial decisions and compensation mechanisms, a critical question arises concerning the adversarial robustness of data attribution methods. However, there has been little… ▽ More

    Submitted 4 October, 2024; v1 submitted 9 September, 2024; originally announced September 2024.

  34. arXiv:2409.04126  [pdf, other

    stat.ME

    Incorporating external data for analyzing randomized clinical trials: A transfer learning approach

    Authors: Yujia Gu, Hanzhong Liu, Wei Ma

    Abstract: Randomized clinical trials are the gold standard for analyzing treatment effects, but high costs and ethical concerns can limit recruitment, potentially leading to invalid inferences. Incorporating external trial data with similar characteristics into the analysis using transfer learning appears promising for addressing these issues. In this paper, we present a formal framework for applying transf… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  35. arXiv:2409.03505  [pdf, ps, other

    stat.ML cs.LG

    Survey of Data-driven Newsvendor: Unified Analysis and Spectrum of Achievable Regrets

    Authors: Zhuoxin Chen, Will Ma

    Abstract: In the Newsvendor problem, the goal is to guess the number that will be drawn from some distribution, with asymmetric consequences for guessing too high vs. too low. In the data-driven version, the distribution is unknown, and one must work with samples from the distribution. Data-driven Newsvendor has been studied under many variants: additive vs. multiplicative regret, high probability vs. expec… ▽ More

    Submitted 17 September, 2024; v1 submitted 5 September, 2024; originally announced September 2024.

  36. arXiv:2409.03346  [pdf, other

    cs.CL cs.AI

    Sketch: A Toolkit for Streamlining LLM Operations

    Authors: Xin Jiang, Xiang Li, Wenjia Ma, Xuezhi Fang, Yiqun Yao, Naitong Yu, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang

    Abstract: Large language models (LLMs) represented by GPT family have achieved remarkable success. The characteristics of LLMs lie in their ability to accommodate a wide range of tasks through a generative approach. However, the flexibility of their output format poses challenges in controlling and harnessing the model's outputs, thereby constraining the application of LLMs in various domains. In this work,… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  37. arXiv:2409.00773  [pdf, other

    hep-ex

    Searching for MeV-scale Axion-like Particles and Dark Photons with PandaX-4T

    Authors: PandaX Collaboration, Tao Li, Zihao Bo, Wei Chen, Xun Chen, Yunhua Chen, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Zhixing Gao, Lisheng Geng, Karl Giboni, Xunan Guo, Xuyuan Guo, Zichao Guo, Chencheng Han, Ke HanChangda He, Jinrong He, Di Huang, Houqi Huang, Junting Huang, Ruquan Hou, Yu Hou, Xiangdong Ji , et al. (76 additional authors not shown)

    Abstract: Axion-like particles (ALPs) and dark photons (DPs) are viable dark matter particle candidates. We have searched for possible ALP/DP signals in the PandaX-4T liquid xenon detector using 94.8 days of data. A binned likelihood fit is constructed to search for possible mono-energetic peaks induced by the absorption processes between ALPs/DPs and atomic electrons of xenon. A detailed temporal model of… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  38. arXiv:2408.17258  [pdf, other

    cs.LG

    Joint Estimation and Prediction of City-wide Delivery Demand: A Large Language Model Empowered Graph-based Learning Approach

    Authors: Tong Nie, Junlin He, Yuewen Mei, Guoyang Qin, Guilong Li, Jian Sun, Wei Ma

    Abstract: The proliferation of e-commerce and urbanization has significantly intensified delivery operations in urban areas, boosting the volume and complexity of delivery demand. Data-driven predictive methods, especially those utilizing machine learning techniques, have emerged to handle these complexities in urban delivery demand management problems. One particularly pressing problem that has not yet bee… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  39. arXiv:2408.15278  [pdf, ps, other

    math.DG

    Harmonic metrics of $\mathrm{SO}_{0}(n,n)$-Higgs bundles in the Hitchin section on non-compact hyperbolic surfaces

    Authors: Weihan Ma

    Abstract: Let $X$ be a Riemann surface. Using the canonical line bundle $K$ and some holomorphic differentials $\boldsymbol{q}$, Hitchin constructed the $G$-Higgs bundles in the Hitchin section for a split real form $G$ of a complex simple Lie group. We study the ${\mathrm{SO}_0(n,n)}$ case. In our work, we establish the existence of harmonic metrics for these Higgs bundles, which are compatible with the… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 34 pages. arXiv admin note: text overlap with arXiv:2307.03365 by other authors

    MSC Class: 53C07; 58E15; 14D21; 81T13

  40. arXiv:2408.15091  [pdf, other

    cs.CL

    Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models

    Authors: Xiyu Liu, Zhengxiao Liu, Naibin Gu, Zheng Lin, Wanli Ma, Ji Xiang, Weiping Wang

    Abstract: The storage and recall of factual associations in auto-regressive transformer language models (LMs) have drawn a great deal of attention, inspiring knowledge editing by directly modifying the located model weights. Most editing works achieve knowledge editing under the guidance of existing interpretations of knowledge recall that mainly focus on subject knowledge. However, these interpretations ar… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  41. arXiv:2408.14873  [pdf, other

    cs.RO math.NA math.OC

    Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation

    Authors: Haozhe Lou, Yurong Liu, Yike Pan, Yiran Geng, Jianteng Chen, Wenlong Ma, Chenglong Li, Lin Wang, Hengzhen Feng, Lu Shi, Liyi Luo, Yongliang Shi

    Abstract: Real2Sim2Real plays a critical role in robotic arm control and reinforcement learning, yet bridging this gap remains a significant challenge due to the complex physical properties of robots and the objects they manipulate. Existing methods lack a comprehensive solution to accurately reconstruct real-world objects with spatial representations and their associated physics attributes. We propose a… ▽ More

    Submitted 17 September, 2024; v1 submitted 27 August, 2024; originally announced August 2024.

  42. arXiv:2408.14818  [pdf

    cond-mat.supr-con

    Irrelevance of 1H composition to the superconductivity in the infinite-layer nickelates: judging from the MeV energy scale

    Authors: Jia-Cai Nie, Xing-Yu Chen, Yi Bian, Xue-Yan Wang, Ting-Na Shao, Jing-Xin Gao, Wei Mao, Bing-Hui Ge, Arnold Muller, Jikun Chen

    Abstract: The discovery of the superconductivity in the infinite-layer nickelates, as topotactically reduced from their respective perovskite percussors via co-annealing with CaH2, extends the understanding in superconductivity. Nevertheless, whether the incorporated 1H composition is critical to the infinite-layer superconductivity recently arouses considerable debates, while the central challenge lies in… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  43. arXiv:2408.12859  [pdf

    physics.flu-dyn

    Programmable Jumping-Droplet Condensation

    Authors: Shan Gao, Jian Qu, Dehui Wang, Zhichun Liu, Weigang Ma

    Abstract: Self-propelled droplet jumping during condensation has attractive prospects for energy harvesting, water collection and thermal management, but its real-life applications are greatly limited to the challenge of enabling a sustainable control on the entire droplet lifecycle. Herein, we propose a programmable jumping-droplet condensation that evolves along an artificially designed pathway without ex… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  44. arXiv:2408.12528  [pdf, other

    cs.CV

    Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

    Authors: Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou

    Abstract: We present a unified transformer, i.e., Show-o, that unifies multimodal understanding and generation. Unlike fully autoregressive models, Show-o unifies autoregressive and (discrete) diffusion modeling to adaptively handle inputs and outputs of various and mixed modalities. The unified model flexibly supports a wide range of vision-language tasks including visual question-answering, text-to-image… ▽ More

    Submitted 20 October, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: Technical Report

  45. arXiv:2408.12116  [pdf, other

    cs.AI

    Geolocation Representation from Large Language Models are Generic Enhancers for Spatio-Temporal Learning

    Authors: Junlin He, Tong Nie, Wei Ma

    Abstract: In the geospatial domain, universal representation models are significantly less prevalent than their extensive use in natural language processing and computer vision. This discrepancy arises primarily from the high costs associated with the input of existing representation models, which often require street views and mobility data. To address this, we develop a novel, training-free method that le… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  46. arXiv:2408.12049  [pdf, ps, other

    cs.IT

    Research on the Construction of Maximum Distance Separable Codes via Arbitrary twisted Generalized Reed-Solomon Codes

    Authors: Chun'e Zhao, Wenping Ma, Tongjiang Yan, Yuhua Sun

    Abstract: Maximum distance separable (MDS) codes have significant combinatorial and cryptographic applications due to their certain optimality. Generalized Reed-Solomon (GRS) codes are the most prominent MDS codes. Twisted generalized Reed-Solomon (TGRS) codes may not necessarily be MDS. It is meaningful to study the conditions under which TGRS codes are MDS. In this paper, we study a general class of TGRS… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  47. arXiv:2408.12034  [pdf

    q-bio.PE

    Sex chromosome evolution: The classical paradigm and so much beyond

    Authors: Paris Veltsos, Sagar Shinde, Wen-Juan Ma

    Abstract: Sex chromosomes have independently evolved in species with separate sexes in most lineages across the tree of life. However, the well-accepted canonical model of sex chromosome evolution is not universally supported. There is no single trajectory for sex chromosome formation and evolution across the tree of life, suggesting the underlying mechanisms and evolutionary forces are diverse and lineage… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: This has been accepted as a Book chapter for Encyclopedia of Evolutionary Biology book 2nd Edition

  48. arXiv:2408.08533  [pdf, ps, other

    stat.ML cs.LG

    Unsupervised Transfer Learning via Adversarial Contrastive Training

    Authors: Chenguang Duan, Yuling Jiao, Huazhen Lin, Wensen Ma, Jerry Zhijian Yang

    Abstract: Learning a data representation for downstream supervised learning tasks under unlabeled scenario is both critical and challenging. In this paper, we propose a novel unsupervised transfer learning approach using adversarial contrastive training (ACT). Our experimental results demonstrate outstanding classification accuracy with both fine-tuned linear probe and K-NN protocol across various datasets,… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  49. arXiv:2408.07926  [pdf, other

    eess.SY

    Enhanced Equivalent Circuit Model for High Current Discharge of Lithium-Ion Batteries with Application to Electric Vertical Takeoff and Landing Aircraft

    Authors: Alireza Goshtasbi, Ruxiu Zhao, Ruiting Wang, Sangwoo Han, Wenting Ma, Jeremy Neubauer

    Abstract: Conventional battery equivalent circuit models (ECMs) have limited capability to predict performance at high discharge rates, where lithium depleted regions may develop and cause a sudden exponential drop in the cell's terminal voltage. Having accurate predictions of performance under such conditions is necessary for electric vertical takeoff and landing (eVTOL) aircraft applications, where high d… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  50. arXiv:2408.07641  [pdf, other

    hep-ex

    Exploring New Physics with PandaX-4T Low Energy Electronic Recoil Data

    Authors: PandaX Collaboration, Xinning Zeng, Zihao Bo, Wei Chen, Xun Chen, Yunhua Chen, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Zhixing Gao, Lisheng Geng, Karl Giboni, Xunan Guo, Xuyuan Guo, Zichao Guo, Chencheng Han, Ke HanChangda He, Jinrong He, Di Huang, Houqi Huang, Junting Huang, Ruquan Hou, Yu Hou, Xiangdong Ji , et al. (76 additional authors not shown)

    Abstract: New particles beyond the Standard Model of particle physics, such as axions, can be effectively searched through their interactions with electrons. We use the large liquid xenon detector PandaX-4T to search for novel electronic recoil signals induced by solar axions, neutrinos with anomalous magnetic moment, axion-like particles, dark photons, and light fermionic dark matter. A detailed background… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.