subscribe to arXiv mailings

Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework

Authors: Boyuan Zhang, Bo Fang, Fanjiang Ye, Yida Gu, Nathan Tallent, Guangming Tan, Dingwen Tao

Abstract: Full-state quantum circuit simulation requires exponentially increased memory size to store the state vector as the number of qubits scales, presenting significant limitations in classical computing systems. Our paper introduces BMQSim, a novel state vector quantum simulation framework that employs lossy compression to address the memory constraints on graphics processing unit (GPU) machines. BMQS… ▽ More Full-state quantum circuit simulation requires exponentially increased memory size to store the state vector as the number of qubits scales, presenting significant limitations in classical computing systems. Our paper introduces BMQSim, a novel state vector quantum simulation framework that employs lossy compression to address the memory constraints on graphics processing unit (GPU) machines. BMQSim effectively tackles four major challenges for state-vector simulation with compression: frequent compression/decompression, high memory movement overhead, lack of dedicated error control, and unpredictable memory space requirements. Our work proposes an innovative strategy of circuit partitioning to significantly reduce the frequency of compression occurrences. We introduce a pipeline that seamlessly integrates compression with data movement while concealing its overhead. Additionally, BMQSim incorporates the first GPU-based lossy compression technique with point-wise error control. Furthermore, BMQSim features a two-level memory management system, ensuring efficient and stable execution. Our evaluations demonstrate that BMQSim can simulate the same circuit with over 10 times less memory usage on average, achieving fidelity over 0.99 and maintaining comparable simulation time to other state-of-the-art simulators. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.13994 [pdf, other]

Vacancy-induced suppression of CDW order and its impact on magnetic order in kagome antiferromagnet FeGe

Authors: Mason L. Klemm, Saif Siddique, Yuan-Chun Chang, Sijie Xu, Yaofeng Xie, Tanner Legvold, Mehrdad T. Kiani, Feng Ye, Huibo Cao, Yiqing Hao, Wei Tian, Hubertus Luetkens, Masaaki Matsuda, Douglas Natelson, Zurab Guguchia, Chien-Lung Huang, Ming Yi, Judy J. Cha, Pengcheng Dai

Abstract: Two-dimensional (2D) kagome lattice metals are interesting because they display flat electronic bands, Dirac points, Van Hove singularities, and can have interplay between charge density wave (CDW), magnetic order, and superconductivity. In kagome lattice antiferromagnet FeGe, a short-range CDW order was found deep within an antiferromagnetically ordered state, interacting with the magnetic order.… ▽ More Two-dimensional (2D) kagome lattice metals are interesting because they display flat electronic bands, Dirac points, Van Hove singularities, and can have interplay between charge density wave (CDW), magnetic order, and superconductivity. In kagome lattice antiferromagnet FeGe, a short-range CDW order was found deep within an antiferromagnetically ordered state, interacting with the magnetic order. Surprisingly, post-growth annealing of FeGe at 560$^{\circ}$C can suppress the CDW order while annealing at 320$^{\circ}$C induces a long-range CDW order, with the ability to cycle between the states repeatedly by annealing. Here we perform transport, neutron scattering, scanning transmission electron microscopy (STEM), and muon spin rotation ($μ$SR) experiments to unveil the microscopic mechanism of the annealing process and its impact on magneto-transport, CDW, and magnetic properties of FeGe. We find that 560$^{\circ}$C annealing creates germanium vacancies uniformly distributed throughout the FeGe kagome lattice, which prevent the formation of Ge-Ge dimers necessary for the CDW order. Upon annealing at 320$^{\circ}$C, the system segregates into stoichiometric FeGe regions with long-range CDW order and regions with stacking faults that act as nucleation sites for the CDW. The presence or absence of CDW order greatly affects the anomalous Hall effect, incommensurate magnetic order, and spin-lattice coupling in FeGe, thus placing FeGe as the only known kagome lattice material with a tunable CDW and magnetic order, potentially useful for sensing and information transmission. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.13782 [pdf, other]

DPLM-2: A Multimodal Diffusion Protein Language Model

Authors: Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang, Quanquan Gu

Abstract: Proteins are essential macromolecules defined by their amino acid sequences, which determine their three-dimensional structures and, consequently, their functions in all living organisms. Therefore, generative protein modeling necessitates a multimodal approach to simultaneously model, understand, and generate both sequences and structures. However, existing methods typically use separate models f… ▽ More Proteins are essential macromolecules defined by their amino acid sequences, which determine their three-dimensional structures and, consequently, their functions in all living organisms. Therefore, generative protein modeling necessitates a multimodal approach to simultaneously model, understand, and generate both sequences and structures. However, existing methods typically use separate models for each modality, limiting their ability to capture the intricate relationships between sequence and structure. This results in suboptimal performance in tasks that requires joint understanding and generation of both modalities. In this paper, we introduce DPLM-2, a multimodal protein foundation model that extends discrete diffusion protein language model (DPLM) to accommodate both sequences and structures. To enable structural learning with the language model, 3D coordinates are converted to discrete tokens using a lookup-free quantization-based tokenizer. By training on both experimental and high-quality synthetic structures, DPLM-2 learns the joint distribution of sequence and structure, as well as their marginals and conditionals. We also implement an efficient warm-up strategy to exploit the connection between large-scale evolutionary data and structural inductive biases from pre-trained sequence-based protein language models. Empirical evaluation shows that DPLM-2 can simultaneously generate highly compatible amino acid sequences and their corresponding 3D structures eliminating the need for a two-stage generation approach. Moreover, DPLM-2 demonstrates competitive performance in various conditional generation tasks, including folding, inverse folding, and scaffolding with multimodal motif inputs, as well as providing structure-aware representations for predictive tasks. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.12457 [pdf, other]

Sharpness-Aware Black-Box Optimization

Authors: Feiyang Ye, Yueming Lyu, Xuehao Wang, Masashi Sugiyama, Yu Zhang, Ivor Tsang

Abstract: Black-box optimization algorithms have been widely used in various machine learning problems, including reinforcement learning and prompt fine-tuning. However, directly optimizing the training loss value, as commonly done in existing black-box optimization methods, could lead to suboptimal model quality and generalization performance. To address those problems in black-box optimization, we propose… ▽ More Black-box optimization algorithms have been widely used in various machine learning problems, including reinforcement learning and prompt fine-tuning. However, directly optimizing the training loss value, as commonly done in existing black-box optimization methods, could lead to suboptimal model quality and generalization performance. To address those problems in black-box optimization, we propose a novel Sharpness-Aware Black-box Optimization (SABO) algorithm, which applies a sharpness-aware minimization strategy to improve the model generalization. Specifically, the proposed SABO method first reparameterizes the objective function by its expectation over a Gaussian distribution. Then it iteratively updates the parameterized distribution by approximated stochastic gradients of the maximum objective value within a small neighborhood around the current solution in the Gaussian distribution space. Theoretically, we prove the convergence rate and generalization bound of the proposed SABO algorithm. Empirically, extensive experiments on the black-box prompt fine-tuning tasks demonstrate the effectiveness of the proposed SABO method in improving model generalization performance. △ Less

Submitted 16 October, 2024; originally announced October 2024.

Comments: 27 pages, 5 figures

arXiv:2410.04185 [pdf, other]

doi 10.1103/PhysRevA.110.043513

Spontaneous Symmetry Breaking In Nonlinear Binary Periodic Systems

Authors: Ruihan Peng, Qidong Fu, Yejia Chen, Weidong Luo, Changming Huang, Fangwei Ye

Abstract: Spontaneous symmetry breaking (SSB) occurs when modes of asymmetric profile appear in a symmetric, double-well potential, due to the nonlinearity of the potential exceeding a critical value. In this study, we examine SSB in a periodic potential where the unit cell itself is a symmetric double-well, in both one-dimensional and two-dimensional periodic systems. Using the tight-binding model, we deri… ▽ More Spontaneous symmetry breaking (SSB) occurs when modes of asymmetric profile appear in a symmetric, double-well potential, due to the nonlinearity of the potential exceeding a critical value. In this study, we examine SSB in a periodic potential where the unit cell itself is a symmetric double-well, in both one-dimensional and two-dimensional periodic systems. Using the tight-binding model, we derive the analytical form that predicts the critical power at which SSB occurs for both 1D and 2D systems. The results show that the critical power depends significantly on the quasi-momentum of the Bloch mode, and as the modulus of momentum increases, the SSB threshold decreases rapidly, potentially dropping to zero. These analytical findings are supported by numerical nonlinear eigenmode analysis and direct propagation simulations of Bloch modes. △ Less

Submitted 5 October, 2024; originally announced October 2024.

Journal ref: Phys. Rev. A 110, 043513 (2024)

arXiv:2410.03090 [pdf, other]

UNComp: Uncertainty-Aware Long-Context Compressor for Efficient Large Language Model Inference

Authors: Jing Xiong, Jianghan Shen, Fanghua Ye, Chaofan Tao, Zhongwei Wan, Jianqiao Lu, Xun Wu, Chuanyang Zheng, Zhijiang Guo, Lingpeng Kong, Ngai Wong

Abstract: Deploying large language models (LLMs) is challenging due to their high memory and computational demands, especially during long-context inference. While key-value (KV) caching accelerates inference by reusing previously computed keys and values, it also introduces significant memory overhead. Existing KV cache compression methods such as eviction and merging typically compress the KV cache after… ▽ More Deploying large language models (LLMs) is challenging due to their high memory and computational demands, especially during long-context inference. While key-value (KV) caching accelerates inference by reusing previously computed keys and values, it also introduces significant memory overhead. Existing KV cache compression methods such as eviction and merging typically compress the KV cache after it is generated and overlook the eviction of hidden states, failing to improve the speed of the prefilling stage. Additionally, applying a uniform compression rate across different attention heads can harm crucial retrieval heads in needle-in-a-haystack tasks due to excessive compression. In this paper, we propose UNComp, an uncertainty-aware compression scheme that leverages matrix entropy to estimate model uncertainty across layers and heads at the token sequence level. By grouping layers and heads based on their uncertainty, UNComp adaptively compresses both the hidden states and the KV cache. Our method achieves a 1.6x speedup in the prefilling stage and reduces the KV cache to 4.74% of its original size, resulting in a 6.4x increase in throughput and a 1.4x speedup in inference with only a 1.41% performance loss. Remarkably, in needle-in-a-haystack tasks, UNComp outperforms the full-size KV cache even when compressed to 9.38% of its original size. Our approach offers an efficient, training-free Grouped-Query Attention paradigm that can be seamlessly integrated into existing KV cache schemes. △ Less

Submitted 3 October, 2024; originally announced October 2024.

arXiv:2410.02719 [pdf, other]

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Authors: Zixuan Li, Jing Xiong, Fanghua Ye, Chuanyang Zheng, Xun Wu, Jianqiao Lu, Zhongwei Wan, Xiaodan Liang, Chengming Li, Zhenan Sun, Lingpeng Kong, Ngai Wong

Abstract: We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) that utilizes Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. This span uncertainty enhances model calibration, improving robustness and mitigating semantic inconsistencies introduced by random chunking. Leveraging this insight, we propose an efficient un… ▽ More We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) that utilizes Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. This span uncertainty enhances model calibration, improving robustness and mitigating semantic inconsistencies introduced by random chunking. Leveraging this insight, we propose an efficient unsupervised learning technique to train the retrieval model, alongside an effective data sampling and scaling strategy. UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results while using only 4% of the training data compared to other advanced open-source retrieval models under distribution shift settings. Our method demonstrates strong calibration through span uncertainty, leading to improved generalization and robustness in long-context RAG tasks. Additionally, UncertaintyRAG provides a lightweight retrieval model that can be integrated into any large language model with varying context window lengths, without the need for fine-tuning, showcasing the flexibility of our approach. △ Less

Submitted 3 October, 2024; originally announced October 2024.

arXiv:2409.15119 [pdf, other]

Log-normal Mutations and their Use in Detecting Surreptitious Fake Images

Authors: Ismail Labiad, Thomas Bäck, Pierre Fernandez, Laurent Najman, Tom Sander, Furong Ye, Mariia Zameshina, Olivier Teytaud

Abstract: In many cases, adversarial attacks are based on specialized algorithms specifically dedicated to attacking automatic image classifiers. These algorithms perform well, thanks to an excellent ad hoc distribution of initial attacks. However, these attacks are easily detected due to their specific initial distribution. We therefore consider other black-box attacks, inspired from generic black-box opti… ▽ More In many cases, adversarial attacks are based on specialized algorithms specifically dedicated to attacking automatic image classifiers. These algorithms perform well, thanks to an excellent ad hoc distribution of initial attacks. However, these attacks are easily detected due to their specific initial distribution. We therefore consider other black-box attacks, inspired from generic black-box optimization tools, and in particular the log-normal algorithm. We apply the log-normal method to the attack of fake detectors, and get successful attacks: importantly, these attacks are not detected by detectors specialized on classical adversarial attacks. Then, combining these attacks and deep detection, we create improved fake detectors. △ Less

Submitted 25 September, 2024; v1 submitted 23 September, 2024; originally announced September 2024.

Comments: log-normal mutations and their use in detecting surreptitious fake images

arXiv:2409.08223 [pdf, other]

doi 10.1016/j.mtphys.2024.101546

Structural and electronic transformations in TiO2 induced by electric current

Authors: Tyler C. Sterling, Feng Ye, Seohyeon Jo, Anish Parulekar, Yu Zhang, Gang Cao, Rishi Raj, Dmitry Reznik

Abstract: In-situ diffuse neutron scattering experiments revealed that when electric current is passed through single crystals of rutile TiO2 under conditions conducive to flash sintering, it induces the formation of parallel planes of oxygen vacancies. Specifically, a current perpendicular to the c-axis generates planes normal to the (132) reciprocal lattice vector, whereas currents aligned with the c-axis… ▽ More In-situ diffuse neutron scattering experiments revealed that when electric current is passed through single crystals of rutile TiO2 under conditions conducive to flash sintering, it induces the formation of parallel planes of oxygen vacancies. Specifically, a current perpendicular to the c-axis generates planes normal to the (132) reciprocal lattice vector, whereas currents aligned with the c-axis form planes normal to the (132) and to the (225) vector. The concentration of defects increases with incresing current. The structural modifications are linked to the appearance of signatures of interacting Ti3+ moments in magnetic susceptibility, signifying a structural collapse around the vacancy planes. Electrical conductivity measurements of the modified material reveal several electronic transitions between semiconducting states (via a metal-like intermediate state) with the smallest gap being 27 meV. Pristine TiO2 can be restored by heating followed by slow cooling in air. Our work suggests a novel paradigm for achieving switching of electrical conductivity related to the flash phenomenon △ Less

Submitted 21 October, 2024; v1 submitted 12 September, 2024; originally announced September 2024.

arXiv:2409.08070 [pdf, other]

All-optical Fourier neural network using partially coherent light

Authors: Jianwei Qin, Yanbing Liu, Yan Liu, Xun Liu, Wei Li, Fangwei Ye

Abstract: Optical neural networks present distinct advantages over traditional electrical counterparts, such as accelerated data processing and reduced energy consumption. While coherent light is conventionally employed in optical neural networks, our study proposes harnessing spatially incoherent light in all-optical Fourier neural networks. Contrary to numerical predictions of declining target recognition… ▽ More Optical neural networks present distinct advantages over traditional electrical counterparts, such as accelerated data processing and reduced energy consumption. While coherent light is conventionally employed in optical neural networks, our study proposes harnessing spatially incoherent light in all-optical Fourier neural networks. Contrary to numerical predictions of declining target recognition accuracy with increased incoherence, our experimental results demonstrate a surprising outcome: improved accuracy with incoherent light. We attribute this unexpected enhancement to spatially incoherent light's ability to alleviate experimental errors like diffraction rings, laser speckle, and edge effects. Our controlled experiments introduced spatial incoherence by passing monochromatic light through a spatial light modulator featuring a dynamically changing random phase array. These findings underscore partially coherent light's potential to optimize optical neural networks, delivering dependable and efficient solutions for applications demanding consistent accuracy and robustness across diverse conditions. △ Less

Submitted 20 September, 2024; v1 submitted 12 September, 2024; originally announced September 2024.

Comments: 19 pages,5 figures

arXiv:2409.06744 [pdf, other]

ProteinBench: A Holistic Evaluation of Protein Foundation Models

Authors: Fei Ye, Zaixiang Zheng, Dongyu Xue, Yuning Shen, Lihao Wang, Yiming Ma, Yan Wang, Xinyou Wang, Xiangxin Zhou, Quanquan Gu

Abstract: Recent years have witnessed a surge in the development of protein foundation models, significantly improving performance in protein prediction and generative tasks ranging from 3D structure prediction and protein design to conformational dynamics. However, the capabilities and limitations associated with these models remain poorly understood due to the absence of a unified evaluation framework. To… ▽ More Recent years have witnessed a surge in the development of protein foundation models, significantly improving performance in protein prediction and generative tasks ranging from 3D structure prediction and protein design to conformational dynamics. However, the capabilities and limitations associated with these models remain poorly understood due to the absence of a unified evaluation framework. To fill this gap, we introduce ProteinBench, a holistic evaluation framework designed to enhance the transparency of protein foundation models. Our approach consists of three key components: (i) A taxonomic classification of tasks that broadly encompass the main challenges in the protein domain, based on the relationships between different protein modalities; (ii) A multi-metric evaluation approach that assesses performance across four key dimensions: quality, novelty, diversity, and robustness; and (iii) In-depth analyses from various user objectives, providing a holistic view of model performance. Our comprehensive evaluation of protein foundation models reveals several key findings that shed light on their current capabilities and limitations. To promote transparency and facilitate further research, we release the evaluation dataset, code, and a public leaderboard publicly for further analysis and a general modular toolkit. We intend for ProteinBench to be a living benchmark for establishing a standardized, in-depth evaluation framework for protein foundation models, driving their development and application while fostering collaboration within the field. △ Less

Submitted 7 October, 2024; v1 submitted 10 September, 2024; originally announced September 2024.

Comments: 30 pages, 2 figures and 15 tables

arXiv:2409.05324 [pdf, other]

FIF-UNet: An Efficient UNet Using Feature Interaction and Fusion for Medical Image Segmentation

Authors: Xiaolin Gou, Chuanlin Liao, Jizhe Zhou, Fengshuo Ye, Yi Lin

Abstract: Nowadays, pre-trained encoders are widely used in medical image segmentation because of their ability to capture complex feature representations. However, the existing models fail to effectively utilize the rich features obtained by the pre-trained encoder, resulting in suboptimal segmentation results. In this work, a novel U-shaped model, called FIF-UNet, is proposed to address the above issue, i… ▽ More Nowadays, pre-trained encoders are widely used in medical image segmentation because of their ability to capture complex feature representations. However, the existing models fail to effectively utilize the rich features obtained by the pre-trained encoder, resulting in suboptimal segmentation results. In this work, a novel U-shaped model, called FIF-UNet, is proposed to address the above issue, including three plug-and-play modules. A channel spatial interaction module (CSI) is proposed to obtain informative features by establishing the interaction between encoder stages and corresponding decoder stages. A cascaded conv-SE module (CoSE) is designed to enhance the representation of critical features by adaptively assigning importance weights on different feature channels. A multi-level fusion module (MLF) is proposed to fuse the multi-scale features from the decoder stages, ensuring accurate and robust final segmentation. Comprehensive experiments on the Synapse and ACDC datasets demonstrate that the proposed FIF-UNet outperforms existing state-of-the-art methods, which achieves the highest average DICE of 86.05% and 92.58%, respectively. △ Less

Submitted 9 September, 2024; originally announced September 2024.

arXiv:2409.00578 [pdf, other]

Exact moments for a run and tumble particle in a harmonic trap with a finite tumble time

Authors: Aoran Sun, Fangfu Ye, Rudolf Podgornik

Abstract: We study the problem of a run and tumble particle in a harmonic trap, with a finite run and tumble time, by a direct integration of the equation of motion. An exact 1D steady state distribution, diagram laws and a programmable Volterra difference equation are derived to calculate any order of moments in any other dimension, both for steady state as well as the Laplace transform in time for the int… ▽ More We study the problem of a run and tumble particle in a harmonic trap, with a finite run and tumble time, by a direct integration of the equation of motion. An exact 1D steady state distribution, diagram laws and a programmable Volterra difference equation are derived to calculate any order of moments in any other dimension, both for steady state as well as the Laplace transform in time for the intermediate states. We also use the moments to infer the distribution by considering a Gaussian quadrature for the corresponding measure, and from the scaling law of high order moments. △ Less

Submitted 31 August, 2024; originally announced September 2024.

Comments: 12 pages 5 figures

arXiv:2407.17011 [pdf, other]

Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism

Authors: Anhao Zhao, Fanghua Ye, Jinlan Fu, Xiaoyu Shen

Abstract: Large language models (LLMs) exhibit remarkable in-context learning (ICL) capabilities. However, the underlying working mechanism of ICL remains poorly understood. Recent research presents two conflicting views on ICL: One emphasizes the impact of similar examples in the demonstrations, stressing the need for label correctness and more shots. The other attributes it to LLMs' inherent ability of ta… ▽ More Large language models (LLMs) exhibit remarkable in-context learning (ICL) capabilities. However, the underlying working mechanism of ICL remains poorly understood. Recent research presents two conflicting views on ICL: One emphasizes the impact of similar examples in the demonstrations, stressing the need for label correctness and more shots. The other attributes it to LLMs' inherent ability of task recognition, deeming label correctness and shot numbers of demonstrations as not crucial. In this work, we provide a Two-Dimensional Coordinate System that unifies both views into a systematic framework. The framework explains the behavior of ICL through two orthogonal variables: whether similar examples are presented in the demonstrations (perception) and whether LLMs can recognize the task (cognition). We propose the peak inverse rank metric to detect the task recognition ability of LLMs and study LLMs' reactions to different definitions of similarity. Based on these, we conduct extensive experiments to elucidate how ICL functions across each quadrant on multiple representative classification tasks. Finally, we extend our analyses to generation tasks, showing that our coordinate system can also be used to interpret ICL for generation tasks effectively. △ Less

Submitted 9 October, 2024; v1 submitted 24 July, 2024; originally announced July 2024.

arXiv:2407.16531 [pdf, other]

1-Form Symmetric Projected Entangled-Pair States

Authors: Yi Tan, Ji-Yao Chen, Didier Poilblanc, Fei Ye, Jia-Wei Mei

Abstract: The 1-form symmetry, manifesting as loop-like symmetries, has gained prominence in the study of quantum phases, deepening our understanding of symmetry. However, the role of 1-form symmetries in Projected Entangled-Pair States (PEPS), two-dimensional tensor network states, remains largely underexplored. We present a novel framework for understanding 1-form symmetries within tensor networks, specif… ▽ More The 1-form symmetry, manifesting as loop-like symmetries, has gained prominence in the study of quantum phases, deepening our understanding of symmetry. However, the role of 1-form symmetries in Projected Entangled-Pair States (PEPS), two-dimensional tensor network states, remains largely underexplored. We present a novel framework for understanding 1-form symmetries within tensor networks, specifically focusing on the derivation of algebraic relations for symmetry matrices on the PEPS virtual legs. Our results reveal that 1-form symmetries impose stringent constraints on tensor network representations, leading to distinct anomalous braiding phases carried by symmetry matrices. We demonstrate how these symmetries influence the ground state and tangent space in PEPS, providing new insights into their physical implications for enhancing ground state optimization efficiency and characterizing the 1-form symmetry structure in excited states. △ Less

Submitted 1 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

Comments: 6 pages, 1 figure. In memory of T. M. Rice for his invaluable teaching on the physics of strongly correlated systems

arXiv:2407.15399 [pdf, other]

Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models

Authors: Xiao Liu, Liangzhi Li, Tong Xiang, Fuying Ye, Lu Wei, Wangyue Li, Noa Garcia

Abstract: With the development of large language models (LLMs) like ChatGPT, both their vast applications and potential vulnerabilities have come to the forefront. While developers have integrated multiple safety mechanisms to mitigate their misuse, a risk remains, particularly when models encounter adversarial inputs. This study unveils an attack mechanism that capitalizes on human conversation strategies… ▽ More With the development of large language models (LLMs) like ChatGPT, both their vast applications and potential vulnerabilities have come to the forefront. While developers have integrated multiple safety mechanisms to mitigate their misuse, a risk remains, particularly when models encounter adversarial inputs. This study unveils an attack mechanism that capitalizes on human conversation strategies to extract harmful information from LLMs. We delineate three pivotal strategies: (i) decomposing malicious questions into seemingly innocent sub-questions; (ii) rewriting overtly malicious questions into more covert, benign-sounding ones; (iii) enhancing the harmfulness of responses by prompting models for illustrative examples. Unlike conventional methods that target explicit malicious responses, our approach delves deeper into the nature of the information provided in responses. Through our experiments conducted on GPT-3.5-turbo, GPT-4, and Llama2, our method has demonstrated a marked efficacy compared to conventional attack methods. In summary, this work introduces a novel attack method that outperforms previous approaches, raising an important question: How to discern whether the ultimate intent in a dialogue is malicious? △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.13246 [pdf, other]

STS MICCAI 2023 Challenge: Grand challenge on 2D and 3D semi-supervised tooth segmentation

Authors: Yaqi Wang, Yifan Zhang, Xiaodiao Chen, Shuai Wang, Dahong Qian, Fan Ye, Feng Xu, Hongyuan Zhang, Qianni Zhang, Chengyu Wu, Yunxiang Li, Weiwei Cui, Shan Luo, Chengkai Wang, Tianhao Li, Yi Liu, Xiang Feng, Huiyu Zhou, Dongyun Liu, Qixuan Wang, Zhouhao Lin, Wei Song, Yuanlin Li, Bing Wang, Chunshi Wang , et al. (2 additional authors not shown)

Abstract: Computer-aided design (CAD) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics d… ▽ More Computer-aided design (CAD) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics due to its low radiation dose. However, there is no open-access 2D public dataset for children's teeth and no open 3D dental CBCT dataset, which limits the development of automatic algorithms for segmenting teeth and analyzing diseases. The Semi-supervised Teeth Segmentation (STS) Challenge, a pioneering event in tooth segmentation, was held as a part of the MICCAI 2023 ToothFairy Workshop on the Alibaba Tianchi platform. This challenge aims to investigate effective semi-supervised tooth segmentation algorithms to advance the field of dentistry. In this challenge, we provide two modalities including the 2D panoramic X-ray images and the 3D CBCT tooth volumes. In Task 1, the goal was to segment tooth regions in panoramic X-ray images of both adult and pediatric teeth. Task 2 involved segmenting tooth sections using CBCT volumes. Limited labelled images with mostly unlabelled ones were provided in this challenge prompt using semi-supervised algorithms for training. In the preliminary round, the challenge received registration and result submission by 434 teams, with 64 advancing to the final round. This paper summarizes the diverse methods employed by the top-ranking teams in the STS MICCAI 2023 Challenge. △ Less

Submitted 18 July, 2024; originally announced July 2024.

arXiv:2407.07640 [pdf, other]

Single Crystal Diffuse Neutron Scattering Study of the Dipole-Octupole Quantum Spin Ice Candidate Ce$_2$Zr$_2$O$_7$: No Apparent Octupolar Correlations Above $T = 0.05$ K

Authors: E. M. Smith, R. Schäfer, J. Dudemaine, B. Placke, B. Yuan, Z. Morgan, F. Ye, R. Moessner, O. Benton, A. D. Bianchi, B. D. Gaulin

Abstract: The insulating magnetic pyrochlore Ce$_2$Zr$_2$O$_7$ has attracted much attention as a quantum spin ice candidate with dipole-octupole character that permits spin ice phases based not only on magnetic dipole moments but also allows for even-more-exotic octupole-based spin ice phases. This work reports low-temperature neutron diffraction measurements on single crystal Ce$_2$Zr$_2$O$_7$ with $Q$-cov… ▽ More The insulating magnetic pyrochlore Ce$_2$Zr$_2$O$_7$ has attracted much attention as a quantum spin ice candidate with dipole-octupole character that permits spin ice phases based not only on magnetic dipole moments but also allows for even-more-exotic octupole-based spin ice phases. This work reports low-temperature neutron diffraction measurements on single crystal Ce$_2$Zr$_2$O$_7$ with $Q$-coverage both at low $Q$ where the magnetic form factor for dipoles is near maximal and at high $Q$ covering the region where the magnetic form factor for Ce$^{3+}$ octupoles is near maximal. This study was motivated by recent powder neutron diffraction studies of other Ce-based dipole-octupole pyrochlores, Ce$_2$Sn$_2$O$_7$ and Ce$_2$Hf$_2$O$_7$, which each showed temperature-dependent diffuse diffraction at high $Q$ that was interpreted as arising from octupolar correlations. Our measurements use an optimized single crystal diffuse scattering instrument that allows us to screen against strong single crystal Bragg scattering in Ce$_2$Zr$_2$O$_7$. The temperature-difference neutron diffraction reveals a low-$Q$ peak consistent with dipolar spin ice correlations. For larger $Q$, the temperature-difference neutron diffraction shows an alternation between positive and negative net intensity. These features are qualitatively consistent with the corresponding numerical-linked-cluster (NLC) calculations using pseudospin interaction parameters reported for Ce$_2$Zr$_2$O$_7$, Ce$_2$Sn$_2$O$_7$, or Ce$_2$Hf$_2$O$_7$. Importantly, neither the measured data nor any of the NLC calculations show increased scattering at high $Q$ resulting from octupolar correlations. We conclude that at the lowest attainable temperatures for our measurement ($T = 0.05$ K), octupolar correlations are not present in Ce$_2$Zr$_2$O$_7$ on the level of our observation threshold of $\sim$ 0.1$\%$ of the low-$Q$ dipole scattering. △ Less

Submitted 5 October, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.05439 [pdf, other]

Stabilizing an individual charge fluctuator in a Si/SiGe quantum dot

Authors: Feiyang Ye, Ammar Ellaboudy, John M. Nichol

Abstract: Charge noise is a major obstacle to improved gate fidelities in silicon spin qubits. Numerous methods exist to mitigate charge noise, including improving device fabrication, dynamical decoupling, and real-time parameter estimation. In this work, we demonstrate a new class of techniques to mitigate charge noise in semiconductor quantum dots by controlling the noise sources themselves. Using two dif… ▽ More Charge noise is a major obstacle to improved gate fidelities in silicon spin qubits. Numerous methods exist to mitigate charge noise, including improving device fabrication, dynamical decoupling, and real-time parameter estimation. In this work, we demonstrate a new class of techniques to mitigate charge noise in semiconductor quantum dots by controlling the noise sources themselves. Using two different classical feedback methods, we stabilize an individual charged two-level fluctuator in a Si/SiGe quantum dot. These control methods reduce the low-frequency component of the noise power spectrum by an order of magnitude. These techniques also enable stabilizing the fluctuator in either of its states. In the future, such techniques may enable improved coherence times in quantum-dot spin qubits. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.04272 [pdf, other]

Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression

Authors: Hao Feng, Boyuan Zhang, Fanjiang Ye, Min Si, Ching-Hsiang Chu, Jiannan Tian, Chunxing Yin, Summer Deng, Yuchen Hao, Pavan Balaji, Tong Geng, Dingwen Tao

Abstract: DLRM is a state-of-the-art recommendation system model that has gained widespread adoption across various industry applications. The large size of DLRM models, however, necessitates the use of multiple devices/GPUs for efficient training. A significant bottleneck in this process is the time-consuming all-to-all communication required to collect embedding data from all devices. To mitigate this, we… ▽ More DLRM is a state-of-the-art recommendation system model that has gained widespread adoption across various industry applications. The large size of DLRM models, however, necessitates the use of multiple devices/GPUs for efficient training. A significant bottleneck in this process is the time-consuming all-to-all communication required to collect embedding data from all devices. To mitigate this, we introduce a method that employs error-bounded lossy compression to reduce the communication data size and accelerate DLRM training. We develop a novel error-bounded lossy compression algorithm, informed by an in-depth analysis of embedding data features, to achieve high compression ratios. Moreover, we introduce a dual-level adaptive strategy for error-bound adjustment, spanning both table-wise and iteration-wise aspects, to balance the compression benefits with the potential impacts on accuracy. We further optimize our compressor for PyTorch tensors on GPUs, minimizing compression overhead. Evaluation shows that our method achieves a 1.38$\times$ training speedup with a minimal accuracy impact. △ Less

Submitted 1 October, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

Comments: camera-ready version for SC '24

arXiv:2407.02015 [pdf, ps, other]

Robust First and Second-Order Differentiation for Regularized Optimal Transport

Authors: Xingjie Li, Fei Lu, Molei Tao, Felix X. -F. Ye

Abstract: Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which addition… ▽ More Applications such as unbalanced and fully shuffled regression can be approached by optimizing regularized optimal transport (OT) distances, such as the entropic OT and Sinkhorn distances. A common approach for this optimization is to use a first-order optimizer, which requires the gradient of the OT distance. For faster convergence, one might also resort to a second-order optimizer, which additionally requires the Hessian. The computations of these derivatives are crucial for efficient and accurate optimization. However, they present significant challenges in terms of memory consumption and numerical instability, especially for large datasets and small regularization strengths. We circumvent these issues by analytically computing the gradients for OT distances and the Hessian for the entropic OT distance, which was not previously used due to intricate tensor-wise calculations and the complex dependency on parameters within the bi-level loss function. Through analytical derivation and spectral analysis, we identify and resolve the numerical instability caused by the singularity and ill-posedness of a key linear system. Consequently, we achieve scalable and stable computation of the Hessian, enabling the implementation of the stochastic gradient descent (SGD)-Newton methods. Tests on shuffled regression examples demonstrate that the second stage of the SGD-Newton method converges orders of magnitude faster than the gradient descent-only method while achieving significantly more accurate parameter estimations. △ Less

Submitted 20 October, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

MSC Class: 68Q25; 68R10; 68U05

arXiv:2407.01445 [pdf, other]

FastCLIP: A Suite of Optimization Techniques to Accelerate CLIP Training with Limited Resources

Authors: Xiyuan Wei, Fanjiang Ye, Ori Yonay, Xingyu Chen, Baixi Sun, Dingwen Tao, Tianbao Yang

Abstract: Existing studies of training state-of-the-art Contrastive Language-Image Pretraining (CLIP) models on large-scale data involve hundreds of or even thousands of GPUs due to the requirement of a large batch size. However, such a large amount of resources is not accessible to most people. While advanced compositional optimization techniques for optimizing global contrastive losses have been demonstra… ▽ More Existing studies of training state-of-the-art Contrastive Language-Image Pretraining (CLIP) models on large-scale data involve hundreds of or even thousands of GPUs due to the requirement of a large batch size. However, such a large amount of resources is not accessible to most people. While advanced compositional optimization techniques for optimizing global contrastive losses have been demonstrated effective for removing the requirement of large batch size, their performance on large-scale data remains underexplored and not optimized. To bridge the gap, this paper explores several aspects of CLIP training with limited resources (e.g., up to tens of GPUs). First, we introduce FastCLIP, a general CLIP training framework built on advanced compositional optimization techniques while designed and optimized for the distributed setting. Our framework is equipped with an efficient gradient reduction strategy to reduce communication overhead. Second, to further boost training efficiency, we investigate three components of the framework from an optimization perspective: the schedule of the inner learning rate, the update rules of the temperature parameter and the model parameters, respectively. Experiments on different strategies for each component shed light on how to conduct CLIP training more efficiently. Finally, we benchmark the performance of FastCLIP and the state-of-the-art training baseline (OpenCLIP) on different compute scales up to 32 GPUs on 8 nodes, and three data scales ranging from 2.7 million, 9.1 million to 315 million image-text pairs to demonstrate the significant improvement of FastCLIP in the resource-limited setting. We release the code of FastCLIP at https://github.com/Optimization-AI/fast_clip . △ Less

Submitted 2 October, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

Comments: 29 pages

arXiv:2406.13922 [pdf, ps, other]

Explicit Performance Bound of Finite Blocklength Coded MIMO: Time-Domain versus Spatiotemporal Channel Coding

Authors: Feng Ye, Xiaohu You, Jiamin Li, Chuan Zhang, Chen Ji

Abstract: In the sixth generation (6G), ultra-reliable low-latency communications (URLLC) will further develop to achieve TKu extreme connectivity, and multiple-input multiple-output (MIMO) is expected to be a key enabler for its realization. Since the latency constraint can be represented by the blocklength of a codeword, it is essential to analyze different coded MIMO schemes under finite blocklength regi… ▽ More In the sixth generation (6G), ultra-reliable low-latency communications (URLLC) will further develop to achieve TKu extreme connectivity, and multiple-input multiple-output (MIMO) is expected to be a key enabler for its realization. Since the latency constraint can be represented by the blocklength of a codeword, it is essential to analyze different coded MIMO schemes under finite blocklength regime. In this paper, we analyze the statistical characteristics of information density of time-domain coding and spatiotemporal coding MIMO, compute the channel capacity and dispersion, and present new explicit performance bounds of finite blocklength coded MIMO for different coding modes via normal approximation. As revealed by the analysis and simulation, spatiotemporal coding can effectively mitigate the performance loss induced by short blocklength by increasing the spatial degree of freedom (DoF). However, for time-domain coding, each spatial link is encoded independently, and the performance loss will be more severe with short blocklength under any spatial DoF. These results indicate that spatiotemporal coding can optimally exploit the spatial dimension advantages of MIMO systems compared with time-domain coding, and it has the potential to support URLLC transmission, enabling very low error-rate communication under stringent blocklength constraint. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 9 pages, 5 figures

arXiv:2406.13249 [pdf, other]

R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation

Authors: Fuda Ye, Shuangyin Li, Yongqi Zhang, Lei Chen

Abstract: Retrieval augmented generation (RAG) has been applied in many scenarios to augment large language models (LLMs) with external documents provided by retrievers. However, a semantic gap exists between LLMs and retrievers due to differences in their training objectives and architectures. This misalignment forces LLMs to passively accept the documents provided by the retrievers, leading to incomprehen… ▽ More Retrieval augmented generation (RAG) has been applied in many scenarios to augment large language models (LLMs) with external documents provided by retrievers. However, a semantic gap exists between LLMs and retrievers due to differences in their training objectives and architectures. This misalignment forces LLMs to passively accept the documents provided by the retrievers, leading to incomprehension in the generation process, where the LLMs are burdened with the task of distinguishing these documents using their inherent knowledge. This paper proposes R$^2$AG, a novel enhanced RAG framework to fill this gap by incorporating Retrieval information into Retrieval Augmented Generation. Specifically, R$^2$AG utilizes the nuanced features from the retrievers and employs a R$^2$-Former to capture retrieval information. Then, a retrieval-aware prompting strategy is designed to integrate retrieval information into LLMs' generation. Notably, R$^2$AG suits low-source scenarios where LLMs and retrievers are frozen. Extensive experiments across five datasets validate the effectiveness, robustness, and efficiency of R$^2$AG. Our analysis reveals that retrieval information serves as an anchor to aid LLMs in the generation process, thereby filling the semantic gap. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.12844 [pdf, other]

Synergizing Foundation Models and Federated Learning: A Survey

Authors: Shenghui Li, Fanghua Ye, Meng Fang, Jiaxu Zhao, Yun-Hin Chan, Edith C. -H. Ngai, Thiemo Voigt

Abstract: The recent development of Foundation Models (FMs), represented by large language models, vision transformers, and multimodal models, has been making a significant impact on both academia and industry. Compared with small-scale models, FMs have a much stronger demand for high-volume data during the pre-training phase. Although general FMs can be pre-trained on data collected from open sources such… ▽ More The recent development of Foundation Models (FMs), represented by large language models, vision transformers, and multimodal models, has been making a significant impact on both academia and industry. Compared with small-scale models, FMs have a much stronger demand for high-volume data during the pre-training phase. Although general FMs can be pre-trained on data collected from open sources such as the Internet, domain-specific FMs need proprietary data, posing a practical challenge regarding the amount of data available due to privacy concerns. Federated Learning (FL) is a collaborative learning paradigm that breaks the barrier of data availability from different participants. Therefore, it provides a promising solution to customize and adapt FMs to a wide range of domain-specific tasks using distributed datasets whilst preserving privacy. This survey paper discusses the potentials and challenges of synergizing FL and FMs and summarizes core techniques, future directions, and applications. A periodically updated paper collection on FM-FL is available at https://github.com/lishenghui/awesome-fm-fl. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.00114 [pdf, other]

Dynamic Multi-Objective Lion Swarm Optimization with Multi-strategy Fusion: An application in 6R robot trajectory planning

Authors: Bao Liu, Tianbao Liu, Zhongshuo Hu, Fei Ye, Lei Gao

Abstract: The advancement of industrialization has spurred the development of innovative swarm intelligence algorithms, with Lion Swarm Optimization (LSO) notable for its robustness, parallelism, simplicity, and efficiency. While LSO excels in single-objective optimization, its multi-objective variants face challenges such as poor initialization, local optima entrapment, and so on. This study proposes Dynam… ▽ More The advancement of industrialization has spurred the development of innovative swarm intelligence algorithms, with Lion Swarm Optimization (LSO) notable for its robustness, parallelism, simplicity, and efficiency. While LSO excels in single-objective optimization, its multi-objective variants face challenges such as poor initialization, local optima entrapment, and so on. This study proposes Dynamic Multi-Objective Lion Swarm Optimization with Multi-strategy Fusion (MF-DMOLSO) to address these limitations. MF-DMOLSO comprises three key components: initialization, swarm position update, and external archive update. The initialization unit employs chaotic mapping for uniform population distribution. The position update unit enhances behavior patterns and step size formulas for cub lions, incorporating crowding degree sorting, Pareto non-dominated sorting, and Levy flight to improve convergence speed and global search capabilities. Reference points guide convergence in higher-dimensional spaces, maintaining population diversity. An adaptive cold-hot start strategy generates a population responsive to environmental changes. The external archive update unit re-evaluates solutions based on non-domination and diversity to form the new population. Evaluations on benchmark functions showed MF-DMOLSO surpassed multi-objective particle swarm optimization, non-dominated sorting genetic algorithm II, and multi-objective lion swarm optimization, exceeding 90% accuracy for two-objective and 97% for three-objective problems. Compared to non-dominated sorting genetic algorithm III, MF-DMOLSO showed a 60% improvement. Applied to 6R robot trajectory planning, MF-DMOLSO optimized running time and maximum acceleration to 8.3s and 0.3pi rad/s^2, achieving a set coverage rate of 70.97% compared to 2% by multi-objective particle swarm optimization, thus improving efficiency and reducing mechanical dither. △ Less

Submitted 7 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.18973 [pdf, other]

Codimension-Two Spiral Spin-Liquid in the Effective Honeycomb-Lattice Compound Cs$_3$Fe$_2$Cl$_9$

Authors: Shang Gao, Chris Pasco, Otkur Omar, Qiang Zhang, Daniel M. Pajerowski, Feng Ye, Matthias Frontzek, Andrew F. May, Matthew B. Stone, Andrew D. Christianson

Abstract: A codimension-two spiral spin-liquid is a correlated paramagnetic state with one-dimensional ground state degeneracy hosted within a three-dimensional lattice. Here, via neutron scattering experiments and numerical simulations, we establish the existence of a codimension-two spiral spin-liquid in the effective honeycomb-lattice compound Cs$_3$Fe$_2$Cl$_9$ and demonstrate the selective visibility o… ▽ More A codimension-two spiral spin-liquid is a correlated paramagnetic state with one-dimensional ground state degeneracy hosted within a three-dimensional lattice. Here, via neutron scattering experiments and numerical simulations, we establish the existence of a codimension-two spiral spin-liquid in the effective honeycomb-lattice compound Cs$_3$Fe$_2$Cl$_9$ and demonstrate the selective visibility of the spiral surface through phase tuning. In the long-range ordered regime, competing spiral and spin density wave orders emerge as a function of applied magnetic field, among which a possible order-by-disorder transition is identified. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 21 pages, 23 figures

arXiv:2405.16252 [pdf, other]

2-torsion in instanton Floer homology

Authors: Zhenkun Li, Fan Ye

Abstract: This paper studies the existence of $2$-torsion in instanton Floer homology with $\mathbb{Z}$ coefficients for closed $3$-manifolds and singular knots. First, we show that the non-existence of $2$-torsion in the framed instanton Floer homology $I^\sharp(S_n^3(K);\mathbb{Z})$ of any nonzero integral $n$-surgery along a knot $K$ in $S^3$ would imply that $K$ is fibered. Also, we show that… ▽ More This paper studies the existence of $2$-torsion in instanton Floer homology with $\mathbb{Z}$ coefficients for closed $3$-manifolds and singular knots. First, we show that the non-existence of $2$-torsion in the framed instanton Floer homology $I^\sharp(S_n^3(K);\mathbb{Z})$ of any nonzero integral $n$-surgery along a knot $K$ in $S^3$ would imply that $K$ is fibered. Also, we show that $I^\sharp(S_{r}^3(K);\mathbb{Z})$ for any nontrivial $K$ with $r=1,1/2,1/4$ always has $2$-torsion. These two results indicate that the existence of $2$-torsion is expected to be a generic phenomenon for Dehn surgeries along knots. Second, we show that for genus-one knots with nontrivial Alexander polynomials and for unknotting-number-one knots, the unreduced singular instanton knot homology $I^\sharp(S^3,K;\mathbb{Z})$ always has $2$-torsion. Finally, some crucial lemmas that help us demonstrate the existence of $2$-torsion are motivated by analogous results in Heegaard Floer theory, which may be of independent interest. In particular, we show that, for a knot $K$ in $S^3$, if there is a nonzero rational number $r$ such that the dual knot $\widetilde{K}_r$ inside $S^3_r(K)$ is Floer simple, then $S^3_r(K)$ must be an L-space and $K$ must be an L-space knot. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 41 pages, 17 figures; comments are welcome

arXiv:2405.13249 [pdf]

Structural Properties of Plastically Deformed SrTiO3 and KTaO3

Authors: Issam Khayr, Sajna Hameed, Jakov Budić, Xing He, Richard Spieker, Ana Najev, Zinan Zhao, Li Yue, Matthew Krogstad, Feng Ye, Yaohua Liu, Raymond Osborn, Stephan Rosenkranz, Yuan Li, Damjan Pelc, Martin Greven

Abstract: Dislocation engineering has the potential to open new avenues toward the exploration and modification of the properties of quantum materials. Strontium titanate (SrTiO3, STO) and potassium tantalate (KTaO3, KTO) are incipient ferroelectrics that show metallization and superconductivity at extremely low charge carrier concentrations, and have been the subject of resurgent interest. These materials… ▽ More Dislocation engineering has the potential to open new avenues toward the exploration and modification of the properties of quantum materials. Strontium titanate (SrTiO3, STO) and potassium tantalate (KTaO3, KTO) are incipient ferroelectrics that show metallization and superconductivity at extremely low charge carrier concentrations, and have been the subject of resurgent interest. These materials also exhibit remarkable ambient-temperature ductility, and thus represent exceptional platforms for studies of the effects of deformation-induced dislocation structures on electronic properties. Recent work on plastically deformed STO revealed an enhancement of the superconducting transition temperature and the emergence of local ferroelectricity and magnetism near self-organized dislocation walls. Here we present a comprehensive structural analysis of plastically deformed STO and KTO, employing specially designed strain cells, diffuse neutron and x-ray scattering, Raman scattering, and nuclear magnetic resonance (NMR). Diffuse scattering and NMR provide insight into the dislocation configurations and densities and their dependence on strain. As in the prior work on STO, Raman scattering reveals evidence for local ferroelectric order near dislocation walls in plastically deformed KTO. Our findings provide valuable information about the self-organized defect structures in both materials, and they position KTO as a second model system with which to explore the associated emergent physics. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.13215 [pdf, other]

Role of stacking defects on the magnetic behavior of CrCl$_3$

Authors: John A. Schneeloch, Adam A. Aczel, Feng Ye, Despina Louca

Abstract: In the study of van der Waals-layered magnetic materials, the properties of CrCl$_3$ continue to attract attention. This compound is reported to undergo antiferromagnetic (AFM) ordering below $\sim$14 K, with a ferromagneticlike region proposed to exist between 14 and 17 K. Ideally, the crystal structure is rhombohedral (R) below $\sim$235 K, separated from a higher-temperature monoclinic (M) phas… ▽ More In the study of van der Waals-layered magnetic materials, the properties of CrCl$_3$ continue to attract attention. This compound is reported to undergo antiferromagnetic (AFM) ordering below $\sim$14 K, with a ferromagneticlike region proposed to exist between 14 and 17 K. Ideally, the crystal structure is rhombohedral (R) below $\sim$235 K, separated from a higher-temperature monoclinic (M) phase by a layer-sliding structural phase transition. However, the structural transition is often inhibited even in bulk single crystals, allowing M-type layer stacking, reported to have a tenfold greater interlayer magnetic coupling than R-type stacking, to be present at low temperature. To clarify the effect of stacking defects on CrCl$_3$, we report magnetization measurements on samples of varying crystalline quality. At low applied magnetic field, some crystals predominantly show the $T_N=14$ K peak, but other crystals show hysteretic behavior and a magnetization enhancement at a slightly higher temperature ($14 < T \lesssim 17$ K.) Samples with anomalous behavior exhibit a transition around $\sim$2 T in isothermal magnetization-field data, providing evidence that M-type stacking defects are the source of these anomalies. Ground powder samples are especially likely to show strongly anomalous behavior. We suggest that the anomalous behavior arises from few-layer magnetic domains that form just above $T_N$ in an environment of mixed interlayer magnetic coupling strength. We argue that the influence of M-type stacking boundaries on sublattice magnetization is already observable in reported neutron scattering data, and may be responsible for a certain feature in reported specific heat data. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: Supplement is part of pdf file

arXiv:2405.13034 [pdf, other]

Autonomous Workflow for Multimodal Fine-Grained Training Assistants Towards Mixed Reality

Authors: Jiahuan Pei, Irene Viola, Haochen Huang, Junxiao Wang, Moonisa Ahsan, Fanghua Ye, Jiang Yiming, Yao Sai, Di Wang, Zhumin Chen, Pengjie Ren, Pablo Cesar

Abstract: Autonomous artificial intelligence (AI) agents have emerged as promising protocols for automatically understanding the language-based environment, particularly with the exponential development of large language models (LLMs). However, a fine-grained, comprehensive understanding of multimodal environments remains under-explored. This work designs an autonomous workflow tailored for integrating AI a… ▽ More Autonomous artificial intelligence (AI) agents have emerged as promising protocols for automatically understanding the language-based environment, particularly with the exponential development of large language models (LLMs). However, a fine-grained, comprehensive understanding of multimodal environments remains under-explored. This work designs an autonomous workflow tailored for integrating AI agents seamlessly into extended reality (XR) applications for fine-grained training. We present a demonstration of a multimodal fine-grained training assistant for LEGO brick assembly in a pilot XR environment. Specifically, we design a cerebral language agent that integrates LLM with memory, planning, and interaction with XR tools and a vision-language agent, enabling agents to decide their actions based on past experiences. Furthermore, we introduce LEGO-MRTA, a multimodal fine-grained assembly dialogue dataset synthesized automatically in the workflow served by a commercial LLM. This dataset comprises multimodal instruction manuals, conversations, XR responses, and vision question answering. Last, we present several prevailing open-resource LLMs as benchmarks, assessing their performance with and without fine-tuning on the proposed dataset. We anticipate that the broader impact of this workflow will advance the development of smarter assistants for seamless user interaction in XR environments, fostering research in both AI and HCI communities. △ Less

Submitted 5 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: Accepted by ACL 2024

arXiv:2405.10411 [pdf, other]

Nanoscale structural correlations in a model cuprate superconductor

Authors: Zachary W. Anderson, Marin Spaić, Nikolaos Biniskos, Liam Thompson, Biqiong Yu, Jack Zwettler, Yaohua Liu, Feng Ye, Garrett E. Granroth, Matthew Krogstad, Raymond Osborn, Damjan Pelc, Martin Greven

Abstract: Understanding the extent and role of inhomogeneity is a pivotal challenge in the physics of cuprate superconductors. While it is known that structural and electronic inhomogeneity is prevalent in the cuprates, it has proven difficult to disentangle compound-specific features from universally relevant effects. Here we combine advanced neutron and x-ray diffuse scattering with numerical modeling to… ▽ More Understanding the extent and role of inhomogeneity is a pivotal challenge in the physics of cuprate superconductors. While it is known that structural and electronic inhomogeneity is prevalent in the cuprates, it has proven difficult to disentangle compound-specific features from universally relevant effects. Here we combine advanced neutron and x-ray diffuse scattering with numerical modeling to obtain insight into bulk structural correlations in HgBa$_2$CuO$_{4+δ}$. This cuprate exhibits a high optimal transition temperature of nearly 100 K, pristine charge-transport behavior, and a simple average crystal structure without long-range structural instabilities, and is therefore uniquely suited for investigations of intrinsic inhomogeneity. We uncover diffuse reciprocal-space patterns that correspond to prominent nanoscale correlations of atomic displacements perpendicular to the CuO$_2$ planes. The real-space nature of the correlations is revealed through three-dimensional pair distribution function analysis and complementary numerical refinement. We find that relative displacements of ionic and CuO$_2$ layers play a crucial role, and that the structural inhomogeneity is not directly caused by the presence of conventional point defects. The observed correlations are therefore intrinsic to HgBa$_2$CuO$_{4+δ}$, and thus likely important for the physics of cuprates more broadly. It is possible that the structural correlations are closely related to the unusual superconducting correlations and Mott-localization in these complex oxides. As advances in scattering techniques yield increasingly comprehensive data, the experimental and analysis tools developed here for large volumes of diffuse scattering data can be expected to aid future investigations of a wide range of materials. △ Less

Submitted 14 October, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: 14 pages, 5 figures. V2 changes: exchanged Fig. 4 and 5.; added neutron panels to Fig. 4.; fixed incorrect neutron panels in Fig. 5 and updated related text; added 3D-DeltaPDF resolution paragraph: clarified text about correlation lengths, temperature control, absence of occupational disorder and CuO2 plane buckling; removed appendix; typo fixes, small changes to phrasing, etc

arXiv:2405.05542 [pdf, other]

Dynamic Deep Factor Graph for Multi-Agent Reinforcement Learning

Authors: Yuchen Shi, Shihong Duan, Cheng Xu, Ran Wang, Fangwen Ye, Chau Yuen

Abstract: This work introduces a novel value decomposition algorithm, termed \textit{Dynamic Deep Factor Graphs} (DDFG). Unlike traditional coordination graphs, DDFG leverages factor graphs to articulate the decomposition of value functions, offering enhanced flexibility and adaptability to complex value function structures. Central to DDFG is a graph structure generation policy that innovatively generates… ▽ More This work introduces a novel value decomposition algorithm, termed \textit{Dynamic Deep Factor Graphs} (DDFG). Unlike traditional coordination graphs, DDFG leverages factor graphs to articulate the decomposition of value functions, offering enhanced flexibility and adaptability to complex value function structures. Central to DDFG is a graph structure generation policy that innovatively generates factor graph structures on-the-fly, effectively addressing the dynamic collaboration requirements among agents. DDFG strikes an optimal balance between the computational overhead associated with aggregating value functions and the performance degradation inherent in their complete decomposition. Through the application of the max-sum algorithm, DDFG efficiently identifies optimal policies. We empirically validate DDFG's efficacy in complex scenarios, including higher-order predator-prey tasks and the StarCraft II Multi-agent Challenge (SMAC), thus underscoring its capability to surmount the limitations faced by existing value decomposition algorithms. DDFG emerges as a robust solution for MARL challenges that demand nuanced understanding and facilitation of dynamic agent collaboration. The implementation of DDFG is made publicly accessible, with the source code available at \url{https://github.com/SICC-Group/DDFG}. △ Less

Submitted 7 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: submitted to IEEE TPAMI

arXiv:2405.03692 [pdf, other]

Imitation Learning for Adaptive Video Streaming with Future Adversarial Information Bottleneck Principle

Authors: Shuoyao Wang, Jiawei Lin, Fangwei Ye

Abstract: Adaptive video streaming plays a crucial role in ensuring high-quality video streaming services. Despite extensive research efforts devoted to Adaptive BitRate (ABR) techniques, the current reinforcement learning (RL)-based ABR algorithms may benefit the average Quality of Experience (QoE) but suffers from fluctuating performance in individual video sessions. In this paper, we present a novel appr… ▽ More Adaptive video streaming plays a crucial role in ensuring high-quality video streaming services. Despite extensive research efforts devoted to Adaptive BitRate (ABR) techniques, the current reinforcement learning (RL)-based ABR algorithms may benefit the average Quality of Experience (QoE) but suffers from fluctuating performance in individual video sessions. In this paper, we present a novel approach that combines imitation learning with the information bottleneck technique, to learn from the complex offline optimal scenario rather than inefficient exploration. In particular, we leverage the deterministic offline bitrate optimization problem with the future throughput realization as the expert and formulate it as a mixed-integer non-linear programming (MINLP) problem. To enable large-scale training for improved performance, we propose an alternative optimization algorithm that efficiently solves the MINLP problem. To address the issues of overfitting due to the future information leakage in MINLP, we incorporate an adversarial information bottleneck framework. By compressing the video streaming state into a latent space, we retain only action-relevant information. Additionally, we introduce a future adversarial term to mitigate the influence of future information leakage, where Model Prediction Control (MPC) policy without any future information is employed as the adverse expert. Experimental results demonstrate the effectiveness of our proposed approach in significantly enhancing the quality of adaptive video streaming, providing a 7.30\% average QoE improvement and a 30.01\% average ranking reduction. △ Less

Submitted 12 March, 2024; originally announced May 2024.

Comments: submitted to IEEE Journal

arXiv:2404.13396 [pdf]

Angle-Resolved Magneto-Chiral Anisotropy in a Non-Centrosymmetric Atomic Layer Superlattice

Authors: Long Cheng, Mingrui Bao, Jingxian Zhang, Xue Zhang, Qun Yang, Qiang Li, Hui Cao, Dawei Qiu, Jia Liu, Fei Ye, Qing Wang, Genhao Liang, Hui Li, Guanglei Cheng, Hua Zhou, Jian-Min Zuo, Xiaodong Zhou, Jian Shen, Zhifeng Zhu, Sai Mu, Wenbo Wang, Xiaofang Zhai

Abstract: Chirality in solid-state materials has sparked significant interest due to potential applications of topologically-protected chiral states in next-generation information technology. The electrical magneto-chiral effect (eMChE), arising from relativistic spin-orbit interactions, shows great promise for developing chiral materials and devices for electronic integration. Here we demonstrate an angle-… ▽ More Chirality in solid-state materials has sparked significant interest due to potential applications of topologically-protected chiral states in next-generation information technology. The electrical magneto-chiral effect (eMChE), arising from relativistic spin-orbit interactions, shows great promise for developing chiral materials and devices for electronic integration. Here we demonstrate an angle-resolved eMChE in an A-B-C-C type atomic-layer superlattice lacking time and space inversion symmetry. We observe non-superimposable enantiomers of left-handed and right-handed tilted uniaxial magnetic anisotropy as the sample rotates under static fields, with the tilting angle reaching a striking 45 degree. Magnetic force microscopy and atomistic simulations correlate the tilt to the emergence and evolution of chiral spin textures. The Dzyaloshinskii-Moriya interaction lock effect in competition with Zeeman effect is demonstrated to be responsible for the angle-resolved eMChE. Our findings open up a new horizon for engineering angle-resolved magneto-chiral anisotropy, shedding light on the development of novel angle-resolved sensing or writing techniques in chiral spintronics. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2403.10971 [pdf, other]

Task-Aware Low-Rank Adaptation of Segment Anything Model

Authors: Xuehao Wang, Feiyang Ye, Yu Zhang

Abstract: The Segment Anything Model (SAM), with its remarkable zero-shot capability, has been proven to be a powerful foundation model for image segmentation tasks, which is an important task in computer vision. However, the transfer of its rich semantic information to multiple different downstream tasks remains unexplored. In this paper, we propose the Task-Aware Low-Rank Adaptation (TA-LoRA) method, whic… ▽ More The Segment Anything Model (SAM), with its remarkable zero-shot capability, has been proven to be a powerful foundation model for image segmentation tasks, which is an important task in computer vision. However, the transfer of its rich semantic information to multiple different downstream tasks remains unexplored. In this paper, we propose the Task-Aware Low-Rank Adaptation (TA-LoRA) method, which enables SAM to work as a foundation model for multi-task learning. Specifically, TA-LoRA injects an update parameter tensor into each layer of the encoder in SAM and leverages a low-rank tensor decomposition method to incorporate both task-shared and task-specific information. Furthermore, we introduce modified SAM (mSAM) for multi-task learning where we remove the prompt encoder of SAM and use task-specific no mask embeddings and mask decoder for each task. Extensive experiments conducted on benchmark datasets substantiate the efficacy of TA-LoRA in enhancing the performance of mSAM across multiple downstream tasks. △ Less

Submitted 16 March, 2024; originally announced March 2024.

arXiv:2403.06568 [pdf, other]

Better Understandings and Configurations in MaxSAT Local Search Solvers via Anytime Performance Analysis

Authors: Furong Ye, Chuan Luo, Shaowei Cai

Abstract: Though numerous solvers have been proposed for the MaxSAT problem, and the benchmark environment such as MaxSAT Evaluations provides a platform for the comparison of the state-of-the-art solvers, existing assessments were usually evaluated based on the quality, e.g., fitness, of the best-found solutions obtained within a given running time budget. However, concerning solely the final obtained solu… ▽ More Though numerous solvers have been proposed for the MaxSAT problem, and the benchmark environment such as MaxSAT Evaluations provides a platform for the comparison of the state-of-the-art solvers, existing assessments were usually evaluated based on the quality, e.g., fitness, of the best-found solutions obtained within a given running time budget. However, concerning solely the final obtained solutions regarding specific time budgets may restrict us from comprehending the behavior of the solvers along the convergence process. This paper demonstrates that Empirical Cumulative Distribution Functions can be used to compare MaxSAT local search solvers' anytime performance across multiple problem instances and various time budgets. The assessment reveals distinctions in solvers' performance and displays that the (dis)advantages of solvers adjust along different running times. This work also exhibits that the quantitative and high variance assessment of anytime performance can guide machines, i.e., automatic configurators, to search for better parameter settings. Our experimental results show that the hyperparameter optimization tool, i.e., SMAC, generally achieves better parameter settings of local search when using the anytime performance as the cost function, compared to using the fitness of the best-found solutions. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.06144 [pdf, other]

Simulating Family Conversations using LLMs: Demonstration of Parenting Styles

Authors: Frank Tian-fang Ye, Xiaozi Gao

Abstract: This study presents a framework for conducting psychological and linguistic research through simulated conversations using large language models (LLMs). The proposed methodology offers significant advantages, particularly for simulating human interactions involving potential unethical language or behaviors that would be impermissible in traditional experiments with human participants. As a demonst… ▽ More This study presents a framework for conducting psychological and linguistic research through simulated conversations using large language models (LLMs). The proposed methodology offers significant advantages, particularly for simulating human interactions involving potential unethical language or behaviors that would be impermissible in traditional experiments with human participants. As a demonstration, we employed LLMs to simulate family conversations across four parenting styles (authoritarian, authoritative, permissive, and uninvolved). In general, we observed that the characteristics of the four parenting styles were portrayed in the simulated conversations. Several strategies could be used to improve the simulation quality, such as including context awareness, employing a few-shot prompting approach or fine-tuning models to cater to specific simulation requirements. Overall, this study introduces a promising methodology for conducting psychological and linguistic research through simulated conversations, while acknowledging the current limitations and proposing potential solutions for future refinement and improvement. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2403.03310 [pdf, other]

Graph Learning for Parameter Prediction of Quantum Approximate Optimization Algorithm

Authors: Zhiding Liang, Gang Liu, Zheyuan Liu, Jinglei Cheng, Tianyi Hao, Kecheng Liu, Hang Ren, Zhixin Song, Ji Liu, Fanny Ye, Yiyu Shi

Abstract: In recent years, quantum computing has emerged as a transformative force in the field of combinatorial optimization, offering novel approaches to tackling complex problems that have long challenged classical computational methods. Among these, the Quantum Approximate Optimization Algorithm (QAOA) stands out for its potential to efficiently solve the Max-Cut problem, a quintessential example of com… ▽ More In recent years, quantum computing has emerged as a transformative force in the field of combinatorial optimization, offering novel approaches to tackling complex problems that have long challenged classical computational methods. Among these, the Quantum Approximate Optimization Algorithm (QAOA) stands out for its potential to efficiently solve the Max-Cut problem, a quintessential example of combinatorial optimization. However, practical application faces challenges due to current limitations on quantum computational resource. Our work optimizes QAOA initialization, using Graph Neural Networks (GNN) as a warm-start technique. This sacrifices affordable computational resource on classical computer to reduce quantum computational resource overhead, enhancing QAOA's effectiveness. Experiments with various GNN architectures demonstrate the adaptability and stability of our framework, highlighting the synergy between quantum algorithms and machine learning. Our findings show GNN's potential in improving QAOA performance, opening new avenues for hybrid quantum-classical approaches in quantum computing and contributing to practical applications. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2402.18567 [pdf, other]

Diffusion Language Models Are Versatile Protein Learners

Authors: Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang, Quanquan Gu

Abstract: This paper introduces diffusion protein language model (DPLM), a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences. We first pre-train scalable DPLMs from evolutionary-scale protein sequences within a generative self-supervised discrete diffusion probabilistic framework, which generalizes language modeling for proteins in a princ… ▽ More This paper introduces diffusion protein language model (DPLM), a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences. We first pre-train scalable DPLMs from evolutionary-scale protein sequences within a generative self-supervised discrete diffusion probabilistic framework, which generalizes language modeling for proteins in a principled way. After pre-training, DPLM exhibits the ability to generate structurally plausible, novel, and diverse protein sequences for unconditional generation. We further demonstrate the proposed diffusion generative pre-training makes DPLM possess a better understanding of proteins, making it a superior representation learner, which can be fine-tuned for various predictive tasks, comparing favorably to ESM2 (Lin et al., 2022). Moreover, DPLM can be tailored for various needs, which showcases its prowess of conditional generation in several ways: (1) conditioning on partial peptide sequences, e.g., generating scaffolds for functional motifs with high success rate; (2) incorporating other modalities as conditioner, e.g., structure-conditioned generation for inverse folding; and (3) steering sequence generation towards desired properties, e.g., satisfying specified secondary structures, through a plug-and-play classifier guidance. Code is released at \url{https://github.com/bytedance/dplm}. △ Less

Submitted 16 October, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: ICML 2024 camera-ready version

arXiv:2402.18070 [pdf, other]

A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

Authors: Limin Jiang, Yi Shi, Haiqin Hu, Qingyu Deng, Siyi Xu, Yintao Liu, Feng Yuan, Si Wang, Yihao Shen, Fangfang Ye, Shan Cao, Zhiyuan Jiang

Abstract: Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and… ▽ More Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and consecutive character of WBP. Furthermore, the large amount of data in WBPs cannot be processed quickly in symmetric multiprocessors (SMPs) due to the unpredictability of memory latency. To address this issue, we propose a hierarchical dataflow-driven architecture to accelerate WBP. A pack-and-ship approach is presented under a non-uniform memory access (NUMA) architecture to allow the subordinate tiles to operate in a bundled access and execute manner. We also propose a multi-level dataflow model and the related scheduling scheme to manage and allocate the heterogeneous hardware resources. Experiment results demonstrate that our prototype achieves $2\times$ and $2.3\times$ speedup in terms of normalized throughput and single-tile clock cycles compared with GPU and DSP counterparts in several critical WBP benchmarks. Additionally, a link-level throughput of $288$ Mbps can be achieved with a $45$-core configuration. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 7 pages, 7 figures, conference

arXiv:2402.16983 [pdf, other]

Thermal evolution of spin excitations in honeycomb Ising antiferromagnetic FePSe3

Authors: Lebing Chen, Xiaokun Teng, Ding Hu, Feng Ye, Garrett E. Granroth, Ming Yi, Jae-Ho Chung, Robert J. Birgeneau, Pengcheng Dai

Abstract: We use elastic and inelastic neutron scattering (INS) to study the antiferromagnetic (AF) phase transitions and spin excitations in the two-dimensional (2D) zig-zag antiferromagnet FePSe$_3$. By determining the magnetic order parameter across the AF phase transition, we conclude that the AF phase transition in FePSe$_3$ is first-order in nature. In addition, our INS measurements reveal that the sp… ▽ More We use elastic and inelastic neutron scattering (INS) to study the antiferromagnetic (AF) phase transitions and spin excitations in the two-dimensional (2D) zig-zag antiferromagnet FePSe$_3$. By determining the magnetic order parameter across the AF phase transition, we conclude that the AF phase transition in FePSe$_3$ is first-order in nature. In addition, our INS measurements reveal that the spin waves in the AF ordered state have a large easy-axis magnetic anisotropy gap, consistent with an Ising Hamiltonian, and possible biquadratic magnetic exchange interactions. On warming across $T_N$, we find that dispersive spin excitations associated with three-fold rotational symmetric AF fluctuations change into FM spin fluctuations above $T_N$. These results suggest that the first-order AF phase transition in FePSe$_3$ may arise from the competition between $C_3$ symmetric AF and $C_1$ symmetric FM spin fluctuations around $T_N$, in place of a conventional second-order AF phase transition. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.07654 [pdf, other]

Impact of spatial transformations on landscape features of CEC2022 basic benchmark problems

Authors: Haoran Yin, Diederick Vermetten, Furong Ye, Thomas H. W. Bäck, Anna V. Kononova

Abstract: When benchmarking optimization heuristics, we need to take care to avoid an algorithm exploiting biases in the construction of the used problems. One way in which this might be done is by providing different versions of each problem but with transformations applied to ensure the algorithms are equipped with mechanisms for successfully tackling a range of problems. In this paper, we investigate sev… ▽ More When benchmarking optimization heuristics, we need to take care to avoid an algorithm exploiting biases in the construction of the used problems. One way in which this might be done is by providing different versions of each problem but with transformations applied to ensure the algorithms are equipped with mechanisms for successfully tackling a range of problems. In this paper, we investigate several of these problem transformations and show how they influence the low-level landscape features of a set of 5 problems from the CEC2022 benchmark suite. Our results highlight that even relatively small transformations can significantly alter the measured landscape features. This poses a wider question of what properties we want to preserve when creating problem transformations, and how to fairly measure them. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.07616 [pdf, other]

Anchor-based Large Language Models

Authors: Jianhui Pang, Fanghua Ye, Derek Fai Wong, Xin He, Wanshun Chen, Longyue Wang

Abstract: Large language models (LLMs) predominantly employ decoder-only transformer architectures, necessitating the retention of keys/values information for historical tokens to provide contextual information and avoid redundant computation. However, the substantial size and parameter volume of these LLMs require massive GPU memory. This memory demand increases with the length of the input text, leading t… ▽ More Large language models (LLMs) predominantly employ decoder-only transformer architectures, necessitating the retention of keys/values information for historical tokens to provide contextual information and avoid redundant computation. However, the substantial size and parameter volume of these LLMs require massive GPU memory. This memory demand increases with the length of the input text, leading to an urgent need for more efficient methods of information storage and processing. This study introduces Anchor-based LLMs (AnLLMs), which utilize an innovative anchor-based self-attention network (AnSAN) and also an anchor-based inference strategy. This approach enables LLMs to compress sequence information into an anchor token, reducing the keys/values cache and enhancing inference efficiency. Experiments on question-answering benchmarks reveal that AnLLMs maintain similar accuracy levels while achieving up to 99% keys/values cache reduction and up to 3.5 times faster inference. Despite a minor compromise in accuracy, the substantial enhancements of AnLLMs employing the AnSAN technique in resource utilization and computational efficiency underscore their potential for practical LLM applications. △ Less

Submitted 1 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

Comments: The paper has been accepted by the ACL2024 conference. Work was done when Jianhui Pang and Fanghua Ye were interning at Tencent AI Lab

arXiv:2401.17669 [pdf, other]

Compression before Fusion: Broadcast Semantic Communication System for Heterogeneous Tasks

Authors: Mingze Gong, Shuoyao Wang, Fangwei Ye, Suzhi Bi

Abstract: Semantic communication has emerged as new paradigm shifts in 6G from the conventional syntax-oriented communications. Recently, the wireless broadcast technology has been introduced to support semantic communication system toward higher communication efficiency. Nevertheless, existing broadcast semantic communication systems target on general representation within one stage and fail to balance the… ▽ More Semantic communication has emerged as new paradigm shifts in 6G from the conventional syntax-oriented communications. Recently, the wireless broadcast technology has been introduced to support semantic communication system toward higher communication efficiency. Nevertheless, existing broadcast semantic communication systems target on general representation within one stage and fail to balance the inference accuracy among users. In this paper, the broadcast encoding process is decomposed into compression and fusion to improves communication efficiency with adaptation to tasks and channels.Particularly, we propose multiple task-channel-aware sub-encoders (TCE) and a channel-aware feature fusion sub-encoder (CFE) towards compression and fusion, respectively. In TCEs, multiple local-channel-aware attention blocks are employed to extract and compress task-relevant information for each user. In GFE, we introduce a global-channel-aware fine-tuning block to merge these compressed task-relevant signals into a compact broadcast signal. Notably, we retrieve the bottleneck in DeepBroadcast and leverage information bottleneck theory to further optimize the parameter tuning of TCEs and CFE.We substantiate our approach through experiments on a range of heterogeneous tasks across various channels with additive white Gaussian noise (AWGN) channel, Rayleigh fading channel, and Rician fading channel. Simulation results evidence that the proposed DeepBroadcast outperforms the state-of-the-art methods. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.17141 [pdf, other]

Incipient nematicity from electron flat bands in a kagome metal

Authors: Nathan Drucker, Thanh Nguyen, Manasi Mandal, Phum Siriviboon, Yujie Quan, Artittaya Boonkird, Ryotaro Okabe, Fankang Li, Kaleb Buragge, Fumiaki Funuma, Masaaki Matsuda, Douglas Abernathy, Travis Williams, Songxue Chi, Feng Ye, Christie Nelson, Bolin Liao, Pavel Volkov, Mingda Li

Abstract: Engineering new quantum phases requires fine tuning of the electronic, orbital, spin, and lattice degrees of freedom. To this end, the kagome lattice with flat bands has garnered great attention by hosting various topological and correlated phases, when the flat band is at the Fermi level. Here we discover unconventional nematiciy in kagome metal CoSn, where flat bands are fully occupied below the… ▽ More Engineering new quantum phases requires fine tuning of the electronic, orbital, spin, and lattice degrees of freedom. To this end, the kagome lattice with flat bands has garnered great attention by hosting various topological and correlated phases, when the flat band is at the Fermi level. Here we discover unconventional nematiciy in kagome metal CoSn, where flat bands are fully occupied below the Fermi level. Thermodynamic, dilatometry, resonant X-ray scattering, inelastic neutron scattering, Larmor diffraction, and thermoelectric measurements consistently hint at rotational symmetry-breaking and nematic order that is pronounced only near T=225 K. These observations, principally the nematic's finite temperature stability -- incipience -- can be explained by a phenomenological model which reveals that thermally excited flat bands promote symmetry breaking at a characteristic temperature. Our work shows that thermal fluctuations, which are typically detrimental for correlated electron phases, can induce new ordered states of matter, avoiding the requirements for fine tuning of electronic bands. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: 37 pages, 5 main figures, 14 supplementary figures

arXiv:2401.14541 [pdf, other]

Characterization of individual charge fluctuators in Si/SiGe quantum dots

Authors: Feiyang Ye, Ammar Ellaboudy, Dylan Albrecht, Rohith Vudatha, N. Tobias Jacobson, John M. Nichol

Abstract: Electron spins in silicon quantum dots are excellent qubits due to their long coherence times, scalability, and compatibility with advanced semiconductor technology. Although high gate fidelities can be achieved with spin qubits, charge noise in the semiconductor environment still hinders further improvements. Despite the importance of charge noise, key questions about the specific nature of the f… ▽ More Electron spins in silicon quantum dots are excellent qubits due to their long coherence times, scalability, and compatibility with advanced semiconductor technology. Although high gate fidelities can be achieved with spin qubits, charge noise in the semiconductor environment still hinders further improvements. Despite the importance of charge noise, key questions about the specific nature of the fluctuators that cause charge noise remain unanswered. Here, we probe individual two-level fluctuators (TLFs) in Si/SiGe quantum dots through simple quantum-dot transport measurement and analyses based on the Allan variance and factorial hidden Markov modeling. We find that the TLF switching times depend sensitively on gate voltages, decrease with temperature, and depend on the current through a nearby quantum dot. A model for the data of the primary TLF we study indicates that it may be a bistable charge dipole near the plunger gate electrode, heated by current through the sensor dot, and experiencing state transitions driven not by direct electron-phonon coupling but through some other mechanism such as coupling to electrons passing through the sensor dot. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.13246 [pdf, other]

doi 10.18653/v1/2024.acl-long.321

SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning

Authors: Guoxin Chen, Kexin Tang, Chao Yang, Fuying Ye, Yu Qiao, Yiming Qian

Abstract: Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning throu… ▽ More Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning through supervised learning, ignoring logical dependencies between steps. Moreover, existing reinforcement learning (RL) based methods overlook the structured relationships, underutilizing the potential of RL in structured reasoning. In this paper, we propose SEER, a novel method that maximizes a structure-based return to facilitate structured reasoning and explanation. Our proposed structure-based return precisely describes the hierarchical and branching structure inherent in structured reasoning, effectively capturing the intricate relationships between different reasoning steps. In addition, we introduce a fine-grained reward function to meticulously delineate diverse reasoning steps. Extensive experiments show that SEER significantly outperforms state-of-the-art methods, achieving an absolute improvement of 6.9% over RL-based methods on EntailmentBank, a 4.4% average improvement on STREET benchmark, and exhibiting outstanding efficiency and cross-dataset generalization performance. Our code is available at https://github.com/Chen-GX/SEER. △ Less

Submitted 27 September, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: Camera ready version for ACL 2024 Main Conference

arXiv:2401.12794 [pdf, other]

Benchmarking LLMs via Uncertainty Quantification

Authors: Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu

Abstract: The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods. However, current evaluation platforms, such as the widely recognized HuggingFace open LLM leaderboard, neglect a crucial aspect -- uncertainty, which is vital for thoroughly assessing LLMs. To bridge this gap, we introduce a new benchmarking… ▽ More The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods. However, current evaluation platforms, such as the widely recognized HuggingFace open LLM leaderboard, neglect a crucial aspect -- uncertainty, which is vital for thoroughly assessing LLMs. To bridge this gap, we introduce a new benchmarking approach for LLMs that integrates uncertainty quantification. Our examination involves eight LLMs (LLM series) spanning five representative natural language processing tasks. Our findings reveal that: I) LLMs with higher accuracy may exhibit lower certainty; II) Larger-scale LLMs may display greater uncertainty compared to their smaller counterparts; and III) Instruction-finetuning tends to increase the uncertainty of LLMs. These results underscore the significance of incorporating uncertainty in the evaluation of LLMs. △ Less

Submitted 25 April, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: 25 pages, preprints

arXiv:2401.11929 [pdf, other]

Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting

Authors: Jinliang Deng, Feiyang Ye, Du Yin, Xuan Song, Ivor W. Tsang, Hui Xiong

Abstract: Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis, characterized by extensive input sequences, as opposed to the shorter spans typical of traditional approaches. While longer sequences inherently offer richer information for enhanced predictive precision, prevailing studies often respond by escalating model complexity. These intricate models can inflat… ▽ More Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis, characterized by extensive input sequences, as opposed to the shorter spans typical of traditional approaches. While longer sequences inherently offer richer information for enhanced predictive precision, prevailing studies often respond by escalating model complexity. These intricate models can inflate into millions of parameters, resulting in prohibitive parameter scales. Our study demonstrates, through both analytical and empirical evidence, that decomposition is key to containing excessive model inflation while achieving uniformly superior and robust results across various datasets. Remarkably, by tailoring decomposition to the intrinsic dynamics of time series data, our proposed model outperforms existing benchmarks, using over 99 \% fewer parameters than the majority of competing methods. Through this work, we aim to unleash the power of a restricted set of parameters by capitalizing on domain characteristics--a timely reminder that in the realm of LTSF, bigger is not invariably better. △ Less

Submitted 16 October, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

Showing 1–50 of 478 results for author: Ye, F