subscribe to arXiv mailings

Unambiguous identification of the indirect band nature of atomically thin hexagonal boron nitride

Authors: Lei Fu, Yuqing Hu, Ning Tang, Junxi Duan, Xionghui Jia, Huaiyuan Yang, Zhuoxian Li, Xiangyan Han, Guoping Li, Jianming Lu, Lun Dai, Weikun Ge, Bo Shen

Abstract: Atomically thin hexagonal boron nitride (h-BN), especially monolayer, has garnered increasing attention due to its intriguing optical and light-matter-interaction properties. However, its intrinsic optical properties and electronic band structure, have long remained elusive. In this study, near-resonance excited deep-UV photoluminescence/Raman spectroscopy and deep-UV reflectance contrast spectros… ▽ More Atomically thin hexagonal boron nitride (h-BN), especially monolayer, has garnered increasing attention due to its intriguing optical and light-matter-interaction properties. However, its intrinsic optical properties and electronic band structure, have long remained elusive. In this study, near-resonance excited deep-UV photoluminescence/Raman spectroscopy and deep-UV reflectance contrast spectroscopy are utilized to experimentally investigate the optical properties of atomically thin h-BN across various layer numbers. It is revealed that the absence of luminescence in 1-3 layers h-BN is indicative of their indirect band gap nature, rectifying previously adopted identification of a direct band gap in monolayer BN. Notably, band-edge luminescence signals and indirect bandgap absorption start to appear in 4-layer, and the luminescence intensity increases with the number of layers, suggesting that interlayer interactions and periodicity along the z-axis enhance phonon-assisted indirect bandgap transition, even in the 4-layer case, and furthermore indicating the formation process of flat bands at the K and M valleys as the periodicity along the z direction increases. Additionally, the prominent resonance Raman signals in atomically thin h-BN underscore strong electron-phonon coupling in this material. △ Less

Submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.12107 [pdf, other]

doi 10.1016/j.jss.2024.112253

Just-In-Time Software Defect Prediction via Bi-modal Change Representation Learning

Authors: Yuze Jiang, Beijun Shen, Xiaodong Gu

Abstract: For predicting software defects at an early stage, researchers have proposed just-in-time defect prediction (JIT-DP) to identify potential defects in code commits. The prevailing approaches train models to represent code changes in history commits and utilize the learned representations to predict the presence of defects in the latest commit. However, existing models merely learn editions in sourc… ▽ More For predicting software defects at an early stage, researchers have proposed just-in-time defect prediction (JIT-DP) to identify potential defects in code commits. The prevailing approaches train models to represent code changes in history commits and utilize the learned representations to predict the presence of defects in the latest commit. However, existing models merely learn editions in source code, without considering the natural language intentions behind the changes. This limitation hinders their ability to capture deeper semantics. To address this, we introduce a novel bi-modal change pre-training model called BiCC-BERT. BiCC-BERT is pre-trained on a code change corpus to learn bi-modal semantic representations. To incorporate commit messages from the corpus, we design a novel pre-training objective called Replaced Message Identification (RMI), which learns the semantic association between commit messages and code changes. Subsequently, we integrate BiCC-BERT into JIT-DP and propose a new defect prediction approach -- JIT-BiCC. By leveraging the bi-modal representations from BiCC-BERT, JIT-BiCC captures more profound change semantics. We train JIT-BiCC using 27,391 code changes and compare its performance with 8 state-of-the-art JIT-DP approaches. The results demonstrate that JIT-BiCC outperforms all baselines, achieving a 10.8% improvement in F1-score. This highlights its effectiveness in learning the bi-modal semantics for JIT-DP. △ Less

Submitted 15 October, 2024; originally announced October 2024.

Comments: Accepted by JSS (The Journal of Systems & Software)

arXiv:2410.11320 [pdf, other]

Regularized Estimation of High-Dimensional Matrix-Variate Autoregressive Models

Authors: Hangjin Jiang, Baining Shen, Yuzhou Li, Zhaoxing Gao

Abstract: Matrix-variate time series data are increasingly popular in economics, statistics, and environmental studies, among other fields. This paper develops regularized estimation methods for analyzing high-dimensional matrix-variate time series using bilinear matrix-variate autoregressive models. The bilinear autoregressive structure is widely used for matrix-variate time series, as it reduces model com… ▽ More Matrix-variate time series data are increasingly popular in economics, statistics, and environmental studies, among other fields. This paper develops regularized estimation methods for analyzing high-dimensional matrix-variate time series using bilinear matrix-variate autoregressive models. The bilinear autoregressive structure is widely used for matrix-variate time series, as it reduces model complexity while capturing interactions between rows and columns. However, when dealing with large dimensions, the commonly used iterated least-squares method results in numerous estimated parameters, making interpretation difficult. To address this, we propose two regularized estimation methods to further reduce model dimensionality. The first assumes banded autoregressive coefficient matrices, where each data point interacts only with nearby points. A two-step estimation method is used: first, traditional iterated least-squares is applied for initial estimates, followed by a banded iterated least-squares approach. A Bayesian Information Criterion (BIC) is introduced to estimate the bandwidth of the coefficient matrices. The second method assumes sparse autoregressive matrices, applying the LASSO technique for regularization. We derive asymptotic properties for both methods as the dimensions diverge and the sample size $T\rightarrow\infty$. Simulations and real data examples demonstrate the effectiveness of our methods, comparing their forecasting performance against common autoregressive models in the literature. △ Less

Submitted 15 October, 2024; originally announced October 2024.

arXiv:2410.09812 [pdf, other]

Unraveling the Potential of Large Language Models in Code Translation: How Far Are We?

Authors: Qingxiao Tao, Tingrui Yu, Xiaodong Gu, Beijun Shen

Abstract: While large language models (LLMs) exhibit state-of-the-art performance in various tasks, recent studies have revealed their struggle for code translation. This is because they haven't been extensively pre-trained with parallel multilingual code, which code translation heavily depends on. Moreover, existing benchmarks only cover a limited subset of common programming languages, and thus cannot ref… ▽ More While large language models (LLMs) exhibit state-of-the-art performance in various tasks, recent studies have revealed their struggle for code translation. This is because they haven't been extensively pre-trained with parallel multilingual code, which code translation heavily depends on. Moreover, existing benchmarks only cover a limited subset of common programming languages, and thus cannot reflect the full potential of LLMs in code translation. In this paper, we conduct a large-scale empirical study to exploit the capabilities and incapabilities of LLMs in code translation tasks. We first craft a novel benchmark called PolyHumanEval by extending HumanEval to a multilingual benchmark of 14 languages. With PolyHumanEval, we then perform over 110,000 translations with bleeding-edge code LLMs. The result shows LLMs' suboptimal performance on Python to other languages and the negligible impact of widely adopted LLM optimization techniques such as conventional pre-training and instruction tuning on code translation. To further uncover the potential of LLMs in code translation, we propose two methods: (1) intermediary translation which selects an intermediary language between the source and target ones; and (2) self-training which fine-tunes LLMs on self-generated parallel data. Evaluated with CodeLlama-13B, our approach yields an average improvement of 11.7% computation accuracy on Python-to-other translations. Notably, we interestingly find that Go can serve as a lingua franca for translating between any two studied languages. △ Less

Submitted 13 October, 2024; originally announced October 2024.

Comments: Accepted to APSEC 2024

arXiv:2410.08554 [pdf, other]

Integrated adaptive coherent LiDAR for 4D bionic vision

Authors: Ruixuan Chen, Yichen Wu, Ke Zhang, Chuxin Liu, Yikun Chen, Wencan Li, Bitao Shen, Zhaoxi Chen, Hanke Feng, Zhangfeng Ge, Yan Zhou, Zihan Tao, Weihan Xu, Yimeng Wang, Pengfei Cai, Dong Pan, Haowen Shu, Linjie Zhou, Cheng Wang, Xingjun Wang

Abstract: Light detection and ranging (LiDAR) is a ubiquitous tool to provide precise spatial awareness in various perception environments. A bionic LiDAR that can mimic human-like vision by adaptively gazing at selected regions of interest within a broad field of view is crucial to achieve high-resolution imaging in an energy-saving and cost-effective manner. However, current LiDARs based on stacking fixed… ▽ More Light detection and ranging (LiDAR) is a ubiquitous tool to provide precise spatial awareness in various perception environments. A bionic LiDAR that can mimic human-like vision by adaptively gazing at selected regions of interest within a broad field of view is crucial to achieve high-resolution imaging in an energy-saving and cost-effective manner. However, current LiDARs based on stacking fixed-wavelength laser arrays and inertial scanning have not been able to achieve the desired dynamic focusing patterns and agile scalability simultaneously. Moreover, the ability to synchronously acquire multi-dimensional physical parameters, including distance, direction, Doppler, and color, through seamless fusion between multiple sensors, still remains elusive in LiDAR. Here, we overcome these limitations and demonstrate a bio-inspired frequency-modulated continuous wave (FMCW) LiDAR system with dynamic and scalable gazing capability. Our chip-scale LiDAR system is built using hybrid integrated photonic solutions, where a frequency-chirped external cavity laser provides broad spectral tunability, while on-chip electro-optic combs with elastic channel spacing allow customizable imaging granularity. Using the dynamic zoom-in capability and the coherent FMCW scheme, we achieve a state-of-the-art resolution of 0.012 degrees, providing up to 15 times the resolution of conventional 3D LiDAR sensors, with 115 equivalent scanning lines and 4D parallel imaging. We further demonstrate cooperative sensing between our adaptive coherent LiDAR and a camera to enable high-resolution color-enhanced machine vision. △ Less

Submitted 11 October, 2024; originally announced October 2024.

arXiv:2410.00153 [pdf, other]

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution

Authors: Haiyan Zhao, Heng Zhao, Bo Shen, Ali Payani, Fan Yang, Mengnan Du

Abstract: Probing learned concepts in large language models (LLMs) is crucial for understanding how semantic knowledge is encoded internally. Training linear classifiers on probing tasks is a principle approach to denote the vector of a certain concept in the representation space. However, the single vector identified for a concept varies with both data and training, making it less robust and weakening its… ▽ More Probing learned concepts in large language models (LLMs) is crucial for understanding how semantic knowledge is encoded internally. Training linear classifiers on probing tasks is a principle approach to denote the vector of a certain concept in the representation space. However, the single vector identified for a concept varies with both data and training, making it less robust and weakening its effectiveness in real-world applications. To address this challenge, we propose an approach to approximate the subspace representing a specific concept. Built on linear probing classifiers, we extend the concept vectors into Gaussian Concept Subspace (GCS). We demonstrate GCS's effectiveness through measuring its faithfulness and plausibility across multiple LLMs with different sizes and architectures. Additionally, we use representation intervention tasks to showcase its efficacy in real-world applications such as emotion steering. Experimental results indicate that GCS concept vectors have the potential to balance steering performance and maintaining the fluency in natural language generation tasks. △ Less

Submitted 30 September, 2024; originally announced October 2024.

Comments: 28 pages, 9 figures

arXiv:2409.19574 [pdf, other]

The Devil is in the Sources! Knowledge Enhanced Cross-Domain Recommendation in an Information Bottleneck Perspective

Authors: Binbin Hu, Weifan Wang, Hanshu Wang, Ziqi Liu, Bin Shen, Yong He, Jiawei Chen

Abstract: Cross-domain Recommendation (CDR) aims to alleviate the data sparsity and the cold-start problems in traditional recommender systems by leveraging knowledge from an informative source domain. However, previously proposed CDR models pursue an imprudent assumption that the entire information from the source domain is equally contributed to the target domain, neglecting the evil part that is complete… ▽ More Cross-domain Recommendation (CDR) aims to alleviate the data sparsity and the cold-start problems in traditional recommender systems by leveraging knowledge from an informative source domain. However, previously proposed CDR models pursue an imprudent assumption that the entire information from the source domain is equally contributed to the target domain, neglecting the evil part that is completely irrelevant to users' intrinsic interest. To address this concern, in this paper, we propose a novel knowledge enhanced cross-domain recommendation framework named CoTrans, which remolds the core procedures of CDR models with: Compression on the knowledge from the source domain and Transfer of the purity to the target domain. Specifically, following the theory of Graph Information Bottleneck, CoTrans first compresses the source behaviors with the perception of information from the target domain. Then to preserve all the important information for the CDR task, the feedback signals from both domains are utilized to promote the effectiveness of the transfer procedure. Additionally, a knowledge-enhanced encoder is employed to narrow gaps caused by the non-overlapped items across separate domains. Comprehensive experiments on three widely used cross-domain datasets demonstrate that CoTrans significantly outperforms both single-domain and state-of-the-art cross-domain recommendation approaches. △ Less

Submitted 29 September, 2024; originally announced September 2024.

Comments: Accepted by CIKM 2024

arXiv:2409.14077 [pdf, other]

Pressure-dependent magnetism of the Kitaev candidate Li$_2$RhO$_3$

Authors: Bin Shen, Efrain Insuasti, Ramesh Dhakal, Friedrich Freund, Philipp Gegenwart, Stephen M. Winter, Alexander A. Tsirlin

Abstract: We use magnetization measurements under pressure along with \textit{ab initio} and cluster many-body calculations to investigate magnetism of the Kitaev candidate Li$_2$RhO$_3$. Hydrostatic compression to 4.2~GPa leads to a decrease in the magnitude of the ferromagnetic Kitaev coupling $K$ and the corresponding increase in the off-diagonal anisotropy $Γ$, whereas the experimental Curie-Weiss tempe… ▽ More We use magnetization measurements under pressure along with \textit{ab initio} and cluster many-body calculations to investigate magnetism of the Kitaev candidate Li$_2$RhO$_3$. Hydrostatic compression to 4.2~GPa leads to a decrease in the magnitude of the ferromagnetic Kitaev coupling $K$ and the corresponding increase in the off-diagonal anisotropy $Γ$, whereas the experimental Curie-Weiss temperature changes from negative to positive with the slope of +40~K/GPa. On the other hand, spin freezing persists up to at least 3.25~GPa with the almost constant freezing temperature of $5.5-6.0$~K that does not follow the large changes in the exchange couplings and indicates the likely extrinsic origin of spin freezing. Magnetic frustration in Li$_2$RhO$_3$ is mainly related to the interplay between ferromagnetic $K$ and antiferromagnetic $Γ$, along with the weakness of the third-neighbor coupling $J_3$ that would otherwise stabilize zigzag order. The small $J_3\simeq 0.1$\,meV distinguishes Li$_2$RhO$_3$ from other Kitaev candidates. △ Less

Submitted 21 September, 2024; originally announced September 2024.

Comments: Main text: 8 pages, 6 figures. Supplemental material: 3 pages, 2 figures

arXiv:2409.13558 [pdf]

doi 10.1002/advs.202406882

Tunable Anomalous Hall Effect in a Kagome Ferromagnetic Weyl Semimetal

Authors: Samuel E. Pate, Bin Wang, Yang Zhang, Bing Shen, Enke Liu, Ivar Martin, J. Samuel Jiang, Xiuquan Zhou, Duck Young Chung, Mercouri G. Kanatzidis, Ulrich Welp, Wai-Kwong Kwok, Zhi-Li Xiao

Abstract: Emerging from the intricate interplay of topology and magnetism, the giant anomalous Hall effect (AHE) is the most known topological property of the recently discovered kagome ferromagnetic Weyl semimetal Co_3Sn_2S_2 with the magnetic Co atoms arranged on a kagome lattice. Here we report that the AHE in Co_3Sn_2S_2 can be fine-tuned by an applied magnetic field orientated within ~2 degrees of the… ▽ More Emerging from the intricate interplay of topology and magnetism, the giant anomalous Hall effect (AHE) is the most known topological property of the recently discovered kagome ferromagnetic Weyl semimetal Co_3Sn_2S_2 with the magnetic Co atoms arranged on a kagome lattice. Here we report that the AHE in Co_3Sn_2S_2 can be fine-tuned by an applied magnetic field orientated within ~2 degrees of the kagome plane, while beyond this regime, it stays unchanged. Particularly, it can vanish in magnetic fields parallel to the kagome plane and even decrease in magnetic fields collinear with the spin direction. This tunable AHE can be attributed to local spin switching enabled by the geometrical frustration of the magnetic kagome lattice, revealing that spins in a kagome ferromagnet change their switching behavior as the magnetic field approaches the kagome plane. Our results also suggest a versatile way to tune the properties of a kagome magnet. △ Less

Submitted 20 September, 2024; originally announced September 2024.

Journal ref: Adv. Sci. 11, 2406882 (2024)

arXiv:2409.10628 [pdf]

Single-atom-resolved vibrational spectroscopy of a dislocation

Authors: Hailing Jiang, Tao Wang, Zhenyu Zhang, Ruochen Shi, Xifan Xu, Bowen Sheng, Fang Liu, Weikun Ge, Ping Wang, Bo Shen, Peng Gao, Lucas R Lindsay, Xinqiang Wang

Abstract: Phonon resistance from dislocation scattering is often divided into short-range core interactions and long-range strain field interactions. Using electron energy-loss spectroscopy on a GaN dislocation, we report observations of vibrational modes localized at specific core atoms (short-range) and strain-driven phonon energy shifts around the dislocation (long-range). Ab initio calculations support… ▽ More Phonon resistance from dislocation scattering is often divided into short-range core interactions and long-range strain field interactions. Using electron energy-loss spectroscopy on a GaN dislocation, we report observations of vibrational modes localized at specific core atoms (short-range) and strain-driven phonon energy shifts around the dislocation (long-range). Ab initio calculations support these findings and draw out additional details. This study reveals atomically resolved vibrational spectra of dislocations, thus offering insights for engineering improved material functionalities. △ Less

Submitted 16 September, 2024; originally announced September 2024.

arXiv:2409.09712 [pdf]

Topological Nodal Chains and Transverse Transports in Ferromagnetic Centrosymmetric Semimetal FeIn2S4

Authors: Junyan Liu, Yibo Wang, Xuebin Dong, Jinying Yang, Shen Zhang, Meng Lyu, Binbin Wang, Hongxiang Wei, Shouguo Wang, Enke Liu, Baogen Shen

Abstract: Nodal chain semimetals protected by nonsymmorphic symmetries are distinct from Dirac and Weyl semimetals, featuring unconventional topological surface states and resulting in anomalous magnetotransport properties. Here, we reveal that the ferromagnetic FeIn2S4 is a suitable nodal chain candidate in theory. Centrosymmetric FeIn2S4 with nonsymmorphic symmetries shows half-metallicity and clean band-… ▽ More Nodal chain semimetals protected by nonsymmorphic symmetries are distinct from Dirac and Weyl semimetals, featuring unconventional topological surface states and resulting in anomalous magnetotransport properties. Here, we reveal that the ferromagnetic FeIn2S4 is a suitable nodal chain candidate in theory. Centrosymmetric FeIn2S4 with nonsymmorphic symmetries shows half-metallicity and clean band-crossings with hourglass-type dispersion tracing out nodal lines. Owing to glide mirror symmetries, the nontrivial nodal loops form nodal chain, which is associated with the perpendicular glide mirror planes. These nodal chains are robust against spin-orbital interaction, giving rise to the coexistence of drumhead-type surface states and closed surface Fermi arcs. Moreover, the nodal loops protected by nonsymmorphic symmetry contribute to large anomalous Hall conductivity and the anomalous Nernst conductivity. Our results provide a platform to explore the intriguing topological state and transverse transport properties in magnetic system. △ Less

Submitted 15 September, 2024; originally announced September 2024.

Comments: 7 figs and 1 table

arXiv:2409.09709 [pdf]

Scaling the topological transport based on an effective Weyl model

Authors: Shen Zhang, Jinying Yang, Meng Lyu, Junyan Liu, Binbin Wang, Hongxiang Wei, Claudia Felser, Wenqing Zhang, Enke Liu, Baogen Shen

Abstract: Magnetic topological semimetals are increasingly fueling interests in exotic electronic-thermal physics including thermoelectrics and spintronics. To control the transports of topological carriers in such materials becomes a central issue. However, the topological bands in real materials are normally intricate, leaving obstacles to understand the transports in a physically clear way. Parallel to t… ▽ More Magnetic topological semimetals are increasingly fueling interests in exotic electronic-thermal physics including thermoelectrics and spintronics. To control the transports of topological carriers in such materials becomes a central issue. However, the topological bands in real materials are normally intricate, leaving obstacles to understand the transports in a physically clear way. Parallel to the renowned effective two-band model in magnetic field scale for semiconductors, here, an effective Weyl-band model in temperature scale was developed with pure Weyl state and a few meaningful parameters for topological semimetals. Based on the model, a universal scaling was established and subsequently verified by reported experimental transports. The essential sign regularity of anomalous Hall and Nernst transports was revealed with connection to chiralities of Weyl nodes and carrier types. Upon a double-Weyl model, a concept of Berry-curvature ferrimagnetic structure, as an analogy to the real-space magnetic structure, was further proposed and well described the emerging sign reversal of Nernst thermoelectric transports in temperature scale. Our study offers a convenient tool for scaling the Weyl-fermion-related transport physics, and promotes the modulations and applications of magnetic topological materials in future topological quantum devices. △ Less

Submitted 15 September, 2024; originally announced September 2024.

Comments: Five figs

arXiv:2409.04850 [pdf, ps, other]

Deep Computer Vision for Solar Physics Big Data: Opportunities and Challenges

Authors: Bo Shen, Marco Marena, Chenyang Li, Qin Li, Haodi Jiang, Mengnan Du, Jiajun Xu, Haimin Wang

Abstract: With recent missions such as advanced space-based observatories like the Solar Dynamics Observatory (SDO) and Parker Solar Probe, and ground-based telescopes like the Daniel K. Inouye Solar Telescope (DKIST), the volume, velocity, and variety of data have made solar physics enter a transformative era as solar physics big data (SPBD). With the recent advancement of deep computer vision, there are n… ▽ More With recent missions such as advanced space-based observatories like the Solar Dynamics Observatory (SDO) and Parker Solar Probe, and ground-based telescopes like the Daniel K. Inouye Solar Telescope (DKIST), the volume, velocity, and variety of data have made solar physics enter a transformative era as solar physics big data (SPBD). With the recent advancement of deep computer vision, there are new opportunities in SPBD for tackling problems that were previously unsolvable. However, there are new challenges arising due to the inherent characteristics of SPBD and deep computer vision models. This vision paper presents an overview of the different types of SPBD, explores new opportunities in applying deep computer vision to SPBD, highlights the unique challenges, and outlines several potential future research directions. △ Less

Submitted 7 September, 2024; originally announced September 2024.

arXiv:2408.14354 [pdf, other]

SWE-bench-java: A GitHub Issue Resolving Benchmark for Java

Authors: Daoguang Zan, Zhirong Huang, Ailun Yu, Shaoxin Lin, Yifan Shi, Wei Liu, Dong Chen, Zongshuai Qi, Hao Yu, Lei Yu, Dezhi Ran, Muhan Zeng, Bo Shen, Pan Bian, Guangtai Liang, Bei Guan, Pengjie Huang, Tao Xie, Yongji Wang, Qianxiang Wang

Abstract: GitHub issue resolving is a critical task in software engineering, recently gaining significant attention in both industry and academia. Within this task, SWE-bench has been released to evaluate issue resolving capabilities of large language models (LLMs), but has so far only focused on Python version. However, supporting more programming languages is also important, as there is a strong demand in… ▽ More GitHub issue resolving is a critical task in software engineering, recently gaining significant attention in both industry and academia. Within this task, SWE-bench has been released to evaluate issue resolving capabilities of large language models (LLMs), but has so far only focused on Python version. However, supporting more programming languages is also important, as there is a strong demand in industry. As a first step toward multilingual support, we have developed a Java version of SWE-bench, called SWE-bench-java. We have publicly released the dataset, along with the corresponding Docker-based evaluation environment and leaderboard, which will be continuously maintained and updated in the coming months. To verify the reliability of SWE-bench-java, we implement a classic method SWE-agent and test several powerful LLMs on it. As is well known, developing a high-quality multi-lingual benchmark is time-consuming and labor-intensive, so we welcome contributions through pull requests or collaboration to accelerate its iteration and refinement, paving the way for fully automated programming. △ Less

Submitted 26 August, 2024; originally announced August 2024.

Comments: This work is in progress

arXiv:2408.11341 [pdf, other]

EHL*: Memory-Budgeted Indexing for Ultrafast Optimal Euclidean Pathfinding

Authors: Jinchun Du, Bojie Shen, Muhammad Aamir Cheema

Abstract: The Euclidean Shortest Path Problem (ESPP), which involves finding the shortest path in a Euclidean plane with polygonal obstacles, is a classic problem with numerous real-world applications. The current state-of-the-art solution, Euclidean Hub Labeling (EHL), offers ultra-fast query performance, outperforming existing techniques by 1-2 orders of magnitude in runtime efficiency. However, this perf… ▽ More The Euclidean Shortest Path Problem (ESPP), which involves finding the shortest path in a Euclidean plane with polygonal obstacles, is a classic problem with numerous real-world applications. The current state-of-the-art solution, Euclidean Hub Labeling (EHL), offers ultra-fast query performance, outperforming existing techniques by 1-2 orders of magnitude in runtime efficiency. However, this performance comes at the cost of significant memory overhead, requiring up to tens of gigabytes of storage on large maps, which can limit its applicability in memory-constrained environments like mobile phones or smaller devices. Additionally, EHL's memory usage can only be determined after index construction, and while it provides a memory-runtime tradeoff, it does not fully optimize memory utilization. In this work, we introduce an improved version of EHL, called EHL*, which overcomes these limitations. A key contribution of EHL* is its ability to create an index that adheres to a specified memory budget while optimizing query runtime performance. Moreover, EHL* can leverage preknown query distributions, a common scenario in many real-world applications to further enhance runtime efficiency. Our results show that EHL* can reduce memory usage by up to 10-20 times without much impact on query runtime performance compared to EHL, making it a highly effective solution for optimal pathfinding in memory-constrained environments. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.02586 [pdf, other]

Massive MIMO-OTFS-Based Random Access for Cooperative LEO Satellite Constellations

Authors: Boxiao Shen, Yongpeng Wu, Shiqi Gong, Heng Liu, Björn Ottersten, Wenjun Zhang

Abstract: This paper investigates joint device identification, channel estimation, and symbol detection for cooperative multi-satellite-enhanced random access, where orthogonal time-frequency space modulation with the large antenna array is utilized to combat the dynamics of the terrestrial-satellite links (TSLs). We introduce the generalized complex exponential basis expansion model to parameterize TSLs, t… ▽ More This paper investigates joint device identification, channel estimation, and symbol detection for cooperative multi-satellite-enhanced random access, where orthogonal time-frequency space modulation with the large antenna array is utilized to combat the dynamics of the terrestrial-satellite links (TSLs). We introduce the generalized complex exponential basis expansion model to parameterize TSLs, thereby reducing the pilot overhead. By exploiting the block sparsity of the TSLs in the angular domain, a message passing algorithm is designed for initial channel estimation. Subsequently, we examine two cooperative modes to leverage the spatial diversity within satellite constellations: the centralized mode, where computations are performed at a high-power central server, and the distributed mode, where computations are offloaded to edge satellites with minimal signaling overhead. Specifically, in the centralized mode, device identification is achieved by aggregating backhaul information from edge satellites, and channel estimation and symbol detection are jointly enhanced through a structured approximate expectation propagation (AEP) algorithm. In the distributed mode, edge satellites share channel information and exchange soft information about data symbols, leading to a distributed version of AEP. The introduced basis expansion model for TSLs enables the efficient implementation of both centralized and distributed algorithms via fast Fourier transform. Simulation results demonstrate that proposed schemes significantly outperform conventional algorithms in terms of the activity error rate, the normalized mean squared error, and the symbol error rate. Notably, the distributed mode achieves performance comparable to the centralized mode with only two exchanges of soft information about data symbols within the constellation. △ Less

Submitted 5 August, 2024; originally announced August 2024.

Comments: This paper has been accepted by IEEE Journal on Selected Areas in Communications

arXiv:2408.01989 [pdf, other]

doi 10.1016/j.visinf.2024.07.001

JobViz: Skill-driven Visual Exploration of Job Advertisements

Authors: Ran Wang, Qianhe Chen, Yong Wang, Boyang Shen, Lewei Xiong

Abstract: Online job advertisements on various job portals or websites have become the most popular way for people to find potential career opportunities nowadays. However, the majority of these job sites are limited to offering fundamental filters such as job titles, keywords, and compensation ranges. This often poses a challenge for job seekers in efficiently identifying relevant job advertisements that a… ▽ More Online job advertisements on various job portals or websites have become the most popular way for people to find potential career opportunities nowadays. However, the majority of these job sites are limited to offering fundamental filters such as job titles, keywords, and compensation ranges. This often poses a challenge for job seekers in efficiently identifying relevant job advertisements that align with their unique skill sets amidst a vast sea of listings. Thus, we propose well-coordinated visualizations to provide job seekers with three levels of details of job information: a skill-job overview visualizes skill sets, employment posts as well as relationships between them with a hierarchical visualization design; a post exploration view leverages an augmented radar-chart glyph to represent job posts and further facilitates users' swift comprehension of the pertinent skills necessitated by respective positions; a post detail view lists the specifics of selected job posts for profound analysis and comparison. By using a real-world recruitment advertisement dataset collected from 51Job, one of the largest job websites in China, we conducted two case studies and user interviews to evaluate JobViz. The results demonstrated the usefulness and effectiveness of our approach. △ Less

Submitted 4 August, 2024; originally announced August 2024.

arXiv:2407.15438 [pdf, other]

Integrated Mode-Hop-Free Tunable Lasers at 780 nm for Chip-Scale Classical and Quantum Photonic Applications

Authors: Joshua E. Castro, Eber Nolasco-Martinez, Paolo Pintus, Zeyu Zhang, Boqiang Shen, Theodore Morin, Lillian Thiel, Trevor J. Steiner, Nicholas Lewis, Sahil D. Patel, John E. Bowers, David M. Weld, Galan Moody

Abstract: In the last decade, remarkable advances in integrated photonic technologies have enabled table-top experiments and instrumentation to be scaled down to compact chips with significant reduction in size, weight, power consumption, and cost. Here, we demonstrate an integrated continuously tunable laser in a heterogeneous gallium arsenide-on-silicon nitride (GaAs-on-SiN) platform that emits in the far… ▽ More In the last decade, remarkable advances in integrated photonic technologies have enabled table-top experiments and instrumentation to be scaled down to compact chips with significant reduction in size, weight, power consumption, and cost. Here, we demonstrate an integrated continuously tunable laser in a heterogeneous gallium arsenide-on-silicon nitride (GaAs-on-SiN) platform that emits in the far-red radiation spectrum near 780 nm, with 20 nm tuning range, <6 kHz intrinsic linewidth, and a >40 dB side-mode suppression ratio. The GaAs optical gain regions are heterogeneously integrated with low-loss SiN waveguides. The narrow linewidth lasing is achieved with an extended cavity consisting of a resonator-based Vernier mirror and a phase shifter. Utilizing synchronous tuning of the integrated heaters, we show mode-hop-free wavelength tuning over a range larger than 100 GHz (200 pm). To demonstrate the potential of the device, we investigate two illustrative applications: (i) the linear characterization of a silicon nitride microresonator designed for entangled-photon pair generation, and (ii) the absorption spectroscopy and locking to the D1 and D2 transition lines of 87-Rb. The performance of the proposed integrated laser holds promise for a broader spectrum of both classical and quantum applications in the visible range, encompassing communication, control, sensing, and computing. △ Less

Submitted 22 July, 2024; originally announced July 2024.

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 50 pages, 10 figures, 4 tables

arXiv:2407.08424 [pdf, other]

Semantic Feature Division Multiple Access for Multi-user Digital Interference Networks

Authors: Shuai Ma, Chuanhui Zhang, Bin Shen, Youlong Wu, Hang Li, Shiyin Li, Guangming Shi, Naofal Al-Dhahir

Abstract: With the ever-increasing user density and quality of service (QoS) demand,5G networks with limited spectrum resources are facing massive access challenges. To address these challenges, in this paper, we propose a novel discrete semantic feature division multiple access (SFDMA) paradigm for multi-user digital interference networks. Specifically, by utilizing deep learning technology, SFDMA extracts… ▽ More With the ever-increasing user density and quality of service (QoS) demand,5G networks with limited spectrum resources are facing massive access challenges. To address these challenges, in this paper, we propose a novel discrete semantic feature division multiple access (SFDMA) paradigm for multi-user digital interference networks. Specifically, by utilizing deep learning technology, SFDMA extracts multi-user semantic information into discrete representations in distinguishable semantic subspaces, which enables multiple users to transmit simultaneously over the same time-frequency resources. Furthermore, based on a robust information bottleneck, we design a SFDMA based multi-user digital semantic interference network for inference tasks, which can achieve approximate orthogonal transmission. Moreover, we propose a SFDMA based multi-user digital semantic interference network for image reconstruction tasks, where the discrete outputs of the semantic encoders of the users are approximately orthogonal, which significantly reduces multi-user interference. Furthermore, we propose an Alpha-Beta-Gamma (ABG) formula for semantic communications, which is the first theoretical relationship between inference accuracy and transmission power. Then, we derive adaptive power control methods with closed-form expressions for inference tasks. Extensive simulations verify the effectiveness and superiority of the proposed SFDMA. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.05690 [pdf, other]

Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations

Authors: Bowen Shen, Zheng Lin, Daren Zha, Wei Liu, Jian Luan, Bin Wang, Weiping Wang

Abstract: Structured pruning fundamentally reduces computational and memory overheads of large language models (LLMs) and offers a feasible solution for end-side LLM deployment. Structurally pruned models remain dense and high-precision, highly compatible with further tuning and compression. However, as the coarse-grained structured pruning poses large damage to the highly interconnected model, achieving a… ▽ More Structured pruning fundamentally reduces computational and memory overheads of large language models (LLMs) and offers a feasible solution for end-side LLM deployment. Structurally pruned models remain dense and high-precision, highly compatible with further tuning and compression. However, as the coarse-grained structured pruning poses large damage to the highly interconnected model, achieving a high compression ratio for scaled-up LLMs remains a challenge. In this paper, we introduce a task-agnostic structured pruning approach coupled with a compact Transformer architecture design. The proposed approach, named TransAct, reduces transitional activations inside multi-head attention (MHA) and multi-layer perceptron (MLP) modules, while preserving the inter-module activations that are sensitive to perturbations. Hence, the LLM is pruned into an intra-module low-rank architecture, significantly reducing weights, KV Cache and attention computation. TransAct is implemented on the LLaMA model and evaluated on downstream benchmarks. Results verify the optimality of our approach at high compression with respect to both efficiency and performance. Further, ablation studies reveal the strength of activation-guided iterative pruning and provide experimental analysis on the redundancy of MHA and MLP modules. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Findings of ACL 2024

arXiv:2407.00814 [pdf, other]

Privacy-Aware Spectrum Pricing and Power Control Optimization for LEO Satellite Internet-of-Things

Authors: Bowen Shen, Kwok-Yan Lam, Feng Li

Abstract: Low earth orbit (LEO) satellite systems play an important role in next generation communication networks due to their ability to provide extensive global coverage with guaranteed communications in remote areas and isolated areas where base stations cannot be cost-efficiently deployed. With the pervasive adoption of LEO satellite systems, especially in the LEO Internet-of-Things (IoT) scenarios, th… ▽ More Low earth orbit (LEO) satellite systems play an important role in next generation communication networks due to their ability to provide extensive global coverage with guaranteed communications in remote areas and isolated areas where base stations cannot be cost-efficiently deployed. With the pervasive adoption of LEO satellite systems, especially in the LEO Internet-of-Things (IoT) scenarios, their spectrum resource management requirements have become more complex as a result of massive service requests and high bandwidth demand from terrestrial terminals. For instance, when leasing the spectrum to terrestrial users and controlling the uplink transmit power, satellites collect user data for machine learning purposes, which usually are sensitive information such as location, budget and quality of service (QoS) requirement. To facilitate model training in LEO IoT while preserving the privacy of data, blockchain-driven federated learning (FL) is widely used by leveraging on a fully decentralized architecture. In this paper, we propose a hybrid spectrum pricing and power control framework for LEO IoT by combining blockchain technology and FL. We first design a local deep reinforcement learning algorithm for LEO satellite systems to learn a revenue-maximizing pricing and power control scheme. Then the agents collaborate to form a FL system. We also propose a reputation-based blockchain which is used in the global model aggregation phase of FL. Based on the reputation mechanism, a node is selected for each global training round to perform model aggregation and block generation, which can further enhance the decentralization of the network and guarantee the trust. Simulation tests are conducted to evaluate the performances of the proposed scheme. Our results show the efficiency of finding the maximum revenue scheme for LEO satellite systems while preserving the privacy of each agent. △ Less

Submitted 1 April, 2024; originally announced July 2024.

arXiv:2406.17349 [pdf, other]

Semantic Deep Hiding for Robust Unlearnable Examples

Authors: Ruohan Meng, Chenyu Yi, Yi Yu, Siyuan Yang, Bingquan Shen, Alex C. Kot

Abstract: Ensuring data privacy and protection has become paramount in the era of deep learning. Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration by adding small perturbations to data. However, such perturbations (e.g., noise, texture, color change) predominantly impact low-level features, making them vulnerable to common countermeasures. I… ▽ More Ensuring data privacy and protection has become paramount in the era of deep learning. Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration by adding small perturbations to data. However, such perturbations (e.g., noise, texture, color change) predominantly impact low-level features, making them vulnerable to common countermeasures. In contrast, semantic images with intricate shapes have a wealth of high-level features, making them more resilient to countermeasures and potential for producing robust unlearnable examples. In this paper, we propose a Deep Hiding (DH) scheme that adaptively hides semantic images enriched with high-level features. We employ an Invertible Neural Network (INN) to invisibly integrate predefined images, inherently hiding them with deceptive perturbations. To enhance data unlearnability, we introduce a Latent Feature Concentration module, designed to work with the INN, regularizing the intra-class variance of these perturbations. To further boost the robustness of unlearnable examples, we design a Semantic Images Generation module that produces hidden semantic images. By utilizing similar semantic information, this module generates similar semantic images for samples within the same classes, thereby enlarging the inter-class distance and narrowing the intra-class distance. Extensive experiments on CIFAR-10, CIFAR-100, and an ImageNet subset, against 18 countermeasures, reveal that our proposed method exhibits outstanding robustness for unlearnable examples, demonstrating its efficacy in preventing unauthorized data exploitation. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Accepted by TIFS 2024

arXiv:2406.11466 [pdf, other]

Unbounded sequential multipartite nonlocality via violation of Mermin inequality

Authors: Bang-Zhu Shen, Mao-Sheng Li

Abstract: Quantum nonlocality is a significant feature in quantum information theory, prompting recent investigations into the potential reuse of post-measurement states to uncover nonlocality among sequentially measuring observers. While prior studies primarily focused on bipartite or tripartite systems and observers with one chain, such as multiple Bobs with a single Alice or multiple Charlies with a sing… ▽ More Quantum nonlocality is a significant feature in quantum information theory, prompting recent investigations into the potential reuse of post-measurement states to uncover nonlocality among sequentially measuring observers. While prior studies primarily focused on bipartite or tripartite systems and observers with one chain, such as multiple Bobs with a single Alice or multiple Charlies with a single Alice and Bob, our work extends beyond this framework. We explore sequential nonlocality in systems comprising more parties and observer chains. Our findings reveal that in $n$-partite systems, regardless of whether it is a single-chain or double-chain scenario, there exist unbounded sequential observers capable of detecting nonlocality through violations of the Mermin inequality. In contrast to the conjecture that sequential Bell nonlocality cannot manifest with multiple Alices and Bobs in bipartite systems (i.e., the double-chain setting)[Phys. Rev. A 104, L060201 (2021)], our results suggest that increasing the number of subsystems may enable more observer chains to detect nonlocality alongside single observers. Our study advances research on sequential nonlocality, providing valuable insights into its detection across diverse scenarios. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 17 pages, 2 figures

arXiv:2406.07003 [pdf, other]

GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model

Authors: Wei Liu, Ailun Yu, Daoguang Zan, Bo Shen, Wei Zhang, Haiyan Zhao, Zhi Jin, Qianxiang Wang

Abstract: The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit less satisfactory performance on repository-level completion due to the lack of repository-specific knowledge in these LLMs. To address this problem, we propose… ▽ More The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit less satisfactory performance on repository-level completion due to the lack of repository-specific knowledge in these LLMs. To address this problem, we propose GraphCoder, a retrieval-augmented code completion framework that leverages LLMs' general code knowledge and the repository-specific knowledge via a graph-based retrieval-generation process. In particular, GraphCoder captures the context of completion target more accurately through code context graph (CCG) that consists of control-flow, data- and control-dependence between code statements, a more structured way to capture the completion target context than the sequence-based context used in existing retrieval-augmented approaches; based on CCG, GraphCoder further employs a coarse-to-fine retrieval process to locate context-similar code snippets with the completion target from the current repository. Experimental results demonstrate both the effectiveness and efficiency of GraphCoder: Compared to baseline retrieval-augmented methods, GraphCoder achieves higher exact match (EM) on average, with increases of +6.06 in code match and +6.23 in identifier match, while using less time and space. △ Less

Submitted 13 September, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.05792 [pdf]

doi 10.1063/5.0214167

Above room-temperature two-dimensional ferromagnetic half-metals in Mn-based Janus magnets

Authors: Xiang-Fan Huang, Kang-Jie Li, Zequan Wang, Shi-Bo Zhao, Bing Shen, Zu-Xin Chen, Yusheng Hou

Abstract: Two-dimensional (2D) ferromagnets and their heterostructures offer fertile grounds for designing fascinating functionalities in ultra-thin spintronic devices. Here, by first-principles calculations, we report the discovery of energetically and thermodynamically stable 2D ferromagnets with very strong inplane magnetic anisotropy in MnXY (X = S, and Se; Y = Cl, Br and I) monolayers. Remarkably, we f… ▽ More Two-dimensional (2D) ferromagnets and their heterostructures offer fertile grounds for designing fascinating functionalities in ultra-thin spintronic devices. Here, by first-principles calculations, we report the discovery of energetically and thermodynamically stable 2D ferromagnets with very strong inplane magnetic anisotropy in MnXY (X = S, and Se; Y = Cl, Br and I) monolayers. Remarkably, we find that the Curie temperatures of the ferromagnetic MnSBr, MnSI, MnSeCl, and MnSeI monolayers are as high as 271, 273, 231 and 418 K, respectively. In addition, we demonstrate that these ferromagnetic monolayers are intrinsic half-metals with large spin band gaps ranging from 2.5 eV to 3.2 eV. When spin-orbit coupling is considered in these ferromagnetic monolayers, the nature of their half-metal is almost unaffected. Finally, the strong inplane magnetic anisotropy of MnSY (Y = Br, I) and MnSeY (Y = Cl, I) monolayers originate mainly from halogen and chalcogen atoms, respectively. Our work shows 2D Janus Mn-based ferromagnetic half-metals may have appealing functionalities in high-performance spintronic applications. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 16 pages, 4 figures, accepted by Applied Physics Letters

Journal ref: Appl. Phys. Lett. 124, 252402 (2024)

arXiv:2406.05555 [pdf, ps, other]

doi 10.1109/MCOM.001.2100704

OAM-SWIPT for IoE-Driven 6G

Authors: Runyu Lyu, Wenchi Cheng, Bazhong Shen, Zhiyuan Ren, Hailin Zhang

Abstract: Simultaneous wireless information and power transfer (SWIPT), which achieves both wireless energy transfer (WET) and information transfer, is an attractive technique for future Internet of Everything (IoE) in the sixth-generation (6G) mobile communications. With SWIPT, battery-less IoE devices can be powered while communicating with other devices. Line-of-sight (LOS) RF transmission and near-field… ▽ More Simultaneous wireless information and power transfer (SWIPT), which achieves both wireless energy transfer (WET) and information transfer, is an attractive technique for future Internet of Everything (IoE) in the sixth-generation (6G) mobile communications. With SWIPT, battery-less IoE devices can be powered while communicating with other devices. Line-of-sight (LOS) RF transmission and near-field inductive coupling based transmission are typical SWIPT scenarios, which are both LOS channels and without enough degree of freedom for high spectrum efficiency as well as high energy efficiency. Due to the orthogonal wavefronts, orbital angular momentum (OAM) can facilitate the SWIPT in LOS channels. In this article, we introduce the OAM-based SWIPT as well as discuss some basic advantages and challenges for it. After introducing the OAM-based SWIPT for IoE, we first propose an OAM-based SWIPT system model with the OAM-modes assisted dynamic power splitting (DPS). Then, four basic advantages regarding the OAM-based SWIPT are reviewed with some numerical analyses for further demonstrating the advantages. Next, four challenges regarding integrating OAM into SWIPT and possible solutions are discussed. OAM technology provides multiple orthogonal streams to increase both spectrum and energy efficiencies for SWIPT, thus creating many opportunities for future WET and SWIPT researches. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 7 pages, 6 figures

Journal ref: in IEEE Communications Magazine, vol. 60, no. 3, pp. 19-25, March 2022

arXiv:2406.04902 [pdf, other]

Beyond Data, Towards Sustainability: A Sydney Case Study on Urban Digital Twins

Authors: Ammar Sohail, Bojie Shen, Muhammad Aamir Cheema, Mohammed Eunus Ali, Anwaar Ulhaq, Muhammad Ali Babar, Asama Qureshi

Abstract: As urban areas grapple with unprecedented challenges stemming from population growth and climate change, the emergence of urban digital twins offers a promising solution. This paper presents a case study focusing on Sydney's urban digital twin, a virtual replica integrating diverse real-time and historical data, including weather, crime, emissions, and traffic. Through advanced visualization and d… ▽ More As urban areas grapple with unprecedented challenges stemming from population growth and climate change, the emergence of urban digital twins offers a promising solution. This paper presents a case study focusing on Sydney's urban digital twin, a virtual replica integrating diverse real-time and historical data, including weather, crime, emissions, and traffic. Through advanced visualization and data analysis techniques, the study explores some applications of this digital twin in urban sustainability, such as spatial ranking of suburbs and automatic identification of correlations between variables. Additionally, the research delves into predictive modeling, employing machine learning to forecast traffic crash risks using environmental data, showcasing the potential for proactive interventions. The contributions of this work lie in the comprehensive exploration of a city-scale digital twin for sustainable urban planning, offering a multifaceted approach to data-driven decision-making. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.03792 [pdf, other]

Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning

Authors: Naibin Gu, Peng Fu, Xiyu Liu, Bowen Shen, Zheng Lin, Weiping Wang

Abstract: Parameter-efficient fine-tuning (PEFT) has emerged as the predominant technique for fine-tuning in the era of large language models. However, existing PEFT methods still have inadequate training efficiency. Firstly, the utilization of large-scale foundation models during the training process is excessively redundant for certain fine-tuning tasks. Secondly, as the model size increases, the growth i… ▽ More Parameter-efficient fine-tuning (PEFT) has emerged as the predominant technique for fine-tuning in the era of large language models. However, existing PEFT methods still have inadequate training efficiency. Firstly, the utilization of large-scale foundation models during the training process is excessively redundant for certain fine-tuning tasks. Secondly, as the model size increases, the growth in trainable parameters of empirically added PEFT modules becomes non-negligible and redundant, leading to inefficiency. To achieve task-specific efficient fine-tuning, we propose the Light-PEFT framework, which includes two methods: Masked Early Pruning of the Foundation Model and Multi-Granularity Early Pruning of PEFT. The Light-PEFT framework allows for the simultaneous estimation of redundant parameters in both the foundation model and PEFT modules during the early stage of training. These parameters can then be pruned for more efficient fine-tuning. We validate our approach on GLUE, SuperGLUE, QA tasks, and various models. With Light-PEFT, parameters of the foundation model can be pruned by up to over 40%, while still controlling trainable parameters to be only 25% of the original PEFT method. Compared to utilizing the PEFT method directly, Light-PEFT achieves training and inference speedup, reduces memory usage, and maintains comparable performance and the plug-and-play feature of PEFT. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Findings of ACL 2024

arXiv:2405.17852 [pdf, other]

doi 10.1007/s11433-024-2422-2

Advances in laser-plasma interactions using intense vortex laser beams

Authors: Yin Shi, Xiaomei Zhang, Alexey Arefiev, Baifei Shen

Abstract: Low-intensity light beams carrying Orbital Angular Momentum (OAM), commonly known as vortex beams, have garnered significant attention due to promising applications in areas ranging from optical trapping to communication. In recent years, there has been a surge in global research exploring the potential of high-intensity vortex laser beams and specifically their interactions with plasmas. This pap… ▽ More Low-intensity light beams carrying Orbital Angular Momentum (OAM), commonly known as vortex beams, have garnered significant attention due to promising applications in areas ranging from optical trapping to communication. In recent years, there has been a surge in global research exploring the potential of high-intensity vortex laser beams and specifically their interactions with plasmas. This paper provides a comprehensive review of recent advances in this area. Compared to conventional laser beams, intense vortex beams exhibit unique properties such as twisted phase fronts, OAM delivery, hollow intensity distribution, and spatially isolated longitudinal fields. These distinct characteristics give rise to a multitude of rich phenomena, profoundly influencing laser-plasma interactions and offering diverse applications. The paper also discusses future prospects and identifies promising general research areas involving vortex beams. These areas include low-divergence particle acceleration, instability suppression, high-energy photon delivery with OAM, and the generation of strong magnetic fields. With growing scientific interest and application potential, the study of intense vortex lasers is poised for rapid development in the coming years. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Journal ref: SCIENCE CHINA Physics, Mechanics & Astronomy (2024)

arXiv:2405.12754 [pdf, other]

Global-local Fourier Neural Operator for Accelerating Coronal Magnetic Field Model

Authors: Yutao Du, Qin Li, Raghav Gnanasambandam, Mengnan Du, Haimin Wang, Bo Shen

Abstract: Exploring the outer atmosphere of the sun has remained a significant bottleneck in astrophysics, given the intricate magnetic formations that significantly influence diverse solar events. Magnetohydrodynamics (MHD) simulations allow us to model the complex interactions between the sun's plasma, magnetic fields, and the surrounding environment. However, MHD simulation is extremely time-consuming, t… ▽ More Exploring the outer atmosphere of the sun has remained a significant bottleneck in astrophysics, given the intricate magnetic formations that significantly influence diverse solar events. Magnetohydrodynamics (MHD) simulations allow us to model the complex interactions between the sun's plasma, magnetic fields, and the surrounding environment. However, MHD simulation is extremely time-consuming, taking days or weeks for simulation. The goal of this study is to accelerate coronal magnetic field simulation using deep learning, specifically, the Fourier Neural Operator (FNO). FNO has been proven to be an ideal tool for scientific computing and discovery in the literature. In this paper, we proposed a global-local Fourier Neural Operator (GL-FNO) that contains two branches of FNOs: the global FNO branch takes downsampled input to reconstruct global features while the local FNO branch takes original resolution input to capture fine details. The performance of the GLFNO is compared with state-of-the-art deep learning methods, including FNO, U-NO, U-FNO, Vision Transformer, CNN-RNN, and CNN-LSTM, to demonstrate its accuracy, computational efficiency, and scalability. Furthermore, physics analysis from domain experts is also performed to demonstrate the reliability of GL-FNO. The results demonstrate that GL-FNO not only accelerates the MHD simulation (a few seconds for prediction, more than \times 20,000 speed up) but also provides reliable prediction capabilities, thus greatly contributing to the understanding of space weather dynamics. Our code implementation is available at https://github.com/Yutao-0718/GL-FNO △ Less

Submitted 8 September, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: 10 pages

arXiv:2405.10344 [pdf, ps, other]

Feasibility of Nash-Moser iteration for Cheng-Yau-type gradient estimates of nonlinear equations on complete Riemannian manifolds

Authors: Bin Shen, Yuhan Zhu

Abstract: In this manuscript, we employ the Nash-Moser iteration technique to determine a condition under which the positive solution $u$ of the generalized nonlinear Poisson equation $$\operatorname{div} (\varphi(|\nabla u|^2)\nabla u) + ψ(u^2)u = 0,$$ on a complete Riemannian manifold with Ricci curvature bounded from below can be shown to satisfy a Cheng-Yau-type gradient estimate. We define a class of… ▽ More In this manuscript, we employ the Nash-Moser iteration technique to determine a condition under which the positive solution $u$ of the generalized nonlinear Poisson equation $$\operatorname{div} (\varphi(|\nabla u|^2)\nabla u) + ψ(u^2)u = 0,$$ on a complete Riemannian manifold with Ricci curvature bounded from below can be shown to satisfy a Cheng-Yau-type gradient estimate. We define a class of $\varphi$-Laplacian operators by $Δ_{\varphi}(u):=\operatorname{div} (\varphi(|\nabla u|^2)\nabla u)$, where $\varphi$ is a $C^2$ function under some certain growth conditions. This can be regarded as a natural generalization of the $p$-Laplacian, the $(p,q)$-Laplacian and the exponential Laplacian, as well as having a close connection to the prescribed mean curvature problem. We illustrate the feasibility of applying the Nash-Moser iteration for such Poisson equation to get the Cheng-Yau-type gradient estimates in different cases with various $\varphi$ and $ψ$. Utilizing these estimates, we proves the related Harnack inequalities and a series of Liouville theorems. Our results can cover a wide range of quasilinear Laplace operator (e.g. $p$-Laplacian for $\varphi(t)=t^{p/2-1}$), and Lichnerowicz-type nonlinear equations (i.e. $ψ(t) = At^{p} + Bt^{q} + Ct\log t + D$). △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.10216 [pdf, other]

Low-Rank Adaptation of Time Series Foundational Models for Out-of-Domain Modality Forecasting

Authors: Divij Gupta, Anubhav Bhatti, Suraj Parmar, Chen Dan, Yuwei Liu, Bingjie Shen, San Lee

Abstract: Low-Rank Adaptation (LoRA) is a widely used technique for fine-tuning large pre-trained or foundational models across different modalities and tasks. However, its application to time series data, particularly within foundational models, remains underexplored. This paper examines the impact of LoRA on contemporary time series foundational models: Lag-Llama, MOIRAI, and Chronos. We demonstrate LoRA'… ▽ More Low-Rank Adaptation (LoRA) is a widely used technique for fine-tuning large pre-trained or foundational models across different modalities and tasks. However, its application to time series data, particularly within foundational models, remains underexplored. This paper examines the impact of LoRA on contemporary time series foundational models: Lag-Llama, MOIRAI, and Chronos. We demonstrate LoRA's fine-tuning potential for forecasting the vital signs of sepsis patients in intensive care units (ICUs), emphasizing the models' adaptability to previously unseen, out-of-domain modalities. Integrating LoRA aims to enhance forecasting performance while reducing inefficiencies associated with fine-tuning large models on limited domain-specific data. Our experiments show that LoRA fine-tuning of time series foundational models significantly improves forecasting, achieving results comparable to state-of-the-art models trained from scratch on similar modalities. We conduct comprehensive ablation studies to demonstrate the trade-offs between the number of tunable parameters and forecasting performance and assess the impact of varying LoRA matrix ranks on model performance. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 5 pages, 3 figures. This work has been submitted to the ACM for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2405.07303 [pdf, other]

Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment

Authors: L. T. Yang, S. K. Liu, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

Abstract: We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio… ▽ More We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axions with mass up to 100 eV/$c^2$. Within the hadronic model of KSVZ, our results exclude axion mass $>5.3~\rm{eV}/c^2$ at 95\% C.L. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 7 pages, 5 figures

arXiv:2405.06995 [pdf, other]

Benchmarking Cross-Domain Audio-Visual Deception Detection

Authors: Xiaobao Guo, Zitong Yu, Nithish Muthuchamy Selvaraj, Bingquan Shen, Adams Wai-Kin Kong, Alex C. Kot

Abstract: Automated deception detection is crucial for assisting humans in accurately assessing truthfulness and identifying deceptive behavior. Conventional contact-based techniques, like polygraph devices, rely on physiological signals to determine the authenticity of an individual's statements. Nevertheless, recent developments in automated deception detection have demonstrated that multimodal features d… ▽ More Automated deception detection is crucial for assisting humans in accurately assessing truthfulness and identifying deceptive behavior. Conventional contact-based techniques, like polygraph devices, rely on physiological signals to determine the authenticity of an individual's statements. Nevertheless, recent developments in automated deception detection have demonstrated that multimodal features derived from both audio and video modalities may outperform human observers on publicly available datasets. Despite these positive findings, the generalizability of existing audio-visual deception detection approaches across different scenarios remains largely unexplored. To close this gap, we present the first cross-domain audio-visual deception detection benchmark, that enables us to assess how well these methods generalize for use in real-world scenarios. We used widely adopted audio and visual features and different architectures for benchmarking, comparing single-to-single and multi-to-single domain generalization performance. To further exploit the impacts using data from multiple source domains for training, we investigate three types of domain sampling strategies, including domain-simultaneous, domain-alternating, and domain-by-domain for multi-to-single domain generalization evaluation. We also propose an algorithm to enhance the generalization performance by maximizing the gradient inner products between modality encoders, named ``MM-IDGM". Furthermore, we proposed the Attention-Mixer fusion method to improve performance, and we believe that this new cross-domain benchmark will facilitate future research in audio-visual deception detection. △ Less

Submitted 5 October, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

Comments: 12 pages

arXiv:2405.01714 [pdf, other]

Interpretable Vital Sign Forecasting with Model Agnostic Attention Maps

Authors: Yuwei Liu, Chen Dan, Anubhav Bhatti, Bingjie Shen, Divij Gupta, Suraj Parmar, San Lee

Abstract: Sepsis is a leading cause of mortality in intensive care units (ICUs), representing a substantial medical challenge. The complexity of analyzing diverse vital signs to predict sepsis further aggravates this issue. While deep learning techniques have been advanced for early sepsis prediction, their 'black-box' nature obscures the internal logic, impairing interpretability in critical settings like… ▽ More Sepsis is a leading cause of mortality in intensive care units (ICUs), representing a substantial medical challenge. The complexity of analyzing diverse vital signs to predict sepsis further aggravates this issue. While deep learning techniques have been advanced for early sepsis prediction, their 'black-box' nature obscures the internal logic, impairing interpretability in critical settings like ICUs. This paper introduces a framework that combines a deep learning model with an attention mechanism that highlights the critical time steps in the forecasting process, thus improving model interpretability and supporting clinical decision-making. We show that the attention mechanism could be adapted to various black box time series forecasting models such as N-HiTS and N-BEATS. Our method preserves the accuracy of conventional deep learning models while enhancing interpretability through attention-weight-generated heatmaps. We evaluated our model on the eICU-CRD dataset, focusing on forecasting vital signs for sepsis patients. We assessed its performance using mean squared error (MSE) and dynamic time warping (DTW) metrics. We explored the attention maps of N-HiTS and N-BEATS, examining the differences in their performance and identifying crucial factors influencing vital sign forecasting. △ Less

Submitted 21 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: 8 pages, 4 figures

arXiv:2404.19320 [pdf, ps, other]

Strong enhancement of magnetic coercivity induced by uniaxial stress

Authors: Bin Shen, Franziska Breitner, Philipp Gegenwart, Anton Jesche

Abstract: The performance of permanent magnets is intricately tied to their magnetic hysteresis loop. In this study, we investigate the heavy-fermion ferromagnet CeAgSb$_2$ through magnetization measurements under uniaxial stress. We observe a 2400 % increase in magnetic coercivity with just a modest stress of approximately 1 kbar. This effect persists even after pressure release, attributable to stress-ind… ▽ More The performance of permanent magnets is intricately tied to their magnetic hysteresis loop. In this study, we investigate the heavy-fermion ferromagnet CeAgSb$_2$ through magnetization measurements under uniaxial stress. We observe a 2400 % increase in magnetic coercivity with just a modest stress of approximately 1 kbar. This effect persists even after pressure release, attributable to stress-induced defects that efficiently pin domain walls. Other magnetic properties such as ordering temperature and saturation moment exhibit only weak pressure dependencies and display full reversibility. Our findings offer a promising route for increasing coercive field strength and enhancing the energy product in ferromagnetic materials and are potentially applicable to a broad spectrum of commercial or emerging magnetic applications. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: Main text: 6 pages, 3 figures. Supplemental material: 3 pages, 3 figures

arXiv:2404.18439 [pdf, other]

$ν$-DBA: Neural Implicit Dense Bundle Adjustment Enables Image-Only Driving Scene Reconstruction

Authors: Yunxuan Mao, Bingqi Shen, Yifei Yang, Kai Wang, Rong Xiong, Yiyi Liao, Yue Wang

Abstract: The joint optimization of the sensor trajectory and 3D map is a crucial characteristic of bundle adjustment (BA), essential for autonomous driving. This paper presents $ν$-DBA, a novel framework implementing geometric dense bundle adjustment (DBA) using 3D neural implicit surfaces for map parametrization, which optimizes both the map surface and trajectory poses using geometric error guided by den… ▽ More The joint optimization of the sensor trajectory and 3D map is a crucial characteristic of bundle adjustment (BA), essential for autonomous driving. This paper presents $ν$-DBA, a novel framework implementing geometric dense bundle adjustment (DBA) using 3D neural implicit surfaces for map parametrization, which optimizes both the map surface and trajectory poses using geometric error guided by dense optical flow prediction. Additionally, we fine-tune the optical flow model with per-scene self-supervision to further improve the quality of the dense mapping. Our experimental results on multiple driving scene datasets demonstrate that our method achieves superior trajectory optimization and dense reconstruction accuracy. We also investigate the influences of photometric error and different neural geometric priors on the performance of surface reconstruction and novel view synthesis. Our method stands as a significant step towards leveraging neural implicit representations in dense bundle adjustment for more accurate trajectories and detailed environmental mapping. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.11987 [pdf, other]

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Authors: Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, Despoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, Leonidas Guibas

Abstract: We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos. Our focus lies in capturing coherent spatial placement between pairs of individuals across varying degrees of engagement. MultiPhys, being physically aware, exhibits robustness to jittering and occlusions, and effectively eliminates penetration issues between the two individuals. We devise a pipelin… ▽ More We introduce MultiPhys, a method designed for recovering multi-person motion from monocular videos. Our focus lies in capturing coherent spatial placement between pairs of individuals across varying degrees of engagement. MultiPhys, being physically aware, exhibits robustness to jittering and occlusions, and effectively eliminates penetration issues between the two individuals. We devise a pipeline in which the motion estimated by a kinematic-based method is fed into a physics simulator in an autoregressive manner. We introduce distinct components that enable our model to harness the simulator's properties without compromising the accuracy of the kinematic estimates. This results in final motion estimates that are both kinematically coherent and physically compliant. Extensive evaluations on three challenging datasets characterized by substantial inter-person interaction show that our method significantly reduces errors associated with penetration and foot skating, while performing competitively with the state-of-the-art on motion accuracy and smoothness. Results and code can be found on our project page (http://www.iri.upc.edu/people/nugrinovic/multiphys/). △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.09793 [pdf, other]

First Search for Light Fermionic Dark Matter Absorption on Electrons Using Germanium Detector in CDEX-10 Experiment

Authors: J. X. Liu, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

Abstract: We present the first results of the search for sub-MeV fermionic dark matter absorbed by electron targets of Germanium using the 205.4~kg$\cdot$day data collected by the CDEX-10 experiment, with the analysis threshold of 160~eVee. No significant dark matter (DM) signals over the background are observed. Results are presented as limits on the cross section of DM--electron interaction. We present ne… ▽ More We present the first results of the search for sub-MeV fermionic dark matter absorbed by electron targets of Germanium using the 205.4~kg$\cdot$day data collected by the CDEX-10 experiment, with the analysis threshold of 160~eVee. No significant dark matter (DM) signals over the background are observed. Results are presented as limits on the cross section of DM--electron interaction. We present new constraints of cross section in the DM range of 0.1--10 keV/$c^2$ for vector and axial-vector interaction. The upper limit on the cross section is set to be $\rm 5.5\times10^{-46}~cm^2$ for vector interaction, and $\rm 1.8\times10^{-46}~cm^2$ for axial-vector interaction at DM mass of 5 keV/$c^2$. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 6 pages, 4 figures

arXiv:2404.08947 [pdf, other]

Zero-Shot Code Representation Learning via Prompt Tuning

Authors: Nan Cui, Xiaodong Gu, Beijun Shen

Abstract: Learning code representations has been the core prerequisite of many software engineering tasks such as code clone detection and code generation. State-of-the-art program representation techniques mainly utilize pre-trained language models (PLMs) such as CodeBERT. A Transformer encoder is firstly pre-trained on a large-scale code corpus to acquire general knowledge about source code. The pre-train… ▽ More Learning code representations has been the core prerequisite of many software engineering tasks such as code clone detection and code generation. State-of-the-art program representation techniques mainly utilize pre-trained language models (PLMs) such as CodeBERT. A Transformer encoder is firstly pre-trained on a large-scale code corpus to acquire general knowledge about source code. The pre-trained model is then fine-tuned on specific tasks using an amount of labeled data. However, gathering training samples for the downstream tasks can be prohibitively expensive and impractical for domain-specific languages or project-specific tasks. Besides, pre-training and downstream tasks are usually heterogeneous, which makes it difficult to fully explore the knowledge learned during pre-training. In this paper, we propose Zecoler, a zero-shot approach for learning code representations. Zecoler is built upon a pre-trained programming language model. In order to elicit knowledge from the PLMs efficiently, Zecoler casts the downstream tasks to the same form of pre-training objectives by inserting train-able prompts into the original input. These prompts can guide PLMs on how to generate better results. Subsequently, we employ the prompt tuning technique to search for the optimal prompts for PLMs automatically. This enables the representation model to efficiently fit the downstream tasks through fine-tuning on the dataset in source language domain and then reuse the pre-trained knowledge for the target domain in a zero-shot style. We evaluate Zecoler in five code intelligence tasks including code clone detection, code search, method name prediction, code summarization, and code generation. The results show that our approach significantly outperforms baseline models under the zero-shot setting. △ Less

Submitted 13 April, 2024; originally announced April 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2204.08360

arXiv:2404.08241 [pdf]

Adaptive Anomaly Detection Disruption Prediction Starting from First Discharge on Tokamak

Authors: Xinkun Ai, Wei Zheng, Ming Zhang, Yonghua Ding, Dalong Chen, Zhongyong Chen, Bihao Guo, Chengshuo Shen, Nengchao Wang, Zhoujun Yang, Zhipeng Chen, Yuan Pan, Biao Shen, Binjia Xiao

Abstract: Plasma disruption presents a significant challenge in tokamak fusion, where it can cause severe damage and economic losses. Current disruption predictors mainly rely on data-driven methods, requiring extensive discharge data for training. However, future tokamaks require disruption prediction from the first shot, posing challenges of data scarcity during the early operation period. In this period… ▽ More Plasma disruption presents a significant challenge in tokamak fusion, where it can cause severe damage and economic losses. Current disruption predictors mainly rely on data-driven methods, requiring extensive discharge data for training. However, future tokamaks require disruption prediction from the first shot, posing challenges of data scarcity during the early operation period. In this period disruption prediction aims to support safe exploration of operation range and accumulate necessary data to develop advanced prediction models. Thus, predictors must adapt to evolving plasma environments during this exploration phase. To address these issues, this study proposes a cross-tokamak adaptive deployment method using the Enhanced Convolutional Autoencoder Anomaly Detection (E-CAAD) predictor, enabling disruption prediction from the first shot of new devices. Experimental results indicate the ability of E-CAAD model trained on existing devices to effectively differentiate between disruption precursors and non-disruption samples on new devices, proving the feasibility of model cross-device transfer. Building upon this, adaptive learning from scratch and threshold adaptive adjustment strategies are proposed to achieve model cross-device transfer. The adaptive learning from scratch strategy enables the predictor to use scarce data during the early operation of the new device while rapidly adapting to changes in operation environment. The threshold adaptive adjustment strategy addresses the challenge of selecting warning thresholds on new devices where validation set is lacking, ensuring that the warning thresholds adapt to changes in the operation environment. Finally, experiments transferring the model from J-TEXT to EAST exhibit comparable performance to EAST models trained with ample data, achieving a TPR of 85.88% and a FPR of 6.15%, with a 20ms reserved MGI system reaction time. △ Less

Submitted 26 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Comments: 18 pages, 7 figures

arXiv:2404.06819 [pdf, other]

Enc2DB: A Hybrid and Adaptive Encrypted Query Processing Framework

Authors: Hui Li, Jingwen Shi, Qi Tian, Zheng Li, Yan Fu, Bingqing Shen, Yaofeng Tu

Abstract: As cloud computing gains traction, data owners are outsourcing their data to cloud service providers (CSPs) for Database Service (DBaaS), bringing in a deviation of data ownership and usage, and intensifying privacy concerns, especially with potential breaches by hackers or CSP insiders. To address that, encrypted database services propose encrypting every tuple and query statement before submitti… ▽ More As cloud computing gains traction, data owners are outsourcing their data to cloud service providers (CSPs) for Database Service (DBaaS), bringing in a deviation of data ownership and usage, and intensifying privacy concerns, especially with potential breaches by hackers or CSP insiders. To address that, encrypted database services propose encrypting every tuple and query statement before submitting to the CSP, ensuring data confidentiality when the CSP is honest-but-curious, or even compromised. Existing solutions either employ property preserving cryptography schemes, which can perform certain operations over ciphertext without decrypting the data over the CSP, or utilize trusted execution environment (TEE) to safeguard data and computations from the CSP. Based on these efforts, we introduce Enc2DB, a novel secure database system, following a hybrid strategy on PostgreSQL and openGauss. We present a micro-benchmarking test and self-adaptive mode switch strategy that can dynamically choose the best execution path (cryptography or TEE) to answer a given query. Besides, we also design and implement a ciphertext index compatible with native cost model and query optimizers to accelerate query processing. Empirical study over TPC-C test justifies that Enc2DB outperforms pure TEE and cryptography solutions, and our ciphertext index implementation also outperforms the state-of-the-art cryptographic-based system. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 33 pages,33 figures, DASAFAA24

arXiv:2404.00659 [pdf, other]

Sign-reversal Anomalous Hall effect driven by a magnetic transition in Cr$_{7-δ}$Te$_8$

Authors: Bowen Chen, Xiaokai Wu, Zhiyu Liao, Zhendong Fu, Bing Xu, Meng Wang, Bing Shen

Abstract: The search for exotic spin configurations and related novel transport properties continues to be fueled by the promise of new electronic states and outstanding candidate components for spintronic applications. In layered Cr$_{7-δ}$Te$_8$, the applied field drives a before unreported magnetic transition revealed by the alternating current magnetic susceptibility measurements around room temperature… ▽ More The search for exotic spin configurations and related novel transport properties continues to be fueled by the promise of new electronic states and outstanding candidate components for spintronic applications. In layered Cr$_{7-δ}$Te$_8$, the applied field drives a before unreported magnetic transition revealed by the alternating current magnetic susceptibility measurements around room temperature. This observed magnetic transition results in a sign change for the anomalous Hall effect which exhibits non-monotonous temperature dependence. The prominent topological Hall effect (THE) with a large value of 1$μΩ\cdot cm$ has been observed without breaking the inversion symmetry for Cr$_{7-δ}$Te$_8$. This robust THE can persist up to room temperature attributed to the nonzero fluctuation-driven scalar spin chirality. The complicated interactions of long-range and short-range magnetic orders lead to rich exotic magnetic states with related novel transport properties in Cr$_{7-δ}$Te$_8$. △ Less

Submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.17085 [pdf, other]

doi 10.1103/PhysRevB.109.224402

Magnetic versus nonmagnetic polymorphs of RuBr$_3$ under pressure

Authors: Bin Shen, Victoria A. Ginga, Angel M. Arévalo-López, Gaston Garbarino, Ece Uykur, Marcos Goncalves-Faria, Prashanta K. Mukharjee, Philipp Gegenwart, Alexander A. Tsirlin

Abstract: Pressure evolution of the crystal structure and magnetism of the honeycomb $α$-RuBr$_3$ is studied using high-pressure x-ray diffraction, magnetometry, and density-functional band-structure calculations. Hydrostatic compression transforms antiferromagnetic $α$-RuBr$_3$ ($R\bar 3$) into paramagnetic $α'$-RuBr$_3$ ($P\bar 1$) where short Ru-Ru bonds cause magnetism collapse above 1.3 GPa at 0 K and… ▽ More Pressure evolution of the crystal structure and magnetism of the honeycomb $α$-RuBr$_3$ is studied using high-pressure x-ray diffraction, magnetometry, and density-functional band-structure calculations. Hydrostatic compression transforms antiferromagnetic $α$-RuBr$_3$ ($R\bar 3$) into paramagnetic $α'$-RuBr$_3$ ($P\bar 1$) where short Ru-Ru bonds cause magnetism collapse above 1.3 GPa at 0 K and 2.5 GPa at 295 K. Below this critical pressure, the Néel temperature of $α$-RuBr$_3$ increases with the slope of 1.8 K/GPa. Pressure tunes $α$-RuBr$_3$ away from the Kitaev limit, whereas increased third-neighbor in-plane coupling and interlayer coupling lead to a further stabilization of the collinear zigzag state. Both $α$- and $α'$-RuBr$_3$ are metastable at ambient pressure, but their transformation into the thermodynamically stable $β$-polymorph is kinetically hindered at room temperature. △ Less

Submitted 5 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 8 pages + Supplemental Material: published version

Journal ref: Phys. Rev. B 109, 224402 (2024)

arXiv:2403.16443 [pdf, other]

CodeS: Natural Language to Code Repository via Multi-Layer Sketch

Authors: Daoguang Zan, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei Guan, Zhiguang Yang, Yongji Wang, Qianxiang Wang, Lizhen Cui

Abstract: The impressive performance of large language models (LLMs) on code-related tasks has shown the potential of fully automated software development. In light of this, we introduce a new software engineering task, namely Natural Language to code Repository (NL2Repo). This task aims to generate an entire code repository from its natural language requirements. To address this task, we propose a simple y… ▽ More The impressive performance of large language models (LLMs) on code-related tasks has shown the potential of fully automated software development. In light of this, we introduce a new software engineering task, namely Natural Language to code Repository (NL2Repo). This task aims to generate an entire code repository from its natural language requirements. To address this task, we propose a simple yet effective framework CodeS, which decomposes NL2Repo into multiple sub-tasks by a multi-layer sketch. Specifically, CodeS includes three modules: RepoSketcher, FileSketcher, and SketchFiller. RepoSketcher first generates a repository's directory structure for given requirements; FileSketcher then generates a file sketch for each file in the generated structure; SketchFiller finally fills in the details for each function in the generated file sketch. To rigorously assess CodeS on the NL2Repo task, we carry out evaluations through both automated benchmarking and manual feedback analysis. For benchmark-based evaluation, we craft a repository-oriented benchmark, SketchEval, and design an evaluation metric, SketchBLEU. For feedback-based evaluation, we develop a VSCode plugin for CodeS and engage 30 participants in conducting empirical studies. Extensive experiments prove the effectiveness and practicality of CodeS on the NL2Repo task. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: https://github.com/NL2Code/CodeS

arXiv:2403.12032 [pdf, other]

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

Authors: Hansheng Chen, Ruoxi Shi, Yulin Liu, Bokui Shen, Jiayuan Gu, Gordon Wetzstein, Hao Su, Leonidas Guibas

Abstract: Open-domain 3D object synthesis has been lagging behind image synthesis due to limited data and higher computational complexity. To bridge this gap, recent works have investigated multi-view diffusion but often fall short in either 3D consistency, visual quality, or efficiency. This paper proposes MVEdit, which functions as a 3D counterpart of SDEdit, employing ancestral sampling to jointly denois… ▽ More Open-domain 3D object synthesis has been lagging behind image synthesis due to limited data and higher computational complexity. To bridge this gap, recent works have investigated multi-view diffusion but often fall short in either 3D consistency, visual quality, or efficiency. This paper proposes MVEdit, which functions as a 3D counterpart of SDEdit, employing ancestral sampling to jointly denoise multi-view images and output high-quality textured meshes. Built on off-the-shelf 2D diffusion models, MVEdit achieves 3D consistency through a training-free 3D Adapter, which lifts the 2D views of the last timestep into a coherent 3D representation, then conditions the 2D views of the next timestep using rendered views, without uncompromising visual quality. With an inference time of only 2-5 minutes, this framework achieves better trade-off between quality and speed than score distillation. MVEdit is highly versatile and extendable, with a wide range of applications including text/image-to-3D generation, 3D-to-3D editing, and high-quality texture synthesis. In particular, evaluations demonstrate state-of-the-art performance in both image-to-3D and text-guided texture generation tasks. Additionally, we introduce a method for fine-tuning 2D latent diffusion models on small 3D datasets with limited resources, enabling fast low-resolution text-to-3D initialization. △ Less

Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: V2 note: Fix missing acknowledgements. Project page: https://lakonik.github.io/mvedit

arXiv:2403.11098 [pdf]

Single-Shot Single-Beam Coherent Raman Scattering Thermometry Based on Air Lasing

Authors: Xu Lu, Yewei Chen, Francesco Mazza, Siyi He, Zihan Li, Shunlin Huang, Quanjun Wang, Ning Zhang, Bo Shen, Yuzhu Wu, Jinping Yao, Ya Cheng

Abstract: Thermometric techniques with high accuracy, fast response speed and ease of implementation are desirable for the study of dynamic combustion environments, transient reacting flows, and non-equilibrium plasmas. Herein, single-shot single-beam coherent Raman scattering (SS-CRS) thermometry is developed, for the first time to our knowledge, by using air lasing as a probe. It's proved that the air-las… ▽ More Thermometric techniques with high accuracy, fast response speed and ease of implementation are desirable for the study of dynamic combustion environments, transient reacting flows, and non-equilibrium plasmas. Herein, single-shot single-beam coherent Raman scattering (SS-CRS) thermometry is developed, for the first time to our knowledge, by using air lasing as a probe. It's proved that the air-lasing-assisted CRS signal has a high signal-to-noise ratio enabling single-shot measurements at a 1 kHz repetition rate. The SS-CRS thermometry consistently exhibits precision better than 2% at different temperatures, but the inaccuracy grows with the increase in temperature. The high detection precision, 1 kHz acquisition rate and easy-to-implement single-beam scheme are achieved thanks to the unique temporal, spectral and spatial characteristics of air lasing. This work opens a novel avenue for high-speed CRS thermometry, holding tremendous potential for fast diagnostics of transient reacting flows and plasmas. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: 15 pages, 4 figures

arXiv:2403.10717 [pdf, other]

Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency

Authors: Soumyadeep Pal, Yuguang Yao, Ren Wang, Bingquan Shen, Sijia Liu

Abstract: Modern machine learning (ML) systems demand substantial training data, often resorting to external sources. Nevertheless, this practice renders them vulnerable to backdoor poisoning attacks. Prior backdoor defense strategies have primarily focused on the identification of backdoored models or poisoned data characteristics, typically operating under the assumption of access to clean data. In this w… ▽ More Modern machine learning (ML) systems demand substantial training data, often resorting to external sources. Nevertheless, this practice renders them vulnerable to backdoor poisoning attacks. Prior backdoor defense strategies have primarily focused on the identification of backdoored models or poisoned data characteristics, typically operating under the assumption of access to clean data. In this work, we delve into a relatively underexplored challenge: the automatic identification of backdoor data within a poisoned dataset, all under realistic conditions, i.e., without the need for additional clean data or without manually defining a threshold for backdoor detection. We draw an inspiration from the scaled prediction consistency (SPC) technique, which exploits the prediction invariance of poisoned data to an input scaling factor. Based on this, we pose the backdoor data identification problem as a hierarchical data splitting optimization problem, leveraging a novel SPC-based loss function as the primary optimization objective. Our innovation unfolds in several key aspects. First, we revisit the vanilla SPC method, unveiling its limitations in addressing the proposed backdoor identification problem. Subsequently, we develop a bi-level optimization-based approach to precisely identify backdoor data by minimizing the advanced SPC loss. Finally, we demonstrate the efficacy of our proposal against a spectrum of backdoor attacks, encompassing basic label-corrupted attacks as well as more sophisticated clean-label attacks, evaluated across various benchmark datasets. Experiment results show that our approach often surpasses the performance of current baselines in identifying backdoor data points, resulting in about 4%-36% improvement in average AUROC. Codes are available at https://github.com/OPTML-Group/BackdoorMSPC. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: The Twelfth International Conference on Learning Representations (ICLR 2024)

arXiv:2403.06215 [pdf]

doi 10.1021/acs.nanolett.4c01848

Observation of in-gap states in a two-dimensional CrI2/NbSe2 heterostructure

Authors: Peigen Li, Jihai Zhang, Di Zhu, Cui-Qun Chen, Enkui Yi, Bing Shen, Yusheng Hou, Zhongbo Yan, Dao-Xin Yao, Donghui Guo, Dingyong Zhong

Abstract: Low-dimensional magnetic structures coupled with superconductors are promising platforms for realizing Majorana zero modes, which have potential applications in topological quantum computing. Here, we report a two-dimensional (2D) magnetic-superconducting heterostructure consisting of single-layer chromium diiodide (CrI2) on a niobium diselenide (NbSe2) superconductor. Single-layer CrI2 nanosheets… ▽ More Low-dimensional magnetic structures coupled with superconductors are promising platforms for realizing Majorana zero modes, which have potential applications in topological quantum computing. Here, we report a two-dimensional (2D) magnetic-superconducting heterostructure consisting of single-layer chromium diiodide (CrI2) on a niobium diselenide (NbSe2) superconductor. Single-layer CrI2 nanosheets, which hold antiferromagnetic (AFM) ground states by our first-principles calculations, were epitaxially grown on the layered NbSe2 substrate. Using scanning tunneling microscopy/spectroscopy, we observed robust in-gap states spatially located at the edge of the nanosheets and defect-induced zero-energy peaks inside the CrI2 nanosheets. Magnetic-flux vortices induced by an external field exhibit broken threefold rotational symmetry of pristine NbSe2 superconductor, implying the efficient modulation of the interfacial superconducting states by the epitaxial CrI2 layer. A phenomenological model suggests the existence of chiral edge states in a 2D AFM-superconducting hybrid system with an even Chern number, providing a qualitatively plausible understanding for our experimental observation. △ Less

Submitted 25 July, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

Comments: 21 pages, 5 figures

Journal ref: Nano Letters 2024

Showing 1–50 of 477 results for author: Shen, B