-
Possible way to achieve anomalous valley Hall effect by tunable intrinsic piezoelectric polarization in FeO$_2$SiGeN$_2$ monolayer
Authors:
Jianke Tian,
Jia Li,
Hengbo Liu,
Yan Li,
Ze Liu,
Linyang Li,
Jun Li,
Guodong Liu,
Junjie Shi
Abstract:
Valley-related multiple Hall effect and piezoelectric response are novel transport characteristics in low-dimensional system, however few studies have reported their coexistence in a single system as well as their coupling relationships. By first-principles calculations, we propose a multifunctional Janus semiconductor, i.e. FeO$_2$SiGeN$_2$ monolayer with large valley polarization of about 120 me…
▽ More
Valley-related multiple Hall effect and piezoelectric response are novel transport characteristics in low-dimensional system, however few studies have reported their coexistence in a single system as well as their coupling relationships. By first-principles calculations, we propose a multifunctional Janus semiconductor, i.e. FeO$_2$SiGeN$_2$ monolayer with large valley polarization of about 120 meV and in-plane piezoelectric polarization with d11 of -0.714.03 pm/V. The magnetic anisotropy energy can be significantly regulated by electronic correlation strength and strain, which can be attributed to the change of competition relationship about Fe-3d-resolved magnetic anisotropy energy brought about by external regulatory means. Electronic correlation strength can induce phase transitions in Janus FeO$_2$SiGeN$_2$ monolayer from ferrovalley to quantum anomalous Hall phase, while the half-valley metallic state as the boundary of the phase transition can gererate 100% spin- and valley polarization. The related phase transition mechanism is analyzed based on the two-band strained kp model. The presence of piezoelectric strain coefficients d11 in valleytronic material makes the coupling between charge degrees of freedom and valley degrees of freedom possible, and the intrinsic electric field caused by the in-plane piezoelectric response provide the way to realize piezoelectric anomalous valley Hall effect. This work may pave a way to find a new member of materials with valley-related multiple Hall effect and stimulate further experimental works related to valleytronics and piezotronics.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Piezoelectric Manipulation and Engineering for Layertronics in Two-Dimensional Materials
Authors:
Jianke Tian,
Jia Li,
Hengbo Liu,
Yan Li,
Ze Liu,
Linyang Li,
Jun Li,
Guodong Liu,
Junjie Shi
Abstract:
The electronic transport characteristics of two-dimensional (2D) systems have widespread application prospects in the fabrication of multifunctional nanodevices. However, the current research for basic transport phenomena, such as anomalous valley Hall effect (AVHE) and piezoelectric response, is limited to discrete discussion. Here, we theoretically propose a valley-piezoelectricity coupling stra…
▽ More
The electronic transport characteristics of two-dimensional (2D) systems have widespread application prospects in the fabrication of multifunctional nanodevices. However, the current research for basic transport phenomena, such as anomalous valley Hall effect (AVHE) and piezoelectric response, is limited to discrete discussion. Here, we theoretically propose a valley-piezoelectricity coupling strategy beyond the existing paradigm to realize AVHE and layer Hall effect (LHE) in ferrovalley (FV) systems, and its essential principle can be extended to general valleytronic materials. Through first-principles calculations, we demonstrate that the large polarized electric field of 2.8*106 (1.67*107) V/m can be induced by 0.1% uniaxial strain in FV 2H-LaHF (1T-LaHF) monolayers. In addition, the microscopic mechanism of interlayer antiferromagnetic (AFM) state of 2H-LaHF bilayer is uncovered by the spin Hamiltonian and super-superexchange (SSE) interaction. Our findings pave the way for new explorations of valley Hall-related effect involving piezoelectricity.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Spin-layer coupling in altermagnets multilayer: a design principle for spintronics
Authors:
Jianke Tian,
Jia Li,
Hengbo Liu,
Yan Li,
Ze Liu,
Linyang Li,
Jun Li,
Guodong Liu,
Junjie Shi
Abstract:
The discovery of collinear symmetric-compensated altermagnets (AM) with intrinsic spin splitting provides a route towards energy-efficient and ultrafast device applications. Here, using first-principles calculations and symmetry analysis, we propose a series of AM Cr2SX (X=O, S, Se) monolayer and explore the spin splitting in Cr2SX multilayer. A general design principle for realizing the spin-laye…
▽ More
The discovery of collinear symmetric-compensated altermagnets (AM) with intrinsic spin splitting provides a route towards energy-efficient and ultrafast device applications. Here, using first-principles calculations and symmetry analysis, we propose a series of AM Cr2SX (X=O, S, Se) monolayer and explore the spin splitting in Cr2SX multilayer. A general design principle for realizing the spin-layer coupling in odd/even-layer is mapped out based on the comprehensive analysis of spin group symmetry. The spin splitting behavior related with the MzUt, Mz and ML symmetries in AM multilayer can be significantly modulated by magnetic orders, crystal symmetry and external perpendicular gate field (Ez). Due to the spin-compensated bands of sublayers linked by overall Mz and interlayers ML symmetries, the Cr2S2 odd-layer exhibits the unique coexistence of spin splitting and spin degeneracy at high symmetric paths and X/Y valley, respectively. Furthermore, owing to the higher priority of overall ML symmetry compared to interlayers ML symmetry in AM even-layer, the spin-layer coupling of AM multilayer shows strong odd/even-layer dependence. Our work not only offer a new direction for manipulating spin splitting, but also greatly enrich the research on AM monolayer and multilayer.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Resolving turbulence drivers in luminous obscured quasars with JWST/NIRSpec IFU
Authors:
Mandy C. Chen,
Hsiao-Wen Chen,
Michael Rauch,
Andrey Vayner,
Weizhe Liu,
David S. N. Rupke,
Jenny E. Greene,
Nadia L. Zakamska,
Dominika Wylezalek,
Guilin Liu,
Sylvain Veilleux,
Nicole P. H. Nesvadba,
Caroline Bertemes
Abstract:
In this Letter, we investigate the turbulence and energy injection in the extended nebulae surrounding two luminous obscured quasars, WISEA J100211.29$+$013706.7 ($z=1.5933$) and SDSS J165202.64$+$172852.3 ($z=2.9489$). Utilizing high-resolution data from the NIRSpec IFU onboard the James Webb Space Telescope, we analyze the velocity fields of line-emitting gas in and around these quasars and cons…
▽ More
In this Letter, we investigate the turbulence and energy injection in the extended nebulae surrounding two luminous obscured quasars, WISEA J100211.29$+$013706.7 ($z=1.5933$) and SDSS J165202.64$+$172852.3 ($z=2.9489$). Utilizing high-resolution data from the NIRSpec IFU onboard the James Webb Space Telescope, we analyze the velocity fields of line-emitting gas in and around these quasars and construct the second-order velocity structure functions (VSFs) to quantify turbulent motions across different spatial scales. Our findings reveal a notable flattening in the VSFs from $\approx\!3$ kpc up to a scale of 10--20 kpc, suggesting that energy injection predominantly occurs at a scale $\lesssim$10 kpc, likely powered by quasar outflows and jet-driven bubbles. The extended spatial range of flat VSFs may also indicate the presence of multiple energy injection sources at these scales. For J1652, the turbulent energy in the host interstellar medium (ISM) is significantly higher than in tidally stripped gas, consistent with the expectation of active galactic nucleus (AGN) activities stirring up the host ISM. Compared to the VSFs observed on spatial scales of 10--50 kpc around lower-redshift UV-bright quasars, these obscured quasars exhibit higher turbulent energies in their immediate surroundings, implying different turbulence drivers between the ISM and halo-scale gas. Future studies with an expanded sample are essential to elucidate further the extent and the pivotal role of AGNs in shaping the gas kinematics of host galaxies and beyond.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
First results from the JWST Early Release Science Program Q3D: The Fast Outflow in a Red Quasar at z=0.44
Authors:
Weizhe Liu,
Sylvain Veilleux,
Swetha Sankar,
David S. N. Rupke,
Nadia L. Zakamska,
Dominika Wylezalek,
Andrey Vayner,
Caroline Bertemes,
Yu-Ching Chen,
Yuzo Ishikawa,
Jenny E. Greene,
Timothy Heckman,
Guilin Liu,
Hsiao-Wen Chen,
Dieter Lutz,
Sean D. Johnson,
Nicole P. H. Nesvadba,
Patrick Ogle,
Nadiia Diachenko,
Andy D. Goulding,
Kevin N. Hainline,
Fred Hamann,
Hui Xian Grace Lim,
Nora Lützgendorf,
Vincenzo Mainieri
, et al. (4 additional authors not shown)
Abstract:
Quasar feedback may play a key role in the evolution of massive galaxies. The dust-reddened quasar, F2M110648.35$+$480712 at $z = 0.4352$ is one of the few cases at its redshift that exhibits powerful quasar feedback through bipolar outflows. Our new observation with the integral field unit mode of Near-infrared Spectrograph onboard JWST opens a new window to examine this spectacular outflow throu…
▽ More
Quasar feedback may play a key role in the evolution of massive galaxies. The dust-reddened quasar, F2M110648.35$+$480712 at $z = 0.4352$ is one of the few cases at its redshift that exhibits powerful quasar feedback through bipolar outflows. Our new observation with the integral field unit mode of Near-infrared Spectrograph onboard JWST opens a new window to examine this spectacular outflow through Pa$α$ emission line with $\sim$3$\times$ better spatial resolution than previous work. The morphology and kinematics of the Pa$α$ nebula confirm the existence of a bipolar outflow extending on a scale of $\sim$17$\times$14 kpc and with a velocity reaching $\sim$1100 km s$^{-1}$. The higher spatial resolution of our new observation leads to more reliable measurements of outflow kinematics. Considering only the spatially resolved outflow and assuming an electron density of 100 cm$^{-2}$, the mass, momentum and kinetic energy outflow rates are $\sim$50-210 M$_{\odot}$ yr$^{-1}$, $\sim$0.3-1.7$\times$10$^{36}$ dynes ($\sim$14-78\% of the quasar photon momentum flux) and $\sim$0.16-1.27$\times$10$^{44}$ erg s$^{-1}$ ($\sim$0.02-0.20\% of the quasar bolometric luminosity), respectively. The local instantaneous outflow rates generally decrease radially. We infer that the quasar is powerful enough to drive the outflow, while stellar processes cannot be overlooked as a contributing energy source. The mass outflow rate is $\sim$0.4-1.5 times the star formation rate, and the ratio of kinetic energy outflow rate to the quasar bolometric luminosity is comparable to the minimum value required for negative quasar feedback in simulations. This outflow may help regulate the star formation activity within the system to some extent.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Feedback Schr{ö}dinger Bridge Matching
Authors:
Panagiotis Theodoropoulos,
Nikolaos Komianos,
Vincent Pacelli,
Guan-Horng Liu,
Evangelos A. Theodorou
Abstract:
Recent advancements in diffusion bridges for distribution transport problems have heavily relied on matching frameworks, yet existing methods often face a trade-off between scalability and access to optimal pairings during training. Fully unsupervised methods make minimal assumptions but incur high computational costs, limiting their practicality. On the other hand, imposing full supervision of th…
▽ More
Recent advancements in diffusion bridges for distribution transport problems have heavily relied on matching frameworks, yet existing methods often face a trade-off between scalability and access to optimal pairings during training. Fully unsupervised methods make minimal assumptions but incur high computational costs, limiting their practicality. On the other hand, imposing full supervision of the matching process with optimal pairings improves scalability, however, it can be infeasible in many applications. To strike a balance between scalability and minimal supervision, we introduce Feedback Schrödinger Bridge Matching (FSBM), a novel semi-supervised matching framework that incorporates a small portion (less than 8% of the entire dataset) of pre-aligned pairs as state feedback to guide the transport map of non coupled samples, thereby significantly improving efficiency. This is achieved by formulating a static Entropic Optimal Transport (EOT) problem with an additional term capturing the semi-supervised guidance. The generalized EOT objective is then recast into a dynamic formulation to leverage the scalability of matching frameworks. Extensive experiments demonstrate that FSBM accelerates training and enhances generalization by leveraging coupled pairs guidance, opening new avenues for training matching frameworks with partially aligned datasets.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
NSmark: Null Space Based Black-box Watermarking Defense Framework for Pre-trained Language Models
Authors:
Haodong Zhao,
Jinming Hu,
Peixuan Li,
Fangqi Li,
Jinrui Sha,
Peixuan Chen,
Zhuosheng Zhang,
Gongshen Liu
Abstract:
Pre-trained language models (PLMs) have emerged as critical intellectual property (IP) assets that necessitate protection. Although various watermarking strategies have been proposed, they remain vulnerable to Linear Functionality Equivalence Attacks (LFEA), which can invalidate most existing white-box watermarks without prior knowledge of the watermarking scheme or training data. This paper furth…
▽ More
Pre-trained language models (PLMs) have emerged as critical intellectual property (IP) assets that necessitate protection. Although various watermarking strategies have been proposed, they remain vulnerable to Linear Functionality Equivalence Attacks (LFEA), which can invalidate most existing white-box watermarks without prior knowledge of the watermarking scheme or training data. This paper further analyzes and extends the attack scenarios of LFEA to the commonly employed black-box settings for PLMs by considering Last-Layer outputs (dubbed LL-LFEA). We discover that the null space of the output matrix remains invariant against LL-LFEA attacks. Based on this finding, we propose NSmark, a task-agnostic, black-box watermarking scheme capable of resisting LL-LFEA attacks. NSmark consists of three phases: (i) watermark generation using the digital signature of the owner, enhanced by spread spectrum modulation for increased robustness; (ii) watermark embedding through an output mapping extractor that preserves PLM performance while maximizing watermark capacity; (iii) watermark verification, assessed by extraction rate and null space conformity. Extensive experiments on both pre-training and downstream tasks confirm the effectiveness, reliability, fidelity, and robustness of our approach. Code is available at https://github.com/dongdongzhaoUP/NSmark.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
Test of lepton flavour universality with $B_s^0 \rightarrow φ\ell^+\ell^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1124 additional authors not shown)
Abstract:
Lepton flavour universality in rare $b\rightarrow s$ transitions is tested for the first time using $B_s^0$ meson decays. The measurements are performed using $pp$ collision data collected by the LHCb experiment between 2011 and 2018, corresponding to a total integrated luminosity of 9$\,{\rm fb}^{-1}$. Branching fraction ratios between the $B_s^0 \rightarrow φe^+e^-$ and…
▽ More
Lepton flavour universality in rare $b\rightarrow s$ transitions is tested for the first time using $B_s^0$ meson decays. The measurements are performed using $pp$ collision data collected by the LHCb experiment between 2011 and 2018, corresponding to a total integrated luminosity of 9$\,{\rm fb}^{-1}$. Branching fraction ratios between the $B_s^0 \rightarrow φe^+e^-$ and $B_s^0 \rightarrow φμ^+μ^-$ decays are measured in three regions of dilepton mass squared, $q^2$, with $0.1 < q^2 < 1.1$, $1.1 < q^2 < 6.0$, and $15 < q^2 < 19\,{\rm GeV}^2/c^4$. The results agree with the Standard Model expectation of lepton flavour universality.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Observation of a rare beta decay of the charmed baryon with a Graph Neural Network
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the…
▽ More
The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the fundamental parameters of the Cabibbo-Kobayashi-Maskawa matrix in weak interaction theory. This article presents the first observation of the Cabibbo-suppressed $Λ_c^+$ beta decay into a neutron $Λ_c^+ \rightarrow n e^+ ν_{e}$, based on $4.5~\mathrm{fb}^{-1}$ of electron-positron annihilation data collected with the BESIII detector in the energy region above the $Λ^+_c\barΛ^-_c$ threshold. A novel machine learning technique, leveraging Graph Neural Networks, has been utilized to effectively separate signals from dominant backgrounds, particularly $Λ_c^+ \rightarrow Λe^+ ν_{e}$. This approach has yielded a statistical significance of more than $10σ$. The absolute branching fraction of $Λ_c^+ \rightarrow n e^+ ν_{e}$ is measured to be $(3.57\pm0.34_{\mathrm{stat}}\pm0.14_{\mathrm{syst}})\times 10^{-3}$. For the first time, the CKM matrix element $\left|V_{cd}\right|$ is extracted via a charmed baryon decay to be $0.208\pm0.011_{\rm exp.}\pm0.007_{\rm LQCD}\pm0.001_{τ_{Λ_c^+}}$. This study provides a new probe to further understand fundamental interactions in the charmed baryon sector, and demonstrates the power of modern machine learning techniques in enhancing experimental capability in high energy physics research.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Observation of $χ_{c0}\toΣ^{+}\barΣ^{-}η$ and evidence for $χ_{c1,2}\toΣ^{+}\barΣ^{-}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\barΣ^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be…
▽ More
Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\bar��^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be $\mathcal{B}(χ_{c0}\toΣ^{+}\barΣ^{-}η)=({1.26 \pm 0.20 \pm 0.13}) \times 10^{-4}, ~\mathcal{B}(χ_{c1}\toΣ^{+}\barΣ^{-}η)=({5.10 \pm 1.21 \pm 0.67}) \times 10^{-5}$, and $\mathcal{B}(χ_{c2}\toΣ^{+}\barΣ^{-}η)=({5.46 \pm 1.18 \pm 0.50}) \times 10^{-5}$, where the first uncertainties are statistical, and the second ones are systematic.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
ChannelGPT: A Large Model to Generate Digital Twin Channel for 6G Environment Intelligence
Authors:
Li Yu,
Lianzheng Shi,
Jianhua Zhang,
Jialin Wang,
Zhen Zhang,
Yuxiang Zhang,
Guangyi Liu
Abstract:
6G is envisaged to provide multimodal sensing, pervasive intelligence, global coverage, global coverage, etc., which poses extreme intricacy and new challenges to the network design and optimization. As the core part of 6G, wireless channel is the carrier and enabler for the flourishing technologies and novel services, which intrinsically determines the ultimate system performance. However, how to…
▽ More
6G is envisaged to provide multimodal sensing, pervasive intelligence, global coverage, global coverage, etc., which poses extreme intricacy and new challenges to the network design and optimization. As the core part of 6G, wireless channel is the carrier and enabler for the flourishing technologies and novel services, which intrinsically determines the ultimate system performance. However, how to describe and utilize the complicated and high-dynamic characteristics of wireless channel accurately and effectively still remains great hallenges. To tackle this, digital twin is envisioned as a powerful technology to migrate the physical entities to virtual and computational world. In this article, we propose a large model driven digital twin channel generator (ChannelGPT) embedded with environment intelligence (EI) to enable pervasive intelligence paradigm for 6G network. EI is an iterative and interactive procedure to boost the system performance with online environment adaptivity. Firstly, ChannelGPT is capable of utilization the multimodal data from wireless channel and corresponding physical environment with the equipped sensing ability. Then, based on the fine-tuned large model, ChannelGPT can generate multi-scenario channel parameters, associated map information and wireless knowledge simultaneously, in terms of each task requirement. Furthermore, with the support of online multidimensional channel and environment information, the network entity will make accurate and immediate decisions for each 6G system layer. In practice, we also establish a ChannelGPT prototype to generate high-fidelity channel data for varied scenarios to validate the accuracy and generalization ability based on environment intelligence.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Observation of the Singly Cabibbo-Suppressed Decay $Λ_c^{+}\to pπ^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured…
▽ More
Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured as $\mathcal{B}(Λ_c^{+}\to pπ^0)/\mathcal{B}(Λ_c^{+}\to pη)=(0.120\pm0.026_{\rm stat.}\pm0.007_{\rm syst.})$. This result resolves the longstanding discrepancy between earlier experimental searches, providing both a decisive conclusion and valuable input for QCD-inspired theoretical models. A sophisticated deep learning approach using a Transformer-based architecture is employed to distinguish the signal from the prevalent hadronic backgrounds, complemented by thorough validation and systematic uncertainty quantification.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Search for $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ at center-of-mass energies from 4.47 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for…
▽ More
Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for $e^{+}e^{-} \to φχ_{c0}$, as well as the product of the Born cross section for $e^{+}e^{-} \to φη_{c2}(1D)$ and a sum of five branching fractions. Furthermore, the product of the electronic width of $Y(4660)$ and the branching fraction of the $Y(4660) \to φχ_{c0}$, denoted as $Γ^{Y(4660)}_{e^{+}e^{-}} \mathcal{B}_{Y(4660) \to φχ_{c0}}$, is determined to be $< 0.40$ eV at the 90\% confidence level.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Authors:
Zhiyuan Ma,
Yuzhu Zhang,
Guoli Jia,
Liangliang Zhao,
Yichao Ma,
Mingjie Ma,
Gaofeng Liu,
Kaiyan Zhang,
Jianjun Li,
Bowen Zhou
Abstract:
As one of the most popular and sought-after generative models in the recent years, diffusion models have sparked the interests of many researchers and steadily shown excellent advantage in various generative tasks such as image synthesis, video generation, molecule design, 3D scene rendering and multimodal generation, relying on their dense theoretical principles and reliable application practices…
▽ More
As one of the most popular and sought-after generative models in the recent years, diffusion models have sparked the interests of many researchers and steadily shown excellent advantage in various generative tasks such as image synthesis, video generation, molecule design, 3D scene rendering and multimodal generation, relying on their dense theoretical principles and reliable application practices. The remarkable success of these recent efforts on diffusion models comes largely from progressive design principles and efficient architecture, training, inference, and deployment methodologies. However, there has not been a comprehensive and in-depth review to summarize these principles and practices to help the rapid understanding and application of diffusion models. In this survey, we provide a new efficiency-oriented perspective on these existing efforts, which mainly focuses on the profound principles and efficient practices in architecture designs, model training, fast inference and reliable deployment, to guide further theoretical research, algorithm migration and model application for new scenarios in a reader-friendly way. \url{https://github.com/ponyzym/Efficient-DMs-Survey}
△ Less
Submitted 16 October, 2024; v1 submitted 15 October, 2024;
originally announced October 2024.
-
Observation of $χ_{cJ}\to p \bar p K^0_S K^- π^+ + c.c.$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (648 additional authors not shown)
Abstract:
By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be…
▽ More
By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\to p \bar p K^{0}_{S} K^- π^+ + c.c.)=(2.61\pm0.27\pm0.32)\times10^{-5},$ $\mathcal{B}(χ_{c1}\to p \bar p K^{0}_{S} K^- π^+ + c.c.)=(4.16\pm0.24\pm0.46)\times10^{-5},$ and $\mathcal{B}(χ_{c2}\to p \bar p K^{0}_{S} K^- π^+ + c.c.)=(5.63\pm0.28\pm0.46)\times10^{-5}$, respectively. The processes $χ_{c1,2} \to \bar{p} Λ(1520) K^0_S π^{+} + c.c.$ are also observed, with statistical significances of 5.7$σ$ and 7.0$σ$, respectively. Evidence for $χ_{c0} \to\bar{p} Λ(1520) K^0_S π^{+} + c.c.$ is found with statistical significances of 3.3$σ$ each. The corresponding branching fractions are determined to be $\mathcal{B}(χ_{c0}\to \bar{p} Λ(1520) K^0_S π^{+} + c.c.) =(1.61^{+0.68}_{-0.64}\pm0.23)\times10^{-5}$, $\mathcal{B}(χ_{c1}\to \bar{p} Λ(1520) K^0_S π^{+} + c.c.)=(4.06^{+0.80}_{-0.76}\pm0.52)\times10^{-5}$, and $\mathcal{B}(χ_{c2}\to \bar{p} Λ(1520) K^0_S π^{+} + c.c.)=(4.09^{+0.87}_{-0.84}\pm0.42)\times10^{-5}$. Here, the first uncertainties are statistical and the second ones are systematic.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
Authors:
Han Wang,
Yilin Zhao,
Dian Li,
Xiaohan Wang,
Gang Liu,
Xuguang Lan,
Hui Wang
Abstract:
Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking. Similar to reasoning tasks like solving math problems, humor generation requires continuous reflection and revision to foster creative thinking, rather than relying on a sudden flash of inspiration like…
▽ More
Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking. Similar to reasoning tasks like solving math problems, humor generation requires continuous reflection and revision to foster creative thinking, rather than relying on a sudden flash of inspiration like Creative Leap-of-Thought (CLoT) paradigm. Although CLoT can realize the ability of remote association generation, this paradigm fails to generate humor content. Therefore, in this paper, we propose a systematic way of thinking about generating humor and based on it, we built Creative Leap of Structured Thought (CLoST) frame. First, a reward model is necessary achieve the purpose of being able to correct errors, since there is currently no expert model of humor and a usable rule to determine whether a piece of content is humorous. Judgement-oriented instructions are designed to improve the capability of a model, and we also propose an open-domain instruction evolutionary method to fully unleash the potential. Then, through reinforcement learning, the model learns to hone its rationales of the thought chain and refine the strategies it uses. Thus, it learns to recognize and correct its mistakes, and finally generate the most humorous and creative answer. These findings deepen our understanding of the creative capabilities of LLMs and provide ways to enhance LLMs' creative abilities for cross-domain innovative applications.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Probing the Meissner effect in pressurized bilayer nickelate superconductors using diamond quantum sensors
Authors:
Junyan Wen,
Yue Xu,
Gang Wang,
Ze-Xu He,
Yang Chen,
Ningning Wang,
Tenglong Lu,
Xiaoli Ma,
Feng Jin,
Liucheng Chen,
Miao Liu,
Jing-Wei Fan,
Xiaobing Liu,
Xin-Yu Pan,
Gang-Qin Liu,
Jinguang Cheng,
Xiaohui Yu
Abstract:
Recent reports on the signatures of high-temperature superconductivity with a critical temperature Tc close to 80 K have triggered great research interest and extensive follow-up studies. Although zero-resistance state has been successfully achieved under improved hydrostatic pressure conditions, there is no clear evidence of superconducting diamagnetism in pressurized…
▽ More
Recent reports on the signatures of high-temperature superconductivity with a critical temperature Tc close to 80 K have triggered great research interest and extensive follow-up studies. Although zero-resistance state has been successfully achieved under improved hydrostatic pressure conditions, there is no clear evidence of superconducting diamagnetism in pressurized $\mathrm{La_{3}Ni_{2}O_{7-δ}}$ due to the low superconducting volume fraction and limited magnetic measurement techniques under high pressure conditions. Here, using shallow nitrogen-vacancy centers implanted on the culet of diamond anvils as in-situ quantum sensors, we observe convincing evidence for the Meissner effect in polycrystalline samples $\mathrm{La_{3}Ni_{2}O_{7-δ}}$ and $\mathrm{La_{2}PrNi_{2}O_{7}}$: the magnetic field expulsion during both field cooling and field warming processes. The correlated measurements of Raman spectra and NV-based magnetic imaging indicate an incomplete structural transformation related to the displacement of oxygen ions emerging in the non-superconducting region. Furthermore, comparative experiments on different pressure transmitting media (silicone oil and KBr) and nickelates ($\mathrm{La_{3}Ni_{2}O_{7-δ}}$ and $\mathrm{La_{2}PrNi_{2}O_{7}}$) reveal that an improved hydrostatic pressure conditions and the substitution of La by Pr in $\mathrm{La_{3}Ni_{2}O_{7-δ}}$ can dramatically increase the superconductivity. Our work clarifies the controversy about the Meissner effect of bilayer nickelate and contributes to a deeper understanding of the mechanism of nickelate high-temperature superconductors.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Could the inter-band lag of active galactic nucleus vary randomly?
Authors:
Zhen-Bo Su,
Zhen-Yi Cai,
Jun-Xian Wang,
Tinggui Wang,
Yongquan Xue,
Min-Xuan Cai,
Lulu Fan,
Hengxiao Guo,
Zhicheng He,
Zizhao He,
Xu-Fan Hu,
Ji-an Jiang,
Ning Jiang,
Wen-Yong Kang,
Lei Lei,
Guilin Liu,
Teng Liu,
Zhengyan Liu,
Zhenfeng Sheng,
Mouyuan Sun,
Wen Zhao
Abstract:
The inter-band lags among the optical broad-band continua of active galactic nuclei (AGNs) have been intensively explored over the past decade. However, the nature of the lags remains under debate. Here utilizing two distinct scenarios for AGN variability, i.e., the thermal fluctuation of accretion disk and the reprocessing of both the accretion disk and clouds in the broad line region, we show th…
▽ More
The inter-band lags among the optical broad-band continua of active galactic nuclei (AGNs) have been intensively explored over the past decade. However, the nature of the lags remains under debate. Here utilizing two distinct scenarios for AGN variability, i.e., the thermal fluctuation of accretion disk and the reprocessing of both the accretion disk and clouds in the broad line region, we show that, owing to the random nature of AGN variability, the inter-band lags of an individual AGN would vary from one campaign with a finite baseline to another. Specifically, the thermal fluctuation scenario implies larger variations in the lags than the reprocessing scenario. Moreover, the former predicts a positive correlation between the lag and variation amplitude, while the latter does not result in such a correlation. For both scenarios, averaging the lags of an individual AGN measured with repeated and non-overlapping campaigns would give rise to a stable lag, which is larger for a longer baseline and gets saturation for a sufficiently long baseline. However, obtaining the stable lag for an individual AGN is very time-consuming. Alternatively, it can be equivalently inferred by averaging the lags of a sample of AGNs with similar physical properties, thus can be properly compared with predictions of AGN models. In addition, discussed are several new observational tests suggested by our simulations as well as the role of the deep high-cadence surveys of the Wide Field Survey Telescope in enriching our knowledge of the lags.
△ Less
Submitted 13 October, 2024;
originally announced October 2024.
-
LibEER: A Comprehensive Benchmark and Algorithm Library for EEG-based Emotion Recognition
Authors:
Huan Liu,
Shusen Yang,
Yuzhe Zhang,
Mengze Wang,
Fanyu Gong,
Chengxi Xie,
Guanjian Liu,
Dalin Zhang
Abstract:
EEG-based emotion recognition (EER) is garnering increasing attention due to its potential in understanding and analyzing human emotions. Recently, significant advancements have been achieved using various deep learning-based techniques to address the EER problem. However, the absence of a convincing benchmark and open-source codebase complicates fair comparisons between different models and poses…
▽ More
EEG-based emotion recognition (EER) is garnering increasing attention due to its potential in understanding and analyzing human emotions. Recently, significant advancements have been achieved using various deep learning-based techniques to address the EER problem. However, the absence of a convincing benchmark and open-source codebase complicates fair comparisons between different models and poses reproducibility challenges for practitioners. These issues considerably impede progress in this field. In light of this, we propose a comprehensive benchmark and algorithm library (LibEER) for fair comparisons in EER by making most of the implementation details of different methods consistent and using the same single codebase in PyTorch. In response to these challenges, we propose LibEER, a comprehensive benchmark and algorithm library for fair comparisons in EER, by ensuring consistency in the implementation details of various methods and utilizing a single codebase in PyTorch. LibEER establishes a unified evaluation framework with standardized experimental settings, enabling unbiased evaluations of over ten representative deep learning-based EER models across the four most commonly used datasets. Additionally, we conduct an exhaustive and reproducible comparison of the performance and efficiency of popular models, providing valuable insights for researchers in selecting and designing EER models. We aspire for our work to not only lower the barriers for beginners entering the field of EEG-based emotion recognition but also promote the standardization of research in this domain, thereby fostering steady development. The source code is available at \url{https://github.com/ButterSen/LibEER}.
△ Less
Submitted 13 October, 2024;
originally announced October 2024.
-
Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation
Authors:
Guozhi Liu,
Weiwei Lin,
Tiansheng Huang,
Ruichao Mo,
Qi Mu,
Li Shen
Abstract:
Harmful fine-tuning attack poses a serious threat to the online fine-tuning service. Vaccine, a recent alignment-stage defense, applies uniform perturbation to all layers of embedding to make the model robust to the simulated embedding drift. However, applying layer-wise uniform perturbation may lead to excess perturbations for some particular safety-irrelevant layers, resulting in defense perform…
▽ More
Harmful fine-tuning attack poses a serious threat to the online fine-tuning service. Vaccine, a recent alignment-stage defense, applies uniform perturbation to all layers of embedding to make the model robust to the simulated embedding drift. However, applying layer-wise uniform perturbation may lead to excess perturbations for some particular safety-irrelevant layers, resulting in defense performance degradation and unnecessary memory consumption. To address this limitation, we propose Targeted Vaccine (T-Vaccine), a memory-efficient safety alignment method that applies perturbation to only selected layers of the model. T-Vaccine follows two core steps: First, it uses gradient norm as a statistical metric to identify the safety-critical layers. Second, instead of applying uniform perturbation across all layers, T-Vaccine only applies perturbation to the safety-critical layers while keeping other layers frozen during training. Results show that T-Vaccine outperforms Vaccine in terms of both defense effectiveness and resource efficiency. Comparison with other defense baselines, e.g., RepNoise and TAR also demonstrate the superiority of T-Vaccine. Notably, T-Vaccine is the first defense that can address harmful fine-tuning issues for a 7B pre-trained models trained on consumer GPUs with limited memory (e.g., RTX 4090). Our code is available at https://github.com/Lslland/T-Vaccine.
△ Less
Submitted 17 October, 2024; v1 submitted 13 October, 2024;
originally announced October 2024.
-
Cohomology of Pointed Finite Tensor Categories
Authors:
Bowen Li,
Gongxiang Liu
Abstract:
We consider the finite generation property for cohomology algebra of pointed finite tensor categories via de-equivariantization and exact sequence of finite tensor categories. As a result, we prove that all coradically graded pointed finite tensor categories over abelian groups have finitely generated cohomology.
We consider the finite generation property for cohomology algebra of pointed finite tensor categories via de-equivariantization and exact sequence of finite tensor categories. As a result, we prove that all coradically graded pointed finite tensor categories over abelian groups have finitely generated cohomology.
△ Less
Submitted 12 October, 2024;
originally announced October 2024.
-
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Authors:
Guanlin Liu,
Kaixuan Ji,
Renjie Zheng,
Zheng Wu,
Chen Dun,
Quanquan Gu,
Lin Yan
Abstract:
Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs) with human preferences and improving their ability to perform complex tasks. However, current approaches either require significant computational resources due to the use of multiple models and extensive online sampling for training (e.g., PPO) or are framed as bandit problems (e.g., DPO, DRO), which often st…
▽ More
Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs) with human preferences and improving their ability to perform complex tasks. However, current approaches either require significant computational resources due to the use of multiple models and extensive online sampling for training (e.g., PPO) or are framed as bandit problems (e.g., DPO, DRO), which often struggle with multi-step reasoning tasks, such as math problem-solving and complex reasoning that involve long chains of thought. To overcome these limitations, we introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model. The MDP formulation of DQO offers structural advantages over bandit-based methods, enabling more effective process supervision. Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
D. Agarwal,
M. Agathos,
M. Aghaei Abchouyeh,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
P. Ajith,
T. Akutsu,
S. Albanesi,
R. A. Alfaidi,
A. Al-Jodah,
C. Alléné
, et al. (1758 additional authors not shown)
Abstract:
The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by…
▽ More
The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Observation of $D^+\toη^\primeμ^+ν_μ$ and First Study of $D^+\to η^\prime \ell^+ν_\ell$ Decay Dynamics
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy 3.773\,GeV with the BESIII detector, we report the first observation of the semileptonic decay $D^+\to η^\prime μ^+ν_μ$ with significance of $8.6σ$ including systematic uncertainties, and an improved measurement of $D^+\to η^\prime e^+ν_e$. The branching fractions of $D^+\to η^\prime μ^+ν_μ$ and…
▽ More
Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy 3.773\,GeV with the BESIII detector, we report the first observation of the semileptonic decay $D^+\to η^\prime μ^+ν_μ$ with significance of $8.6σ$ including systematic uncertainties, and an improved measurement of $D^+\to η^\prime e^+ν_e$. The branching fractions of $D^+\to η^\prime μ^+ν_μ$ and $D^+\to η^\prime e^+ν_e$ are determined to be $(1.92\pm0.28_{\rm stat}\pm 0.08_{\rm syst})\times 10^{-4}$ and $(1.79\pm0.19_{\rm stat}\pm 0.07_{\rm syst})\times 10^{-4}$, respectively. From an analysis of the $D^+\to η^\prime \ell^+ν_\ell$ decay dynamics, the product of the hadronic form factor $f_+^{η^{\prime}}(0)$ and the CKM matrix element $|V_{cd}|$ is measured for the first time, giving $f^{η^\prime}_+(0)|V_{cd}| = (5.92\pm0.56_{\rm stat}\pm0.13_{\rm syst})\times 10^{-2}$. No evidence for violation of $μ-e$ lepton-flavor universality is found in both the full range and several bins of $\ell^+ν_\ell$ four-momentum transfer. The $η-η^\prime$ mixing angle in the quark flavor basis is determined to be $φ_{\rm P} =(39.8\pm0.8_{\rm stat}\pm0.3_{\rm syst})^\circ$.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Balancing Innovation and Privacy: Data Security Strategies in Natural Language Processing Applications
Authors:
Shaobo Liu,
Guiran Liu,
Binrong Zhu,
Yuanshuai Luo,
Linxiao Wu,
Rui Wang
Abstract:
This research addresses privacy protection in Natural Language Processing (NLP) by introducing a novel algorithm based on differential privacy, aimed at safeguarding user data in common applications such as chatbots, sentiment analysis, and machine translation. With the widespread application of NLP technology, the security and privacy protection of user data have become important issues that need…
▽ More
This research addresses privacy protection in Natural Language Processing (NLP) by introducing a novel algorithm based on differential privacy, aimed at safeguarding user data in common applications such as chatbots, sentiment analysis, and machine translation. With the widespread application of NLP technology, the security and privacy protection of user data have become important issues that need to be solved urgently. This paper proposes a new privacy protection algorithm designed to effectively prevent the leakage of user sensitive information. By introducing a differential privacy mechanism, our model ensures the accuracy and reliability of data analysis results while adding random noise. This method not only reduces the risk caused by data leakage but also achieves effective processing of data while protecting user privacy. Compared to traditional privacy methods like data anonymization and homomorphic encryption, our approach offers significant advantages in terms of computational efficiency and scalability while maintaining high accuracy in data analysis. The proposed algorithm's efficacy is demonstrated through performance metrics such as accuracy (0.89), precision (0.85), and recall (0.88), outperforming other methods in balancing privacy and utility. As privacy protection regulations become increasingly stringent, enterprises and developers must take effective measures to deal with privacy risks. Our research provides an important reference for the application of privacy protection technology in the field of NLP, emphasizing the need to achieve a balance between technological innovation and user privacy. In the future, with the continuous advancement of technology, privacy protection will become a core element of data-driven applications and promote the healthy development of the entire industry.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Precision Measurement of the Branching Fraction of $D^{+}\to μ^{+}ν_μ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of $E_{\rm cm}=3.773$ GeV with the BESIII detector operating at the BEPCII collider, we determine the branching fraction of the leptonic decay $D^+\toμ^+ν_μ$ to be $(3.981\pm0.079_{\rm stat}\pm0.040_{\rm syst})\times10^{-4}$. Interpreting our measurement with knowledge of the Fermi coupling constant…
▽ More
Using $20.3~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of $E_{\rm cm}=3.773$ GeV with the BESIII detector operating at the BEPCII collider, we determine the branching fraction of the leptonic decay $D^+\toμ^+ν_μ$ to be $(3.981\pm0.079_{\rm stat}\pm0.040_{\rm syst})\times10^{-4}$. Interpreting our measurement with knowledge of the Fermi coupling constant $G_F$, the masses of the $D^+$ and $μ^+$ as well as the lifetime of the $D^+$, we determine $f_{D^+}|V_{cd}|=(47.53\pm0.48_{\rm stat}\pm0.24_{\rm syst}\pm0.12_{\rm input})~\mathrm{MeV}$. This result is a factor of 2.3 more precise than the previous best measurement. Using the value of the magnitude of the Cabibbo-Kobayashi-Maskawa matrix element $|V_{cd}|$ given by the global standard model fit, we obtain the $D^+$ decay constant $f_{D^+}=(211.5\pm2.3_{\rm stat}\pm1.1_{\rm syst}\pm0.8_{\rm input})$ MeV. Alternatively, using the value of $f_{D^+}$ from a precise lattice quantum chromodynamics calculation, we extract $|V_{cd}|=0.2242\pm0.0023_{\rm stat}\pm0.0011_{\rm syst}\pm0.0009_{\rm input}$.
△ Less
Submitted 10 October, 2024;
originally announced October 2024.
-
Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare
Authors:
Nan Fang,
Guiliang Liu,
Wei Gong
Abstract:
Reinforcement Learning (RL) applied in healthcare can lead to unsafe medical decisions and treatment, such as excessive dosages or abrupt changes, often due to agents overlooking common-sense constraints. Consequently, Constrained Reinforcement Learning (CRL) is a natural choice for safe decisions. However, specifying the exact cost function is inherently difficult in healthcare. Recent Inverse Co…
▽ More
Reinforcement Learning (RL) applied in healthcare can lead to unsafe medical decisions and treatment, such as excessive dosages or abrupt changes, often due to agents overlooking common-sense constraints. Consequently, Constrained Reinforcement Learning (CRL) is a natural choice for safe decisions. However, specifying the exact cost function is inherently difficult in healthcare. Recent Inverse Constrained Reinforcement Learning (ICRL) is a promising approach that infers constraints from expert demonstrations. ICRL algorithms model Markovian decisions in an interactive environment. These settings do not align with the practical requirement of a decision-making system in healthcare, where decisions rely on historical treatment recorded in an offline dataset. To tackle these issues, we propose the Constraint Transformer (CT). Specifically, 1) we utilize a causal attention mechanism to incorporate historical decisions and observations into the constraint modeling, while employing a Non-Markovian layer for weighted constraints to capture critical states. 2) A generative world model is used to perform exploratory data augmentation, enabling offline RL methods to simulate unsafe decision sequences. In multiple medical scenarios, empirical results demonstrate that CT can capture unsafe states and achieve strategies that approximate lower mortality rates, reducing the occurrence probability of unsafe behaviors.
△ Less
Submitted 14 October, 2024; v1 submitted 9 October, 2024;
originally announced October 2024.
-
Chip-Tuning: Classify Before Language Models Say
Authors:
Fangwei Zhu,
Dian Li,
Jiajun Huang,
Gang Liu,
Hui Wang,
Zhifang Sui
Abstract:
The rapid development in the performance of large language models (LLMs) is accompanied by the escalation of model size, leading to the increasing cost of model training and inference. Previous research has discovered that certain layers in LLMs exhibit redundancy, and removing these layers brings only marginal loss in model performance. In this paper, we adopt the probing technique to explain the…
▽ More
The rapid development in the performance of large language models (LLMs) is accompanied by the escalation of model size, leading to the increasing cost of model training and inference. Previous research has discovered that certain layers in LLMs exhibit redundancy, and removing these layers brings only marginal loss in model performance. In this paper, we adopt the probing technique to explain the layer redundancy in LLMs and demonstrate that language models can be effectively pruned with probing classifiers. We propose chip-tuning, a simple and effective structured pruning framework specialized for classification problems. Chip-tuning attaches tiny probing classifiers named chips to different layers of LLMs, and trains chips with the backbone model frozen. After selecting a chip for classification, all layers subsequent to the attached layer could be removed with marginal performance loss. Experimental results on various LLMs and datasets demonstrate that chip-tuning significantly outperforms previous state-of-the-art baselines in both accuracy and pruning ratio, achieving a pruning ratio of up to 50%. We also find that chip-tuning could be applied on multimodal models, and could be combined with model finetuning, proving its excellent compatibility.
△ Less
Submitted 11 October, 2024; v1 submitted 9 October, 2024;
originally announced October 2024.
-
Search for the radiative decays $D^+\toγρ^+$ and $D^+\toγK^{*+}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (648 additional authors not shown)
Abstract:
We search for the radiative decays $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ using 20.3~fb$^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$ GeV by the BESIII detector operating at the BEPCII collider. No significant signals are observed, and the upper limits on the branching fractions of $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ at 90\% confidence level ar…
▽ More
We search for the radiative decays $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ using 20.3~fb$^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$ GeV by the BESIII detector operating at the BEPCII collider. No significant signals are observed, and the upper limits on the branching fractions of $D^{+} \to γρ^+$ and $D^{+} \to γK^{*+}$ at 90\% confidence level are set to be $1.3\times10^{-5}$ and $1.8\times10^{-5}$, respectively.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
Observation of an axial-vector state in the study of $ψ(3686) \to φηη'$ decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (625 additional authors not shown)
Abstract:
Using (2712.4 $\pm$ 14.3)$\times 10^{6}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, a partial wave analysis of the decay $ψ(3686) \to φηη' $ is performed with the covariant tensor approach. An axial-vector state with a mass near 2.3 $\rm GeV/c^2$ is observed for the first time. Its mass and width are measured to be 2316…
▽ More
Using (2712.4 $\pm$ 14.3)$\times 10^{6}$ $ψ(3686)$ events collected with the BESIII detector at BEPCII, a partial wave analysis of the decay $ψ(3686) \to φηη' $ is performed with the covariant tensor approach. An axial-vector state with a mass near 2.3 $\rm GeV/c^2$ is observed for the first time. Its mass and width are measured to be 2316 $\pm 9_{\mathrm{stat}} \pm 30_{\mathrm{syst}}\,\rm MeV/c^2$ and 89 $\pm 15_{\mathrm{stat}} \pm 26_{\mathrm{syst}}\,\rm MeV$, respectively. The product branching fractions of $\mathcal{B}(ψ(3686) \to X(2300) η') \mathcal{B}(X(2300)\to φη)$ and $\mathcal{B}(ψ(3686) \to X(2300) η)\mathcal{B}(X(2300)\to φη')$ are determined to be (4.8 $\pm 1.3_{\mathrm{stat}} \pm 0.7_{\mathrm{syst}})\times 10^{-6}$ and (2.2 $\pm 0.7_{\mathrm{stat}} \pm 0.7_{\mathrm{syst}})\times 10^{-6}$, respectively. The branching fraction $\mathcal{B}(ψ(3686) \to φηη')$ is measured for the first time to be (3.14$\pm0.17_{\mathrm{stat}}\pm0.24_{\mathrm{syst}})\times10^{-5}$.
The first uncertainties are statistical and the second are systematic.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
UniMuMo: Unified Text, Music and Motion Generation
Authors:
Han Yang,
Kun Su,
Yutong Zhang,
Jiaben Chen,
Kaizhi Qian,
Gaowen Liu,
Chuang Gan
Abstract:
We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To address the lack of time-synchronized data, we align unpaired music and motion data based on rhythmic patterns to leverage existing large-scale music-only and motion-only datasets. By converting music, motion, and text int…
▽ More
We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To address the lack of time-synchronized data, we align unpaired music and motion data based on rhythmic patterns to leverage existing large-scale music-only and motion-only datasets. By converting music, motion, and text into token-based representation, our model bridges these modalities through a unified encoder-decoder transformer architecture. To support multiple generation tasks within a single framework, we introduce several architectural improvements. We propose encoding motion with a music codebook, mapping motion into the same feature space as music. We introduce a music-motion parallel generation scheme that unifies all music and motion generation tasks into a single transformer decoder architecture with a single training task of music-motion joint generation. Moreover, the model is designed by fine-tuning existing pre-trained single-modality models, significantly reducing computational demands. Extensive experiments demonstrate that UniMuMo achieves competitive results on all unidirectional generation benchmarks across music, motion, and text modalities. Quantitative results are available in the \href{https://hanyangclarence.github.io/unimumo_demo/}{project page}.
△ Less
Submitted 6 October, 2024;
originally announced October 2024.
-
ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model
Authors:
Shuhao Gu,
Mengdi Zhao,
Bowen Zhang,
Liangdong Wang,
Jijie Li,
Guang Liu
Abstract:
Tokenizer is an essential component for large language models (LLMs), and a tokenizer with a high compression rate can improve the model's representation and processing efficiency. However, the tokenizer cannot ensure high compression rate in all scenarios, and an increase in the average input and output lengths will increases the training and inference costs of the model. Therefore, it is crucial…
▽ More
Tokenizer is an essential component for large language models (LLMs), and a tokenizer with a high compression rate can improve the model's representation and processing efficiency. However, the tokenizer cannot ensure high compression rate in all scenarios, and an increase in the average input and output lengths will increases the training and inference costs of the model. Therefore, it is crucial to find ways to improve the model's efficiency with minimal cost while maintaining the model's performance. In this work, we propose a method to improve model representation and processing efficiency by replacing the tokenizers of LLMs. We propose replacing and reinitializing the parameters of the model's input and output layers with the parameters of the original model, and training these parameters while keeping other parameters fixed. We conducted experiments on different LLMs, and the results show that our method can maintain the performance of the model after replacing the tokenizer, while significantly improving the decoding speed for long texts.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
Authors:
Gang Liu,
Michael Sun,
Wojciech Matusik,
Meng Jiang,
Jie Chen
Abstract:
While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inver…
▽ More
While large language models (LLMs) have integrated images, adapting them to graphs remains challenging, limiting their applications in materials and drug design. This difficulty stems from the need for coherent autoregressive generation across texts and graphs. To address this, we introduce Llamole, the first multimodal LLM capable of interleaved text and graph generation, enabling molecular inverse design with retrosynthetic planning. Llamole integrates a base LLM with the Graph Diffusion Transformer and Graph Neural Networks for multi-conditional molecular generation and reaction inference within texts, while the LLM, with enhanced molecular understanding, flexibly controls activation among the different graph modules. Additionally, Llamole integrates A* search with LLM-based cost functions for efficient retrosynthetic planning. We create benchmarking datasets and conduct extensive experiments to evaluate Llamole against in-context learning and supervised fine-tuning. Llamole significantly outperforms 14 adapted LLMs across 12 metrics for controllable molecular design and retrosynthetic planning.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Quasi-triangular, factorizable Leibniz bialgebras and relative Rota-Baxter operators
Authors:
Chengming Bai,
Guilai Liu,
Yunhe Sheng,
Rong Tang
Abstract:
We introduce the notion of quasi-triangular Leibniz bialgebras, which can be constructed from solutions of the classical Leibniz Yang-Baxter equation (CLYBE) whose skew-symmetric parts are invariant. In addition to triangular Leibniz bialgebras, quasi-triangular Leibniz bialgebras contain factorizable Leibniz bialgebras as another subclass, which lead to a factorization of the underlying Leibniz a…
▽ More
We introduce the notion of quasi-triangular Leibniz bialgebras, which can be constructed from solutions of the classical Leibniz Yang-Baxter equation (CLYBE) whose skew-symmetric parts are invariant. In addition to triangular Leibniz bialgebras, quasi-triangular Leibniz bialgebras contain factorizable Leibniz bialgebras as another subclass, which lead to a factorization of the underlying Leibniz algebras. Relative Rota-Baxter operators with weights on Leibniz algebras are used to characterize solutions of the CLYBE whose skew-symmetric parts are invariant. On skew-symmetric quadratic Leibniz algebras, such operators correspond to Rota-Baxter type operators. Consequently, we introduce the notion of skew-symmetric quadratic Rota-Baxter Leibniz algebras, such that they give rise to triangular Leibniz bialgebras in the case of weight $0$, while they are in one-to-one correspondence with factorizable Leibniz bialgebras in the case of nonzero weights.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Measurement of the effective leptonic weak mixing angle
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1117 additional authors not shown)
Abstract:
Using $pp$ collision data at $\sqrt{s}=13$ TeV, recorded by the LHCb experiment between 2016 and 2018 and corresponding to an integrated luminosity of $5.4$ fb$^{-1}$, the forward-backward asymmetry in the $pp \to Z/γ^{*} \to μ^+μ^-$ process is measured. The measurement is carried out in ten intervals of the difference between the muon pseudorapidities, within a fiducial region covering dimuon mas…
▽ More
Using $pp$ collision data at $\sqrt{s}=13$ TeV, recorded by the LHCb experiment between 2016 and 2018 and corresponding to an integrated luminosity of $5.4$ fb$^{-1}$, the forward-backward asymmetry in the $pp \to Z/γ^{*} \to μ^+μ^-$ process is measured. The measurement is carried out in ten intervals of the difference between the muon pseudorapidities, within a fiducial region covering dimuon masses between $66$ and $116$ GeV, muon pseudorapidities between $2.0$ and $4.5$ and muon transverse momenta above $20$ GeV. These forward-backward asymmetries are compared with predictions, at next-to-leading order in the strong and electroweak couplings. The measured effective leptonic weak mixing angle is $\sin^2θ_{\rm eff}^\ell = 0.23147 \pm 0.00044 \pm 0.00005 \pm 0.00023$, where the first uncertainty is statistical, the second arises from systematic uncertainties associated with the asymmetry measurement, and the third arises from uncertainties in the fit model used to extract $\sin^2θ_{\rm eff}^\ell$ from the asymmetry measurement. This result is based on an arithmetic average of results using the CT18, MSHT20, and NNPDF31 parameterisations of the proton internal structure, and is consistent with previous measurements and with predictions from the global electroweak fit.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Search for lepton number violating decays of $D_s^+\to h^-h^0e^+e^+$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (650 additional authors not shown)
Abstract:
Based on 7.33 fb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector operating at the BEPCII collider at center-of-mass energies from 4.128 to 4.226 GeV, a search for the Majorana neutrino $ν_m$ is conducted in the lepton-number-violating decays of $D_s^+\to h^-h^0e^+e^+$. Here, $h^-$ represents a $K^-$ or $π^-$, and $h^0$ represents a $π^0$, $K_S^0$ or $φ$. No significant signal is…
▽ More
Based on 7.33 fb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector operating at the BEPCII collider at center-of-mass energies from 4.128 to 4.226 GeV, a search for the Majorana neutrino $ν_m$ is conducted in the lepton-number-violating decays of $D_s^+\to h^-h^0e^+e^+$. Here, $h^-$ represents a $K^-$ or $π^-$, and $h^0$ represents a $π^0$, $K_S^0$ or $φ$. No significant signal is observed, and the upper limits of their branching fractions at the 90\% confidence level are determined to be $\mathcal{B}(D_s^+\to φπ^-e^+e^+) < 6.9 \times 10^{-5}$, $\mathcal{B}(D_s^+\to φK^-e^+e^+) < 9.9 \times 10^{-5}$, $\mathcal{B}(D_s^+\to K_S^0π^-e^+e^+) < 1.3 \times 10^{-5}$, $\mathcal{B}(D_s^+\to K_S^0K^-e^+e^+) < 2.9 \times 10^{-5}$, $\mathcal{B}(D_s^+\to π^-π^0e^+e^+) < 2.9 \times 10^{-5}$ and $\mathcal{B}(D_s^+\to K^-π^0e^+e^+) < 3.4 \times 10^{-5}$. The Majorana neutrino is searched for with different mass assumptions within the range [0.20, 0.80] GeV$/c^2$ in the decay of $D_s^+\toφe^+ν_m$ with $ν_m\toπ^-e^+$, and the upper limits of the branching fractions at the 90\% confidence level are at the level of $10^{-5}-10^{-2}$, depending on the mass of the Majorana neutrino.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
FARM: Functional Group-Aware Representations for Small Molecules
Authors:
Thao Nguyen,
Kuan-Hao Huang,
Ge Liu,
Martin D. Burke,
Ying Diao,
Heng Ji
Abstract:
We introduce Functional Group-Aware Representations for Small Molecules (FARM), a novel foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs. The key innovation of FARM lies in its functional group-aware tokenization, which directly incorporates functional group information into the representations. This strategic reduction in tokenization granularity…
▽ More
We introduce Functional Group-Aware Representations for Small Molecules (FARM), a novel foundation model designed to bridge the gap between SMILES, natural language, and molecular graphs. The key innovation of FARM lies in its functional group-aware tokenization, which directly incorporates functional group information into the representations. This strategic reduction in tokenization granularity is intentionally aligned with key drivers of functional properties (i.e., functional groups), enhancing the model's understanding of chemical language. By expanding the chemical lexicon, FARM more effectively bridges SMILES and natural language, ultimately advancing the model's capacity to predict molecular properties. FARM also represents molecules from two perspectives: by using masked language modeling to capture atom-level features and by employing graph neural networks to encode the whole molecule topology. By leveraging contrastive learning, FARM aligns these two views of representations into a unified molecular embedding. We rigorously evaluate FARM on the MoleculeNet dataset, where it achieves state-of-the-art performance on 10 out of 12 tasks. These results highlight FARM's potential to improve molecular representation learning, with promising applications in drug discovery and pharmaceutical research.
△ Less
Submitted 6 October, 2024; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Pre-Chirp-Domain Index Modulation for Full-Diversity Affine Frequency Division Multiplexing towards 6G
Authors:
Guangyao Liu,
Tianqi Mao,
Zhenyu Xiao,
Ruiqi Liu,
Miaowen Wen
Abstract:
Affine frequency division multiplexing (AFDM), tailored as a superior multicarrier technique utilizing chirp signals for high-mobility communications, is envisioned as a promising candidate for the sixth-generation (6G) wireless network. AFDM is based on the discrete affine Fourier transform (DAFT) with two adjustable parameters of the chirp signals, termed as the pre-chirp and post-chirp paramete…
▽ More
Affine frequency division multiplexing (AFDM), tailored as a superior multicarrier technique utilizing chirp signals for high-mobility communications, is envisioned as a promising candidate for the sixth-generation (6G) wireless network. AFDM is based on the discrete affine Fourier transform (DAFT) with two adjustable parameters of the chirp signals, termed as the pre-chirp and post-chirp parameters, respectively. We show that the pre-chirp counterpart can be flexibly manipulated for additional degree-of-freedom (DoF). Therefore, this paper proposes a novel AFDM scheme with the pre-chirp index modulation (PIM) philosophy (AFDM-PIM), which can implicitly convey extra information bits through dynamic pre-chirp parameter assignment, thus enhancing both spectral and energy efficiency. Specifically, we first demonstrate that the subcarrier orthogonality is still maintained by applying distinct pre-chirp parameters to various subcarriers in the AFDM modulation process. Inspired by this property, each AFDM subcarrier is constituted with a unique pre-chirp signal according to the incoming bits. By such arrangement, extra binary bits can be embedded into the index patterns of pre-chirp parameter assignment without additional energy consumption. For performance analysis, we derive the asymptotically tight upper bounds on the average bit error rates (BERs) of the proposed schemes with maximum-likelihood (ML) detection, and validate that the proposed AFDM-PIM can achieve the optimal diversity order under doubly dispersive channels. Based on the derivations, we further propose an optimal pre-chirp alphabet design to enhance the BER performance via intelligent optimization algorithms. Simulations demonstrate that the proposed AFDM-PIM outperforms the classical benchmarks under doubly dispersive channel.
△ Less
Submitted 17 October, 2024; v1 submitted 30 September, 2024;
originally announced October 2024.
-
Hopf algebras with the dual Chevalley property of discrete corepresentation type
Authors:
Jing Yu,
Gongxiang Liu
Abstract:
We try to classify Hopf algebras with the dual Chevalley property of discrete corepresentation type over an algebraically closed field $\Bbb{k}$ with characteristic 0. For such Hopf algebra $H$, we characterize the link quiver of $H$ and determine the structures of the link-indecomposable component $H_{(1)}$ containing $\Bbb{k}1$. Besides, we construct an infinite-dimensional non-pointed non-cosem…
▽ More
We try to classify Hopf algebras with the dual Chevalley property of discrete corepresentation type over an algebraically closed field $\Bbb{k}$ with characteristic 0. For such Hopf algebra $H$, we characterize the link quiver of $H$ and determine the structures of the link-indecomposable component $H_{(1)}$ containing $\Bbb{k}1$. Besides, we construct an infinite-dimensional non-pointed non-cosemisimple link-indecomposable Hopf algebra $H(e_{\pm 1}, f_{\pm 1}, u, v)$ with the dual Chevalley property of discrete corepresentation type.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Perspective: imaging atomic step geometry to determine surface terminations of kagome materials and beyond
Authors:
Guowei Liu,
Tianyu Yang,
Yu-Xiao Jiang,
Shafayat Hossain,
Hanbin Deng,
M. Zahid Hasan,
Jia-Xin Yin
Abstract:
Here we review scanning tunneling microscopy research on the surface determination for various types of kagome materials, including 11-type (CoSn, FeSn, FeGe), 32-type (Fe3Sn2), 13-type (Mn3Sn), 135-type (AV3Sb5, A = K, Rb, Cs), 166-type (TbMn6Sn6, YMn6Sn6 and ScV6Sn6), and 322-type (Co3Sn2S2 and Ni3In2Se2). We first demonstrate that the measured step height between different surfaces typically de…
▽ More
Here we review scanning tunneling microscopy research on the surface determination for various types of kagome materials, including 11-type (CoSn, FeSn, FeGe), 32-type (Fe3Sn2), 13-type (Mn3Sn), 135-type (AV3Sb5, A = K, Rb, Cs), 166-type (TbMn6Sn6, YMn6Sn6 and ScV6Sn6), and 322-type (Co3Sn2S2 and Ni3In2Se2). We first demonstrate that the measured step height between different surfaces typically deviates from the expected value of +-0.4~0.8A, which is owing to the tunneling convolution effect with electronic states and becomes a serious issue for Co3Sn2S2 where the expected Sn-S interlayer distance is 0.6A. Hence, we put forward a general methodology for surface determination as atomic step geometry imaging, which is fundamental but also experimentally challenging to locate the step and to image with atomic precision. We discuss how this method can be used to resolve the surface termination puzzle in Co3Sn2S2. This method provides a natural explanation for the existence of adatoms and vacancies, and beyond using unknown impurity states, we propose and use designer layer-selective substitutional chemical markers to confirm the validity of this method. Finally, we apply this method to determine the surface of a new kagome material Ni3In2Se2, as a cousin of Co3Sn2S2, and we image the underlying kagome geometry on the determined Se surface above the kagome layer, which directly visualizes the p-d hybridization physics. We emphasize that this general method does not rely on theory, but the determined surface identity can provide guidelines for first-principles calculations with adjustable parameters on the surface-dependent local density of states and quasi-particle interference patterns.
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
Boosting SISSO Performance on Small Sample Datasets by Using Random Forests Prescreening for Complex Feature Selection
Authors:
Xiaolin Jiang,
Guanqi Liu,
Jiaying Xie,
Zhenpeng Hu
Abstract:
In materials science, data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates. Symbolic regression is a key to extracting material descriptors from large datasets, in particular the Sure Independence Screening and Sparsifying Operator (SISSO) method. While SISSO needs to store the entire expression space to impose heavy memory demands, it…
▽ More
In materials science, data-driven methods accelerate material discovery and optimization while reducing costs and improving success rates. Symbolic regression is a key to extracting material descriptors from large datasets, in particular the Sure Independence Screening and Sparsifying Operator (SISSO) method. While SISSO needs to store the entire expression space to impose heavy memory demands, it limits the performance in complex problems. To address this issue, we propose a RF-SISSO algorithm by combining Random Forests (RF) with SISSO. In this algorithm, the Random Forest algorithm is used for prescreening, capturing non-linear relationships and improving feature selection, which may enhance the quality of the input data and boost the accuracy and efficiency on regression and classification tasks. For a testing on the SISSO's verification problem for 299 materials, RF-SISSO demonstrates its robust performance and high accuracy. RF-SISSO can maintain the testing accuracy above 0.9 across all four training sample sizes and significantly enhancing regression efficiency, especially in training subsets with smaller sample sizes. For the training subset with 45 samples, the efficiency of RF-SISSO was 265 times higher than that of original SISSO. As collecting large datasets would be both costly and time-consuming in the practical experiments, it is thus believed that RF-SISSO may benefit scientific researches by offering a high predicting accuracy with limited data efficiently.
△ Less
Submitted 27 September, 2024;
originally announced September 2024.
-
Emu3: Next-Token Prediction is All You Need
Authors:
Xinlong Wang,
Xiaosong Zhang,
Zhengxiong Luo,
Quan Sun,
Yufeng Cui,
Jinsheng Wang,
Fan Zhang,
Yueze Wang,
Zhen Li,
Qiying Yu,
Yingli Zhao,
Yulong Ao,
Xuebin Min,
Tao Li,
Boya Wu,
Bo Zhao,
Bowen Zhang,
Liangdong Wang,
Guang Liu,
Zheqi He,
Xi Yang,
Jingjing Liu,
Yonghua Lin,
Tiejun Huang,
Zhongyuan Wang
Abstract:
While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e.g., Stable Diffusion) and compositional approaches (e.g., CLIP combined with LLMs). In this paper, we introduce Emu3, a new suite of state-of-the-art multimodal models trained solely with next-token predi…
▽ More
While next-token prediction is considered a promising path towards artificial general intelligence, it has struggled to excel in multimodal tasks, which are still dominated by diffusion models (e.g., Stable Diffusion) and compositional approaches (e.g., CLIP combined with LLMs). In this paper, we introduce Emu3, a new suite of state-of-the-art multimodal models trained solely with next-token prediction. By tokenizing images, text, and videos into a discrete space, we train a single transformer from scratch on a mixture of multimodal sequences. Emu3 outperforms several well-established task-specific models in both generation and perception tasks, surpassing flagship models such as SDXL and LLaVA-1.6, while eliminating the need for diffusion or compositional architectures. Emu3 is also capable of generating high-fidelity video via predicting the next token in a video sequence. We simplify complex multimodal model designs by converging on a singular focus: tokens, unlocking great potential for scaling both during training and inference. Our results demonstrate that next-token prediction is a promising path towards building general multimodal intelligence beyond language. We open-source key techniques and models to support further research in this direction.
△ Less
Submitted 27 September, 2024;
originally announced September 2024.
-
Stripes, pair density wave, and holon Wigner crystal in single-band Hubbard model on diagonal square lattice
Authors:
Zhi Xu,
Gui-Xin Liu,
Yi-Fan Jiang
Abstract:
We investigate the ground-state properties of the Hubbard model on wide diagonal square cylinders, rotated by $π/4$ relative to the regular lattice orientation. Using state-of-the-art density matrix renormalization group calculations with a large number of states, we convincingly demonstrate the development of a unidirectional charge density wave (CDW) characterized by infinite-length stripes alon…
▽ More
We investigate the ground-state properties of the Hubbard model on wide diagonal square cylinders, rotated by $π/4$ relative to the regular lattice orientation. Using state-of-the-art density matrix renormalization group calculations with a large number of states, we convincingly demonstrate the development of a unidirectional charge density wave (CDW) characterized by infinite-length stripes along the primitive vector of square lattice in models with next-nearest-neighbor hopping $t'=-0.1\sim -0.3$ and doping $δ\sim 14\%$. Intriguingly, analysis of pair-pair correlation functions along these stripes reveals incommensurate pair density wave (PDW) superconductivity with diverged susceptibility. To the best of our knowledge, this is probably the first controlled numerical evidence of dominant PDW in the single-band Hubbard model on square lattices. At lower doping $δ\sim 10\%$, we observed the formation of an additional CDW order within each stripe, which aligns across different stripes, forming a holon Wigner crystal phase. The spin pattern retains antiferromagnetic stripes with anti-phase domain walls. The ordering momentum of this emerged CDW order is remarkably close to the center-of-mass momentum of Cooper pairs in the PDW phase, suggesting a multifaceted relationship between CDW and PDW ordering.
△ Less
Submitted 27 September, 2024;
originally announced September 2024.
-
Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning
Authors:
Arshiya Khan,
Guannan Liu,
Xing Gao
Abstract:
Large Language Models (LLMs) have shown significant challenges in detecting and repairing vulnerable code, particularly when dealing with vulnerabilities involving multiple aspects, such as variables, code flows, and code structures. In this study, we utilize GitHub Copilot as the LLM and focus on buffer overflow vulnerabilities. Our experiments reveal a notable gap in Copilot's abilities when dea…
▽ More
Large Language Models (LLMs) have shown significant challenges in detecting and repairing vulnerable code, particularly when dealing with vulnerabilities involving multiple aspects, such as variables, code flows, and code structures. In this study, we utilize GitHub Copilot as the LLM and focus on buffer overflow vulnerabilities. Our experiments reveal a notable gap in Copilot's abilities when dealing with buffer overflow vulnerabilities, with a 76% vulnerability detection rate but only a 15% vulnerability repair rate. To address this issue, we propose context-aware prompt tuning techniques designed to enhance LLM performance in repairing buffer overflow. By injecting a sequence of domain knowledge about the vulnerability, including various security and code contexts, we demonstrate that Copilot's successful repair rate increases to 63%, representing more than four times the improvement compared to repairs without domain knowledge.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
Neural P$^3$M: A Long-Range Interaction Modeling Enhancer for Geometric GNNs
Authors:
Yusong Wang,
Chaoran Cheng,
Shaoning Li,
Yuxuan Ren,
Bin Shao,
Ge Liu,
Pheng-Ann Heng,
Nanning Zheng
Abstract:
Geometric graph neural networks (GNNs) have emerged as powerful tools for modeling molecular geometry. However, they encounter limitations in effectively capturing long-range interactions in large molecular systems. To address this challenge, we introduce Neural P$^3$M, a versatile enhancer of geometric GNNs to expand the scope of their capabilities by incorporating mesh points alongside atoms and…
▽ More
Geometric graph neural networks (GNNs) have emerged as powerful tools for modeling molecular geometry. However, they encounter limitations in effectively capturing long-range interactions in large molecular systems. To address this challenge, we introduce Neural P$^3$M, a versatile enhancer of geometric GNNs to expand the scope of their capabilities by incorporating mesh points alongside atoms and reimaging traditional mathematical operations in a trainable manner. Neural P$^3$M exhibits flexibility across a wide range of molecular systems and demonstrates remarkable accuracy in predicting energies and forces, outperforming on benchmarks such as the MD22 dataset. It also achieves an average improvement of 22% on the OE62 dataset while integrating with various architectures.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
Dataset Distillation-based Hybrid Federated Learning on Non-IID Data
Authors:
Xiufang Shi,
Wei Zhang,
Mincheng Wu,
Guangyi Liu,
Zhenyu Wen,
Shibo He,
Tejal Shah,
Rajiv Ranjan
Abstract:
In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (Non-IID) data. This study focuses on the issue of label distribution skew. To address it, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distil…
▽ More
In federated learning, the heterogeneity of client data has a great impact on the performance of model training. Many heterogeneity issues in this process are raised by non-independently and identically distributed (Non-IID) data. This study focuses on the issue of label distribution skew. To address it, we propose a hybrid federated learning framework called HFLDD, which integrates dataset distillation to generate approximately independent and equally distributed (IID) data, thereby improving the performance of model training. Particularly, we partition the clients into heterogeneous clusters, where the data labels among different clients within a cluster are unbalanced while the data labels among different clusters are balanced. The cluster headers collect distilled data from the corresponding cluster members, and conduct model training in collaboration with the server. This training process is like traditional federated learning on IID data, and hence effectively alleviates the impact of Non-IID data on model training. Furthermore, we compare our proposed method with typical baseline methods on public datasets. Experimental results demonstrate that when the data labels are severely imbalanced, the proposed HFLDD outperforms the baseline methods in terms of both test accuracy and communication cost.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Optically Coherent Nitrogen-Vacancy Centers in HPHT Treated Diamonds
Authors:
Yuan-Han Tang,
Xiaoran Zhang,
Kang-Yuan Liu,
Fan Xia,
Huijie Zheng,
Xiaobing Liu,
Xin-Yu Pan,
Heng Fan,
Gang-Qin Liu
Abstract:
As a point defect with unique spin and optical properties, nitrogen-vacancy (NV) center in diamond has attracted much attention in the fields of quantum sensing, quantum simulation, and quantum networks. The optical properties of an NV center are crucial for all these quantum applications. However, NV centers fabricated by destructive methods such as electron irradiation or ion implantation usuall…
▽ More
As a point defect with unique spin and optical properties, nitrogen-vacancy (NV) center in diamond has attracted much attention in the fields of quantum sensing, quantum simulation, and quantum networks. The optical properties of an NV center are crucial for all these quantum applications. However, NV centers fabricated by destructive methods such as electron irradiation or ion implantation usually exhibit poor optical coherence. In this work, we demonstrate a non-destructive method to fabricate optically coherent NV centers. High-purity single crystal diamonds are annealed under high pressure and high temperature (1700 $^{\circ}$C, 5.5 GPa), and individually resolvable NV centers with narrow PLE linewidth (<100 MHz) are produced. The high-pressure condition prevents the conversion of diamond to graphite during high-temperature annealing, significantly expanding the parameter space for creating high-performance artificial defects for quantum information science. These findings deepen our understanding of NV center formation in diamond and have implications for the optimization of color centers in solids, including silicon carbide and hexagonal boron nitride.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Search for $B_{(s)}^{*0}\toμ^+μ^-$ in $B_c^+\toπ^+μ^+μ^-$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1113 additional authors not shown)
Abstract:
A search for the very rare $B^{*0}\toμ^+μ^-$ and $B_{s}^{*0}\toμ^+μ^-$ decays is conducted by analysing the $B_c^+\to π^+μ^+μ^-$ process. The analysis uses proton-proton collision data collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9$\text{\,fb}^{-1}$. The signal signatures correspond to simultaneous peaks in the $μ^+μ^-$ and $π^+μ^+μ^-$ invari…
▽ More
A search for the very rare $B^{*0}\toμ^+μ^-$ and $B_{s}^{*0}\toμ^+μ^-$ decays is conducted by analysing the $B_c^+\to π^+μ^+μ^-$ process. The analysis uses proton-proton collision data collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9$\text{\,fb}^{-1}$. The signal signatures correspond to simultaneous peaks in the $μ^+μ^-$ and $π^+μ^+μ^-$ invariant masses. No evidence for an excess of events over background is observed for either signal decay mode. Upper limits at the $90\%$ confidence level are set on the branching fractions relative to that for $B_c^+\to J\mskip -3mu/\mskip -2muψπ^+$ decays, \begin{align*}
{\cal R}_{B^{*0}(μ^+μ^-)π^+/J\mskip -3mu/\mskip -2muψπ^+} &< 3.8\times 10^{-5}\ \text{ and }
{\cal R}_{B_{s}^{*0}(μ^+μ^-)π^+/J\mskip -3mu/\mskip -2muψπ^+} &< 5.0\times 10^{-5}\,. \end{align*}
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Provably Efficient Exploration in Inverse Constrained Reinforcement Learning
Authors:
Bo Yue,
Jian Li,
Guiliang Liu
Abstract:
To obtain the optimal constraints in complex environments, Inverse Constrained Reinforcement Learning (ICRL) seeks to recover these constraints from expert demonstrations in a data-driven manner. Existing ICRL algorithms collect training samples from an interactive environment. However, the efficacy and efficiency of these sampling strategies remain unknown. To bridge this gap, we introduce a stra…
▽ More
To obtain the optimal constraints in complex environments, Inverse Constrained Reinforcement Learning (ICRL) seeks to recover these constraints from expert demonstrations in a data-driven manner. Existing ICRL algorithms collect training samples from an interactive environment. However, the efficacy and efficiency of these sampling strategies remain unknown. To bridge this gap, we introduce a strategic exploration framework with guaranteed efficiency. Specifically, we define a feasible constraint set for ICRL problems and investigate how expert policy and environmental dynamics influence the optimality of constraints. Motivated by our findings, we propose two exploratory algorithms to achieve efficient constraint inference via 1) dynamically reducing the bounded aggregate error of cost estimation and 2) strategically constraining the exploration policy. Both algorithms are theoretically grounded with tractable sample complexity. We empirically demonstrate the performance of our algorithms under various environments.
△ Less
Submitted 30 September, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
Search for $D^0\to K^-ηe^+ν_e$, $D^+\to K_S^0 ηe^+ν_e$ and $D^+\to ηηe^+ν_e$ decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 7.93 fb$^{-1}$, collected at the center-of-mass energy of 3.773 GeV with the BESIII detector, we search for the semileptonic decays $D^0\to K^-ηe^+ν_e$, $D^+\to K_S^0 ηe^+ν_e$ and $D^+\to ηηe^+ν_e$ for the first time. We present evidence for $D^0\to K^-ηe^+ν_e$ with a significance of $3.3σ$. The branching fraction…
▽ More
By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 7.93 fb$^{-1}$, collected at the center-of-mass energy of 3.773 GeV with the BESIII detector, we search for the semileptonic decays $D^0\to K^-ηe^+ν_e$, $D^+\to K_S^0 ηe^+ν_e$ and $D^+\to ηηe^+ν_e$ for the first time. We present evidence for $D^0\to K^-ηe^+ν_e$ with a significance of $3.3σ$. The branching fraction of $D^0\to K^-ηe^+ν_e$ is measured to be $(0.84_{-0.34}^{+0.29}\pm0.22)\times 10^{-4}$. Here, the first uncertainties are statistical and the second ones are systematic. No significant signals are observed for the decays $D^+\to K_S^0 ηe^+ν_e$ and $D^+\to ηηe^+ν_e$ and we set the upper limits on their branching fractions.
△ Less
Submitted 24 September, 2024; v1 submitted 23 September, 2024;
originally announced September 2024.