-
UniVision: A Unified Framework for Vision-Centric 3D Perception
Authors:
Yu Hong,
Qian Liu,
Huayuan Cheng,
Danjiao Ma,
Hang Dai,
Yu Wang,
Guangzhi Cao,
Yong Ding
Abstract:
The past few years have witnessed the rapid development of vision-centric 3D perception in autonomous driving. Although the 3D perception models share many structural and conceptual similarities, there still exist gaps in their feature representations, data formats, and objectives, posing challenges for unified and efficient 3D perception framework design. In this paper, we present UniVision, a si…
▽ More
The past few years have witnessed the rapid development of vision-centric 3D perception in autonomous driving. Although the 3D perception models share many structural and conceptual similarities, there still exist gaps in their feature representations, data formats, and objectives, posing challenges for unified and efficient 3D perception framework design. In this paper, we present UniVision, a simple and efficient framework that unifies two major tasks in vision-centric 3D perception, \ie, occupancy prediction and object detection. Specifically, we propose an explicit-implicit view transform module for complementary 2D-3D feature transformation. We propose a local-global feature extraction and fusion module for efficient and adaptive voxel and BEV feature extraction, enhancement, and interaction. Further, we propose a joint occupancy-detection data augmentation strategy and a progressive loss weight adjustment strategy which enables the efficiency and stability of the multi-task framework training. We conduct extensive experiments for different perception tasks on four public benchmarks, including nuScenes LiDAR segmentation, nuScenes detection, OpenOccupancy, and Occ3D. UniVision achieves state-of-the-art results with +1.5 mIoU, +1.8 NDS, +1.5 mIoU, and +1.8 mIoU gains on each benchmark, respectively. We believe that the UniVision framework can serve as a high-performance baseline for the unified vision-centric 3D perception task. The code will be available at \url{https://github.com/Cc-Hy/UniVision}.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
First observation of the decay $Λ^+_c\to nK^{0}_{S}π^+π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (630 additional authors not shown)
Abstract:
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4599.53$ MeV and $4698.82$ MeV with the BESIII detector, the decay $Λ_{c}^{+}\to nK_{S}^{0}π^+π^0$ is observed for the first time with a significance of $9.2σ$. The branching fraction is measured to be $(0.85\pm0.13\pm0.03)\%$, where the first uncertainty is statistical and the second systematic,…
▽ More
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4599.53$ MeV and $4698.82$ MeV with the BESIII detector, the decay $Λ_{c}^{+}\to nK_{S}^{0}π^+π^0$ is observed for the first time with a significance of $9.2σ$. The branching fraction is measured to be $(0.85\pm0.13\pm0.03)\%$, where the first uncertainty is statistical and the second systematic, which differs from the theoretical prediction based on isospin by 4.4$σ$. This indicates that there may be resonant contributions or some unknown dynamics in this decay.
△ Less
Submitted 28 March, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Leveraging Frequency Domain Learning in 3D Vessel Segmentation
Authors:
Xinyuan Wang,
Chengwei Pan,
Hongming Dai,
Gangming Zhao,
Jinpeng Li,
Xiao Zhang,
Yizhou Yu
Abstract:
Coronary microvascular disease constitutes a substantial risk to human health. Employing computer-aided analysis and diagnostic systems, medical professionals can intervene early in disease progression, with 3D vessel segmentation serving as a crucial component. Nevertheless, conventional U-Net architectures tend to yield incoherent and imprecise segmentation outcomes, particularly for small vesse…
▽ More
Coronary microvascular disease constitutes a substantial risk to human health. Employing computer-aided analysis and diagnostic systems, medical professionals can intervene early in disease progression, with 3D vessel segmentation serving as a crucial component. Nevertheless, conventional U-Net architectures tend to yield incoherent and imprecise segmentation outcomes, particularly for small vessel structures. While models with attention mechanisms, such as Transformers and large convolutional kernels, demonstrate superior performance, their extensive computational demands during training and inference lead to increased time complexity. In this study, we leverage Fourier domain learning as a substitute for multi-scale convolutional kernels in 3D hierarchical segmentation models, which can reduce computational expenses while preserving global receptive fields within the network. Furthermore, a zero-parameter frequency domain fusion method is designed to improve the skip connections in U-Net architecture. Experimental results on a public dataset and an in-house dataset indicate that our novel Fourier transformation-based network achieves remarkable dice performance (84.37\% on ASACA500 and 80.32\% on ImageCAS) in tubular vessel segmentation tasks and substantially reduces computational requirements without compromising global receptive fields.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
On the Three Demons in Causality in Finance: Time Resolution, Nonstationarity, and Latent Factors
Authors:
Xinshuai Dong,
Haoyue Dai,
Yewen Fan,
Songyao Jin,
Sathyamoorthy Rajendran,
Kun Zhang
Abstract:
Financial data is generally time series in essence and thus suffers from three fundamental issues: the mismatch in time resolution, the time-varying property of the distribution - nonstationarity, and causal factors that are important but unknown/unobserved. In this paper, we follow a causal perspective to systematically look into these three demons in finance. Specifically, we reexamine these iss…
▽ More
Financial data is generally time series in essence and thus suffers from three fundamental issues: the mismatch in time resolution, the time-varying property of the distribution - nonstationarity, and causal factors that are important but unknown/unobserved. In this paper, we follow a causal perspective to systematically look into these three demons in finance. Specifically, we reexamine these issues in the context of causality, which gives rise to a novel and inspiring understanding of how the issues can be addressed. Following this perspective, we provide systematic solutions to these problems, which hopefully would serve as a foundation for future research in the area.
△ Less
Submitted 12 January, 2024; v1 submitted 28 December, 2023;
originally announced January 2024.
-
Large Language Models for Robotics: Opportunities, Challenges, and Perspectives
Authors:
Jiaqi Wang,
Zihao Wu,
Yiwei Li,
Hanqi Jiang,
Peng Shu,
Enze Shi,
Huawen Hu,
Chong Ma,
Yiheng Liu,
Xuhui Wang,
Yincheng Yao,
Xuan Liu,
Huaqin Zhao,
Zhengliang Liu,
Haixing Dai,
Lin Zhao,
Bao Ge,
Xiang Li,
Tianming Liu,
Shu Zhang
Abstract:
Large language models (LLMs) have undergone significant expansion and have been increasingly integrated across various domains. Notably, in the realm of robot task planning, LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions. However, for embodied tasks, where robots interact with comp…
▽ More
Large language models (LLMs) have undergone significant expansion and have been increasingly integrated across various domains. Notably, in the realm of robot task planning, LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions. However, for embodied tasks, where robots interact with complex environments, text-only LLMs often face challenges due to a lack of compatibility with robotic visual perception. This study provides a comprehensive overview of the emerging integration of LLMs and multimodal LLMs into various robotic tasks. Additionally, we propose a framework that utilizes multimodal GPT-4V to enhance embodied task planning through the combination of natural language instructions and robot visual perceptions. Our results, based on diverse datasets, indicate that GPT-4V effectively enhances robot performance in embodied tasks. This extensive survey and evaluation of LLMs and multimodal LLMs across a variety of robotic tasks enriches the understanding of LLM-centric embodied intelligence and provides forward-looking insights toward bridging the gap in Human-Robot-Environment interaction.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Pre-insertion resistors temperature prediction based on improved WOA-SVR
Authors:
Honghe Dai,
Site Mo,
Haoxin Wang,
Nan Yin,
Songhai Fan,
Bixiong Li
Abstract:
The pre-insertion resistors (PIR) within high-voltage circuit breakers are critical components and warm up by generating Joule heat when an electric current flows through them. Elevated temperature can lead to temporary closure failure and, in severe cases, the rupture of PIR. To accurately predict the temperature of PIR, this study combines finite element simulation techniques with Support Vector…
▽ More
The pre-insertion resistors (PIR) within high-voltage circuit breakers are critical components and warm up by generating Joule heat when an electric current flows through them. Elevated temperature can lead to temporary closure failure and, in severe cases, the rupture of PIR. To accurately predict the temperature of PIR, this study combines finite element simulation techniques with Support Vector Regression (SVR) optimized by an Improved Whale Optimization Algorithm (IWOA) approach. The IWOA includes Tent mapping, a convergence factor based on the sigmoid function, and the Ornstein-Uhlenbeck variation strategy. The IWOA-SVR model is compared with the SSA-SVR and WOA-SVR. The results reveal that the prediction accuracies of the IWOA-SVR model were 90.2% and 81.5% (above 100$^\circ$C) in the 3$^\circ$C temperature deviation range and 96.3% and 93.4% (above 100$^\circ$C) in the 4$^\circ$C temperature deviation range, surpassing the performance of the comparative models. This research demonstrates the method proposed can realize the online monitoring of the temperature of the PIR, which can effectively prevent thermal faults PIR and provide a basis for the opening and closing of the circuit breaker within a short period.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Partial Wave Analysis of $J/ψ\rightarrow γγφ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (603 additional authors not shown)
Abstract:
Using a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, a partial wave analysis on the decay $γγφ$ is performed to investigate the intermediate resonances in $J/ψ\rightarrowγX, X\rightarrowγφ$. The resonances $f_{1}(1285)$, $η(1405)$, $f_{1}(1420)$, $f_{1}(1510)$, $f_{2}(1525)$, $X(1835)$, $f_{2}(1950)$, $f_{2}(2010)$, $f_{0}(2200)$ and…
▽ More
Using a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, a partial wave analysis on the decay $γγφ$ is performed to investigate the intermediate resonances in $J/ψ\rightarrowγX, X\rightarrowγφ$. The resonances $f_{1}(1285)$, $η(1405)$, $f_{1}(1420)$, $f_{1}(1510)$, $f_{2}(1525)$, $X(1835)$, $f_{2}(1950)$, $f_{2}(2010)$, $f_{0}(2200)$ and $η_{c}$ are observed with statistical significance greater than 5$σ$. The product branching fractions $\mathcal{B}(J/ψ\rightarrowγX, X\rightarrow γφ)$ are reported. The resonance parameters of $η(1405)$ and $X(1835)$ are also measured.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Observation of $\mathcal R(3810)$ in $e^+e^-\rightarrow {\rm hadrons}$ and Improved Measurements of the Resonance Parameters of $\mathcal R(3760)$ and $\mathcal R(3780)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (596 additional authors not shown)
Abstract:
We report the measurement of the cross sections for $e^+e^-\rightarrow {\rm hadrons}$ at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe a new resonance $\mathcal R(3810)$ in the cross sections for the first time, and observe the $\mathcal R(3760)$ resonance with high significance in the cross sections. The $\mathcal R(3810)$ has a mass of $(3804.5 \pm 0.9 \pm 0.9)$ ~MeV/$c^2$,…
▽ More
We report the measurement of the cross sections for $e^+e^-\rightarrow {\rm hadrons}$ at center-of-mass (c.m.) energies from 3.645 to 3.871 GeV. We observe a new resonance $\mathcal R(3810)$ in the cross sections for the first time, and observe the $\mathcal R(3760)$ resonance with high significance in the cross sections. The $\mathcal R(3810)$ has a mass of $(3804.5 \pm 0.9 \pm 0.9)$ ~MeV/$c^2$, a total width of $(5.4 \pm 3.5 \pm 3.2)$~MeV, and an electronic partial width of $(19.4 \pm 7.4 \pm 12.1)$~eV. Its significance is $7.7σ$. The $\mathcal R(3810)$ could be interpreted as a hadro-charmonium resonance predicted by Quantum Chromodynamics (QCD). In addition, we measure the mass $(3751.9\pm 3.8\pm 2.8)$ ~MeV/$c^2$, the total width $(32.8 \pm 5.8 \pm 8.7)$~MeV, and the electronic partial width $(184\pm 75\pm 86)$~eV with improved precision for the $\mathcal R(3760)$. Furthermore, for the $\mathcal R(3780)$ we measure the mass $(3778.7\pm 0.5\pm 0.3)$ ~MeV/$c^2$ and total width $(20.3 \pm 0.8 \pm 1.7)$~MeV with improved precision, and the electronic partial width $(265\pm 69\pm 83)$~eV. The $\mathcal R(3780)$ can be interpreted as the $1^3D_1$ state of charmonium. Its mass and total width differ significantly from the corresponding fitted values given by the Particle Data Group in 2022 by 7.1 and 3.2 times the uncertainties for $ψ(3770)$, respectively. $ψ(3770)$ has been interpreted as the $1^3D_1$ state for 45 years.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Search for a massless particle beyond the Standard Model in the $Σ^+\rightarrow p+{\rm invisible}$ decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
A massless particle beyond the Standard Model is searched for in the two-body decay $Σ^+\rightarrow p+{\rm invisible}$ using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at a center-of-mass energy of $\sqrt{s}=3.097$ GeV with the BESIII detector at the BEPCII collider. No significant signal is observed, and the upper limit on the branching fraction $B(Σ^+\rightarrow p+{\rm invisible})$…
▽ More
A massless particle beyond the Standard Model is searched for in the two-body decay $Σ^+\rightarrow p+{\rm invisible}$ using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at a center-of-mass energy of $\sqrt{s}=3.097$ GeV with the BESIII detector at the BEPCII collider. No significant signal is observed, and the upper limit on the branching fraction $B(Σ^+\rightarrow p+{\rm invisible})$ is determined to be $3.2\times10^{-5}$ at the 90% confidence level. This is the first search for a flavor-changing neutral current process with missing energy in hyperon decays which plays an important role in constraining new physics models.
△ Less
Submitted 5 April, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Observation of $χ_{cJ}\to 3(K^+K^-)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (632 additional authors not shown)
Abstract:
By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay processes $χ_{cJ} \to 3(K^+K^-)$ ($J=0,1,2$) are observed for the first time with statistical significances of 8.2$σ$, 8.1$σ$, and 12.4$σ$, respectively. The product branching fractions of $ψ(3686)\toγχ_{cJ}$, $χ_{cJ}\to 3(K^+K^-)$ are presented and the branching…
▽ More
By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay processes $χ_{cJ} \to 3(K^+K^-)$ ($J=0,1,2$) are observed for the first time with statistical significances of 8.2$σ$, 8.1$σ$, and 12.4$σ$, respectively. The product branching fractions of $ψ(3686)\toγχ_{cJ}$, $χ_{cJ}\to 3(K^+K^-)$ are presented and the branching fractions of $χ_{cJ}\to 3(K^+K^-)$ decays are determined to be
$\mathcal{B}_{χ_{c0}\to 3(K^+K^-)}$=$(10.7\pm1.8\pm1.1)$$\times10^{-6}$,
$\mathcal{B}_{χ_{c1}\to 3(K^+K^-)}$=$(4.2\pm0.9\pm0.5)$$\times10^{-6}$, and
$\mathcal{B}_{χ_{c2}\to 3(K^+K^-)}$=$(7.2\pm1.1\pm0.8)$$\times10^{-6}$,
where the first uncertainties are statistical and the second are systematic.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Medical Report Generation based on Segment-Enhanced Contrastive Representation Learning
Authors:
Ruoqing Zhao,
Xi Wang,
Hongliang Dai,
Pan Gao,
Piji Li
Abstract:
Automated radiology report generation has the potential to improve radiology reporting and alleviate the workload of radiologists. However, the medical report generation task poses unique challenges due to the limited availability of medical data and the presence of data bias. To maximize the utility of available data and reduce data bias, we propose MSCL (Medical image Segmentation with Contrasti…
▽ More
Automated radiology report generation has the potential to improve radiology reporting and alleviate the workload of radiologists. However, the medical report generation task poses unique challenges due to the limited availability of medical data and the presence of data bias. To maximize the utility of available data and reduce data bias, we propose MSCL (Medical image Segmentation with Contrastive Learning), a framework that utilizes the Segment Anything Model (SAM) to segment organs, abnormalities, bones, etc., and can pay more attention to the meaningful ROIs in the image to get better visual representations. Then we introduce a supervised contrastive loss that assigns more weight to reports that are semantically similar to the target while training. The design of this loss function aims to mitigate the impact of data bias and encourage the model to capture the essential features of a medical image and generate high-quality reports. Experimental results demonstrate the effectiveness of our proposed model, where we achieve state-of-the-art performance on the IU X-Ray public dataset.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
BiSwift: Bandwidth Orchestrator for Multi-Stream Video Analytics on Edge
Authors:
Lin Sun,
Weijun Wang,
Tingting Yuan,
Liang Mi,
Haipeng Dai,
Yunxin Liu,
Xiaoming Fu
Abstract:
High-definition (HD) cameras for surveillance and road traffic have experienced tremendous growth, demanding intensive computation resources for real-time analytics. Recently, offloading frames from the front-end device to the back-end edge server has shown great promise. In multi-stream competitive environments, efficient bandwidth management and proper scheduling are crucial to ensure both high…
▽ More
High-definition (HD) cameras for surveillance and road traffic have experienced tremendous growth, demanding intensive computation resources for real-time analytics. Recently, offloading frames from the front-end device to the back-end edge server has shown great promise. In multi-stream competitive environments, efficient bandwidth management and proper scheduling are crucial to ensure both high inference accuracy and high throughput. To achieve this goal, we propose BiSwift, a bi-level framework that scales the concurrent real-time video analytics by a novel adaptive hybrid codec integrated with multi-level pipelines, and a global bandwidth controller for multiple video streams. The lower-level front-back-end collaborative mechanism (called adaptive hybrid codec) locally optimizes the accuracy and accelerates end-to-end video analytics for a single stream. The upper-level scheduler aims to accuracy fairness among multiple streams via the global bandwidth controller. The evaluation of BiSwift shows that BiSwift is able to real-time object detection on 9 streams with an edge device only equipped with an NVIDIA RTX3070 (8G) GPU. BiSwift improves 10%$\sim$21% accuracy and presents 1.2$\sim$9$\times$ throughput compared with the state-of-the-art video analytics pipelines.
△ Less
Submitted 4 February, 2024; v1 submitted 25 December, 2023;
originally announced December 2023.
-
Search for the decay $χ_{c1}(3872)\toπ^{+}π^{-}χ_{c1}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (608 additional authors not shown)
Abstract:
Using a data sample corresponding to an integrated luminosity of 10.9 fb$^{-1}$ collected at center-of-mass energies from 4.16 to 4.34 GeV with the BESIII detector, we search for the decay $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$ in the radiative production $e^{+}e^{-} \to γχ_{c1}(3872)$. No significant signal is observed, and the ratio for the branching fraction of $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$…
▽ More
Using a data sample corresponding to an integrated luminosity of 10.9 fb$^{-1}$ collected at center-of-mass energies from 4.16 to 4.34 GeV with the BESIII detector, we search for the decay $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$ in the radiative production $e^{+}e^{-} \to γχ_{c1}(3872)$. No significant signal is observed, and the ratio for the branching fraction of $χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}$ to $χ_{c1}(3872) \to π^{+}π^{-}J/ψ$ is measured as $\mathcal{R}\equiv\frac{\mathcal{B}[χ_{c1}(3872) \to π^{+}π^{-}χ_{c1}]}{\mathcal{B}[χ_{c1}(3872) \to π^{+}π^{-} J/ψ]}<0.18$ at 90$\%$ confidence level. The upper limit on the product of the cross section $σ[e^{+}e^{-}\toγχ_{c1}(3872)]$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^{+}π^{-}χ_{c1}]$ at each center-of-mass energy is also given. These measurements favor the non-conventional charmonium nature of the $χ_{c1}(3872)$ state.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Measurements of $Σ$ electromagnetic form factors in the time-like region using the untagged initial-state radiation technique
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (626 additional authors not shown)
Abstract:
The process $e^{+}e^{-}\toΣ^{+}\barΣ^{-}$ is studied from threshold up to 3.04 GeV/$c^2$ via the initial-state radiation technique using data with an integrated luminosity of 12.0 fb$^{-1}$, collected at center-of-mass energies between 3.773 and 4.258 GeV with the BESIII detector at the BEPCII collider. The pair production cross sections and the effective form factors of $Σ$ are measured in eleven…
▽ More
The process $e^{+}e^{-}\toΣ^{+}\barΣ^{-}$ is studied from threshold up to 3.04 GeV/$c^2$ via the initial-state radiation technique using data with an integrated luminosity of 12.0 fb$^{-1}$, collected at center-of-mass energies between 3.773 and 4.258 GeV with the BESIII detector at the BEPCII collider. The pair production cross sections and the effective form factors of $Σ$ are measured in eleven $Σ^{+}\barΣ^{-}$ invariant mass intervals from threshold to 3.04 GeV/$c^2$. The results are consistent with the previous results from Belle and BESIII. Furthermore, the branching fractions of the decays $J/ψ\toΣ^{+}\barΣ^{-}$ and $ψ(3686)\toΣ^{+}\barΣ^{-}$ are determined and the obtained results are consistent with the previous results of BESIII.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Authors:
Ziqian Zeng,
Yihuai Hong,
Hongliang Dai,
Huiping Zhuang,
Cen Chen
Abstract:
Early Exiting is one of the most popular methods to achieve efficient inference. Current early exiting methods adopt the (weighted) sum of the cross entropy loss of all internal classifiers during training, imposing all these classifiers to predict all instances correctly. However, during inference, as long as one internal classifier predicts an instance correctly, it can accelerate without losing…
▽ More
Early Exiting is one of the most popular methods to achieve efficient inference. Current early exiting methods adopt the (weighted) sum of the cross entropy loss of all internal classifiers during training, imposing all these classifiers to predict all instances correctly. However, during inference, as long as one internal classifier predicts an instance correctly, it can accelerate without losing accuracy. Thus, there is a notable gap between training and inference. We propose ConsistentEE, an early exiting method that is consistent in training and inference. ConsistentEE formulates the early exiting process as a reinforcement learning problem. A policy network is added to decide whether an instance should exit or continue. The training objective of ConsistentEE only require each instance to be predicted correctly by one internal classifier. Additionally, we introduce the concept Memorize Layer to measure the hardness of an instance. We incorporate memorized layer into reward function design, which allows "easy" instances to focus more on acceleration while "hard" instances to focus more on accuracy. Experimental results show that our method outperforms other baselines on various natural language understanding and generation tasks.
△ Less
Submitted 7 April, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Observation of significant flavor-SU(3) breaking in the kaon wave function at $12~{\rm GeV}^2<Q^2<25~{\rm GeV}^2$ and discovery of the charmless decay $ψ(3770)\to K_S^0K_L^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (607 additional authors not shown)
Abstract:
We present cross sections for the reaction $e^+e^-\to K_S^0K_L^0$ at center-of-mass energies ranging from 3.51 GeV to 4.95 GeV using data samples collected in the BESIII experiment, corresponding to a total integrated luminosity of 26.5 fb$^{-1}$. The ratio of neutral-to-charged kaon form factors at large momentum transfers ($12~{\rm GeV}^2<Q^2<25~{\rm GeV}^2$) is determined to be $0.21\pm 0.01$,…
▽ More
We present cross sections for the reaction $e^+e^-\to K_S^0K_L^0$ at center-of-mass energies ranging from 3.51 GeV to 4.95 GeV using data samples collected in the BESIII experiment, corresponding to a total integrated luminosity of 26.5 fb$^{-1}$. The ratio of neutral-to-charged kaon form factors at large momentum transfers ($12~{\rm GeV}^2<Q^2<25~{\rm GeV}^2$) is determined to be $0.21\pm 0.01$, which indicates a small but significant effect of flavor-SU(3) breaking in the kaon wave function, and consequently excludes the possibility that flavor-SU(3) breaking is the primary reason for the strong experimental violation of the pQCD prediction $|F(π^{\pm})|/|F(K^{\pm})|=f^2_π/f^2_{K}$, where $F(π^{\pm})$ and $F(K^{\pm})$ are the form factors, and $f_π$ and $f_{K}$ are the decay constants of charged pions and kaons, respectively. We also observe a significant signal for the charmless decay $ψ(3770)\to K_S^0K_L^0$ for the first time. Within a $1σ$ contour of the likelihood value, the the branching fraction for $ψ(3770)\to K_S^0K_L^0$ is determined to be ${\cal B}=(2.63_{-1.59}^{+1.40})\times 10^{-5}$, and the relative phase between the continuum and $ψ(3770)$ amplitudes is $φ=(-0.39_{-0.10}^{+0.05})π$. The branching fraction is in good agreement with the $\mathcal{S}$- and $\mathcal{D}$-wave charmonia mixing scheme proposed in the interpretation of the "$ρπ$ puzzle" between $J/ψ$ and $ψ(3686)$ decays.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
AEDFL: Efficient Asynchronous Decentralized Federated Learning with Heterogeneous Devices
Authors:
Ji Liu,
Tianshi Che,
Yang Zhou,
Ruoming Jin,
Huaiyu Dai,
Dejing Dou,
Patrick Valduriez
Abstract:
Federated Learning (FL) has achieved significant achievements recently, enabling collaborative model training on distributed data over edge devices. Iterative gradient or model exchanges between devices and the centralized server in the standard FL paradigm suffer from severe efficiency bottlenecks on the server. While enabling collaborative training without a central server, existing decentralize…
▽ More
Federated Learning (FL) has achieved significant achievements recently, enabling collaborative model training on distributed data over edge devices. Iterative gradient or model exchanges between devices and the centralized server in the standard FL paradigm suffer from severe efficiency bottlenecks on the server. While enabling collaborative training without a central server, existing decentralized FL approaches either focus on the synchronous mechanism that deteriorates FL convergence or ignore device staleness with an asynchronous mechanism, resulting in inferior FL accuracy. In this paper, we propose an Asynchronous Efficient Decentralized FL framework, i.e., AEDFL, in heterogeneous environments with three unique contributions. First, we propose an asynchronous FL system model with an efficient model aggregation method for improving the FL convergence. Second, we propose a dynamic staleness-aware model update approach to achieve superior accuracy. Third, we propose an adaptive sparse training method to reduce communication and computation costs without significant accuracy degradation. Extensive experimentation on four public datasets and four models demonstrates the strength of AEDFL in terms of accuracy (up to 16.3% higher), efficiency (up to 92.9% faster), and computation costs (up to 42.3% lower).
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Robust Estimation of Nonlinear Properties of Quantum Processes
Authors:
Yuqing Wang,
Guoding Liu,
Zhenhuan Liu,
Yifan Tang,
Xiongfeng Ma,
Hao Dai
Abstract:
Accurate and robust estimation of quantum process properties is crucial for quantum information processing and quantum many-body physics. Combining classical shadow tomography and randomized benchmarking, Helsen et al. introduced a method to estimate the linear properties of quantum processes. In this work, we focus on the estimation protocols of nonlinear process properties that are robust to sta…
▽ More
Accurate and robust estimation of quantum process properties is crucial for quantum information processing and quantum many-body physics. Combining classical shadow tomography and randomized benchmarking, Helsen et al. introduced a method to estimate the linear properties of quantum processes. In this work, we focus on the estimation protocols of nonlinear process properties that are robust to state preparation and measurement errors. We introduce two protocols, both utilizing random gate sequences but employing different post-processing methods, which make them suitable for measuring different nonlinear properties. The first protocol offers a robust and sound method to estimate the out-of-time-ordered correlation, as demonstrated numerically in an Ising model. The second protocol estimates unitarity, effectively characterizing the incoherence of quantum channels. We expect the two protocols to be useful tools for exploring quantum many-body physics and characterizing quantum processes.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Measurements of Born Cross Sections for $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2595)^- + {\rm c.c.}$ and $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2625)^- + {\rm c.c.}$ at $\sqrt{s}=$4918.0 and 4950.9 MeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (620 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, the Born cross sections of $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2595)^- + \rm{c.c.}$ and $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2625)^- + \rm{c.c.}$ are measured for the first time at center-of-mass energies of $\sqrt{s}=4918.0$ and 4950.9 MeV. Non-zero cross sections are observed very close to the production threshol…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector operating at the BEPCII collider, the Born cross sections of $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2595)^- + \rm{c.c.}$ and $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2625)^- + \rm{c.c.}$ are measured for the first time at center-of-mass energies of $\sqrt{s}=4918.0$ and 4950.9 MeV. Non-zero cross sections are observed very close to the production threshold. The measured Born cross sections of $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2625)^- + \rm{c.c.}$ are about $2\sim3$ times greater than those of $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2595)^- + \rm{c.c.}$, thereby indicating that the exotic structure potentially exists in the excited charmed baryons. The Born cross sections are $15.6\pm3.1\pm0.9$ pb and $29.4\pm3.7\pm2.7$ pb for $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2595)^- + \rm{c.c.}$, and are $43.4\pm4.0\pm4.1$ pb and $76.8\pm6.5\pm4.2$ pb for $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2625)^- +\rm{c.c.}$ at $\sqrt s=4918.0$ and 4950.9 MeV, respectively. Based on the polar angle distributions of the $\barΛ_{c}(2625)^-$ and $Λ_{c}(2625)^+$, the form-factor ratios $\sqrt{|G_{E}|^2 + 3|G_{M}|^2}/|G_{C}|$ are determined for $e^+e^-\to Λ_{c}^+ \barΛ_{c}(2625)^- + \rm{c.c.}$ for the first time, which are $5.95\pm4.07\pm0.15$ and $0.94\pm0.32\pm0.02$ at $\sqrt s=4918.0$ and 4950.9 MeV, respectively. All of these first uncertainties are statistical and second systematic.
△ Less
Submitted 8 May, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Individualized Deepfake Detection Exploiting Traces Due to Double Neural-Network Operations
Authors:
Mushfiqur Rahman,
Runze Liu,
Chau-Wai Wong,
Huaiyu Dai
Abstract:
In today's digital landscape, journalists urgently require tools to verify the authenticity of facial images and videos depicting specific public figures before incorporating them into news stories. Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial ima…
▽ More
In today's digital landscape, journalists urgently require tools to verify the authenticity of facial images and videos depicting specific public figures before incorporating them into news stories. Existing deepfake detectors are not optimized for this detection task when an image is associated with a specific and identifiable individual. This study focuses on the deepfake detection of facial images of individual public figures. We propose to condition the proposed detector on the identity of the identified individual given the advantages revealed by our theory-driven simulations. While most detectors in the literature rely on perceptible or imperceptible artifacts present in deepfake facial images, we demonstrate that the detection performance can be improved by exploiting the idempotency property of neural networks. In our approach, the training process involves double neural-network operations where we pass an authentic image through a deepfake simulating network twice. Experimental results show that the proposed method improves the area under the curve (AUC) from 0.92 to 0.94 and reduces its standard deviation by 17\%. For evaluating the detection performance of individual public figures, a facial image dataset with individuals' names is required, a criterion not met by the current deepfake datasets. To address this, we curated a dataset comprising 32k images featuring 45 public figures, which we intend to release to the public after the paper is published.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Search for $D^{0}\to K_{S}^{0} K^{-} e^{+}ν_{e}$, $D^{+}\to K_{S}^{0} K_{S}^{0} e^{+}ν_{e}$, and $D^{+}\to K^{+}K^{-} e^{+}ν_{e}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (604 additional authors not shown)
Abstract:
A search has been performed for the semileptonic decays $D^{0}\to K_{S}^{0} K^{-} e^{+}ν_{e}$, $D^{+}\to K_{S}^{0} K_{S}^{0} e^{+}ν_{e}$ and $D^{+}\to K^{+}K^{-} e^{+}ν_{e}$, using $7.9~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$ GeV by the BESIII detector operating at the BEPCII collider. No significant signals are observed, and upper li…
▽ More
A search has been performed for the semileptonic decays $D^{0}\to K_{S}^{0} K^{-} e^{+}ν_{e}$, $D^{+}\to K_{S}^{0} K_{S}^{0} e^{+}ν_{e}$ and $D^{+}\to K^{+}K^{-} e^{+}ν_{e}$, using $7.9~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$ GeV by the BESIII detector operating at the BEPCII collider. No significant signals are observed, and upper limits are set at the 90\% confidence level of $2.13\times10^{-5}$, $1.54\times10^{-5}$ and $2.10\times10^{-5}$ for the branching fractions of $D^{0}\to K_{S}^{0} K^{-} e^{+}ν_{e}$, $D^{+}\to K_{S}^{0} K_{S}^{0} e^{+}ν_{e}$ and $D^{+}\to K^{+}K^{-} e^{+}ν_{e}$, respectively.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Dance of Channel and Sequence: An Efficient Attention-Based Approach for Multivariate Time Series Forecasting
Authors:
Haoxin Wang,
Yipeng Mo,
Nan Yin,
Honghe Dai,
Bixiong Li,
Songhai Fan,
Site Mo
Abstract:
In recent developments, predictive models for multivariate time series analysis have exhibited commendable performance through the adoption of the prevalent principle of channel independence. Nevertheless, it is imperative to acknowledge the intricate interplay among channels, which fundamentally influences the outcomes of multivariate predictions. Consequently, the notion of channel independence,…
▽ More
In recent developments, predictive models for multivariate time series analysis have exhibited commendable performance through the adoption of the prevalent principle of channel independence. Nevertheless, it is imperative to acknowledge the intricate interplay among channels, which fundamentally influences the outcomes of multivariate predictions. Consequently, the notion of channel independence, while offering utility to a certain extent, becomes increasingly impractical, leading to information degradation. In response to this pressing concern, we present CSformer, an innovative framework characterized by a meticulously engineered two-stage self-attention mechanism. This mechanism is purposefully designed to enable the segregated extraction of sequence-specific and channel-specific information, while sharing parameters to promote synergy and mutual reinforcement between sequences and channels. Simultaneously, we introduce sequence adapters and channel adapters, ensuring the model's ability to discern salient features across various dimensions. Rigorous experimentation, spanning multiple real-world datasets, underscores the robustness of our approach, consistently establishing its position at the forefront of predictive performance across all datasets. This augmentation substantially enhances the capacity for feature extraction inherent to multivariate time series data, facilitating a more comprehensive exploitation of the available information.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
From Ultra-Fine to Fine: Fine-tuning Ultra-Fine Entity Typing Models to Fine-grained
Authors:
Hongliang Dai,
Ziqian Zeng
Abstract:
For the task of fine-grained entity typing (FET), due to the use of a large number of entity types, it is usually considered too costly to manually annotating a training dataset that contains an ample number of examples for each type. A common way to address this problem is to use distantly annotated training data that contains incorrect labels. However, the performance of models trained solely wi…
▽ More
For the task of fine-grained entity typing (FET), due to the use of a large number of entity types, it is usually considered too costly to manually annotating a training dataset that contains an ample number of examples for each type. A common way to address this problem is to use distantly annotated training data that contains incorrect labels. However, the performance of models trained solely with such data can be limited by the errors in the automatic annotation. Recently, there are a few approaches that no longer follow this conventional way. But without using sufficient direct entity typing supervision may also cause them to yield inferior performance. In this paper, we propose a new approach that can avoid the need of creating distantly labeled data whenever there is a new type schema. We first train an entity typing model that have an extremely board type coverage by using the ultra-fine entity typing data. Then, when there is a need to produce a model for a newly designed fine-grained entity type schema. We can simply fine-tune the previously trained model with a small number of examples annotated under this schema. Experimental results show that our approach achieves outstanding performance for FET under the few-shot setting. It can also outperform state-of-the-art weak supervision based methods after fine-tuning the model with only a small size manually annotated training set.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-aware Model Update
Authors:
Ji Liu,
Juncheng Jia,
Tianshi Che,
Chao Huo,
Jiaxiang Ren,
Yang Zhou,
Huaiyu Dai,
Dejing Dou
Abstract:
As a promising approach to deal with distributed data, Federated Learning (FL) achieves major advancements in recent years. FL enables collaborative model training by exploiting the raw data dispersed in multiple edge devices. However, the data is generally non-independent and identically distributed, i.e., statistical heterogeneity, and the edge devices significantly differ in terms of both compu…
▽ More
As a promising approach to deal with distributed data, Federated Learning (FL) achieves major advancements in recent years. FL enables collaborative model training by exploiting the raw data dispersed in multiple edge devices. However, the data is generally non-independent and identically distributed, i.e., statistical heterogeneity, and the edge devices significantly differ in terms of both computation and communication capacity, i.e., system heterogeneity. The statistical heterogeneity leads to severe accuracy degradation while the system heterogeneity significantly prolongs the training process. In order to address the heterogeneity issue, we propose an Asynchronous Staleness-aware Model Update FL framework, i.e., FedASMU, with two novel methods. First, we propose an asynchronous FL system model with a dynamical model aggregation method between updated local models and the global model on the server for superior accuracy and high efficiency. Then, we propose an adaptive local model adjustment method by aggregating the fresh global model with local models on devices to further improve the accuracy. Extensive experimentation with 6 models and 5 public datasets demonstrates that FedASMU significantly outperforms baseline approaches in terms of accuracy (0.60% to 23.90% higher) and efficiency (3.54% to 97.98% faster).
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Determination of spin-parity quantum numbers of X(2370) as $0^{-+}$ from $J/ψ\rightarrowγK^{0}_{S}K^{0}_{S}η^{\prime}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (605 additional authors not shown)
Abstract:
Based on $(10087\pm44)\times10^{6}$ $J/ψ$ events collected with the BESIII detector, a partial wave analysis of the decay $J/ψ\rightarrowγK^{0}_{S}K^{0}_{S}η^{\prime}$ is performed. The mass and width of the $X(2370)$ are measured to be $2395 \pm 11 ({\rm stat})^{+26}_{-94}({\rm syst})\ \mathrm{MeV}/c^{2}$ and $188^{+18}_{-17}({\rm stat})^{+124}_{-33}({\rm syst})~\mathrm{MeV}$, respectively. The c…
▽ More
Based on $(10087\pm44)\times10^{6}$ $J/ψ$ events collected with the BESIII detector, a partial wave analysis of the decay $J/ψ\rightarrowγK^{0}_{S}K^{0}_{S}η^{\prime}$ is performed. The mass and width of the $X(2370)$ are measured to be $2395 \pm 11 ({\rm stat})^{+26}_{-94}({\rm syst})\ \mathrm{MeV}/c^{2}$ and $188^{+18}_{-17}({\rm stat})^{+124}_{-33}({\rm syst})~\mathrm{MeV}$, respectively. The corresponding product branching fraction is $\mathcal{B}[J/ψ\rightarrowγX(2370)] \times \mathcal{B}[X(2370) \rightarrow f_{0}(980)η^{\prime}] \times \mathcal{B}[f_{0}(980) \rightarrow K^{0}_{S}K^{0}_{S}] = \left( 1.31 \pm 0.22 ({\rm stat})^{+2.85}_{-0.84}({\rm syst}) \right) \times 10^{-5}$. The statistical significance of the $X(2370)$ is greater than $11.7σ$ and the spin-parity is determined to be $0^{-+}$ for the first time. The measured mass and spin-parity of the $X(2370)$ are consistent with the predictions of the lightest pseudoscalar glueball.
△ Less
Submitted 6 May, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Holistic Evaluation of GPT-4V for Biomedical Imaging
Authors:
Zhengliang Liu,
Hanqi Jiang,
Tianyang Zhong,
Zihao Wu,
Chong Ma,
Yiwei Li,
Xiaowei Yu,
Yutong Zhang,
Yi Pan,
Peng Shu,
Yanjun Lyu,
Lu Zhang,
Junjie Yao,
Peixin Dong,
Chao Cao,
Zhenxiang Xiao,
Jiaqi Wang,
Huan Zhao,
Shaochen Xu,
Yaonai Wei,
Jingyuan Chen,
Haixing Dai,
Peilong Wang,
Hao He,
Zewei Wang
, et al. (25 additional authors not shown)
Abstract:
In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor…
▽ More
In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and more. Tasks include modality recognition, anatomy localization, disease diagnosis, report generation, and lesion detection. The extensive experiments provide insights into GPT-4V's strengths and weaknesses. Results show GPT-4V's proficiency in modality and anatomy recognition but difficulty with disease diagnosis and localization. GPT-4V excels at diagnostic report generation, indicating strong image captioning skills. While promising for biomedical imaging AI, GPT-4V requires further enhancement and validation before clinical deployment. We emphasize responsible development and testing for trustworthy integration of biomedical AGI. This rigorous evaluation of GPT-4V on diverse medical images advances understanding of multimodal large language models (LLMs) and guides future work toward impactful healthcare applications.
△ Less
Submitted 10 November, 2023;
originally announced December 2023.
-
Amplitude Analysis of the Decays $D^0\toπ^+π^-π^+π^-$ and $π^+π^-π^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (620 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ taken at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector, a joint amplitude analysis is performed on the decays $D^0\toπ^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$). The fit fractions of individual components are obtained, and large interferences among the dominant components…
▽ More
Using $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ taken at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector, a joint amplitude analysis is performed on the decays $D^0\toπ^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$). The fit fractions of individual components are obtained, and large interferences among the dominant components of $D^{0}\to a_{1}(1260)π$, $D^{0}\toπ(1300)π$, $D^{0}\toρ(770)ρ(770)$ and $D^{0}\to2(ππ)_{S}$ are found in both channels. With the obtained amplitude model, the $CP$-even fractions of $D^0\to π^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$) are determined to be $(75.2\pm1.1_{\rm stat.}\pm1.5_{\rm syst.})\%$ and $(68.9\pm1.5_{\rm stat.}\pm 2.4_{\rm syst.})\%$, respectively. The branching fractions of $D^0\to π^+π^-π^+π^-$ and $D^0\toπ^+π^-π^0π^0$(non-$η$) are measured to be $(0.688\pm0.010_{\rm stat.}\pm 0.010_{\rm syst.})\%$ and $(0.951\pm0.025_{\rm stat.}\pm 0.021_{\rm syst.})\%$, respectively. The amplitude analysis provides an important model for binning strategy in the measurements of the strong phase parameters of $D^0 \to 4π$ when used to determine the CKM angle $γ(φ_{3})$ via the $B^{-}\to D K^{-}$ decay.
△ Less
Submitted 3 April, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Gene-MOE: A sparsely gated prognosis and classification framework exploiting pan-cancer genomic information
Authors:
Xiangyu Meng,
Xue Li,
Qing Yang,
Huanhuan Dai,
Lian Qiao,
Hongzhen Ding,
Long Hao,
Xun Wang
Abstract:
Benefiting from the advancements in deep learning, various genomic analytical techniques, such as survival analysis, classification of tumors and their subtypes, and exploration of specific pathways, have significantly enhanced our understanding of the biological mechanisms driving cancer. However, the overfitting issue, arising from the limited number of patient samples, poses a challenge in impr…
▽ More
Benefiting from the advancements in deep learning, various genomic analytical techniques, such as survival analysis, classification of tumors and their subtypes, and exploration of specific pathways, have significantly enhanced our understanding of the biological mechanisms driving cancer. However, the overfitting issue, arising from the limited number of patient samples, poses a challenge in improving the accuracy of genome analysis by deepening the neural network. Furthermore, it remains uncertain whether novel approaches such as the sparsely gated mixture of expert (MOE) and self-attention mechanisms can improve the accuracy of genomic analysis. In this paper, we introduce a novel sparsely gated RNA-seq analysis framework called Gene-MOE. This framework exploits the potential of the MOE layers and the proposed mixture of attention expert (MOAE) layers to enhance the analysis accuracy. Additionally, it addresses overfitting challenges by integrating pan-cancer information from 33 distinct cancer types through pre-training.We pre-trained Gene-MOE on TCGA pan-cancer RNA-seq dataset with 33 cancer types. Subsequently, we conducted experiments involving cancer classification and survival analysis based on the pre-trained Gene-MOE. According to the survival analysis results on 14 cancer types, Gene-MOE outperformed state-of-the-art models on 12 cancer types. Through detailed feature analysis, we found that the Gene-MOE model could learn rich feature representations of high-dimensional genes. According to the classification results, the total accuracy of the classification model for 33 cancer classifications reached 95.8%, representing the best performance compared to state-of-the-art models. These results indicate that Gene-MOE holds strong potential for use in cancer classification and survival analysis.
△ Less
Submitted 18 December, 2023; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Measurement of Branching Fractions for $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ and $Λ_{c}^{+} \rightarrow n K_{S}^{0} K^{+}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (603 additional authors not shown)
Abstract:
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4.600\,\mathrm{GeV}$ and $4.699\,\mathrm{GeV}$ with the BESIII detector, we measure the absolute branching fraction of the Cabibbo-favored decay $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ with the precision improved by a factor of 2.8 and report the first evidence for the singly-Cabibbo-suppressed…
▽ More
Based on 4.5 fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated at center-of-mass energies between $4.600\,\mathrm{GeV}$ and $4.699\,\mathrm{GeV}$ with the BESIII detector, we measure the absolute branching fraction of the Cabibbo-favored decay $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ with the precision improved by a factor of 2.8 and report the first evidence for the singly-Cabibbo-suppressed decay $Λ_{c}^{+} \rightarrow n K_{S}^{0} K^{+}$. The branching fractions for $Λ_{c}^{+} \rightarrow n K_{S}^{0} π^{+}$ and $Λ_{c}^{+} \rightarrow n K_{S}^{0} K^{+}$ are determined to be $(1.86\pm0.08\pm0.04)\times10^{-2}$ and $\left(4.3^{+1.9}_{-1.5}\pm0.3\right)\times10^{-4}$, respectively, where the first uncertainties are statistical and the second ones are systematic.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
First observation of $Λ_c^+\rightarrowΛK^+π^0$ and evidence of $Λ_c^+\rightarrowΛK^+π^+π^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (608 additional authors not shown)
Abstract:
We present the first observation of the singly Cabibbo-suppressed decay $Λ_c^+ \rightarrow ΛK^+π^0$ with a significance of $5.7σ$ and the first evidence of $Λ_c^+ \rightarrow ΛK^+π^+π^-$ decay with a significance of $3.1σ$, based on $e^+e^-$ annihilation data recorded by the BESIII detector at the BEPCII collider. The data correspond to an integrated luminosity of $6.4~{\rm fb^{-1}}$, in the cente…
▽ More
We present the first observation of the singly Cabibbo-suppressed decay $Λ_c^+ \rightarrow ΛK^+π^0$ with a significance of $5.7σ$ and the first evidence of $Λ_c^+ \rightarrow ΛK^+π^+π^-$ decay with a significance of $3.1σ$, based on $e^+e^-$ annihilation data recorded by the BESIII detector at the BEPCII collider. The data correspond to an integrated luminosity of $6.4~{\rm fb^{-1}}$, in the center-of-mass energy range from $4.600~{\rm GeV}$ to $4.950~{\rm GeV}$. We determine the branching fractions of $Λ_c^+ \rightarrow ΛK^+π^0$ and $Λ_c^+ \rightarrow ΛK^+π^+π^-$ relative to their Cabibbo-favored counterparts to be $\frac{\mathcal{B}(Λ_c^+ \rightarrow ΛK^+π^0)}{\mathcal{B}(Λ_c^+ \rightarrow Λπ^+π^0)} = (2.09\pm0.39_{\mathrm{stat.}}\pm0.07_{\mathrm{syst.}}) \times 10^{-2}$ and $\frac{\mathcal{B}(Λ_c^+ \rightarrow ΛK^+π^+π^-)}{\mathcal{B}(Λ_c^+ \rightarrow Λπ^+π^+π^-)} = (1.13\pm0.41_{\mathrm{stat.}}\pm0.06_{\mathrm{syst.}}) \times 10^{-2}$, respectively. Moreover, by combining our measured result with the world average of $\mathcal{B}(Λ^+_c\to Λπ^+π^0)$, we obtain the branching fraction $\mathcal{B}(Λ_c^+ \to ΛK^+π^0) = (1.49\pm0.27_{\mathrm{stat.}}\pm0.05_{\mathrm{syst.}}\pm0.08_{\mathrm{ref.}}) \times 10^{-3}$. This result significantly departs from theoretical predictions based on quark $SU(3)$ flavor symmetry, which is underpinned by the presumption of meson pair $S$-wave amplitude dominance.
△ Less
Submitted 25 February, 2024; v1 submitted 21 November, 2023;
originally announced November 2023.
-
Improved measurement of the decays $η' \to π^{+}π^{-}π^{+(0)}π^{-(0)}$ and search for the rare decay $η' \to 4π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (606 additional authors not shown)
Abstract:
Using a sample of 10 billion $J/ψ$ events collected with the BESIII detector, the decays $η' \to π^{+}π^{-}π^{+}π^{-}$, $η' \to π^{+}π^{-}π^{0}π^{0}$ and $η' \to 4 π^{0}$ are studied via the process $J/ψ\toγη'$. The branching fractions of $η' \to π^{+}π^{-}π^{+}π^{-}$ and $η' \to π^{+}π^{-}π^{0}$ $π^{0}$ are measured to be $( 8.56 \pm 0.25({\rm stat.}) \pm 0.23({\rm syst.}) ) \times {10^{ - 5}}$ a…
▽ More
Using a sample of 10 billion $J/ψ$ events collected with the BESIII detector, the decays $η' \to π^{+}π^{-}π^{+}π^{-}$, $η' \to π^{+}π^{-}π^{0}π^{0}$ and $η' \to 4 π^{0}$ are studied via the process $J/ψ\toγη'$. The branching fractions of $η' \to π^{+}π^{-}π^{+}π^{-}$ and $η' \to π^{+}π^{-}π^{0}$ $π^{0}$ are measured to be $( 8.56 \pm 0.25({\rm stat.}) \pm 0.23({\rm syst.}) ) \times {10^{ - 5}}$ and $(2.12 \pm 0.12({\rm stat.}) \pm 0.10({\rm syst.})) \times {10^{ - 4}}$, respectively, which are consistent with previous measurements but with improved precision. No significant $η' \to 4 π^{0}$ signal is observed, and the upper limit on the branching fraction of this decay is determined to be less than $1.24 \times {10^{-5}}$ at the $90\%$ confidence level. In addition, an amplitude analysis of $η' \to π^{+}π^{-}π^{+}π^{-}$ is performed to extract the doubly virtual isovector form factor $α$ for the first time. The measured value of $α=1.22 \pm 0.33({\rm stat.}) \pm 0.04({\rm syst.})$, is in agreement with the prediction of the VMD model.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Open Set Dandelion Network for IoT Intrusion Detection
Authors:
Jiashu Wu,
Hao Dai,
Kenneth B. Kent,
Jerome Yen,
Chengzhong Xu,
Yang Wang
Abstract:
As IoT devices become widely, it is crucial to protect them from malicious intrusions. However, the data scarcity of IoT limits the applicability of traditional intrusion detection methods, which are highly data-dependent. To address this, in this paper we propose the Open-Set Dandelion Network (OSDN) based on unsupervised heterogeneous domain adaptation in an open-set manner. The OSDN model perfo…
▽ More
As IoT devices become widely, it is crucial to protect them from malicious intrusions. However, the data scarcity of IoT limits the applicability of traditional intrusion detection methods, which are highly data-dependent. To address this, in this paper we propose the Open-Set Dandelion Network (OSDN) based on unsupervised heterogeneous domain adaptation in an open-set manner. The OSDN model performs intrusion knowledge transfer from the knowledge-rich source network intrusion domain to facilitate more accurate intrusion detection for the data-scarce target IoT intrusion domain. Under the open-set setting, it can also detect newly-emerged target domain intrusions that are not observed in the source domain. To achieve this, the OSDN model forms the source domain into a dandelion-like feature space in which each intrusion category is compactly grouped and different intrusion categories are separated, i.e., simultaneously emphasising inter-category separability and intra-category compactness. The dandelion-based target membership mechanism then forms the target dandelion. Then, the dandelion angular separation mechanism achieves better inter-category separability, and the dandelion embedding alignment mechanism further aligns both dandelions in a finer manner. To promote intra-category compactness, the discriminating sampled dandelion mechanism is used. Assisted by the intrusion classifier trained using both known and generated unknown intrusion knowledge, a semantic dandelion correction mechanism emphasises easily-confused categories and guides better inter-category separability. Holistically, these mechanisms form the OSDN model that effectively performs intrusion knowledge transfer to benefit IoT intrusion detection. Comprehensive experiments on several intrusion datasets verify the effectiveness of the OSDN model, outperforming three state-of-the-art baseline methods by 16.9%.
△ Less
Submitted 7 January, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Study of the decay $J/ψ\to φπ^{0}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (604 additional authors not shown)
Abstract:
Based on $(10.09 \pm 0.04) \times 10^9$ $J/ψ$ events collected with the BESIII detector operating at the BEPCII collider, a partial wave analysis of the decay $J/ψ\to φπ^{0}η$ is performed. We observe for the first time two new structures on the $φη$ invariant mass distribution, with statistical significances of $24.0σ$ and $16.9σ$; the first with $J^{\rm PC}$ = $1^{+-}$, mass M = (1911 $\pm$ 6 (s…
▽ More
Based on $(10.09 \pm 0.04) \times 10^9$ $J/ψ$ events collected with the BESIII detector operating at the BEPCII collider, a partial wave analysis of the decay $J/ψ\to φπ^{0}η$ is performed. We observe for the first time two new structures on the $φη$ invariant mass distribution, with statistical significances of $24.0σ$ and $16.9σ$; the first with $J^{\rm PC}$ = $1^{+-}$, mass M = (1911 $\pm$ 6 (stat.) $\pm$ 14 (sys.))~MeV/$c^{2}$, and width $Γ= $ (149 $\pm$ 12 (stat.) $\pm$ 23 (sys.))~MeV, the second with $J^{\rm PC}$ = $1^{--}$, mass M = (1996 $\pm$ 11 (stat.) $\pm$ 30 (sys.))~MeV/$c^{2}$, and width $Γ$ = (148 $\pm$ 16 (stat.) $\pm$ 66 (sys.))~MeV. These measurements provide important input for the strangeonium spectrum. In addition, the $f_0(980)-a_0(980)^0$ mixing signal in $J/ψ\to φf_0(980) \to φa_0(980)^0$ and the corresponding electromagnetic decay $J/ψ\to φa_0(980)^0$ are measured with improved precision, providing crucial information to understand the nature of $a_0(980)^0$ and $f_0(980)$.
△ Less
Submitted 14 November, 2023; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Evidence of the Singly Cabibbo Suppressed decay $Λ_c^+\to pπ^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
Evidence for the singly Cabibbo suppressed decay $Λ_c^+\to pπ^0$ is reported for the first time with a statistical significance of $3.7σ$ based on 6.0 $\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.600 and 4.843 GeV with the BESIII detector at the BEPCII collider. The absolute branching fraction of $Λ_c^+\to pπ^0$ is measured to be…
▽ More
Evidence for the singly Cabibbo suppressed decay $Λ_c^+\to pπ^0$ is reported for the first time with a statistical significance of $3.7σ$ based on 6.0 $\mathrm{fb}^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.600 and 4.843 GeV with the BESIII detector at the BEPCII collider. The absolute branching fraction of $Λ_c^+\to pπ^0$ is measured to be $(1.56^{+0.72}_{-0.58}\pm0.20)\times 10^{-4}$. Combining with the branching fraction of $Λ_c^+\to nπ^+$, $(6.6\pm1.3)\times10^{-4}$, the ratio of the branching fractions of $Λ_c^+\to nπ^+$ and $Λ_c^+\to pπ^0$ is calculated to be $3.2^{+2.2}_{-1.2}$. As an important input for the theoretical models describing the decay mechanisms of charmed baryons, our result indicates that the non-factorizable contributions play an essential role and their interference with the factorizable contributions should not be significant. In addition, the absolute branching fraction of $Λ_c^+\to pη$ is measured to be $(1.63\pm0.31_{\rm stat}\pm0.11_{\rm syst}) \times10^{-3}$.
△ Less
Submitted 3 June, 2024; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Observation and branching fraction measurement of the decay $J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0} + c.c.$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (602 additional authors not shown)
Abstract:
The first observation of the decays $J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0}$ and $J\!/\!ψ\rightarrow p \barΣ^{-} K_{S}^{0}$ is reported using $(10087\pm44)\times10^{6}$ $J\!/\!ψ$ events recorded by the BESIII detector at the BEPCII storage ring. The branching fractions of each channel are determined to be…
▽ More
The first observation of the decays $J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0}$ and $J\!/\!ψ\rightarrow p \barΣ^{-} K_{S}^{0}$ is reported using $(10087\pm44)\times10^{6}$ $J\!/\!ψ$ events recorded by the BESIII detector at the BEPCII storage ring. The branching fractions of each channel are determined to be $\mathcal{B}(J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0})=(1.361 \pm 0.006 \pm 0.025) \times 10^{-4}$ and $\mathcal{B}(J\!/\!ψ\rightarrow p \barΣ^{-} K_{S}^{0})=(1.352 \pm 0.006 \pm 0.025) \times 10^{-4}$. The combined result is $\mathcal{B}(J\!/\!ψ\rightarrow \bar{p} Σ^{+} K_{S}^{0} +c.c.)=(2.725 \pm 0.009 \pm 0.050) \times 10^{-4}$, where the first uncertainty is statistical and the second systematic. The results presented are in good agreement with the branching fractions of the isospin partner decay $J\!/\!ψ\rightarrow p K^- \barΣ^0 + c.c.$.
△ Less
Submitted 14 November, 2023; v1 submitted 10 November, 2023;
originally announced November 2023.
-
SQLPrompt: In-Context Text-to-SQL with Minimal Labeled Data
Authors:
Ruoxi Sun,
Sercan Ö. Arik,
Rajarishi Sinha,
Hootan Nakhost,
Hanjun Dai,
Pengcheng Yin,
Tomas Pfister
Abstract:
Text-to-SQL aims to automate the process of generating SQL queries on a database from natural language text. In this work, we propose "SQLPrompt", tailored to improve the few-shot prompting capabilities of Text-to-SQL for Large Language Models (LLMs). Our methods include innovative prompt design, execution-based consistency decoding strategy which selects the SQL with the most consistent execution…
▽ More
Text-to-SQL aims to automate the process of generating SQL queries on a database from natural language text. In this work, we propose "SQLPrompt", tailored to improve the few-shot prompting capabilities of Text-to-SQL for Large Language Models (LLMs). Our methods include innovative prompt design, execution-based consistency decoding strategy which selects the SQL with the most consistent execution outcome among other SQL proposals, and a method that aims to improve performance by diversifying the SQL proposals during consistency selection with different prompt designs ("MixPrompt") and foundation models ("MixLLMs"). We show that \emph{SQLPrompt} outperforms previous approaches for in-context learning with few labeled data by a large margin, closing the gap with finetuning state-of-the-art with thousands of labeled data.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Measurement of the absolute branching fraction of the three-body decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ and search for $Λ_{c}^+ \to nK^+π^0$, $Σ^{0}K^{+}π^{0}$ and $ΛK^{+}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (600 additional authors not shown)
Abstract:
The Cabbibo-favored decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is studied for the first time using 6.1 fb$^{-1}$ of $e^+e^-$ collision data at center-of-mass energies between 4.600 and 4.840 GeV, collected with the BESIII detector at the BEPCII collider. With a double-tag method, the branching fraction of the three-body decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is measured to be…
▽ More
The Cabbibo-favored decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is studied for the first time using 6.1 fb$^{-1}$ of $e^+e^-$ collision data at center-of-mass energies between 4.600 and 4.840 GeV, collected with the BESIII detector at the BEPCII collider. With a double-tag method, the branching fraction of the three-body decay $Λ_{c}^+ \to Ξ^{0}K^{+}π^{0}$ is measured to be $(7.79 \pm 1.46 _{\rm} \pm0.71 _{\rm}) \times 10^{ - 3}$, where the first and second uncertainties are statistical and systematic, respectively. The branching fraction of the two-body decay $Λ_{c}^+ \to Ξ(1530)^{0}K^+$ is $(5.99\pm1.04\pm0.29)\times10^{-3}$, which is consistent with the previous result of $(5.02\pm0.99\pm0.31)\times 10^{-3}$. In addition, the upper limit on the branching fraction of the doubly Cabbibo-suppressed decay $Λ_{c}^+ \to nK^+π^0$ is $7.1 \times 10^{-4}$ at the 90$\%$ confidence level. The upper limits on the branching fractions of $Λ_{c}^+ \to Σ^{0}K^{+}π^{0}$ and $ΛK^{+}π^{0}$ are also determined to be $1.8\times 10^{-3}$ and $ 2.0 \times 10^{-3}$, respectively.
△ Less
Submitted 8 May, 2024; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Search for a muonphilic scalar $X_{0}$ or vector $X_{1}$ via $J/ψ\toμ^+μ^-+\rm{invisible}$ decays at BESII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (608 additional authors not shown)
Abstract:
A light scalar $X_{0}$ or vector $X_{1}$ particles have been introduced as a possible explanation for the $(g-2)_μ$ anomaly and dark matter phenomena.
Using $(8.998\pm 0.039)\times10^9$ $\jpsi $ events collected by the BESIII detector, we search for a light muon philic scalar $X_{0}$ or vector $X_{1}$ in the processes $J/ψ\toμ^+μ^- X_{0,1}$ with $X_{0,1}$ invisible decays. No obvious signal is f…
▽ More
A light scalar $X_{0}$ or vector $X_{1}$ particles have been introduced as a possible explanation for the $(g-2)_μ$ anomaly and dark matter phenomena.
Using $(8.998\pm 0.039)\times10^9$ $\jpsi $ events collected by the BESIII detector, we search for a light muon philic scalar $X_{0}$ or vector $X_{1}$ in the processes $J/ψ\toμ^+μ^- X_{0,1}$ with $X_{0,1}$ invisible decays. No obvious signal is found, and the upper limits on the coupling $g_{0,1}'$ between the muon and the $X_{0,1}$ particles are set to be between $1.1\times10^{-3}$ and $1.0\times10^{-2}$ for the $X_{0,1}$ mass in the range of $1<M(X_{0,1})<1000$ MeV$/c^2$ at 90$\%$ confidence level.
△ Less
Submitted 18 February, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval
Authors:
Jiayi Chen,
Hanjun Dai,
Bo Dai,
Aidong Zhang,
Wei Wei
Abstract:
Visually-rich document entity retrieval (VDER), which extracts key information (e.g. date, address) from document images like invoices and receipts, has become an important topic in industrial NLP applications. The emergence of new document types at a constant pace, each with its unique entity types, presents a unique challenge: many documents contain unseen entity types that occur only a couple o…
▽ More
Visually-rich document entity retrieval (VDER), which extracts key information (e.g. date, address) from document images like invoices and receipts, has become an important topic in industrial NLP applications. The emergence of new document types at a constant pace, each with its unique entity types, presents a unique challenge: many documents contain unseen entity types that occur only a couple of times. Addressing this challenge requires models to have the ability of learning entities in a few-shot manner. However, prior works for Few-shot VDER mainly address the problem at the document level with a predefined global entity space, which doesn't account for the entity-level few-shot scenario: target entity types are locally personalized by each task and entity occurrences vary significantly among documents. To address this unexplored scenario, this paper studies a novel entity-level few-shot VDER task. The challenges lie in the uniqueness of the label space for each task and the increased complexity of out-of-distribution (OOD) contents. To tackle this novel task, we present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization that distinguishes between in-task and out-of-task distribution. Specifically, we adopt a hierarchical decoder (HC) and employ contrastive learning (ContrastProtoNet) to achieve this goal. Furthermore, we introduce a new dataset, FewVEX, to boost future research in the field of entity-level few-shot VDER. Experimental results demonstrate our approaches significantly improve the robustness of popular meta-learning baselines.
△ Less
Submitted 8 December, 2023; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Observation of the Anomalous Shape of $X(1840)$ in $J/ψ\rightarrow γ3(π^+ π^-)$ Indicating a Second Resonance Near $p\bar{p}$ Threshold
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Using a sample of $(10087\pm44)\times 10^6$ $J/ψ$ events, which is about 45 times larger than that was previously analyzed, a further investigation on the $J/ψ\rightarrow γ3(π^+π^-)$ decay is performed. A significant distortion at 1.84 GeV/$c^2$ in the line-shape of the $3(π^+π^-)$ invariant mass spectrum is observed for the first time, which could be resolved by two overlapping resonant structure…
▽ More
Using a sample of $(10087\pm44)\times 10^6$ $J/ψ$ events, which is about 45 times larger than that was previously analyzed, a further investigation on the $J/ψ\rightarrow γ3(π^+π^-)$ decay is performed. A significant distortion at 1.84 GeV/$c^2$ in the line-shape of the $3(π^+π^-)$ invariant mass spectrum is observed for the first time, which could be resolved by two overlapping resonant structures, $X(1840)$ and $X(1880)$. The new state $X(1880)$ is observed with a statistical significance larger than $10σ$. The mass and width of $X(1880)$ are determined to be $1882.1\pm1.7\pm0.7$ MeV/$c^2$ and $30.7\pm5.5 \pm2.4$ MeV, respectively, which indicates the existence of a $p\bar{p}$ bound state.
△ Less
Submitted 15 April, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
Does or did the supernova remnant Cassiopeia A operate as a PeVatron?
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE;…
▽ More
For decades, supernova remnants (SNRs) have been considered the prime sources of Galactic Cosmic rays (CRs). But whether SNRs can accelerate CR protons to PeV energies and thus dominate CR flux up to the knee is currently under intensive theoretical and phenomenological debate. The direct test of the ability of SNRs to operate as CR PeVatrons can be provided by ultrahigh-energy (UHE; $E_γ\geq 100$~TeV) $γ$-rays. In this context, the historical SNR Cassiopeia A (Cas A) is considered one of the most promising target for UHE observations. This paper presents the observation of Cas A and its vicinity by the LHAASO KM2A detector. The exceptional sensitivity of LHAASO KM2A in the UHE band, combined with the young age of Cas A, enabled us to derive stringent model-independent limits on the energy budget of UHE protons and nuclei accelerated by Cas A at any epoch after the explosion. The results challenge the prevailing paradigm that Cas A-type SNRs are major suppliers of PeV CRs in the Milky Way.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Study of the doubly Cabibbo-suppressed decays $D^+_s\to K^+K^+π^-$ and $D^+_s\to K^+K^+π^-π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko
, et al. (604 additional authors not shown)
Abstract:
Based on 7.33 fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, the experimental studies of the doubly Cabibbo-suppressed decays $D^+_s\to K^+K^+π^-$ and $D^+_s\to K^+K^+π^-π^0$ are reported. We determine the absolute branching fraction of $D^+_s\to K^+K^+π^-$ to be (…
▽ More
Based on 7.33 fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector, the experimental studies of the doubly Cabibbo-suppressed decays $D^+_s\to K^+K^+π^-$ and $D^+_s\to K^+K^+π^-π^0$ are reported. We determine the absolute branching fraction of $D^+_s\to K^+K^+π^-$ to be (${1.23^{+0.28}_{-0.25}}({\rm stat})\pm0.06({\rm syst})$) $\times 10^{-4}$. No significant signal of $D^+_s\to K^+K^+π^-π^0$ is observed and the upper limit on its decay branching fraction at 90\% confidence level is set to be $1.7\times10^{-4}$.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization
Authors:
Tianshi Che,
Ji Liu,
Yang Zhou,
Jiaxiang Ren,
Jiwen Zhou,
Victor S. Sheng,
Huaiyu Dai,
Dejing Dou
Abstract:
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. However, the training process of Large Language Models (LLMs) generally incurs the update of significant parameters, which limits the applicability of FL techniques to tackle the LLMs in real scenarios. Prompt tuning can significantly reduce the number of parameters to update, but it eit…
▽ More
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. However, the training process of Large Language Models (LLMs) generally incurs the update of significant parameters, which limits the applicability of FL techniques to tackle the LLMs in real scenarios. Prompt tuning can significantly reduce the number of parameters to update, but it either incurs performance degradation or low training efficiency. The straightforward utilization of prompt tuning in the FL often raises non-trivial communication costs and dramatically degrades performance. In addition, the decentralized data is generally non-Independent and Identically Distributed (non-IID), which brings client drift problems and thus poor performance. This paper proposes a Parameter-efficient prompt Tuning approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and effective FL of LLMs. First, an efficient partial prompt tuning approach is proposed to improve performance and efficiency simultaneously. Second, a novel adaptive optimization method is developed to address the client drift problems on both the device and server sides to enhance performance further. Extensive experiments based on 10 datasets demonstrate the superb performance (up to 60.8\% in terms of accuracy) and efficiency (up to 97.59\% in terms of training time) of FedPepTAO compared with 9 baseline approaches. Our code is available at https://github.com/llm-eff/FedPepTAO.
△ Less
Submitted 11 February, 2024; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Observation of the $ψ(3686)$ decays into $Σ^{+}\barΣ^{-}ω$ and $Σ^{+}\barΣ^{-}{\mathcalφ}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Based on $(27.08\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the $ψ(3686)\toΣ^{+}\barΣ^{-}ω$ and $Σ^{+}\barΣ^{-}φ$ decays are observed for the first time with statistical significances of 13.8$σ$ and 7.6$σ$, respectively. The corresponding branching fractions are measured to be…
▽ More
Based on $(27.08\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the $ψ(3686)\toΣ^{+}\barΣ^{-}ω$ and $Σ^{+}\barΣ^{-}φ$ decays are observed for the first time with statistical significances of 13.8$σ$ and 7.6$σ$, respectively. The corresponding branching fractions are measured to be $\mathcal{B}(ψ(3686)\toΣ^{+}\barΣ^{-}ω)=(1.90 \pm 0.18 \pm 0.21) \times 10^{-5}$ and $\mathcal{B}(ψ(3686)\toΣ^{+}\barΣ^{-}φ)=(2.96 \pm 0.54 \pm 0.41) \times 10^{-6}$, where the first uncertainties are statistical and the second systematic.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Measurement of the cross sections for $e^+e^-\toηπ^+π^-$ at center-of-mass energies between 2.00 and 3.08 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (605 additional authors not shown)
Abstract:
Using data samples collected at center-of-mass energies between 2.000 and 3.080 GeV with the BESIII detector operating at the BEPCII collider, a partial-wave analysis is performed on the process $e^+e^-\toηπ^+π^-$. In addition to the dominant $e^+e^-\toρη$ component, the $e^+e^-\to a_2(1320)π$ process is also sizeable, contributing up to 24% of the total reaction. The measured cross sections of th…
▽ More
Using data samples collected at center-of-mass energies between 2.000 and 3.080 GeV with the BESIII detector operating at the BEPCII collider, a partial-wave analysis is performed on the process $e^+e^-\toηπ^+π^-$. In addition to the dominant $e^+e^-\toρη$ component, the $e^+e^-\to a_2(1320)π$ process is also sizeable, contributing up to 24% of the total reaction. The measured cross sections of the process $e^+e^-\toηπ^+π^-$ are systematically higher than those of BaBar by more than $3σ$ at center-of-mass energies between 2.000 and 2.300 GeV. In the cross section lineshape for $e^+e^-\to a_2(1320)π$, a resonant structure is observed with a significance of $5.5σ$, with $M=(2044\pm31\pm4)$ MeV/$c^2$, $Γ=(163\pm69\pm24)$ MeV and $\mathcal{B_{R}}\cdotΓ_{e^+e^-}^{R}=(34.6\pm17.1\pm6.0)$ eV or $(137.1\pm73.3\pm2.1)$ eV. In the cross section lineshape for $e^+e^-\toρη$, an evidence of a dip structure around 2180 MeV/$c^2$ is observed with statistical significance of $3.0σ$.
△ Less
Submitted 28 November, 2023; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Very high energy gamma-ray emission beyond 10 TeV from GRB 221009A
Authors:
Zhen Cao,
F. Aharonian,
Q. An,
A. Axikegu,
Y. X. Bai,
Y. W. Bao,
D. Bastieri,
X. J. Bi,
Y. J. Bi,
J. T. Cai,
Q. Cao,
W. Y. Cao,
Zhe Cao,
J. Chang,
J. F. Chang,
A. M. Chen,
E. S. Chen,
Liang Chen,
Lin Chen,
Long Chen,
M. J. Chen,
M. L. Chen,
Q. H. Chen,
S. H. Chen,
S. Z. Chen
, et al. (255 additional authors not shown)
Abstract:
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the t…
▽ More
The highest energy gamma-rays from gamma-ray bursts (GRBs) have important implications for their radiation mechanism. Here we report for the first time the detection of gamma-rays up to 13 TeV from the brightest GRB 221009A by the Large High Altitude Air-shower Observatory (LHAASO). The LHAASO-KM2A detector registered more than 140 gamma-rays with energies above 3 TeV during 230$-$900s after the trigger. The intrinsic energy spectrum of gamma-rays can be described by a power-law after correcting for extragalactic background light (EBL) absorption. Such a hard spectrum challenges the synchrotron self-Compton (SSC) scenario of relativistic electrons for the afterglow emission above several TeV. Observations of gamma-rays up to 13 TeV from a source with a measured redshift of z=0.151 hints more transparency in intergalactic space than previously expected. Alternatively, one may invoke new physics such as Lorentz Invariance Violation (LIV) or an axion origin of very high energy (VHE) signals.
△ Less
Submitted 22 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Search for $J/ψ$ weak decays containing $D$ meson
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (600 additional authors not shown)
Abstract:
Using a sample of about 10 billion $J/ψ$ events with the BESIII detector, we search for the weak decays of $J/ψ\to \bar{D}^0π^0 + c.c.$, $J/ψ\to \bar{D}^0η+ c.c.$, $J/ψ\to \bar{D}^0ρ^0 + c.c.$, $J/ψ\to D^-π^+ + c.c.$, and $J/ψ\to D^-ρ^+ + c.c.$. Since no significant signal is observed, we set the upper limits of the branching fractions of these decays to be…
▽ More
Using a sample of about 10 billion $J/ψ$ events with the BESIII detector, we search for the weak decays of $J/ψ\to \bar{D}^0π^0 + c.c.$, $J/ψ\to \bar{D}^0η+ c.c.$, $J/ψ\to \bar{D}^0ρ^0 + c.c.$, $J/ψ\to D^-π^+ + c.c.$, and $J/ψ\to D^-ρ^+ + c.c.$. Since no significant signal is observed, we set the upper limits of the branching fractions of these decays to be $\mathcal{B}(J/ψ\to \bar{D}^0π^0 + c.c.) < 4.7 \times 10^{-7}$, $\mathcal{B}(J/ψ\to \bar{D}^0η+ c.c.) < 6.8 \times 10^{-7}$, $\mathcal{B}(J/ψ\to \bar{D}^0ρ^0 + c.c.) < 5.2 \times 10^{-7}$, $\mathcal{B}(J/ψ\to D^-π^+ + c.c.) < 7.0 \times 10^{-8}$, and $\mathcal{B}(J/ψ\to D^-ρ^+ + c.c.) < 6.0 \times 10^{-7}$ at the 90\% confidence level.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Large Language Models can Learn Rules
Authors:
Zhaocheng Zhu,
Yuan Xue,
Xinyun Chen,
Denny Zhou,
Jian Tang,
Dale Schuurmans,
Hanjun Dai
Abstract:
When prompted with a few examples and intermediate steps, large language models (LLMs) have demonstrated impressive performance in various reasoning tasks. However, prompting methods that rely on implicit knowledge in an LLM often generate incorrect answers when the implicit knowledge is wrong or inconsistent with the task. To tackle this problem, we present Hypotheses-to-Theories (HtT), a framewo…
▽ More
When prompted with a few examples and intermediate steps, large language models (LLMs) have demonstrated impressive performance in various reasoning tasks. However, prompting methods that rely on implicit knowledge in an LLM often generate incorrect answers when the implicit knowledge is wrong or inconsistent with the task. To tackle this problem, we present Hypotheses-to-Theories (HtT), a framework that learns a rule library for reasoning with LLMs. HtT contains two stages, an induction stage and a deduction stage. In the induction stage, an LLM is first asked to generate and verify rules over a set of training examples. Rules that appear and lead to correct answers sufficiently often are collected to form a rule library. In the deduction stage, the LLM is then prompted to employ the learned rule library to perform reasoning to answer test questions. Experiments on relational reasoning, numerical reasoning and concept learning problems show that HtT improves existing prompting methods, with an absolute gain of 10-30% in accuracy. The learned rules are also transferable to different models and to different forms of the same problem.
△ Less
Submitted 24 April, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data
Authors:
Tianyang Zhong,
Wei Zhao,
Yutong Zhang,
Yi Pan,
Peixin Dong,
Zuowei Jiang,
Xiaoyan Kui,
Youlan Shang,
Li Yang,
Yaonai Wei,
Longtao Yang,
Hao Chen,
Huan Zhao,
Yuxiao Liu,
Ning Zhu,
Yiwei Li,
Yisong Wang,
Jiaqi Yao,
Jiaqi Wang,
Ying Zeng,
Lei He,
Chao Zheng,
Zhixue Zhang,
Ming Li,
Zhengliang Liu
, et al. (17 additional authors not shown)
Abstract:
Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviousl…
▽ More
Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviously distinctive among institutions, body regions inspected and radiologists. Recently, the advent of large language models (LLM) offers great potential for recognizing signs of health conditions. To resolve the above problem, we collaborate with the Second Xiangya Hospital in China and propose ChatRadio-Valuer based on the LLM, a tailored model for automatic radiology report generation that learns generalizable representations and provides a basis pattern for model adaptation in sophisticated analysts' cases. Specifically, ChatRadio-Valuer is trained based on the radiology reports from a single institution by means of supervised fine-tuning, and then adapted to disease diagnosis tasks for human multi-system evaluation (i.e., chest, abdomen, muscle-skeleton, head, and maxillofacial $\&$ neck) from six different institutions in clinical-level events. The clinical dataset utilized in this study encompasses a remarkable total of \textbf{332,673} observations. From the comprehensive results on engineering indicators, clinical efficacy and deployment cost metrics, it can be shown that ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al., in terms of the diseases diagnosis from radiology reports. ChatRadio-Valuer provides an effective avenue to boost model generalization performance and alleviate the annotation workload of experts to enable the promotion of clinical AI applications in radiology reports.
△ Less
Submitted 9 October, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Measurement of $e^{+}e^{-}\rightarrowηJ/ψ$ Cross Section from $\sqrt{s}=$ 3.808 GeV to 4.951 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (608 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of 22.42 fb$^{-1}$ collected by the BESIII detector operating at the BEPCII storage ring, we measure the cross sections of the $e^{+}e^{-}\rightarrow\etaJ/ψ$ process at center-of-mass energies from 3.808 to 4.951 GeV. Three structures are observed in the line shape of the measured cross sections. A maximum-likelihood fit with $ψ(4040)$, two addition…
▽ More
Using data samples with an integrated luminosity of 22.42 fb$^{-1}$ collected by the BESIII detector operating at the BEPCII storage ring, we measure the cross sections of the $e^{+}e^{-}\rightarrow\etaJ/ψ$ process at center-of-mass energies from 3.808 to 4.951 GeV. Three structures are observed in the line shape of the measured cross sections. A maximum-likelihood fit with $ψ(4040)$, two additional resonances, and a non-resonant component is performed. The mass and width of the first additional state are $(4219.7\pm2.5\pm4.5) \rm{MeV}/\rm{c}^2$ and $(80.7\pm4.4\pm1.4) \rm{MeV}$, respectively, consistent with the $ψ(4230)$. For the second state, the mass and width are $(4386\pm13\pm17) \rm{MeV}/\rm{c}^2$ and $(177\pm32\pm13) \rm{MeV}$, respectively, consistent with the $ψ(4360)$. The first uncertainties are statistical and the second ones are systematic. The statistical significance of $ψ(4040)$ is $8.0σ$ and those for $ψ(4230)$ and $ψ(4360)$ are more than $10.0σ$.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.