subscribe to arXiv mailings

arXiv:2307.00777 [pdf, ps, other]

GA-DRL: Graph Neural Network-Augmented Deep Reinforcement Learning for DAG Task Scheduling over Dynamic Vehicular Clouds

Authors: Zhang Liu, Lianfen Huang, Zhibin Gao, Manman Luo, Seyyedali Hosseinalipour, Huaiyu Dai

Abstract: Vehicular clouds (VCs) are modern platforms for processing of computation-intensive tasks over vehicles. Such tasks are often represented as directed acyclic graphs (DAGs) consisting of interdependent vertices/subtasks and directed edges. In this paper, we propose a graph neural network-augmented deep reinforcement learning scheme (GA-DRL) for scheduling DAG tasks over dynamic VCs. In doing so, we… ▽ More Vehicular clouds (VCs) are modern platforms for processing of computation-intensive tasks over vehicles. Such tasks are often represented as directed acyclic graphs (DAGs) consisting of interdependent vertices/subtasks and directed edges. In this paper, we propose a graph neural network-augmented deep reinforcement learning scheme (GA-DRL) for scheduling DAG tasks over dynamic VCs. In doing so, we first model the VC-assisted DAG task scheduling as a Markov decision process. We then adopt a multi-head graph attention network (GAT) to extract the features of DAG subtasks. Our developed GAT enables a two-way aggregation of the topological information in a DAG task by simultaneously considering predecessors and successors of each subtask. We further introduce non-uniform DAG neighborhood sampling through codifying the scheduling priority of different subtasks, which makes our developed GAT generalizable to completely unseen DAG task topologies. Finally, we augment GAT into a double deep Q-network learning module to conduct subtask-to-vehicle assignment according to the extracted features of subtasks, while considering the dynamics and heterogeneity of the vehicles in VCs. Through simulating various DAG tasks under real-world movement traces of vehicles, we demonstrate that GA-DRL outperforms existing benchmarks in terms of DAG task completion time. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: 15 pages, 12 figures, regular journal

arXiv:2306.14079 [pdf, other]

Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching

Authors: H. J. Terry Suh, Glen Chou, Hongkai Dai, Lujie Yang, Abhishek Gupta, Russ Tedrake

Abstract: Gradient-based methods enable efficient search capabilities in high dimensions. However, in order to apply them effectively in offline optimization paradigms such as offline Reinforcement Learning (RL) or Imitation Learning (IL), we require a more careful consideration of how uncertainty estimation interplays with first-order methods that attempt to minimize them. We study smoothed distance to dat… ▽ More Gradient-based methods enable efficient search capabilities in high dimensions. However, in order to apply them effectively in offline optimization paradigms such as offline Reinforcement Learning (RL) or Imitation Learning (IL), we require a more careful consideration of how uncertainty estimation interplays with first-order methods that attempt to minimize them. We study smoothed distance to data as an uncertainty metric, and claim that it has two beneficial properties: (i) it allows gradient-based methods that attempt to minimize uncertainty to drive iterates to data as smoothing is annealed, and (ii) it facilitates analysis of model bias with Lipschitz constants. As distance to data can be expensive to compute online, we consider settings where we need amortize this computation. Instead of learning the distance however, we propose to learn its gradients directly as an oracle for first-order optimizers. We show these gradients can be efficiently learned with score-matching techniques by leveraging the equivalence between distance to data and data likelihood. Using this insight, we propose Score-Guided Planning (SGP), a planning algorithm for offline RL that utilizes score-matching to enable first-order planning in high-dimensional problems, where zeroth-order methods were unable to scale, and ensembles were unable to overcome local minima. Website: https://sites.google.com/view/score-guided-planning/home △ Less

Submitted 16 October, 2023; v1 submitted 24 June, 2023; originally announced June 2023.

Comments: Glen Chou, Hongkai Dai, and Lujie Yang contributed equally to this work. Accepted to CoRL 2023

arXiv:2306.11892 [pdf, other]

Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications

Authors: Saed Rezayi, Zhengliang Liu, Zihao Wu, Chandra Dhakal, Bao Ge, Haixing Dai, Gengchen Mai, Ninghao Liu, Chen Zhen, Tianming Liu, Sheng Li

Abstract: This paper explores new frontiers in agricultural natural language processing by investigating the effectiveness of using food-related text corpora for pretraining transformer-based language models. In particular, we focus on the task of semantic matching, which involves establishing mappings between food descriptions and nutrition data. To accomplish this, we fine-tune a pre-trained transformer-b… ▽ More This paper explores new frontiers in agricultural natural language processing by investigating the effectiveness of using food-related text corpora for pretraining transformer-based language models. In particular, we focus on the task of semantic matching, which involves establishing mappings between food descriptions and nutrition data. To accomplish this, we fine-tune a pre-trained transformer-based language model, AgriBERT, on this task, utilizing an external source of knowledge, such as the FoodOn ontology. To advance the field of agricultural NLP, we propose two new avenues of exploration: (1) utilizing GPT-based models as a baseline and (2) leveraging ChatGPT as an external source of knowledge. ChatGPT has shown to be a strong baseline in many NLP tasks, and we believe it has the potential to improve our model in the task of semantic matching and enhance our model's understanding of food-related concepts and relationships. Additionally, we experiment with other applications, such as cuisine prediction based on food ingredients, and expand the scope of our research to include other NLP tasks beyond semantic matching. Overall, this paper provides promising avenues for future research in this field, with potential implications for improving the performance of agricultural NLP applications. △ Less

Submitted 20 June, 2023; originally announced June 2023.

arXiv:2306.11730 [pdf, other]

Segment Anything Model (SAM) for Radiation Oncology

Authors: Lian Zhang, Zhengliang Liu, Lu Zhang, Zihao Wu, Xiaowei Yu, Jason Holmes, Hongying Feng, Haixing Dai, Xiang Li, Quanzheng Li, Dajiang Zhu, Tianming Liu, Wei Liu

Abstract: In this study, we evaluate the performance of the Segment Anything Model (SAM) in clinical radiotherapy. Our results indicate that SAM's 'segment anything' mode can achieve clinically acceptable segmentation results in most organs-at-risk (OARs) with Dice scores higher than 0.7. SAM's 'box prompt' mode further improves the Dice scores by 0.1 to 0.5. Considering the size of the organ and the clarit… ▽ More In this study, we evaluate the performance of the Segment Anything Model (SAM) in clinical radiotherapy. Our results indicate that SAM's 'segment anything' mode can achieve clinically acceptable segmentation results in most organs-at-risk (OARs) with Dice scores higher than 0.7. SAM's 'box prompt' mode further improves the Dice scores by 0.1 to 0.5. Considering the size of the organ and the clarity of its boundary, SAM displays better performance for large organs with clear boundaries but performs worse for smaller organs with unclear boundaries. Given that SAM, a model pre-trained purely on natural images, can handle the delineation of OARs from medical images with clinically acceptable accuracy, these results highlight SAM's robust generalization capabilities with consistent accuracy in automatic segmentation for radiotherapy. In other words, SAM can achieve delineation of different OARs at different sites using a generic automatic segmentation model. SAM's generalization capabilities across different disease sites suggest that it is technically feasible to develop a generic model for automatic segmentation in radiotherapy. △ Less

Submitted 4 July, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

arXiv:2306.10319 [pdf, ps, other]

doi 10.1103/PhysRevD.108.112012

Precise measurement of the branching fractions of $J/ψ\rightarrow\barΛπ^{+}Σ^{-}+c.c.$ and $J/ψ\rightarrow\barΛπ^{-}Σ^{+}+c.c.$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

Abstract: Based on a data sample of $(10087\pm44)\times10^6$ $J/ψ$ events collected with the BESIII detector, the branching fraction of $J/ψ\rightarrow\barΛπ^{+}Σ^{-}+c.c.$ is measured to be $(1.221\pm 0.002\pm 0.038)\times10^{-3}$, and the branching fraction of its isospin partner mode $J/ψ\rightarrow\barΛπ^{-}Σ^{+}+c.c.$ is measured to be $(1.244\pm 0.002\pm 0.045)\times10^{-3}$ with improved precision. H… ▽ More Based on a data sample of $(10087\pm44)\times10^6$ $J/ψ$ events collected with the BESIII detector, the branching fraction of $J/ψ\rightarrow\barΛπ^{+}Σ^{-}+c.c.$ is measured to be $(1.221\pm 0.002\pm 0.038)\times10^{-3}$, and the branching fraction of its isospin partner mode $J/ψ\rightarrow\barΛπ^{-}Σ^{+}+c.c.$ is measured to be $(1.244\pm 0.002\pm 0.045)\times10^{-3}$ with improved precision. Here the first uncertainties are statistical and the second ones systematic. The isospin symmetry of the $Σ$ baryon in charmonium hadronic decay and the "$12\%$ rule" are tested, and no violation is found. The potential of using these channels as $Σ$ baryon sources for nuclear physics research is studied, and the momentum and angular distributions of these sources are provided. △ Less

Submitted 24 December, 2023; v1 submitted 17 June, 2023; originally announced June 2023.

arXiv:2306.10095 [pdf, other]

AD-AutoGPT: An Autonomous GPT for Alzheimer's Disease Infodemiology

Authors: Haixing Dai, Yiwei Li, Zhengliang Liu, Lin Zhao, Zihao Wu, Suhang Song, Ye Shen, Dajiang Zhu, Xiang Li, Sheng Li, Xiaobai Yao, Lu Shi, Quanzheng Li, Zhuo Chen, Donglan Zhang, Gengchen Mai, Tianming Liu

Abstract: In this pioneering study, inspired by AutoGPT, the state-of-the-art open-source application based on the GPT-4 large language model, we develop a novel tool called AD-AutoGPT which can conduct data collection, processing, and analysis about complex health narratives of Alzheimer's Disease in an autonomous manner via users' textual prompts. We collated comprehensive data from a variety of news sour… ▽ More In this pioneering study, inspired by AutoGPT, the state-of-the-art open-source application based on the GPT-4 large language model, we develop a novel tool called AD-AutoGPT which can conduct data collection, processing, and analysis about complex health narratives of Alzheimer's Disease in an autonomous manner via users' textual prompts. We collated comprehensive data from a variety of news sources, including the Alzheimer's Association, BBC, Mayo Clinic, and the National Institute on Aging since June 2022, leading to the autonomous execution of robust trend analyses, intertopic distance maps visualization, and identification of salient terms pertinent to Alzheimer's Disease. This approach has yielded not only a quantifiable metric of relevant discourse but also valuable insights into public focus on Alzheimer's Disease. This application of AD-AutoGPT in public health signifies the transformative potential of AI in facilitating a data-rich understanding of complex health narratives like Alzheimer's Disease in an autonomous manner, setting the groundwork for future AI-driven investigations in global health landscapes. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: 20 pages, 4 figures

MSC Class: 68T01; 68T50; 92C50 ACM Class: I.2.7; I.2.1; J.3

arXiv:2306.08937 [pdf, other]

DocumentNet: Bridging the Data Gap in Document Pre-Training

Authors: Lijun Yu, Jin Miao, Xiaoyu Sun, Jiayi Chen, Alexander G. Hauptmann, Hanjun Dai, Wei Wei

Abstract: Document understanding tasks, in particular, Visually-rich Document Entity Retrieval (VDER), have gained significant attention in recent years thanks to their broad applications in enterprise AI. However, publicly available data have been scarce for these tasks due to strict privacy constraints and high annotation costs. To make things worse, the non-overlapping entity spaces from different datase… ▽ More Document understanding tasks, in particular, Visually-rich Document Entity Retrieval (VDER), have gained significant attention in recent years thanks to their broad applications in enterprise AI. However, publicly available data have been scarce for these tasks due to strict privacy constraints and high annotation costs. To make things worse, the non-overlapping entity spaces from different datasets hinder the knowledge transfer between document types. In this paper, we propose a method to collect massive-scale and weakly labeled data from the web to benefit the training of VDER models. The collected dataset, named DocumentNet, does not depend on specific document types or entity sets, making it universally applicable to all VDER tasks. The current DocumentNet consists of 30M documents spanning nearly 400 document types organized in a four-level ontology. Experiments on a set of broadly adopted VDER tasks show significant improvements when DocumentNet is incorporated into the pre-training for both classic and few-shot learning settings. With the recent emergence of large language models (LLMs), DocumentNet provides a large data source to extend their multi-modal capabilities for VDER. △ Less

Submitted 26 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: EMNLP 2023

arXiv:2306.08666 [pdf, other]

Radiology-GPT: A Large Language Model for Radiology

Authors: Zhengliang Liu, Aoxiao Zhong, Yiwei Li, Longtao Yang, Chao Ju, Zihao Wu, Chong Ma, Peng Shu, Cheng Chen, Sekeun Kim, Haixing Dai, Lin Zhao, Lichao Sun, Dajiang Zhu, Jun Liu, Wei Liu, Dinggang Shen, Xiang Li, Quanzheng Li, Tianming Liu

Abstract: We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst… ▽ More We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst for future developments in clinical NLP. The successful implementation of Radiology-GPT is indicative of the potential of localizing generative large language models, specifically tailored for distinctive medical specialties, while ensuring adherence to privacy standards such as HIPAA. The prospect of developing individualized, large-scale language models that cater to specific needs of various hospitals presents a promising direction. The fusion of conversational competence and domain-specific knowledge in these models is set to foster future development in healthcare AI. A demo of Radiology-GPT is available at https://huggingface.co/spaces/allen-eric/radiology-gpt. △ Less

Submitted 19 March, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

arXiv:2306.05194 [pdf, ps, other]

doi 10.1103/PhysRevD.108.092003

Precision Measurements of $D_s^+ \to ηe^+ ν_e$ and $D_s^+ \to η^\prime e^+ ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (591 additional authors not shown)

Abstract: Precision measurements of the semileptonic decays $D_s^+ \to ηe^+ ν_e$ and $D_s^+ \to η^\prime e^+ ν_e$ are performed with 7.33\,fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector. The branching fractions obtained are $\mathcal{B}(D_s^+ \to ηe^{+} ν_e)$ = $(2.255\pm0.039_{\rm stat}\pm 0.051_{\rm syst})\%$ and… ▽ More Precision measurements of the semileptonic decays $D_s^+ \to ηe^+ ν_e$ and $D_s^+ \to η^\prime e^+ ν_e$ are performed with 7.33\,fb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies between 4.128 and 4.226 GeV with the BESIII detector. The branching fractions obtained are $\mathcal{B}(D_s^+ \to ηe^{+} ν_e)$ = $(2.255\pm0.039_{\rm stat}\pm 0.051_{\rm syst})\%$ and $\mathcal{B}(D_s^+ \to η^{\prime} e^{+} ν_e)$ = $(0.810\pm0.038_{\rm stat}\pm 0.024_{\rm syst})\%$. Combining these results with the $\mathcal{B}(D^+\toηe^+ ν_e)$ and $\mathcal{B}(D^+\toη^\prime e^+ ν_e)$ obtained from previous BESIII measurements, the $η-η^\prime$ mixing angle in the quark flavor basis is determined to be $φ_{\rm P} = (40.0\pm2.0_{\rm stat}\pm0.6_{\rm syst})^\circ$. Moreover, from the fits to the partial decay rates of $D_s^+ \to ηe^+ ν_e$ and $D_s^+ \to η^\prime e^+ ν_e$, the products of the hadronic transition form factors $f_+^{η^{(\prime)}}(0)$ and the modulus of the $c\to s$ Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ are determined by using different hadronic transition form factor parametrizations. Based on the two-parameter series expansion, the products $f^η_+(0)|V_{cs}| = 0.4519\pm0.0071_{\rm stat}\pm0.0065_{\rm syst}$ and $f^{η^\prime}_+(0)|V_{cs}| = 0.525\pm0.024_{\rm stat}\pm0.009_{\rm syst}$ are extracted. All results determined in this work supersede those measured in the previous BESIII analyses based on the 3.19 fb$^{-1}$ subsample of data at 4.178 GeV. △ Less

Submitted 27 October, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

Journal ref: Physical Review D 108, 092003 (2023)

arXiv:2306.04106 [pdf]

Prediction of nanobubble-assisted focused ultrasound-induced blood-brain barrier opening with machine learning

Authors: Wenjing Li, Chenchen Bing, Haixin Dai, Rajiv Chopra, Qian Wang, Bingbing Cheng

Abstract: Novel approaches for predicting the outcomes of blood-brain barrier (BBB) opening with focused ultrasound (FUS) and microbubbles are highly desired. This study aims to explore machine learning-based methods for reliably predicting the FUS-induced BBB opening efficacy and safety. Methods: Sixteen female rats were used in this study. An acoustic feedback-controlled FUS system (f0: 0.5MHz) was used f… ▽ More Novel approaches for predicting the outcomes of blood-brain barrier (BBB) opening with focused ultrasound (FUS) and microbubbles are highly desired. This study aims to explore machine learning-based methods for reliably predicting the FUS-induced BBB opening efficacy and safety. Methods: Sixteen female rats were used in this study. An acoustic feedback-controlled FUS system (f0: 0.5MHz) was used for the BBB opening with the infusion of custom-made nanobubbles/Definity. Evans Blue was injected for the BBB opening efficacy verification and the brain tissue was harvested for the safety assessment. Acoustic emissions were recorded, preprocessed and fed into three machine learning models for BBB opening outcomes prediction. Conventional stable and inertial cavitation dose were also calculated. Results: Among the tested machine learning models, a modified Support Vector Data Description (mSVDD) model achieved the best performance in the BBB opening efficacy and safety prediction with an accuracy of 85.0+/-16.6% and 62.5+/-12.8%, respectively. Conventional stable and inertial cavitation dose-based prediction has a prediction accuracy of 80.0% in efficacy and 34.3% in safety, respectively. The mSVDD model trained with the overall bubble response (0-2 MHz) performed better than that trained with the ultra-harmonic bubble response (0.7-0.8 MHz) in both efficacy prediction (85.0+/-16.6% vs 76.0+/-8.0%, p=0.04) and safety prediction (62.5+/-12.8% vs 55.0+/-10.7%, p>0.05). It is also found that the mSVDD model trained with nanobubble data cannot be directly applied to Definity. Conclusion: Our investigations demonstrated that it is feasible to achieve a reliable prediction of FUS-induced BBB opening outcomes with machine learning and acoustic signals from stimulated nanobubbles. This study provided a new approach for the prediction of FUS-BBB opening outcomes with a clinical translation potential. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2306.02624 [pdf, ps, other]

Study of $Λ_c^+\rightarrow Λμ^+ν_μ$ and Test of Lepton Flavor Universality with $Λ_c^+\rightarrow Λ\ell^+ν_{\ell}$ Decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: The measurement of the Cabibbo-favored semileptonic decay $Λ_c^+\rightarrow Λμ^+ν_μ$ is reported using $4.5~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies ranging from 4.600~GeV to 4.699~GeV. The branching fraction of the decay is measured to be $\mathcal{B}(Λ_c^+\rightarrow Λμ^+ν_μ)=(3.48\pm0.14_{\rm stat.}\pm0.10_{\rm syst.})\%$, three times more precise tha… ▽ More The measurement of the Cabibbo-favored semileptonic decay $Λ_c^+\rightarrow Λμ^+ν_μ$ is reported using $4.5~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies ranging from 4.600~GeV to 4.699~GeV. The branching fraction of the decay is measured to be $\mathcal{B}(Λ_c^+\rightarrow Λμ^+ν_μ)=(3.48\pm0.14_{\rm stat.}\pm0.10_{\rm syst.})\%$, three times more precise than the prior world average result. Tests of lepton flavor universality using $Λ_c^+\rightarrow Λ\ell^+ν_{\ell}$ ($\ell=e, μ$) decays are reported for the first time, based on measurements of the differential decay rates and the forward-backward asymmetries in separate four-momentum transfer regions. The results are compatible with Standard Model predictions. Furthermore, we improve the determination of the form-factor parameters in $Λ_c^+\rightarrow Λ\ell^+ν_{\ell}$ decays, which provide stringent tests and calibration for lattice quantum chromodynamics (LQCD) calculations. △ Less

Submitted 5 June, 2023; originally announced June 2023.

Comments: 11 pages, 5 figures

arXiv:2306.02049 [pdf, other]

LambdaBeam: Neural Program Search with Higher-Order Functions and Lambdas

Authors: Kensen Shi, Hanjun Dai, Wen-Ding Li, Kevin Ellis, Charles Sutton

Abstract: Search is an important technique in program synthesis that allows for adaptive strategies such as focusing on particular search directions based on execution results. Several prior works have demonstrated that neural models are effective at guiding program synthesis searches. However, a common drawback of those approaches is the inability to handle iterative loops, higher-order functions, or lambd… ▽ More Search is an important technique in program synthesis that allows for adaptive strategies such as focusing on particular search directions based on execution results. Several prior works have demonstrated that neural models are effective at guiding program synthesis searches. However, a common drawback of those approaches is the inability to handle iterative loops, higher-order functions, or lambda functions, thus limiting prior neural searches from synthesizing longer and more general programs. We address this gap by designing a search algorithm called LambdaBeam that can construct arbitrary lambda functions that compose operations within a given DSL. We create semantic vector representations of the execution behavior of the lambda functions and train a neural policy network to choose which lambdas to construct during search, and pass them as arguments to higher-order functions to perform looping computations. Our experiments show that LambdaBeam outperforms neural, symbolic, and LLM-based techniques in an integer list manipulation domain. △ Less

Submitted 28 October, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

arXiv:2306.00739 [pdf, other]

SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

Authors: Ruoxi Sun, Sercan Ö. Arik, Alex Muzio, Lesly Miculicich, Satya Gundabathula, Pengcheng Yin, Hanjun Dai, Hootan Nakhost, Rajarishi Sinha, Zifeng Wang, Tomas Pfister

Abstract: Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data. This paper introduces the SQL-PaLM framework, a comprehensive solution for understanding and enhancing Text-to-SQL using LLMs, using in the learning regimes of few-shot prom… ▽ More Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data. This paper introduces the SQL-PaLM framework, a comprehensive solution for understanding and enhancing Text-to-SQL using LLMs, using in the learning regimes of few-shot prompting and instruction fine-tuning. With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error filtering. With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs. In particular, we investigate how performance can be improved through expanded training data coverage and diversity, synthetic data augmentation, and integrating query-specific database content. We propose a test-time selection method to further refine accuracy by integrating SQL outputs from multiple paradigms with execution feedback as guidance. Additionally, we tackle the practical challenge of navigating intricate databases with a significant number of tables and columns, proposing efficient techniques for accurately selecting relevant database elements to enhance Text-to-SQL performance. Our holistic approach yields substantial advancements in Text-to-SQL, as demonstrated on two key public benchmarks, Spider and BIRD. Through comprehensive ablations and error analyses, we shed light on the strengths and weaknesses of our framework, offering valuable insights into Text-to-SQL's future work. △ Less

Submitted 30 March, 2024; v1 submitted 26 May, 2023; originally announced June 2023.

arXiv:2305.17030 [pdf, other]

doi 10.3847/1538-4365/acfd29

The First LHAASO Catalog of Gamma-Ray Sources

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022.… ▽ More We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022. This catalog represents the main result from the most sensitive large coverage gamma-ray survey of the sky above 1 TeV, covering declination from $-$20$^{\circ}$ to 80$^{\circ}$. In total, the catalog contains 90 sources with an extended size smaller than $2^\circ$ and a significance of detection at $> 5σ$. Based on our source association criteria, 32 new TeV sources are proposed in this study. Among the 90 sources, 43 sources are detected with ultra-high energy ($E > 100$ TeV) emission at $> 4σ$ significance level. We provide the position, extension, and spectral characteristics of all the sources in this catalog. △ Less

Submitted 27 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 40 pages, 13 figures, 4 tables

Journal ref: The Astrophysical Journal Supplement Series, 271 (2024) 25

arXiv:2305.17010 [pdf, other]

Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets

Authors: Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan

Abstract: Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact algorithms, making them a tempting domain to apply machine learning methods. The highly structured constraints in these problems can hinder either optimization or sampling directly in the solution space. On the other hand, GFlowNets have recently emerged as a powerful machinery to efficiently sample from com… ▽ More Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact algorithms, making them a tempting domain to apply machine learning methods. The highly structured constraints in these problems can hinder either optimization or sampling directly in the solution space. On the other hand, GFlowNets have recently emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially and have the potential to amortize such solution-searching processes in CO, as well as generate diverse solution candidates. In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space. Efficient training techniques are also developed to benefit long-range credit assignment. Through extensive experiments on a variety of different CO tasks with synthetic and realistic data, we demonstrate that GFlowNet policies can efficiently find high-quality solutions. Our implementation is open-sourced at https://github.com/zdhNarsil/GFlowNet-CombOpt. △ Less

Submitted 20 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: Accepted by NeurIPS 2023 as spotlight

arXiv:2305.15879 [pdf, other]

Amplitude analysis and branching fraction measurement of the decay $D^{+} \to K_S^0π^+π^0π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (601 additional authors not shown)

Abstract: Using 2.93 $\rm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy 3.773\,GeV, we perform the first amplitude analysis of the decay $D^+\to K_S^0π^+π^0π^0$ and determine the relative magnitudes and phases of different intermediate processes. The absolute branching fraction of $D^+\to K_S^0π^+π^0π^0$ is measured to be… ▽ More Using 2.93 $\rm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy 3.773\,GeV, we perform the first amplitude analysis of the decay $D^+\to K_S^0π^+π^0π^0$ and determine the relative magnitudes and phases of different intermediate processes. The absolute branching fraction of $D^+\to K_S^0π^+π^0π^0$ is measured to be $(2.888\pm0.058_{\rm stat.}\pm0.069_{\rm syst.})\%$. The dominant intermediate processes are $D^+\to K_S^0a_1(1260)^+(\to ρ^+π^0)$ and $D^+\to \bar{K}^{*0}ρ^+$, with branching fractions of $(8.66\pm1.04_{\rm stat.}\pm1.39_{\rm syst.})\!\times \!10^{-3}$ and $(9.70\pm0.81_{\rm stat.}\pm0.53_{\rm syst.})\!\times \!10^{-3}$, respectively. △ Less

Submitted 5 August, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.14926 [pdf, other]

Universal Self-Adaptive Prompting

Authors: Xingchen Wan, Ruoxi Sun, Hootan Nakhost, Hanjun Dai, Julian Martin Eisenschlos, Sercan O. Arik, Tomas Pfister

Abstract: A hallmark of modern large language models (LLMs) is their impressive general zero-shot and few-shot abilities, often elicited through in-context learning (ICL) via prompting. However, while highly coveted and being the most general, zero-shot performances in LLMs are still typically weaker due to the lack of guidance and the difficulty of applying existing automatic prompt design methods in gener… ▽ More A hallmark of modern large language models (LLMs) is their impressive general zero-shot and few-shot abilities, often elicited through in-context learning (ICL) via prompting. However, while highly coveted and being the most general, zero-shot performances in LLMs are still typically weaker due to the lack of guidance and the difficulty of applying existing automatic prompt design methods in general tasks when ground-truth labels are unavailable. In this study, we address this by presenting Universal Self-Adaptive Prompting (USP), an automatic prompt design approach specifically tailored for zero-shot learning (while compatible with few-shot). Requiring only a small amount of unlabeled data and an inference-only LLM, USP is highly versatile: to achieve universal prompting, USP categorizes a possible NLP task into one of the three possible task types and then uses a corresponding selector to select the most suitable queries and zero-shot model-generated responses as pseudo-demonstrations, thereby generalizing ICL to the zero-shot setup in a fully automated way. We evaluate USP with PaLM and PaLM 2 models and demonstrate performances that are considerably stronger than standard zero-shot baselines and often comparable to or even superior to few-shot baselines across more than 40 natural language understanding, natural language generation, and reasoning tasks. △ Less

Submitted 20 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: EMNLP 2023 (Main). 10 pages, 5 figures, 4 tables (26 pages, 9 figures and 13 tables including references and appendices)

arXiv:2305.14631 [pdf, other]

Determination of spin and parity of $D^{*}_{(s)}$ mesons

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (598 additional authors not shown)

Abstract: The spin and parity of the charmed mesons $D_{s}^{*+}$, $D^{*0}$ and $D^{*+}$ are determined for the first time to be $J^P=1^{-}$ with significances greater than 10$σ$ over other hypotheses of $2^{+}$ and $3^{-}$, using an $e^+e^-$ collision data sample with an integrated luminosity of 3.19 fb$^{-1}$ collected by the BESIII detector at a center-of-mass energy of 4.178 GeV. Different spin-parity hy… ▽ More The spin and parity of the charmed mesons $D_{s}^{*+}$, $D^{*0}$ and $D^{*+}$ are determined for the first time to be $J^P=1^{-}$ with significances greater than 10$σ$ over other hypotheses of $2^{+}$ and $3^{-}$, using an $e^+e^-$ collision data sample with an integrated luminosity of 3.19 fb$^{-1}$ collected by the BESIII detector at a center-of-mass energy of 4.178 GeV. Different spin-parity hypotheses for $D_{s}^{*+}$, $D^{*0}$, and $D^{*+}$ mesons are tested via a helicity amplitude analysis of the processes $e^+e^-\to D^{*+}_{s}D^{-}_{s}$, $D^{*0}D^{0}$ and $D^{*+}D^{-}$, with $D^{*+}_{s}\to D^{+}_{s} γ$, $D^{*0}\to D^{0}π^{0}$, and $D^{*+}\to D^{+}π^{0}$. The results confirm the quark model predictions. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.14106 [pdf, other]

Better Zero-Shot Reasoning with Self-Adaptive Prompting

Authors: Xingchen Wan, Ruoxi Sun, Hanjun Dai, Sercan O. Arik, Tomas Pfister

Abstract: Modern large language models (LLMs) have demonstrated impressive capabilities at sophisticated tasks, often through step-by-step reasoning similar to humans. This is made possible by their strong few and zero-shot abilities -- they can effectively learn from a handful of handcrafted, completed responses ("in-context examples"), or are prompted to reason spontaneously through specially designed tri… ▽ More Modern large language models (LLMs) have demonstrated impressive capabilities at sophisticated tasks, often through step-by-step reasoning similar to humans. This is made possible by their strong few and zero-shot abilities -- they can effectively learn from a handful of handcrafted, completed responses ("in-context examples"), or are prompted to reason spontaneously through specially designed triggers. Nonetheless, some limitations have been observed. First, performance in the few-shot setting is sensitive to the choice of examples, whose design requires significant human effort. Moreover, given the diverse downstream tasks of LLMs, it may be difficult or laborious to handcraft per-task labels. Second, while the zero-shot setting does not require handcrafting, its performance is limited due to the lack of guidance to the LLMs. To address these limitations, we propose Consistency-based Self-adaptive Prompting (COSP), a novel prompt design method for LLMs. Requiring neither handcrafted responses nor ground-truth labels, COSP selects and builds the set of examples from the LLM zero-shot outputs via carefully designed criteria that combine consistency, diversity and repetition. In the zero-shot setting for three different LLMs, we show that using only LLM predictions, COSP improves performance up to 15% compared to zero-shot baselines and matches or exceeds few-shot baselines for a range of reasoning tasks. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: Findings of the Association for Computational Linguistics: ACL 2023. 10 pages, 2 tables, 4 figures (20 pages, 8 tables, 7 figures including references and appendices)

arXiv:2305.13041 [pdf, ps, other]

doi 10.1109/TSP.2023.3282071

Distributed Learning over Networks with Graph-Attention-Based Personalization

Authors: Zhuojun Tian, Zhaoyang Zhang, Zhaohui Yang, Richeng Jin, Huaiyu Dai

Abstract: In conventional distributed learning over a network, multiple agents collaboratively build a common machine learning model. However, due to the underlying non-i.i.d. data distribution among agents, the unified learning model becomes inefficient for each agent to process its locally accessible data. To address this problem, we propose a graph-attention-based personalized training algorithm (GATTA)… ▽ More In conventional distributed learning over a network, multiple agents collaboratively build a common machine learning model. However, due to the underlying non-i.i.d. data distribution among agents, the unified learning model becomes inefficient for each agent to process its locally accessible data. To address this problem, we propose a graph-attention-based personalized training algorithm (GATTA) for distributed deep learning. The GATTA enables each agent to train its local personalized model while exploiting its correlation with neighboring nodes and utilizing their useful information for aggregation. In particular, the personalized model in each agent is composed of a global part and a node-specific part. By treating each agent as one node in a graph and the node-specific parameters as its features, the benefits of the graph attention mechanism can be inherited. Namely, instead of aggregation based on averaging, it learns the specific weights for different neighboring nodes without requiring prior knowledge about the graph structure or the neighboring nodes' data distribution. Furthermore, relying on the weight-learning procedure, we develop a communication-efficient GATTA by skipping the transmission of information with small aggregation weights. Additionally, we theoretically analyze the convergence properties of GATTA for non-convex loss functions. Numerical results validate the excellent performances of the proposed algorithms in terms of convergence and communication cost. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted for publication in IEEE TSP; with supplementary details for the derivations

arXiv:2305.12166 [pdf, other]

Production of doubly-charged $Δ$ baryon in $e^{+}e^{-}$ annihilation at energies from 2.3094 to 2.6464 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (579 additional authors not shown)

Abstract: The processes $e^{+}e^{-} \to Δ^{++}\barΔ^{--}$ and $e^{+}e^{-}\to Δ^{++} \bar{p} π^{-} + c.c.$ are studied for the first time with $179~{\rm pb}^{-1}$ of $e^{+}e^{-}$ annihilation data collected with the BESIII detector at center-of-mass energies from $2.3094$ GeV to $2.6464$ GeV. No significant signal for the $e^{+}e^{-}\to Δ^{++}\barΔ^{--}$ process is observed and the upper limit of the Born cr… ▽ More The processes $e^{+}e^{-} \to Δ^{++}\barΔ^{--}$ and $e^{+}e^{-}\to Δ^{++} \bar{p} π^{-} + c.c.$ are studied for the first time with $179~{\rm pb}^{-1}$ of $e^{+}e^{-}$ annihilation data collected with the BESIII detector at center-of-mass energies from $2.3094$ GeV to $2.6464$ GeV. No significant signal for the $e^{+}e^{-}\to Δ^{++}\barΔ^{--}$ process is observed and the upper limit of the Born cross section is estimated at each energy point. For the process $e^{+}e^{-} \to Δ^{++} \bar{p} π^{-} + c.c.$, a significant signal is observed at center-of-mass energies near 2.6454 GeV and the corresponding Born cross section is reported. △ Less

Submitted 14 July, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

Comments: 10 pages, 4 figures

arXiv:2305.11682 [pdf, other]

Search for a scalar partner of the $X(3872)$ via $ψ(3770)$ decays into $γηη'$ and $γπ^{+}π^{-}J/ψ$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (599 additional authors not shown)

Abstract: Using a data sample corresponding to an integrated luminosity of 2.93 fb$^{-1}$ collected at a center-of-mass energy of 3.773~GeV with the BESIII detector at the BEPCII collider, we search for a scalar partner of the $X(3872)$, denoted as $X(3700)$, via $ψ(3770)\to γηη'$ and $γπ^{+}π^{-}J/ψ$ processes. No significant signals are observed and the upper limits of the product branching fractions… ▽ More Using a data sample corresponding to an integrated luminosity of 2.93 fb$^{-1}$ collected at a center-of-mass energy of 3.773~GeV with the BESIII detector at the BEPCII collider, we search for a scalar partner of the $X(3872)$, denoted as $X(3700)$, via $ψ(3770)\to γηη'$ and $γπ^{+}π^{-}J/ψ$ processes. No significant signals are observed and the upper limits of the product branching fractions $ {\cal B}(ψ(3770)\toγX(3700))\cdot {\cal B}(X(3700)\to ηη')$ and ${\cal B}(ψ(3770)\toγX(3700))\cdot {\cal B}(X(3700)\toπ^{+}π^{-}J/ψ)$ are determined at the 90\% confidence level, for the narrow $X(3700)$ with a mass ranging from 3710 to 3740 MeV/$c^2$, which are from 0.8 to 1.8 $(\times 10^{-5})$ and 0.9 to 3.4 $(\times 10^{-5})$, respectively. △ Less

Submitted 6 September, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

arXiv:2305.10254 [pdf, other]

SAM for Poultry Science

Authors: Xiao Yang, Haixing Dai, Zihao Wu, Ramesh Bist, Sachin Subedi, Jin Sun, Guoyu Lu, Changying Li, Tianming Liu, Lilong Chai

Abstract: In recent years, the agricultural industry has witnessed significant advancements in artificial intelligence (AI), particularly with the development of large-scale foundational models. Among these foundation models, the Segment Anything Model (SAM), introduced by Meta AI Research, stands out as a groundbreaking solution for object segmentation tasks. While SAM has shown success in various agricult… ▽ More In recent years, the agricultural industry has witnessed significant advancements in artificial intelligence (AI), particularly with the development of large-scale foundational models. Among these foundation models, the Segment Anything Model (SAM), introduced by Meta AI Research, stands out as a groundbreaking solution for object segmentation tasks. While SAM has shown success in various agricultural applications, its potential in the poultry industry, specifically in the context of cage-free hens, remains relatively unexplored. This study aims to assess the zero-shot segmentation performance of SAM on representative chicken segmentation tasks, including part-based segmentation and the use of infrared thermal images, and to explore chicken-tracking tasks by using SAM as a segmentation tool. The results demonstrate SAM's superior performance compared to SegFormer and SETR in both whole and part-based chicken segmentation. SAM-based object tracking also provides valuable data on the behavior and movement patterns of broiler birds. The findings of this study contribute to a better understanding of SAM's potential in poultry science and lay the foundation for future advancements in chicken segmentation and tracking. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2305.09218 [pdf, other]

doi 10.1103/PhysRevD.108.L031106

Tests of $CP$ symmetry in the entangled $Ξ^0-\barΞ^0$ Pairs

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

Abstract: The $J/ψ\to Ξ^0 \barΞ^{0}$ process and subsequent decays are investigated using $(10087 \pm 44)\times 10^6$ $J/ψ$ events collected at the BESIII experiment. The decay parameters of $Ξ^0$ and $\barΞ^0$ are measured with greatly improved precision over previous measurements to be $α_Ξ = -0.3750 \pm 0.0034 \pm 0.0016$, $\barα_Ξ = 0.3790 \pm 0.0034 \pm 0.0021$, $φ_Ξ = 0.0051 \pm 0.0096 \pm 0.0018$~rad… ▽ More The $J/ψ\to Ξ^0 \barΞ^{0}$ process and subsequent decays are investigated using $(10087 \pm 44)\times 10^6$ $J/ψ$ events collected at the BESIII experiment. The decay parameters of $Ξ^0$ and $\barΞ^0$ are measured with greatly improved precision over previous measurements to be $α_Ξ = -0.3750 \pm 0.0034 \pm 0.0016$, $\barα_Ξ = 0.3790 \pm 0.0034 \pm 0.0021$, $φ_Ξ = 0.0051 \pm 0.0096 \pm 0.0018$~rad, $\barφ_Ξ = -0.0053 \pm 0.0097 \pm 0.0019$~rad, where the first and the second uncertainties are statistical and systematic, respectively. From these measurements, precise $CP$ symmetry tests in $Ξ^0$ decay are performed, and $A^Ξ_{CP} = (-5.4 \pm 6.5 \pm 3.1) \times 10^{-3}$ and $Δφ^Ξ_{CP} = (-0.1 \pm 6.9 \pm 0.9) \times 10^{-3}$~rad are consistent with $CP$ conservation. The sequential decay also enables a separation of weak and strong phase differences, which are found for the first time to be $ξ_{P}-ξ_{S} = (0.0 \pm 1.7 \pm 0.2) \times 10^{-2}$~rad and $δ_{P}-δ_{S} = (-1.3 \pm 1.7 \pm 0.4)\times 10^{-2}$~rad, respectively. In addition, we measure the $Λ$ decay parameters and test $CP$ symmetry in $Λ$ decays. △ Less

Submitted 14 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

arXiv:2305.07231 [pdf, ps, other]

doi 10.1103/PhysRevD.108.012006

Search for baryon and lepton number violating decays of $Ξ^{0}$ hyperons

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (591 additional authors not shown)

Abstract: Using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we report the first search for the baryon and lepton number violating decays $Ξ^{0} \rightarrow K^{-} e^{+}$ with $Δ(B-L)=0$ and $Ξ^{0} \rightarrow K^{+} e^{-}$ with $|Δ(B-L)|=2$, where $B$ ($L$) is the baryon (lepton) number. While no signal is observed, the upper limits on the branching f… ▽ More Using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we report the first search for the baryon and lepton number violating decays $Ξ^{0} \rightarrow K^{-} e^{+}$ with $Δ(B-L)=0$ and $Ξ^{0} \rightarrow K^{+} e^{-}$ with $|Δ(B-L)|=2$, where $B$ ($L$) is the baryon (lepton) number. While no signal is observed, the upper limits on the branching fractions of these two decays are set to $\mathcal B(Ξ^{0} \rightarrow K^{-} e^{+})<3.6\times10^{-6}$ and $\mathcal B(Ξ^{0} \rightarrow K^{+} e^{-})<1.9\times10^{-6}$ at the 90\% confidence level, respectively. These results offer a direct probe of baryon number violating interactions involving a strange quark. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: 8 pages, 5 figures

Journal ref: Phys. Rev. D 108, 012006 (2023)

arXiv:2305.05372 [pdf, other]

doi 10.1103/PhysRevLett.131.151001

Measurement of ultra-high-energy diffuse gamma-ray emission of the Galactic plane from 10 TeV to 1 PeV with LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer ar… ▽ More The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer array of the Large High Altitude Air Shower Observatory (LHAASO). Diffuse emissions from the inner ($15^{\circ}<l<125^{\circ}$, $|b|<5^{\circ}$) and outer ($125^{\circ}<l<235^{\circ}$, $|b|<5^{\circ}$) Galactic plane are detected with $29.1σ$ and $12.7σ$ significance, respectively. The outer Galactic plane diffuse emission is detected for the first time in the very- to ultra-high-energy domain ($E>10$~TeV). The energy spectrum in the inner Galaxy regions can be described by a power-law function with an index of $-2.99\pm0.04$, which is different from the curved spectrum as expected from hadronic interactions between locally measured cosmic rays and the line-of-sight integrated gas content. Furthermore, the measured flux is higher by a factor of $\sim3$ than the prediction. A similar spectrum with an index of $-2.99\pm0.07$ is found in the outer Galaxy region, and the absolute flux for $10\lesssim E\lesssim60$ TeV is again higher than the prediction for hadronic cosmic ray interactions. The latitude distributions of the diffuse emission are consistent with the gas distribution, while the longitude distributions show clear deviation from the gas distribution. The LHAASO measurements imply that either additional emission sources exist or cosmic ray intensities have spatial variations. △ Less

Submitted 19 August, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

Comments: 12 pages, 8 figures, 5 tables; accepted for publication in Physical Review Letters; source mask file provided as ancillary file

Journal ref: Phys. Rev. Lett. 131, 151001 (2023)

arXiv:2305.04568 [pdf, other]

doi 10.1103/PhysRevLett.131.121801

Search for $\barΛ$-$Λ$ oscillations in the decay $J/ψ\to p K^- \barΛ+c.c.$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, H. Cai, X. Cai , et al. (437 additional authors not shown)

Abstract: We report the first search for $\barΛ$--$Λ$ oscillations in the decay $J/ψ\to p K^- \barΛ + c.c.$ by analyzing $1.31\times10^9$ $J/ψ$ events accumulated with the BESIII detector at the BEPCII collider. The $J/ψ$ events are produced using $e^+e^-$ collisions at a center of mass energy $\sqrt{s}= 3.097$~GeV. No evidence for hyperon oscillations is observed. The upper limit for the oscillation rate o… ▽ More We report the first search for $\barΛ$--$Λ$ oscillations in the decay $J/ψ\to p K^- \barΛ + c.c.$ by analyzing $1.31\times10^9$ $J/ψ$ events accumulated with the BESIII detector at the BEPCII collider. The $J/ψ$ events are produced using $e^+e^-$ collisions at a center of mass energy $\sqrt{s}= 3.097$~GeV. No evidence for hyperon oscillations is observed. The upper limit for the oscillation rate of $\barΛ$ to $Λ$ hyperons is determined to be $\mathcal{P}(Λ)=\frac{\mathcal{B}(J/ψ\to pK^-Λ+c.c.)}{\mathcal{B}(J/ψ\to pK^-\barΛ+c.c.)}<4.4\times10^{-6}$ corresponding to an oscillation parameter $δm_{Λ\barΛ}$ of less than $3.8\times10^{-18}$~GeV at the 90\% confidence level. △ Less

Submitted 31 August, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: 7 pages, 1 figure

Journal ref: Phys.Rev.Lett. 131 (2023) 12, 121801

arXiv:2305.03975 [pdf, other]

doi 10.1103/PhysRevD.108.032003

Determination of the $C\!P$-even fraction of $D^0\rightarrow K_S^0π^+π^-π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (598 additional authors not shown)

Abstract: Quantum-correlated $D\bar{D}$ pairs collected by the BESIII experiment at the $ψ(3770)$ resonance, corresponding to an integrated luminosity of 2.93 fb$^{-1}$, are used to study the $D^0 \rightarrow K^{0}_Sπ^{+} π^{-} π^{0}$ decay mode. The $C\!P$-even fraction of $D^0 \rightarrow K^{0}_Sπ^{+} π^{-} π^{0}$ decays is determined to be $0.235\pm 0.010\pm 0.002$, where the first uncertainty is statist… ▽ More Quantum-correlated $D\bar{D}$ pairs collected by the BESIII experiment at the $ψ(3770)$ resonance, corresponding to an integrated luminosity of 2.93 fb$^{-1}$, are used to study the $D^0 \rightarrow K^{0}_Sπ^{+} π^{-} π^{0}$ decay mode. The $C\!P$-even fraction of $D^0 \rightarrow K^{0}_Sπ^{+} π^{-} π^{0}$ decays is determined to be $0.235\pm 0.010\pm 0.002$, where the first uncertainty is statistical and the second is systematic. △ Less

Submitted 6 May, 2023; originally announced May 2023.

arXiv:2305.02162 [pdf, other]

doi 10.1103/PhysRevA.108.012427

Approximate quantum error correction, covariance symmetry, and their relation

Authors: Hao Dai

Abstract: To perform reliable quantum computation, quantum error correction is indispensable. In certain cases, continuous covariance symmetry of the physical system can make exact error correction impossible. In this work we study the approximate error correction and covariance symmetry from the information-theoretic perspective. For general encoding and noise channels, we define a quantity named infidelit… ▽ More To perform reliable quantum computation, quantum error correction is indispensable. In certain cases, continuous covariance symmetry of the physical system can make exact error correction impossible. In this work we study the approximate error correction and covariance symmetry from the information-theoretic perspective. For general encoding and noise channels, we define a quantity named infidelity to characterize the performance of the approximate quantum error correction and quantify the noncovariance of an encoding channel with respect to a general Lie group from the asymmetry measure of the corresponding Choi state. In particular, when the encoding channel is isometric, we derive a trade-off relation between infidelity and noncovariance. Furthermore, we calculate the average infidelity and noncovariance measure for a type of random code. △ Less

Submitted 24 August, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

arXiv:2305.01368 [pdf, other]

Single-shot spatial instability and electric control of polariton condensates at room temperature

Authors: Ying Gao, Xuekai Ma, Xiaokun Zhai, Chunzi Xing, Meini Gao, Haitao Dai, Hao Wu, Tong Liu, Yuan Ren, Xiao Wang, Anlian Pan, Wei Hu, Stefan Schumacher, Tingge Gao

Abstract: In planar microcavities, the transverse-electric and transverse-magnetic (TE-TM) mode splitting of cavity photons arises due to their different penetration into the Bragg mirrors and can result in optical spin-orbit coupling (SOC). In this work, we find that in a liquid crystal (LC) microcavity filled with perovskite microplates, the pronounced TE-TM splitting gives rise to a strong SOC that leads… ▽ More In planar microcavities, the transverse-electric and transverse-magnetic (TE-TM) mode splitting of cavity photons arises due to their different penetration into the Bragg mirrors and can result in optical spin-orbit coupling (SOC). In this work, we find that in a liquid crystal (LC) microcavity filled with perovskite microplates, the pronounced TE-TM splitting gives rise to a strong SOC that leads to the spatial instability of microcavity polariton condensates under single-shot excitation. Spatially varying hole burning and mode competition occurs between polarization components leading to different condensate profiles from shot to shot. The single-shot polariton condensates become stable when the SOC vanishes as the TE and TM modes are spectrally well separated from each other, which can be achieved by application of an electric field to our LC microcavity with electrically tunable anisotropy. Our findings are well reproduced and traced back to their physical origin by our detailed numerical simulations. With the electrical manipulation our work reveals how the shot-to-shot spatial instability of spatial polariton profiles can be engineered in anisotropic microcavities at room temperature, which will benefit the development of stable polariton-based optoeletronic and light-emitting devices. △ Less

Submitted 2 May, 2023; originally announced May 2023.

arXiv:2305.00894 [pdf, other]

doi 10.1088/1674-1137/ad597b

Searching for $^{76}$Ge neutrinoless double beta decay with the CDEX-1B experiment

Authors: B. T. Zhang, J. Z. Wang, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, X. Y. Guo, L. He, S. M. He, J. W. Hu, H. X. Huang, T. C. Huang, H. T. Jia, X. Jiang , et al. (60 additional authors not shown)

Abstract: We operated a p-type point contact high purity germanium (PPCGe) detector (CDEX-1B, 1.008 kg) in the China Jinping Underground Laboratory (CJPL) for 500.3 days to search for neutrinoless double beta ($0νββ$) decay of $^{76}$Ge. A total of 504.3 kg$\cdot$day effective exposure data was accumulated. The anti-coincidence and the multi/single-site event (MSE/SSE) discrimination methods were used to su… ▽ More We operated a p-type point contact high purity germanium (PPCGe) detector (CDEX-1B, 1.008 kg) in the China Jinping Underground Laboratory (CJPL) for 500.3 days to search for neutrinoless double beta ($0νββ$) decay of $^{76}$Ge. A total of 504.3 kg$\cdot$day effective exposure data was accumulated. The anti-coincidence and the multi/single-site event (MSE/SSE) discrimination methods were used to suppress the background in the energy region of interest (ROI, 1989$-$2089 keV for this work) with a factor of 23. A background level of 0.33 counts/(keV$\cdot$kg$\cdot$yr) was realized. The lower limit on the half life of $^{76}$Ge $0νββ$ decay was constrained as $T_{1/2}^{0ν}\ > \ {1.0}\times 10^{23}\ \rm yr\ (90\% \ C.L.)$, corresponding to the upper limits on the effective Majorana neutrino mass: $\langle m_{ββ}\rangle < $3.2$-$7.5$\ \mathrm{eV}$. △ Less

Submitted 22 September, 2024; v1 submitted 1 May, 2023; originally announced May 2023.

Comments: 11 pages, 12 figures, 2 tables. Version updated to match CPC version

Journal ref: Chin. Phys. C 48, 101001 (2024)

arXiv:2304.14670 [pdf, other]

Prompt Engineering for Healthcare: Methodologies and Applications

Authors: Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, Chenxi Yue, Haiyang Zhang, Yiheng Liu, Yi Pan, Zhengliang Liu, Lichao Sun, Xiang Li, Bao Ge, Xi Jiang, Dajiang Zhu, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

Abstract: Prompt engineering is a critical technique in the field of natural language processing that involves designing and optimizing the prompts used to input information into models, aiming to enhance their performance on specific tasks. With the recent advancements in large language models, prompt engineering has shown significant superiority across various domains and has become increasingly important… ▽ More Prompt engineering is a critical technique in the field of natural language processing that involves designing and optimizing the prompts used to input information into models, aiming to enhance their performance on specific tasks. With the recent advancements in large language models, prompt engineering has shown significant superiority across various domains and has become increasingly important in the healthcare domain. However, there is a lack of comprehensive reviews specifically focusing on prompt engineering in the medical field. This review will introduce the latest advances in prompt engineering in the field of natural language processing for the medical field. First, we will provide the development of prompt engineering and emphasize its significant contributions to healthcare natural language processing applications such as question-answering systems, text summarization, and machine translation. With the continuous improvement of general large language models, the importance of prompt engineering in the healthcare domain is becoming increasingly prominent. The aim of this article is to provide useful resources and bridges for healthcare natural language processing researchers to better explore the application of prompt engineering in this field. We hope that this review can provide new ideas and inspire for research and application in medical natural language processing. △ Less

Submitted 23 March, 2024; v1 submitted 28 April, 2023; originally announced April 2023.

arXiv:2304.14655 [pdf, other]

doi 10.1103/PhysRevLett.131.191802

Test of $C\!P$ Symmetry in Hyperon to Neutron Decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (598 additional authors not shown)

Abstract: The quantum entangled $J/ψ\to Σ^{+}\barΣ^{-}$ pairs from $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events taken by the BESIII detector are used to study the non-leptonic two-body weak decays $Σ^{+} \to n π^{+}$ and $\barΣ^{-} \to \bar{n} π^{-}$. The $C\!P$-odd weak decay parameters of the decays $Σ^{+} \to n π^{+}$ ($α_{+}$) and $\barΣ^{-} \to \bar{n} π^{-}$ ($\barα_{-}$) are determined to be… ▽ More The quantum entangled $J/ψ\to Σ^{+}\barΣ^{-}$ pairs from $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events taken by the BESIII detector are used to study the non-leptonic two-body weak decays $Σ^{+} \to n π^{+}$ and $\barΣ^{-} \to \bar{n} π^{-}$. The $C\!P$-odd weak decay parameters of the decays $Σ^{+} \to n π^{+}$ ($α_{+}$) and $\barΣ^{-} \to \bar{n} π^{-}$ ($\barα_{-}$) are determined to be $-0.0565\pm0.0047_{\rm stat}\pm0.0022_{\rm syst}$ and $0.0481\pm0.0031_{\rm stat}\pm0.0019_{\rm syst}$, respectively. The decay parameter $\barα_{-}$ is measured for the first time, and the accuracy of $α_{+}$ is improved by a factor of four compared to the previous results. The simultaneously determined decay parameters allow the first precision $C\!P$ symmetry test for any hyperon decay with a neutron in the final state with the measurement of $A_{C\!P}=(α_{+}+\barα_{-})/(α_{+}-\barα_{-})=-0.080\pm0.052_{\rm stat}\pm0.028_{\rm syst}$. Assuming $C\!P$ conservation, the average decay parameter is determined as $\left< α_{+}\right>=(α_{+}- \barα_{-})/2 = -0.0506\pm0.0026_{\rm stat}\pm0.0019_{\rm syst}$, while the ratios $α_{+}/α_{0}$ and $\barα_{-}/\barα_{0}$ are $-0.0490\pm0.0032_{\rm stat}\pm0.0021_{\rm syst}$ and $-0.0571\pm0.0053_{\rm stat}\pm0.0032_{\rm syst}$, where $α_{0}$ and $\barα_{0}$ are the decay parameters of the decays $Σ^{+} \to p π^{0}$ and $\barΣ^{-} \to \bar{p} π^{0}$, respectively. △ Less

Submitted 28 April, 2023; originally announced April 2023.

arXiv:2304.13921 [pdf, ps, other]

doi 10.1103/PhysRevLett.130.251902

First study of reaction $Ξ^{0}n\rightarrowΞ^{-}p$ using $Ξ^0$-nucleus scattering at an electron-positron collider

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (593 additional authors not shown)

Abstract: Using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected with the BESIII detector at the BEPCII storage ring, the process $Ξ^{0}n\rightarrowΞ^{-}p$ is studied, where the $Ξ^0$ baryon is produced in the process $J/ψ\rightarrowΞ^0\barΞ^0$ and the neutron is a component of the $^9\rm{Be}$, $^{12}\rm{C}$ and $^{197}\rm{Au}$ nuclei in the beam pipe. A clear signal is observed with a statistical si… ▽ More Using $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected with the BESIII detector at the BEPCII storage ring, the process $Ξ^{0}n\rightarrowΞ^{-}p$ is studied, where the $Ξ^0$ baryon is produced in the process $J/ψ\rightarrowΞ^0\barΞ^0$ and the neutron is a component of the $^9\rm{Be}$, $^{12}\rm{C}$ and $^{197}\rm{Au}$ nuclei in the beam pipe. A clear signal is observed with a statistical significance of $7.1σ$. The cross section of the reaction $Ξ^0+{^9\rm{Be}}\rightarrowΞ^-+p+{^8\rm{Be}}$ is determined to be $σ(Ξ^0+{^9\rm{Be}}\rightarrowΞ^-+p+{^8\rm{Be}})=(22.1\pm5.3_{\rm{stat}}\pm4.5_{\rm{sys}})$ mb at the $Ξ^0$ momentum of $0.818$ GeV/$c$, where the first uncertainty is statistical and the second is systematic. No significant $H$-dibaryon signal is observed in the $Ξ^-p$ final state. This is the first study of hyperon-nucleon interactions in electron-positron collisions and opens up a new direction for such research. △ Less

Submitted 28 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: 13 pages, 7 figures, with Supplemental Material

arXiv:2304.13446 [pdf, other]

doi 10.1016/j.jsv.2023.117700

An efficient multiple harmonic balance method for computing quasi-periodic responses of nonlinear systems

Authors: Qisi Wang, Zipu Yan, Honghua Dai

Abstract: Quasi-periodic responses composed of multiple base frequencies widely exist in science and engineering problems. The multiple harmonic balance (MHB) method is one of the most commonly used approaches for such problems. However, it is limited by low-order estimations due to complex symbolic operations in practical uses. Many variants have been developed to improve the MHB method, among which the ti… ▽ More Quasi-periodic responses composed of multiple base frequencies widely exist in science and engineering problems. The multiple harmonic balance (MHB) method is one of the most commonly used approaches for such problems. However, it is limited by low-order estimations due to complex symbolic operations in practical uses. Many variants have been developed to improve the MHB method, among which the time domain MHB-like methods are regarded as crucial improvements because of their high efficiency and simple derivation. But there is still one main drawback remaining to be addressed. The time domain MHB-like methods negatively suffer from non-physical solutions, which have been shown to be caused by aliasing (mixtures of the high-order into the low-order harmonics). Inspired by the collocation-based harmonic balancing framework recently established by our group, we herein propose a reconstruction multiple harmonic balance (RMHB) method to reconstruct the conventional MHB method using discrete time domain collocations. Our study shows that the relation between the MHB and time domain MHB-like methods is determined by an aliasing matrix, which is non-zero when aliasing occurs. On this basis, a conditional equivalence is established to form the RMHB method. Three numerical examples demonstrate that this new method is more robust and efficient than the state-of-the-art methods. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: 25 pages,12 figures, and 5 tables. Accepted manuscript

Journal ref: Journal of Sound and Vibration, Volume 554,23 June 2023,117700

arXiv:2304.12533 [pdf, other]

Approximate Optimal Controller Synthesis for Cart-Poles and Quadrotors via Sums-of-Squares

Authors: Lujie Yang, Hongkai Dai, Alexandre Amice, Russ Tedrake

Abstract: Sums-of-squares (SOS) optimization is a promising tool to synthesize certifiable controllers for nonlinear dynamical systems. Building upon prior works, we demonstrate that SOS can synthesize dynamic controllers with bounded suboptimal performance for various underactuated robotic systems by finding good approximations of the value function. We summarize a unified SOS framework to synthesize both… ▽ More Sums-of-squares (SOS) optimization is a promising tool to synthesize certifiable controllers for nonlinear dynamical systems. Building upon prior works, we demonstrate that SOS can synthesize dynamic controllers with bounded suboptimal performance for various underactuated robotic systems by finding good approximations of the value function. We summarize a unified SOS framework to synthesize both under- and over- approximations of the value function for continuous-time, control-affine systems, use these approximations to generate approximate optimal controllers, and perform regional analysis on the closed-loop system driven by these controllers. We then extend the formulation to handle hybrid systems with contacts. We demonstrate that our method can generate tight under- and over- approximations of the value function with low-degree polynomials, which are used to provide stabilizing controllers for continuous-time systems including the inverted pendulum, the cart-pole, and the quadrotor as well as a hybrid system, the planar pusher. To the best of our knowledge, this is the first time that a SOS-based time-invariant controller can swing up and stabilize a cart-pole, and push the planar slider to the desired pose. △ Less

Submitted 31 July, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.12159 [pdf, ps, other]

First Experimental Study of the Purely Leptonic Decay $D_s^{*+}\to e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

Abstract: Using $7.33~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken with the BESIII detector at the BEPCII collider, we report the first experimental study of the purely leptonic decay $D_s^{*+}\to e^+ν_e$. A signal for the decay $D_s^{*+}\to e^+ν_e$ is observed with a statistical significance of $2.9σ$. The branching fraction of ${D_s^{*+}\to e^+ν_e}$ is measured to be… ▽ More Using $7.33~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken with the BESIII detector at the BEPCII collider, we report the first experimental study of the purely leptonic decay $D_s^{*+}\to e^+ν_e$. A signal for the decay $D_s^{*+}\to e^+ν_e$ is observed with a statistical significance of $2.9σ$. The branching fraction of ${D_s^{*+}\to e^+ν_e}$ is measured to be $(2.1{^{+1.2}_{-0.9}}_{\rm stat.}\pm0.2_{\rm syst.})\times 10^{-5}$, corresponding to an upper limit of $4.0\times10^{-5}$ at the 90\% confidence level. Taking the total width of the $D_s^{*+}$~(($0.070\pm0.028$) keV) predicted by lattice quantum chromodynamics as input, the decay constant of the $D^{*+}_s$ is determined to be $f_{D_s^{*+}}=(213.6{^{+61.0}_{-45.8}}_{\rm stat.}\pm43.9_{\rm syst.})$ MeV, corresponding to an upper limit of 353.8 MeV at the 90\% confidence level. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.11567 [pdf, other]

doi 10.2196/48904

Differentiate ChatGPT-generated and Human-written Medical Texts

Authors: Wenxiong Liao, Zhengliang Liu, Haixing Dai, Shaochen Xu, Zihao Wu, Yiyang Zhang, Xiaoke Huang, Dajiang Zhu, Hongmin Cai, Tianming Liu, Xiang Li

Abstract: Background: Large language models such as ChatGPT are capable of generating grammatically perfect and human-like text content, and a large number of ChatGPT-generated texts have appeared on the Internet. However, medical texts such as clinical notes and diagnoses require rigorous validation, and erroneous medical content generated by ChatGPT could potentially lead to disinformation that poses sign… ▽ More Background: Large language models such as ChatGPT are capable of generating grammatically perfect and human-like text content, and a large number of ChatGPT-generated texts have appeared on the Internet. However, medical texts such as clinical notes and diagnoses require rigorous validation, and erroneous medical content generated by ChatGPT could potentially lead to disinformation that poses significant harm to healthcare and the general public. Objective: This research is among the first studies on responsible and ethical AIGC (Artificial Intelligence Generated Content) in medicine. We focus on analyzing the differences between medical texts written by human experts and generated by ChatGPT, and designing machine learning workflows to effectively detect and differentiate medical texts generated by ChatGPT. Methods: We first construct a suite of datasets containing medical texts written by human experts and generated by ChatGPT. In the next step, we analyze the linguistic features of these two types of content and uncover differences in vocabulary, part-of-speech, dependency, sentiment, perplexity, etc. Finally, we design and implement machine learning methods to detect medical text generated by ChatGPT. Results: Medical texts written by humans are more concrete, more diverse, and typically contain more useful information, while medical texts generated by ChatGPT pay more attention to fluency and logic, and usually express general terminologies rather than effective information specific to the context of the problem. A BERT-based model can effectively detect medical texts generated by ChatGPT, and the F1 exceeds 95%. △ Less

Submitted 23 April, 2023; originally announced April 2023.

arXiv:2304.11083 [pdf]

Time Reversal Enabled Fiber-Optic Time Synchronization

Authors: Yufeng Chen, Hongfei Dai, Wenlin Li, Fangmin Wang, Bo Wang, Lijun Wang

Abstract: Over the past few decades, fiber-optic time synchronization (FOTS) has provided fundamental support for the efficient operation of modern society. Looking toward the future beyond fifth-generation/sixth-generation (B5G/6G) scenarios and very large radio telescope arrays, developing high-precision, low-complexity and scalable FOTS technology is crucial for building a large-scale time synchronizatio… ▽ More Over the past few decades, fiber-optic time synchronization (FOTS) has provided fundamental support for the efficient operation of modern society. Looking toward the future beyond fifth-generation/sixth-generation (B5G/6G) scenarios and very large radio telescope arrays, developing high-precision, low-complexity and scalable FOTS technology is crucial for building a large-scale time synchronization network. However, the traditional two-way FOTS method needs a data layer to exchange time delay information. This increases the complexity of system and makes it impossible to realize multiple-access time synchronization. In this paper, a time reversal enabled FOTS method is proposed. It measures the clock difference between two locations without involving a data layer, which can reduce the complexity of the system. Moreover, it can also achieve multiple-access time synchronization along the fiber link. Tests over a 230 km fiber link have been carried out to demonstrate the high performance of the proposed method. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2304.10515 [pdf, other]

CP-CNN: Core-Periphery Principle Guided Convolutional Neural Network

Authors: Lin Zhao, Haixing Dai, Zihao Wu, Dajiang Zhu, Tianming Liu

Abstract: The evolution of convolutional neural networks (CNNs) can be largely attributed to the design of its architecture, i.e., the network wiring pattern. Neural architecture search (NAS) advances this by automating the search for the optimal network architecture, but the resulting network instance may not generalize well in different tasks. To overcome this, exploring network design principles that are… ▽ More The evolution of convolutional neural networks (CNNs) can be largely attributed to the design of its architecture, i.e., the network wiring pattern. Neural architecture search (NAS) advances this by automating the search for the optimal network architecture, but the resulting network instance may not generalize well in different tasks. To overcome this, exploring network design principles that are generalizable across tasks is a more practical solution. In this study, We explore a novel brain-inspired design principle based on the core-periphery property of the human brain network to guide the design of CNNs. Our work draws inspiration from recent studies suggesting that artificial and biological neural networks may have common principles in optimizing network architecture. We implement the core-periphery principle in the design of network wiring patterns and the sparsification of the convolution operation. The resulting core-periphery principle guided CNNs (CP-CNNs) are evaluated on three different datasets. The experiments demonstrate the effectiveness and superiority compared to CNNs and ViT-based methods. Overall, our work contributes to the growing field of brain-inspired AI by incorporating insights from the human brain into the design of neural networks. △ Less

Submitted 26 March, 2023; originally announced April 2023.

arXiv:2304.09405 [pdf, other]

doi 10.1007/JHEP09(2023)125

Measurement of branching fractions of $Λ_{c}^{+}$ decays to $Σ^{+} K^{+} K^{-}$, $Σ^{+}φ$ and $Σ^{+} K^{+} π^{-}(π^{0})$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (601 additional authors not shown)

Abstract: Based on 4.5 fb$^{-1}$ data taken at seven center-of-mass energies ranging from 4.600 to 4.699 GeV with the BESIII detector at the BEPCII collider, we measure the branching fractions of $Λ_{c}^{+}\rightarrowΣ^{+}+hadrons$ relative to $Λ_{c}^{+}\rightarrow Σ^+ π^+ π^-$. Combining with the world average branching fraction of $Λ_{c}^{+}\rightarrow Σ^+ π^+ π^-$, their branching fractions are measured… ▽ More Based on 4.5 fb$^{-1}$ data taken at seven center-of-mass energies ranging from 4.600 to 4.699 GeV with the BESIII detector at the BEPCII collider, we measure the branching fractions of $Λ_{c}^{+}\rightarrowΣ^{+}+hadrons$ relative to $Λ_{c}^{+}\rightarrow Σ^+ π^+ π^-$. Combining with the world average branching fraction of $Λ_{c}^{+}\rightarrow Σ^+ π^+ π^-$, their branching fractions are measured to be $(0.377\pm0.042\pm0.018\pm0.021)\%$ for $Λ_{c}^{+}\rightarrowΣ^{+} K^{+} K^{-}$, $(0.200\pm0.023\pm0.010\pm0.011)\%$ for $Λ_{c}^{+}\rightarrowΣ^{+} K^{+} π^{-}$, $(0.414\pm0.080\pm0.029\pm0.023)\%$ for $Λ_{c}^{+}\rightarrowΣ^{+}φ$ and $(0.197\pm0.036\pm0.008\pm0.011)\%$ for $Λ_{c}^{+}\rightarrowΣ^{+}K^{+} K^{-}$(non-$φ$). In all the above results, the first uncertainties are statistical, the second are systematic and the third are from external input of the branching fraction of $Λ_{c}^{+}\rightarrow Σ^+ π^+ π^-$. Since no signal for $Λ_{c}^{+}\rightarrowΣ^{+} K^{+} π^{-}π^{0}$ is observed, the upper limit of its branching fraction is determined to be 0.11\% at the 90$\%$ confidence level. △ Less

Submitted 30 August, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

Journal ref: JHEP09(2023)125

arXiv:2304.09138 [pdf, other]

Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task

Authors: Zihao Wu, Lu Zhang, Chao Cao, Xiaowei Yu, Haixing Dai, Chong Ma, Zhengliang Liu, Lin Zhao, Gang Li, Wei Liu, Quanzheng Li, Dinggang Shen, Xiang Li, Dajiang Zhu, Tianming Liu

Abstract: Recently, ChatGPT and GPT-4 have emerged and gained immense global attention due to their unparalleled performance in language processing. Despite demonstrating impressive capability in various open-domain tasks, their adequacy in highly specific fields like radiology remains untested. Radiology presents unique linguistic phenomena distinct from open-domain data due to its specificity and complexi… ▽ More Recently, ChatGPT and GPT-4 have emerged and gained immense global attention due to their unparalleled performance in language processing. Despite demonstrating impressive capability in various open-domain tasks, their adequacy in highly specific fields like radiology remains untested. Radiology presents unique linguistic phenomena distinct from open-domain data due to its specificity and complexity. Assessing the performance of large language models (LLMs) in such specific domains is crucial not only for a thorough evaluation of their overall performance but also for providing valuable insights into future model design directions: whether model design should be generic or domain-specific. To this end, in this study, we evaluate the performance of ChatGPT/GPT-4 on a radiology NLI task and compare it to other models fine-tuned specifically on task-related data samples. We also conduct a comprehensive investigation on ChatGPT/GPT-4's reasoning ability by introducing varying levels of inference difficulty. Our results show that 1) GPT-4 outperforms ChatGPT in the radiology NLI task; 2) other specifically fine-tuned models require significant amounts of data samples to achieve comparable performance to ChatGPT/GPT-4. These findings demonstrate that constructing a generic model that is capable of solving various tasks across different domains is feasible. △ Less

Submitted 18 April, 2023; originally announced April 2023.

arXiv:2304.07783 [pdf, other]

doi 10.1103/PhysRevD.108.032004

Cross section measurements of $e^+e^- \to ΦK^+ K^-$ and $e^+ e^- \to ΦK_S^0 K_S^0$ at center-of-mass energies between 3.7730 GeV and 4.7008 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (600 additional authors not shown)

Abstract: Based on 22.7 fb$^{-1}$ of $e^+e^-$ annihilation data collected at 33 different center-of-mass energies between 3.7730 GeV and 4.7008 GeV with the BESIII detector at the BEPCII collider, Born cross sections of the two processes $e^+e^-\to φK^+ K^-$ and $e^+ e^- \to φK_{S}^{0} K_{S}^{0}$ are measured for the first time. No indication of resonant production through an intermediate vector state $V$ i… ▽ More Based on 22.7 fb$^{-1}$ of $e^+e^-$ annihilation data collected at 33 different center-of-mass energies between 3.7730 GeV and 4.7008 GeV with the BESIII detector at the BEPCII collider, Born cross sections of the two processes $e^+e^-\to φK^+ K^-$ and $e^+ e^- \to φK_{S}^{0} K_{S}^{0}$ are measured for the first time. No indication of resonant production through an intermediate vector state $V$ is observed, and the upper limits on the product of the electronic width $Γ_{e^+e^-}$ and the branching fraction $Br(V\rightarrow φK \bar{K})$ of the processes $e^+e^- \to V \to φK^+ K^-$ and $e^+e^- \to V \to φK_S^0K_S^0$ at the $90\%$ confidence level are obtained for a large parameter space in resonance masses and widths. For the current world average mass and width of the $ψ(4230)$ of $m=4.2187$ GeV$/c^2$ and $Γ=44$ MeV, we set upper limits on the $φK^+ K^-$ and $φK_S^0K_S^0$ final states of 1.75 eV and 0.47 eV at the $90\%$ confidence level, respectively. △ Less

Submitted 16 April, 2023; originally announced April 2023.

arXiv:2304.06136 [pdf, other]

AGI for Agriculture

Authors: Guoyu Lu, Sheng Li, Gengchen Mai, Jin Sun, Dajiang Zhu, Lilong Chai, Haijian Sun, Xianqiao Wang, Haixing Dai, Ninghao Liu, Rui Xu, Daniel Petti, Changying Li, Tianming Liu, Changying Li

Abstract: Artificial General Intelligence (AGI) is poised to revolutionize a variety of sectors, including healthcare, finance, transportation, and education. Within healthcare, AGI is being utilized to analyze clinical medical notes, recognize patterns in patient data, and aid in patient management. Agriculture is another critical sector that impacts the lives of individuals worldwide. It serves as a found… ▽ More Artificial General Intelligence (AGI) is poised to revolutionize a variety of sectors, including healthcare, finance, transportation, and education. Within healthcare, AGI is being utilized to analyze clinical medical notes, recognize patterns in patient data, and aid in patient management. Agriculture is another critical sector that impacts the lives of individuals worldwide. It serves as a foundation for providing food, fiber, and fuel, yet faces several challenges, such as climate change, soil degradation, water scarcity, and food security. AGI has the potential to tackle these issues by enhancing crop yields, reducing waste, and promoting sustainable farming practices. It can also help farmers make informed decisions by leveraging real-time data, leading to more efficient and effective farm management. This paper delves into the potential future applications of AGI in agriculture, such as agriculture image processing, natural language processing (NLP), robotics, knowledge graphs, and infrastructure, and their impact on precision livestock and precision crops. By leveraging the power of AGI, these emerging technologies can provide farmers with actionable insights, allowing for optimized decision-making and increased productivity. The transformative potential of AGI in agriculture is vast, and this paper aims to highlight its potential to revolutionize the industry. △ Less

Submitted 12 April, 2023; originally announced April 2023.

arXiv:2304.03297 [pdf, other]

Neural Operator Learning for Ultrasound Tomography Inversion

Authors: Haocheng Dai, Michael Penwarden, Robert M. Kirby, Sarang Joshi

Abstract: Neural operator learning as a means of mapping between complex function spaces has garnered significant attention in the field of computational science and engineering (CS&E). In this paper, we apply Neural operator learning to the time-of-flight ultrasound computed tomography (USCT) problem. We learn the mapping between time-of-flight (TOF) data and the heterogeneous sound speed field using a ful… ▽ More Neural operator learning as a means of mapping between complex function spaces has garnered significant attention in the field of computational science and engineering (CS&E). In this paper, we apply Neural operator learning to the time-of-flight ultrasound computed tomography (USCT) problem. We learn the mapping between time-of-flight (TOF) data and the heterogeneous sound speed field using a full-wave solver to generate the training data. This novel application of operator learning circumnavigates the need to solve the computationally intensive iterative inverse problem. The operator learns the non-linear mapping offline and predicts the heterogeneous sound field with a single forward pass through the model. This is the first time operator learning has been used for ultrasound tomography and is the first step in potential real-time predictions of soft tissue distribution for tumor identification in beast imaging. △ Less

Submitted 28 May, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

Comments: 4 pages, 1 figure

arXiv:2304.00179 [pdf]

Shedding Light on Rechargeable Na/Cl$_2$ Battery

Authors: Guanzhou Zhu, Peng Liang, Cheng-Liang Huang, Shu-Chi Wu, Cheng-Chia Huang, Yuan-Yao Li, Shi-Kai Jiang, Wei-Hsiang Huang, Jiachen Li, Feifei Wang, Bing-Joe Hwang, Hongjie Dai

Abstract: Advancing new ideas of rechargeable batteries represents an important path to meeting the ever increasing energy storage needs. Recently we showed rechargeable sodium/chlorine (Na/Cl$_2$) (or lithium/chlorine Li/Cl$_2$) batteries that used a Na (or Li) metal negative electrode, a microporous amorphous carbon nanosphere (aCNS) positive electrode and an electrolyte containing dissolved AlCl$_3$ and… ▽ More Advancing new ideas of rechargeable batteries represents an important path to meeting the ever increasing energy storage needs. Recently we showed rechargeable sodium/chlorine (Na/Cl$_2$) (or lithium/chlorine Li/Cl$_2$) batteries that used a Na (or Li) metal negative electrode, a microporous amorphous carbon nanosphere (aCNS) positive electrode and an electrolyte containing dissolved AlCl$_3$ and fluoride additives in thionyl chloride (SOCl$_2$). The main battery redox reaction involved conversion between NaCl and Cl$_2$ trapped in the carbon positive electrode, delivering a cyclable capacity of up to 1200 mAh g$^{-1}$ (based on positive electrode mass) at a ~ 3.5 V discharge voltage. Here, we discovered by X-ray photoelectron spectroscopy (XPS) that upon charging a Na/Cl$_2$ battery, chlorination of carbon in the positive electrode occurred to form C-Cl accompanied by molecular Cl$_2$ infiltrating the porous aCNS, consistent with Cl$_2$ probed by mass spectrometry. Synchrotron X-ray diffraction observed the development of graphitic ordering in the initially amorphous aCNS under battery charging when the carbon matrix was oxidized/chlorinated and infiltrated with Cl$_2$. The C-Cl, Cl$_2$ species and graphitic ordering were reversible upon discharge, accompanied by NaCl formation. The results revealed redox conversion between NaCl and Cl$_2$, reversible graphitic ordering/amorphourization of carbon through battery charge/discharge, and for the first time probed trapped Cl$_2$ in porous carbon by XPS. △ Less

Submitted 31 March, 2023; originally announced April 2023.

Comments: 30 pages, 9 figures

arXiv:2304.00137 [pdf, ps, other]

doi 10.1103/PhysRevD.109.L121101

Measurement of the cosmic p+He energy spectrum from 50 GeV to 0.5 PeV with the DAMPE space mission

Authors: DAMPE Collaboration, F. Alemanno, C. Altomare, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, H. T. Dai, A. De Benedittis, I. De Mitri, F. de Palma, M. Deliyergiyev , et al. (130 additional authors not shown)

Abstract: Recent observations of the light component of the cosmic-ray spectrum have revealed unexpected features that motivate further and more precise measurements up to the highest energies. The Dark Matter Particle Explorer is a satellite-based cosmic-ray experiment that has been operational since December 2015, continuously collecting data on high-energy cosmic particles with very good statistics, ener… ▽ More Recent observations of the light component of the cosmic-ray spectrum have revealed unexpected features that motivate further and more precise measurements up to the highest energies. The Dark Matter Particle Explorer is a satellite-based cosmic-ray experiment that has been operational since December 2015, continuously collecting data on high-energy cosmic particles with very good statistics, energy resolution, and particle identification capabilities. In this work, the latest measurements of the energy spectrum of proton+helium in the energy range from 46 GeV to 464 TeV are presented. Among the most distinctive features of the spectrum, a spectral hardening at 600 GeV has been observed, along with a softening at 29 TeV measured with a 6.6σ significance. Moreover, the detector features and the analysis approach allowed for the extension of the spectral measurement up to the sub-PeV region. Even if with small statistical significance due to the low number of events, data suggest a new spectral hardening at about 150 TeV. △ Less

Submitted 14 August, 2024; v1 submitted 31 March, 2023; originally announced April 2023.

Comments: Published on PRD

arXiv:2303.15935 [pdf, other]

When Brain-inspired AI Meets AGI

Authors: Lin Zhao, Lu Zhang, Zihao Wu, Yuzhong Chen, Haixing Dai, Xiaowei Yu, Zhengliang Liu, Tuo Zhang, Xintao Hu, Xi Jiang, Xiang Li, Dajiang Zhu, Dinggang Shen, Tianming Liu

Abstract: Artificial General Intelligence (AGI) has been a long-standing goal of humanity, with the aim of creating machines capable of performing any intellectual task that humans can do. To achieve this, AGI researchers draw inspiration from the human brain and seek to replicate its principles in intelligent machines. Brain-inspired artificial intelligence is a field that has emerged from this endeavor, c… ▽ More Artificial General Intelligence (AGI) has been a long-standing goal of humanity, with the aim of creating machines capable of performing any intellectual task that humans can do. To achieve this, AGI researchers draw inspiration from the human brain and seek to replicate its principles in intelligent machines. Brain-inspired artificial intelligence is a field that has emerged from this endeavor, combining insights from neuroscience, psychology, and computer science to develop more efficient and powerful AI systems. In this article, we provide a comprehensive overview of brain-inspired AI from the perspective of AGI. We begin with the current progress in brain-inspired AI and its extensive connection with AGI. We then cover the important characteristics for both human intelligence and AGI (e.g., scaling, multimodality, and reasoning). We discuss important technologies toward achieving AGI in current AI systems, such as in-context learning and prompt tuning. We also investigate the evolution of AGI systems from both algorithmic and infrastructural perspectives. Finally, we explore the limitations and future of AGI. △ Less

Submitted 28 March, 2023; originally announced March 2023.

arXiv:2303.15569 [pdf, ps, other]

Core-Periphery Principle Guided Redesign of Self-Attention in Transformers

Authors: Xiaowei Yu, Lu Zhang, Haixing Dai, Yanjun Lyu, Lin Zhao, Zihao Wu, David Liu, Tianming Liu, Dajiang Zhu

Abstract: Designing more efficient, reliable, and explainable neural network architectures is critical to studies that are based on artificial intelligence (AI) techniques. Previous studies, by post-hoc analysis, have found that the best-performing ANNs surprisingly resemble biological neural networks (BNN), which indicates that ANNs and BNNs may share some common principles to achieve optimal performance i… ▽ More Designing more efficient, reliable, and explainable neural network architectures is critical to studies that are based on artificial intelligence (AI) techniques. Previous studies, by post-hoc analysis, have found that the best-performing ANNs surprisingly resemble biological neural networks (BNN), which indicates that ANNs and BNNs may share some common principles to achieve optimal performance in either machine learning or cognitive/behavior tasks. Inspired by this phenomenon, we proactively instill organizational principles of BNNs to guide the redesign of ANNs. We leverage the Core-Periphery (CP) organization, which is widely found in human brain networks, to guide the information communication mechanism in the self-attention of vision transformer (ViT) and name this novel framework as CP-ViT. In CP-ViT, the attention operation between nodes is defined by a sparse graph with a Core-Periphery structure (CP graph), where the core nodes are redesigned and reorganized to play an integrative role and serve as a center for other periphery nodes to exchange information. We evaluated the proposed CP-ViT on multiple public datasets, including medical image datasets (INbreast) and natural image datasets. Interestingly, by incorporating the BNN-derived principle (CP structure) into the redesign of ViT, our CP-ViT outperforms other state-of-the-art ANNs. In general, our work advances the state of the art in three aspects: 1) This work provides novel insights for brain-inspired AI: we can utilize the principles found in BNNs to guide and improve our ANN architecture design; 2) We show that there exist sweet spots of CP graphs that lead to CP-ViTs with significantly improved performance; and 3) The core nodes in CP-ViT correspond to task-related meaningful and important image patches, which can significantly enhance the interpretability of the trained deep model. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: Core-periphery, functional brain networks, ViT

arXiv:2303.14816 [pdf, other]

Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers

Authors: Zhou Huang, Hang Dai, Tian-Zhu Xiang, Shuo Wang, Huai-Xin Chen, Jie Qin, Huan Xiong

Abstract: Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders, which are not conducive to camouflaged object detection that explores subtle cues from indistinguishable backgrounds. To address these issues, in this… ▽ More Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality modeling and insufficient feature aggregation in decoders, which are not conducive to camouflaged object detection that explores subtle cues from indistinguishable backgrounds. To address these issues, in this paper, we propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), which aims to hierarchically decode locality-enhanced neighboring transformer features through progressive shrinking for camouflaged object detection. Specifically, we propose a nonlocal token enhancement module (NL-TEM) that employs the non-local mechanism to interact neighboring tokens and explore graph-based high-order relations within tokens to enhance local representations of transformers. Moreover, we design a feature shrinkage decoder (FSD) with adjacent interaction modules (AIM), which progressively aggregates adjacent transformer features through a layer-bylayer shrinkage pyramid to accumulate imperceptible but effective cues as much as possible for object information decoding. Extensive quantitative and qualitative experiments demonstrate that the proposed model significantly outperforms the existing 24 competitors on three challenging COD benchmark datasets under six widely-used evaluation metrics. Our code is publicly available at https://github.com/ZhouHuang23/FSPNet. △ Less

Submitted 26 March, 2023; originally announced March 2023.

Comments: CVPR 2023. Project webpage at: https://tzxiang.github.io/project/COD-FSPNet/index.html

Showing 251–300 of 1,242 results for author: Dai, H