subscribe to arXiv mailings

doi 10.1109/ITSC55140.2022.9922539

TJ4DRadSet: A 4D Radar Dataset for Autonomous Driving

Authors: Lianqing Zheng, Zhixiong Ma, Xichan Zhu, Bin Tan, Sen Li, Kai Long, Weiqi Sun, Sihan Chen, Lu Zhang, Mengyue Wan, Libo Huang, Jie Bai

Abstract: The next-generation high-resolution automotive radar (4D radar) can provide additional elevation measurement and denser point clouds, which has great potential for 3D sensing in autonomous driving. In this paper, we introduce a dataset named TJ4DRadSet with 4D radar points for autonomous driving research. The dataset was collected in various driving scenarios, with a total of 7757 synchronized fra… ▽ More The next-generation high-resolution automotive radar (4D radar) can provide additional elevation measurement and denser point clouds, which has great potential for 3D sensing in autonomous driving. In this paper, we introduce a dataset named TJ4DRadSet with 4D radar points for autonomous driving research. The dataset was collected in various driving scenarios, with a total of 7757 synchronized frames in 44 consecutive sequences, which are well annotated with 3D bounding boxes and track ids. We provide a 4D radar-based 3D object detection baseline for our dataset to demonstrate the effectiveness of deep learning methods for 4D radar point clouds. The dataset can be accessed via the following link: https://github.com/TJRadarLab/TJ4DRadSet. △ Less

Submitted 27 July, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: 2022 IEEE International Intelligent Transportation Systems Conference (ITSC 2022)

arXiv:2204.12847 [pdf, other]

Query2Particles: Knowledge Graph Reasoning with Particle Embeddings

Authors: Jiaxin Bai, Zihao Wang, Hongming Zhang, Yangqiu Song

Abstract: Answering complex logical queries on incomplete knowledge graphs (KGs) with missing edges is a fundamental and important task for knowledge graph reasoning. The query embedding method is proposed to answer these queries by jointly encoding queries and entities to the same embedding space. Then the answer entities are selected according to the similarities between the entity embeddings and the quer… ▽ More Answering complex logical queries on incomplete knowledge graphs (KGs) with missing edges is a fundamental and important task for knowledge graph reasoning. The query embedding method is proposed to answer these queries by jointly encoding queries and entities to the same embedding space. Then the answer entities are selected according to the similarities between the entity embeddings and the query embedding. As the answers to a complex query are obtained from a combination of logical operations over sub-queries, the embeddings of the answer entities may not always follow a uni-modal distribution in the embedding space. Thus, it is challenging to simultaneously retrieve a set of diverse answers from the embedding space using a single and concentrated query representation such as a vector or a hyper-rectangle. To better cope with queries with diversified answers, we propose Query2Particles (Q2P), a complex KG query answering method. Q2P encodes each query into multiple vectors, named particle embeddings. By doing so, the candidate answers can be retrieved from different areas over the embedding space using the maximal similarities between the entity embeddings and any of the particle embeddings. Meanwhile, the corresponding neural logic operations are defined to support its reasoning over arbitrary first-order logic queries. The experiments show that Query2Particles achieves state-of-the-art performance on the complex query answering tasks on FB15k, FB15K-237, and NELL knowledge graphs. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: Findings of NAACL-2022

arXiv:2204.03963 [pdf]

Mixing and flow transition in an optimized electrokinetic turbulent micromixer

Authors: Keyi Nan, Yanxia Shi, Tianyun Zhao, Xiaowei Tang, Yueqiang Zhu, Kaige Wang, Jintao Bai, Wei Zhao

Abstract: Micromixer is a key element in lab on a chip for broad applications in the analysis and measurement of chemistry and engineering. Previous investigations reported electrokinetic (EK) turbulence could be realized in a Y-type micromixer with a cross-sectional dimension of 100 $μ$m order. Although the ultrafast turbulent mixing can be generated at a bulk flow Reynolds number of O(1), the micromixer h… ▽ More Micromixer is a key element in lab on a chip for broad applications in the analysis and measurement of chemistry and engineering. Previous investigations reported electrokinetic (EK) turbulence could be realized in a Y-type micromixer with a cross-sectional dimension of 100 $μ$m order. Although the ultrafast turbulent mixing can be generated at a bulk flow Reynolds number of O(1), the micromixer has not been optimized. In this investigation, we systematically investigated the influence of electric field intensity, AC frequency, electric conductivity ratio, and channel width at the entrance on the mixing effect and transition electric Rayleigh number in the "Y" type electrokinetic micromixer. It is found the optimal mixing is realized in a 350 $μ$m wide micromixer, under 100 kHz and 1.14*10^5 V/m AC electric field, with an electric conductivity ratio of 1:3000. Under the conditions, a maximum degree of mixedness of 0.93 can be achieved at 84 $μ$m from the entrance and 100 ms. A further investigation of the critical electric field and the critical electric Rayleigh number indicates the most unstable condition of EK flow instability is inconsistent with that of the optimal mixing in EK turbulence. To predict the evolution of EK flow under high $Ra_{e}$, it is necessary to apply a computational turbulence model, instead of linear instability analysis. △ Less

Submitted 11 April, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: 9 pages, 7 figures

arXiv:2204.00993 [pdf, other]

Improving Vision Transformers by Revisiting High-frequency Components

Authors: Jiawang Bai, Li Yuan, Shu-Tao Xia, Shuicheng Yan, Zhifeng Li, Wei Liu

Abstract: The transformer models have shown promising effectiveness in dealing with various vision tasks. However, compared with training Convolutional Neural Network (CNN) models, training Vision Transformer (ViT) models is more difficult and relies on the large-scale training set. To explain this observation we make a hypothesis that \textit{ViT models are less effective in capturing the high-frequency co… ▽ More The transformer models have shown promising effectiveness in dealing with various vision tasks. However, compared with training Convolutional Neural Network (CNN) models, training Vision Transformer (ViT) models is more difficult and relies on the large-scale training set. To explain this observation we make a hypothesis that \textit{ViT models are less effective in capturing the high-frequency components of images than CNN models}, and verify it by a frequency analysis. Inspired by this finding, we first investigate the effects of existing techniques for improving ViT models from a new frequency perspective, and find that the success of some techniques (e.g., RandAugment) can be attributed to the better usage of the high-frequency components. Then, to compensate for this insufficient ability of ViT models, we propose HAT, which directly augments high-frequency components of images via adversarial training. We show that HAT can consistently boost the performance of various ViT models (e.g., +1.2% for ViT-B, +0.5% for Swin-B), and especially enhance the advanced model VOLO-D5 to 87.3% that only uses ImageNet-1K data, and the superiority can also be maintained on out-of-distribution data and transferred to downstream tasks. The code is available at: https://github.com/jiawangbai/HAT. △ Less

Submitted 27 July, 2022; v1 submitted 3 April, 2022; originally announced April 2022.

Comments: Accepted to ECCV2022; Code: https://github.com/jiawangbai/HAT

arXiv:2203.15134 [pdf, other]

doi 10.1093/mnras/stac918

Detection of strong scattering close to the eclipse region of PSR B1957+20

Authors: J. T. Bai, S. Dai, Q. J. Zhi, W. A. Coles, D. Li, W. W. Zhu, G. Hobbs, G. J. Qiao, N. Wang, J. P. Yuan, M. D. Filipovic, J. B. Wang, Z. C. Pan, L. H. Shang, S. J. Dang, S. Q. Wang, C. C. Miao

Abstract: We present the first measurement of pulse scattering close to the eclipse region of PSR B1957+20, which is in a compact binary system with a low-mass star. We measured pulse scattering time-scales up to 0.2 ms close to the eclipse and showed that it scales with the dispersion measure (DM) excess roughly as $τ\proptoΔ{\rm DM}^{2}$. Our observations provide the first evidence of strong scattering du… ▽ More We present the first measurement of pulse scattering close to the eclipse region of PSR B1957+20, which is in a compact binary system with a low-mass star. We measured pulse scattering time-scales up to 0.2 ms close to the eclipse and showed that it scales with the dispersion measure (DM) excess roughly as $τ\proptoΔ{\rm DM}^{2}$. Our observations provide the first evidence of strong scattering due to multi-path propagation effects in the eclipsing material. We show that Kolmogorov turbulence in the eclipsing material with an inner scale of $\sim100$ m and an outer scale of the size of the eclipse region can naturally explain the observation. Our results show that the eclipsing material in such systems can be highly turbulent and suggest that scattering is one of the main eclipsing mechanisms at around 1.4 GHz. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: 8 pages, 4 figures, MNRAS accepted

arXiv:2203.08350 [pdf, other]

doi 10.1109/TCDS.2022.3222350

A Squeeze-and-Excitation and Transformer based Cross-task System for Environmental Sound Recognition

Authors: Jisheng Bai, Jianfeng Chen, Mou Wang, Muhammad Saad Ayub

Abstract: Environmental sound recognition (ESR) is an emerging research topic in audio pattern recognition. Many tasks are presented to resort to computational models for ESR in real-life applications. However, current models are usually designed for individual tasks, and are not robust and applicable to other tasks. Cross-task models, which promote unified knowledge modeling across various tasks, have not… ▽ More Environmental sound recognition (ESR) is an emerging research topic in audio pattern recognition. Many tasks are presented to resort to computational models for ESR in real-life applications. However, current models are usually designed for individual tasks, and are not robust and applicable to other tasks. Cross-task models, which promote unified knowledge modeling across various tasks, have not been thoroughly investigated. In this article, we propose a cross-task model for three different tasks of ESR: 1) acoustic scene classification; 2) urban sound tagging; and 3) anomalous sound detection. An architecture named SE-Trans is presented that uses attention mechanism-based Squeeze-and-Excitation and Transformer encoder modules to learn the channelwise relationship and temporal dependencies of the acoustic features. FMix is employed as the data augmentation method that improves the performance of ESR. Evaluations for the three tasks are conducted on the recent databases of detection and classification of acoustic scenes and event challenges. The experimental results show that the proposed cross-task model achieves state-of-the-art performance on all tasks. Further analysis demonstrates that the proposed cross-task model can effectively utilize acoustic knowledge across different ESR tasks. △ Less

Submitted 21 November, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

arXiv:2203.07570 [pdf, other]

doi 10.3847/1538-4357/ac6109

A quasar shedding its dust cocoon at redshift 2

Authors: Weimin Yi, W. N. Brandt, Q. Ni, Luis C. Ho, Bin Luo, Wei Yan, D. P. Schneider, Jeremiah D. Paul, Richard M. Plotkin, Jinyi Yang, Feige Wang, Zhicheng He, Chen Chen, Xue-Bing Wu, Jin-Ming Bai

Abstract: We present the first near-IR spectroscopy and joint analyses of multi-wavelength observations for SDSS J082747.14+425241.1, a dust-reddened, weak broad emission-line quasar (WLQ) undergoing a remarkable broad absorption line (BAL) transformation. The systemic redshift is more precisely measured to be $z=2.070\pm0.001$ using H$β$ compared to $z=2.040\pm0.003$ using \mgii\ from the literature, signi… ▽ More We present the first near-IR spectroscopy and joint analyses of multi-wavelength observations for SDSS J082747.14+425241.1, a dust-reddened, weak broad emission-line quasar (WLQ) undergoing a remarkable broad absorption line (BAL) transformation. The systemic redshift is more precisely measured to be $z=2.070\pm0.001$ using H$β$ compared to $z=2.040\pm0.003$ using \mgii\ from the literature, signifying an extreme \mgii\ blueshift of $2140\pm530$ \kms\ relative to H$β$. Using the H$β$-based single-epoch scaling relation with a systematic uncertainty of 0.3 dex, its black hole (BH) mass and Eddington ratio are estimated to be $M_{\rm BH}\sim6.1\times10^8M_\odot$ and $λ_{\rm Edd}\sim0.71$, indicative of being in a rapidly accreting phase. Our investigations confirm the WLQ nature and the LoBAL$\rightarrow$HiBAL transformation, along with a factor of 2 increase in the \mgii+\feii\ emission strength and a decrease of 0.1 in $E(B-V)$ over two decades. The kinetic power of this LoBAL wind at $R\sim$15 pc from its BH is estimated to be $\sim$43\% of the Eddington luminosity, sufficient for quasar feedback upon its host galaxy albeit with an order-of-magnitude uncertainty. This quasar provides a clear example of the long-sought scenario where LoBAL quasars are surrounded by dust cocoons, and wide-angle nuclear winds play a key role in the transition for red quasars evolving into the commonly seen blue quasars. △ Less

Submitted 14 March, 2022; originally announced March 2022.

Comments: 11 pages, 6 figures, accepted for publication in ApJ

arXiv:2203.04084 [pdf]

The Effects of Transverse Inclination on Aeroelastic Cantilever Prisms: Phenomenology, Unsteady Force, and the Base Intensification Phenomenon

Authors: Zengshun Chen, Jie Bai, Cruz Y Li, Yemeng Xu, Jianmin Hua, Xuanyi Xue

Abstract: The transverse inclination is a probable scenario when inclined structures experience an inflow of altered attack angles. This work investigates the effects of transverse inclination on an aeroelastic prism through forced-vibration wind tunnel experiments. The aerodynamic characteristics are tri-parametrically evaluated under different wind speeds, inclination angles, and oscillation amplitudes. R… ▽ More The transverse inclination is a probable scenario when inclined structures experience an inflow of altered attack angles. This work investigates the effects of transverse inclination on an aeroelastic prism through forced-vibration wind tunnel experiments. The aerodynamic characteristics are tri-parametrically evaluated under different wind speeds, inclination angles, and oscillation amplitudes. Results show that transverse inclination fundamentally changes the wake phenomenology by impinging the fix-end horseshoe vortex and breaking the separation symmetry. The aftermath is a bi-polar, once-for-all change in the aerodynamics near the prism base. The suppression of the horseshoe vortex unleashes the Karman vortex, which significantly increases the unsteady crosswind force. After the initial morphology switch, the aerodynamics become independent of inclination angle and oscillation amplitude and depend solely on wind speed. The structure's upper portion does not feel the effect, so this phenomenon is called Base Intensification. The phenomenon only projects notable impacts on the low-speed and VIV regime and is indifferent in the high-speed, quasi-steady Galloping regime. In practice, Base Intensification will disrupt the pedestrian-level wind environment from the unleashed Bernard-Karman vortex shedding, making it erratic and gusty. Moreover, it increases the aerodynamic load at a structure base by as much as 4.3 times. Since fix-end stiffness prevents elastic dissipation, the load translates to massive stress, making detection trickier and failures, if they are to occur, more sudden, extreme, and without any warnings. The 4.3-time amplification also surpasses the safety factor of many standard designs, so transverse inclination deserves engineering attention. △ Less

Submitted 14 February, 2022; originally announced March 2022.

Comments: 49 pages, 20 figures, under review at Journal of Wind Engineering and Industrial Aerodynamics

arXiv:2203.00328 [pdf, other]

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

Authors: Yuting Nie, Junhong Zhao, Wei-Qiang Zhang, Jinfeng Bai

Abstract: Language identification is the task of automatically determining the identity of a language conveyed by a spoken segment. It has a profound impact on the multilingual interoperability of an intelligent speech system. Despite language identification attaining high accuracy on medium or long utterances(>3s), the performance on short utterances (<=1s) is still far from satisfactory. We propose a BERT… ▽ More Language identification is the task of automatically determining the identity of a language conveyed by a spoken segment. It has a profound impact on the multilingual interoperability of an intelligent speech system. Despite language identification attaining high accuracy on medium or long utterances(>3s), the performance on short utterances (<=1s) is still far from satisfactory. We propose a BERT-based language identification system (BERT-LID) to improve language identification performance, especially on short-duration speech segments. We extend the original BERT model by taking the phonetic posteriorgrams (PPG) derived from the front-end phone recognizer as input. Then we deployed the optimal deep classifier followed by it for language identification. Our BERT-LID model can improve the baseline accuracy by about 6.5% on long-segment identification and 19.9% on short-segment identification, demonstrating our BERT-LID's effectiveness to language identification. △ Less

Submitted 11 October, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

Comments: accepted by ISCSLP 2022

arXiv:2202.13694 [pdf, ps, other]

Quotients of Palindromic and Antipalindromic Numbers

Authors: James Haoyu Bai, Joseph Meleshko, Samin Riasat, Jeffrey Shallit

Abstract: A natural number N is said to be palindromic if its binary representation reads the same forwards and backwards. In this paper we study the quotients of two palindromic numbers and answer some basic questions about the resulting sets of integers and rational numbers. For example, we show that the following problem is algorithmically decidable: given an integer N, determine if we can write N = A/B… ▽ More A natural number N is said to be palindromic if its binary representation reads the same forwards and backwards. In this paper we study the quotients of two palindromic numbers and answer some basic questions about the resulting sets of integers and rational numbers. For example, we show that the following problem is algorithmically decidable: given an integer N, determine if we can write N = A/B for palindromic numbers A and B. Given that N is representable, we find a bound on the size of the numerator of the smallest representation. We prove that the set of unrepresentable integers has positive density in N. We also obtain similar results for quotients of antipalindromic numbers (those for which the first half of the binary representation is the reverse complement of the second half). We also provide examples, numerical data, and a number of intriguing conjectures and open problems. △ Less

Submitted 28 February, 2022; originally announced February 2022.

arXiv:2202.08949 [pdf, ps, other]

doi 10.3847/1538-4357/ac559b

Measuring the Virial Factor in SDSS DR5 Quasars with Redshifted H$β$ and Fe ii Broad Emission Lines

Authors: H. T. Liu, Hai-Cheng Feng, Sha-Sha Li, J. M. Bai

Abstract: Under the hypothesis of gravitational redshift induced by the central supermassive black hole, and based on line widths and shifts of redward shifted H$β$ and Fe ii broad emission lines for a sample of 1973 $z<0.8$ SDSS DR5 quasars, we measured the virial factor in determining supermassive black hole masses, usually estimated by the reverberation mapping (RM) method or the relevant secondary metho… ▽ More Under the hypothesis of gravitational redshift induced by the central supermassive black hole, and based on line widths and shifts of redward shifted H$β$ and Fe ii broad emission lines for a sample of 1973 $z<0.8$ SDSS DR5 quasars, we measured the virial factor in determining supermassive black hole masses, usually estimated by the reverberation mapping (RM) method or the relevant secondary methods. The virial factor had been believed to be from the geometric effect of broad-line region. The measured virial factor of Fe ii is larger than that of H$β$ for 98% of these quasars. The virial factor is very different from object to object and for different emission lines. For most of these quasars, the virial factor of H$β$ is larger than these averages that were usually used in determining the masses of black holes. There are three positive correlations among the measured virial factor of H$β$, dimensionless accretion rate and Fe ii/H$β$ line ratio. A positive three-dimensional correlation is found among these three quantities, and this correlation indicates that the virial factor is likely dominated by the dimensionless accretion rate and metallicity. A negative correlation is found between the redward shift of H$β$ and the scaled size of broad-line region radius in units of the gravitational radius of black hole. This negative correlation will be expected naturally if the redward shift of H$β$ is mainly from the gravity of black hole. Radiation pressure from accretion disk may be a significant contributor to the virial factor. △ Less

Submitted 17 February, 2022; originally announced February 2022.

Comments: 22 pages, 9 figures, 2 tables, accepted for publication in ApJ

Journal ref: 2022, ApJ, 928, 60

arXiv:2202.06300 [pdf, other]

doi 10.1007/s11432-022-3576-9

Deep Graph Learning for Spatially-Varying Indoor Lighting Prediction

Authors: Jiayang Bai, Jie Guo, Chenchen Wan, Zhenyu Chen, Zhen He, Shan Yang, Piaopiao Yu, Yan Zhang, Yanwen Guo

Abstract: Lighting prediction from a single image is becoming increasingly important in many vision and augmented reality (AR) applications in which shading and shadow consistency between virtual and real objects should be guaranteed. However, this is a notoriously ill-posed problem, especially for indoor scenarios, because of the complexity of indoor luminaires and the limited information involved in 2D im… ▽ More Lighting prediction from a single image is becoming increasingly important in many vision and augmented reality (AR) applications in which shading and shadow consistency between virtual and real objects should be guaranteed. However, this is a notoriously ill-posed problem, especially for indoor scenarios, because of the complexity of indoor luminaires and the limited information involved in 2D images. In this paper, we propose a graph learning-based framework for indoor lighting estimation. At its core is a new lighting model (dubbed DSGLight) based on depth-augmented Spherical Gaussians (SG) and a Graph Convolutional Network (GCN) that infers the new lighting representation from a single LDR image of limited field-of-view. Our lighting model builds 128 evenly distributed SGs over the indoor panorama, where each SG encoding the lighting and the depth around that node. The proposed GCN then learns the mapping from the input image to DSGLight. Compared with existing lighting models, our DSGLight encodes both direct lighting and indirect environmental lighting more faithfully and compactly. It also makes network training and inference more stable. The estimated depth distribution enables temporally stable shading and shadows under spatially-varying lighting. Through thorough experiments, we show that our method obviously outperforms existing methods both qualitatively and quantitatively. △ Less

Submitted 13 February, 2022; originally announced February 2022.

arXiv:2202.02796 [pdf, other]

GLPanoDepth: Global-to-Local Panoramic Depth Estimation

Authors: Jiayang Bai, Shuichang Lai, Haoyu Qin, Jie Guo, Yanwen Guo

Abstract: In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete descriptions of the scene than perspective images. However, fully-convolutional networks that most current solutions rely on fail to capture rich global contexts from the panorama. To a… ▽ More In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete descriptions of the scene than perspective images. However, fully-convolutional networks that most current solutions rely on fail to capture rich global contexts from the panorama. To address this issue and also the distortion of equirectangular projection in the panorama, we propose Cubemap Vision Transformers (CViT), a new transformer-based architecture that can model long-range dependencies and extract distortion-free global features from the panorama. We show that cubemap vision transformers have a global receptive field at every stage and can provide globally coherent predictions for spherical signals. To preserve important local features, we further design a convolution-based branch in our pipeline (dubbed GLPanoDepth) and fuse global features from cubemap vision transformers at multiple scales. This global-to-local strategy allows us to fully exploit useful global and local features in the panorama, achieving state-of-the-art performance in panoramic depth estimation. △ Less

Submitted 8 February, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

arXiv:2201.10797 [pdf, other]

An Automated Question-Answering Framework Based on Evolution Algorithm

Authors: Sinan Tan, Hui Xue, Qiyu Ren, Huaping Liu, Jing Bai

Abstract: Building a deep learning model for a Question-Answering (QA) task requires a lot of human effort, it may need several months to carefully tune various model architectures and find a best one. It's even harder to find different excellent models for multiple datasets. Recent works show that the best model structure is related to the dataset used, and one single model cannot adapt to all tasks. In th… ▽ More Building a deep learning model for a Question-Answering (QA) task requires a lot of human effort, it may need several months to carefully tune various model architectures and find a best one. It's even harder to find different excellent models for multiple datasets. Recent works show that the best model structure is related to the dataset used, and one single model cannot adapt to all tasks. In this paper, we propose an automated Question-Answering framework, which could automatically adjust network architecture for multiple datasets. Our framework is based on an innovative evolution algorithm, which is stable and suitable for multiple dataset scenario. The evolution algorithm for search combine prior knowledge into initial population and use a performance estimator to avoid inefficient mutation by predicting the performance of candidate model architecture. The prior knowledge used in initial population could improve the final result of the evolution algorithm. The performance estimator could quickly filter out models with bad performance in population as the number of trials increases, to speed up the convergence. Our framework achieves 78.9 EM and 86.1 F1 on SQuAD 1.1, 69.9 EM and 72.5 F1 on SQuAD 2.0. On NewsQA dataset, the found model achieves 47.0 EM and 62.9 F1. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: In Proceedings of the AAAI 2019 Workshop (WS13) on Reasoning and Complex Question-Answering (RCQA-19) https://researcher.watson.ibm.com/researcher/view_group.php?id=9632

arXiv:2201.10068 [pdf, other]

doi 10.1063/5.0086357

Continuous-frequency weak electric field measurement with Rydberg atoms

Authors: Jinlian Hu, Huaqiang Li, Rong Song, Jingxu Bai, Yuechun Jiao, Jianming Zhao, Suotang Jia

Abstract: We demonstrate a continuous frequency electric field measurement based on the far off-resonant AC stark effect in a Rydberg atomic vapor cell. In this configuration, a strong far off-resonant field, denoted as a local oscillator (LO) field, acts as a gain shifting the Rydberg level to a high sensitivity region. An incident weak signal field with a few hundreds of kHz difference from the LO field i… ▽ More We demonstrate a continuous frequency electric field measurement based on the far off-resonant AC stark effect in a Rydberg atomic vapor cell. In this configuration, a strong far off-resonant field, denoted as a local oscillator (LO) field, acts as a gain shifting the Rydberg level to a high sensitivity region. An incident weak signal field with a few hundreds of kHz difference from the LO field is mixed with the LO field in Rydberg system to generate an intermediate frequency (IF) signal, which is read out by the Rydberg electromagnetically induced transparency (Rydberg-EIT) spectroscopy. Not like resonant EIT-AT spectra, we realize the electric field measurement of the signal frequency from 2 GHz to 5 GHz using a single Rydberg state. A minimum detectable filed strength is down to 2.31~$μ$V/cm and a linear dynamic range is over 65~dB. The minimum detectable filed is comparable with a resonant microwave-dressed Rydberg heterodyne receiver using the same system, which is 1.45~$μ$V/cm. We also show the system has an inherent polarization selectivity feature. Our method can provide a high sensitivity of electric field measurement and be extended to arbitrary frequency measurements. △ Less

Submitted 24 January, 2022; originally announced January 2022.

Comments: 5 pages, 4 figures

Journal ref: Appl. Phys. Lett. 121, 014002 (2022);

arXiv:2201.00370 [pdf, ps, other]

Decoding Nonbinary LDPC Codes via Proximal-ADMM Approach (include convergence proofs)

Authors: Yongchao Wang, Jing Bai

Abstract: In this paper, we focus on decoding nonbinary low-density parity-check (LDPC) codes in Galois fields of characteristic two via the proximal alternating direction method of multipliers (proximal-ADMM). By exploiting Flanagan/Constant-Weighting embedding techniques and the decomposition technique based on three-variables parity-check equations, two efficient proximal-ADMM decoders for nonbinary LDPC… ▽ More In this paper, we focus on decoding nonbinary low-density parity-check (LDPC) codes in Galois fields of characteristic two via the proximal alternating direction method of multipliers (proximal-ADMM). By exploiting Flanagan/Constant-Weighting embedding techniques and the decomposition technique based on three-variables parity-check equations, two efficient proximal-ADMM decoders for nonbinary LDPC codes are proposed. We show that both of them are theoretically guaranteed convergent to some stationary point of the decoding model and either of their computational complexities in each proximal-ADMM iteration scales linearly with LDPC code's length and the size of the considered Galois field. Moreover, the decoder based on the Constant-Weight embedding technique satisfies the favorable property of codeword symmetry. Simulation results demonstrate their effectiveness in comparison with state-of-the-art LDPC decoders. △ Less

Submitted 2 January, 2022; originally announced January 2022.

arXiv:2201.00016 [pdf, other]

TransLog: A Unified Transformer-based Framework for Log Anomaly Detection

Authors: Hongcheng Guo, Xingyu Lin, Jian Yang, Yi Zhuang, Jiaqi Bai, Tieqiao Zheng, Bo Zhang, Zhoujun Li

Abstract: Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios especially for low-resource domains. However, previous deep models merely focused on extracting the semantics of log sequence in the same domain, leading to p… ▽ More Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios especially for low-resource domains. However, previous deep models merely focused on extracting the semantics of log sequence in the same domain, leading to poor generalization on multi-domain logs. Therefore, we propose a unified Transformer-based framework for log anomaly detection (\ourmethod{}), which is comprised of the pretraining and adapter-based tuning stage. Our model is first pretrained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer the pretrained model to the target domain via the adapter-based tuning. The proposed method is evaluated on three public datasets including one source domain and two target domains. The experimental results demonstrate that our simple yet efficient approach, with fewer trainable parameters and lower training costs in the target domain, achieves state-of-the-art performance on three benchmarks. △ Less

Submitted 16 January, 2022; v1 submitted 31 December, 2021; originally announced January 2022.

Comments: 6 pages

arXiv:2112.15015 [pdf, other]

Measuring and Sampling: A Metric-guided Subgraph Learning Framework for Graph Neural Network

Authors: Jiyang Bai, Yuxiang Ren, Jiawei Zhang

Abstract: Graph neural network (GNN) has shown convincing performance in learning powerful node representations that preserve both node attributes and graph structural information. However, many GNNs encounter problems in effectiveness and efficiency when they are designed with a deeper network structure or handle large-sized graphs. Several sampling algorithms have been proposed for improving and accelerat… ▽ More Graph neural network (GNN) has shown convincing performance in learning powerful node representations that preserve both node attributes and graph structural information. However, many GNNs encounter problems in effectiveness and efficiency when they are designed with a deeper network structure or handle large-sized graphs. Several sampling algorithms have been proposed for improving and accelerating the training of GNNs, yet they ignore understanding the source of GNN performance gain. The measurement of information within graph data can help the sampling algorithms to keep high-value information while removing redundant information and even noise. In this paper, we propose a Metric-Guided (MeGuide) subgraph learning framework for GNNs. MeGuide employs two novel metrics: Feature Smoothness and Connection Failure Distance to guide the subgraph sampling and mini-batch based training. Feature Smoothness is designed for analyzing the feature of nodes in order to retain the most valuable information, while Connection Failure Distance can measure the structural information to control the size of subgraphs. We demonstrate the effectiveness and efficiency of MeGuide in training various GNNs on multiple datasets. △ Less

Submitted 30 December, 2021; originally announced December 2021.

arXiv:2112.13197 [pdf, other]

doi 10.1145/3488560.3498524

Learning Multi-granularity User Intent Unit for Session-based Recommendation

Authors: Jiayan Guo, Yaming Yang, Xiangchen Song, Yuan Zhang, Yujing Wang, Jing Bai, Yan Zhang

Abstract: Session-based recommendation aims to predict a user's next action based on previous actions in the current session. The major challenge is to capture authentic and complete user preferences in the entire session. Recent work utilizes graph structure to represent the entire session and adopts Graph Neural Network to encode session information. This modeling choice has been proved to be effective an… ▽ More Session-based recommendation aims to predict a user's next action based on previous actions in the current session. The major challenge is to capture authentic and complete user preferences in the entire session. Recent work utilizes graph structure to represent the entire session and adopts Graph Neural Network to encode session information. This modeling choice has been proved to be effective and achieved remarkable results. However, most of the existing studies only consider each item within the session independently and do not capture session semantics from a high-level perspective. Such limitation often leads to severe information loss and increases the difficulty of capturing long-range dependencies within a session. Intuitively, compared with individual items, a session snippet, i.e., a group of locally consecutive items, is able to provide supplemental user intents which are hardly captured by existing methods. In this work, we propose to learn multi-granularity consecutive user intent unit to improve the recommendation performance. Specifically, we creatively propose Multi-granularity Intent Heterogeneous Session Graph which captures the interactions between different granularity intent units and relieves the burden of long-dependency. Moreover, we propose the Intent Fusion Ranking module to compose the recommendation results from various granularity user intents. Compared with current methods that only leverage intents from individual items, IFR benefits from different granularity user intents to generate more accurate and comprehensive session representation, thus eventually boosting recommendation performance. We conduct extensive experiments on five session-based recommendation datasets and the results demonstrate the effectiveness of our method. △ Less

Submitted 10 January, 2022; v1 submitted 25 December, 2021; originally announced December 2021.

arXiv:2112.10332 [pdf, ps, other]

Active Reconfigurable Intelligent Surface Aided Secure Transmission

Authors: Limeng Dong, Hui-Ming Wang, Jiale Bai

Abstract: Reconfigurable Intelligent Surface (RIS) draws great attentions in academic and industry due to its passive and low power consumption nature, and has currently been used in physical layer security to enhance the secure transmission. However, due to the existence of double fading effect on the reflecting channel link between transmitter and user, RIS helps achieve limited secrecy performance gain c… ▽ More Reconfigurable Intelligent Surface (RIS) draws great attentions in academic and industry due to its passive and low power consumption nature, and has currently been used in physical layer security to enhance the secure transmission. However, due to the existence of double fading effect on the reflecting channel link between transmitter and user, RIS helps achieve limited secrecy performance gain compared with the case without RIS. In this correspondence, we propose a novel active RIS design to enhance the secure wireless transmission, where the reflecting elements in RIS not only adjust the phase shift but also amplify the amplitude of signals. To solve the non convex secrecy rate optimization based on this design, an efficient alternating optimization algorithm is proposed to jointly optimize the beamformer at transmitter and reflecting coefficient matrix at RIS. Simulation results show that with the aid of active RIS design, the impact of double fading effect can be effectively relieved, resulting in a significantly higher secrecy performance gain compared with existing solutions with passive RIS and without RIS design. △ Less

Submitted 19 December, 2021; originally announced December 2021.

Comments: Accepted by IEEE TVT

arXiv:2112.09574 [pdf]

Super-resolution reconstruction of cytoskeleton image based on A-net deep learning network

Authors: Qian Chen, Haoxin Bai, Bingchen Che, Tianyun Zhao, Ce Zhang, Kaige Wang, Jintao Bai, Wei Zhao

Abstract: To date, live-cell imaging at the nanometer scale remains challenging. Even though super-resolution microscopy methods have enabled visualization of subcellular structures below the optical resolution limit, the spatial resolution is still far from enough for the structural reconstruction of biomolecules in vivo (i.e. ~24 nm thickness of microtubule fiber). In this study, we proposed an A-net netw… ▽ More To date, live-cell imaging at the nanometer scale remains challenging. Even though super-resolution microscopy methods have enabled visualization of subcellular structures below the optical resolution limit, the spatial resolution is still far from enough for the structural reconstruction of biomolecules in vivo (i.e. ~24 nm thickness of microtubule fiber). In this study, we proposed an A-net network and showed that the resolution of cytoskeleton images captured by a confocal microscope can be significantly improved by combining the A-net deep learning network with the DWDC algorithm based on degradation model. Utilizing the DWDC algorithm to construct new datasets and taking advantage of A-net neural network's features (i.e., considerably fewer layers), we successfully removed the noise and flocculent structures, which originally interfere with the cellular structure in the raw image, and improved the spatial resolution by 10 times using relatively small dataset. We, therefore, conclude that the proposed algorithm that combines A-net neural network with the DWDC method is a suitable and universal approach for exacting structural details of biomolecules, cells and organs from low-resolution images. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: The manuscript has 17 pages, 10 figures and 58 references

arXiv:2112.00976 [pdf, other]

Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification

Authors: Junwen Bai, Shufeng Kong, Carla P. Gomes

Abstract: Multi-label classification (MLC) is a prediction task where each sample can have more than one label. We propose a novel contrastive learning boosted multi-label prediction model based on a Gaussian mixture variational autoencoder (C-GMVAE), which learns a multimodal prior space and employs a contrastive loss. Many existing methods introduce extra complex neural modules like graph neural networks… ▽ More Multi-label classification (MLC) is a prediction task where each sample can have more than one label. We propose a novel contrastive learning boosted multi-label prediction model based on a Gaussian mixture variational autoencoder (C-GMVAE), which learns a multimodal prior space and employs a contrastive loss. Many existing methods introduce extra complex neural modules like graph neural networks to capture the label correlations, in addition to the prediction modules. We find that by using contrastive learning in the supervised setting, we can exploit label information effectively in a data-driven manner, and learn meaningful feature and label embeddings which capture the label correlations and enhance the predictive power. Our method also adopts the idea of learning and aligning latent spaces for both features and labels. In contrast to previous works based on a unimodal prior, C-GMVAE imposes a Gaussian mixture structure on the latent space, to alleviate the posterior collapse and over-regularization issues. C-GMVAE outperforms existing methods on multiple public datasets and can often match other models' full performance with only 50% of the training data. Furthermore, we show that the learnt embeddings provide insights into the interpretation of label-label interactions. △ Less

Submitted 9 June, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

Comments: Accepted to ICML 2022

arXiv:2111.08900 [pdf, other]

A GNN-RNN Approach for Harnessing Geospatial and Temporal Information: Application to Crop Yield Prediction

Authors: Joshua Fan, Junwen Bai, Zhiyun Li, Ariel Ortiz-Bobea, Carla P. Gomes

Abstract: Climate change is posing new challenges to crop-related concerns including food insecurity, supply stability and economic planning. As one of the central challenges, crop yield prediction has become a pressing task in the machine learning field. Despite its importance, the prediction task is exceptionally complicated since crop yields depend on various factors such as weather, land surface, soil q… ▽ More Climate change is posing new challenges to crop-related concerns including food insecurity, supply stability and economic planning. As one of the central challenges, crop yield prediction has become a pressing task in the machine learning field. Despite its importance, the prediction task is exceptionally complicated since crop yields depend on various factors such as weather, land surface, soil quality as well as their interactions. In recent years, machine learning models have been successfully applied in this domain. However, these models either restrict their tasks to a relatively small region, or only study over a single or few years, which makes them hard to generalize spatially and temporally. In this paper, we introduce a novel graph-based recurrent neural network for crop yield prediction, to incorporate both geographical and temporal knowledge in the model, and further boost predictive power. Our method is trained, validated, and tested on over 2000 counties from 41 states in the US mainland, covering years from 1981 to 2019. As far as we know, this is the first machine learning method that embeds geographical knowledge in crop yield prediction and predicts the crop yields at county level nationwide. We also laid a solid foundation for the comparison with other machine learning baselines by applying well-known linear models, tree-based models, deep learning methods and comparing their performance. Experiments show that our proposed method consistently outperforms the existing state-of-the-art methods on various metrics, validating the effectiveness of geospatial and temporal information. △ Less

Submitted 21 January, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

Comments: Fixed typo. 14 pages, 9 figures, accepted at AAAI-22 Social Impact Track

arXiv:2111.08137 [pdf, other]

Joint Unsupervised and Supervised Training for Multilingual ASR

Authors: Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath

Abstract: Self-supervised training has shown promising gains in pretraining models and facilitating the downstream finetuning for speech recognition, like multilingual ASR. Most existing methods adopt a 2-stage scheme where the self-supervised loss is optimized in the first pretraining stage, and the standard supervised finetuning resumes in the second stage. In this paper, we propose an end-to-end (E2E) Jo… ▽ More Self-supervised training has shown promising gains in pretraining models and facilitating the downstream finetuning for speech recognition, like multilingual ASR. Most existing methods adopt a 2-stage scheme where the self-supervised loss is optimized in the first pretraining stage, and the standard supervised finetuning resumes in the second stage. In this paper, we propose an end-to-end (E2E) Joint Unsupervised and Supervised Training (JUST) method to combine the supervised RNN-T loss and the self-supervised contrastive and masked language modeling (MLM) losses. We validate its performance on the public dataset Multilingual LibriSpeech (MLS), which includes 8 languages and is extremely imbalanced. On MLS, we explore (1) JUST trained from scratch, and (2) JUST finetuned from a pretrained checkpoint. Experiments show that JUST can consistently outperform other existing state-of-the-art methods, and beat the monolingual baseline by a significant margin, demonstrating JUST's capability of handling low-resource languages in multilingual ASR. Our average WER of all languages outperforms average monolingual baseline by 33.3%, and the state-of-the-art 2-stage XLSR by 32%. On low-resource languages like Polish, our WER is less than half of the monolingual baseline and even beats the supervised transfer learning method which uses external supervision. △ Less

Submitted 15 November, 2021; originally announced November 2021.

arXiv:2111.00636 [pdf, ps, other]

doi 10.3847/1538-3881/ac1ce7

A SysTematic seaRch fOr Dual Agns in meRgINg Galaxies (ASTRO-DARING) III: results from the SDSS spectroscopic surveys

Authors: Yang-wei Zhang, Yang Huang, Jin-ming Bai, Xiao-wei Liu, Jian-guo Wang, Xiao-bo Dong

Abstract: As the third installment in a series systematically searching dual active galactic nuclei (AGN) amongst merging galaxies, we present the results of 20 dual AGNs found by using the SDSS fiber spectra. To reduce the flux contamination from both the fiber aperture and seeing effects, the angular separation of two cores in our merging galaxy pairs sample is restricted at least larger than 3 arcsec. By… ▽ More As the third installment in a series systematically searching dual active galactic nuclei (AGN) amongst merging galaxies, we present the results of 20 dual AGNs found by using the SDSS fiber spectra. To reduce the flux contamination from both the fiber aperture and seeing effects, the angular separation of two cores in our merging galaxy pairs sample is restricted at least larger than 3 arcsec. By careful analysis of the emission lines, 20 dual AGNs are identified from 61 merging galaxies with their two cores both observed by the SDSS spectroscopic surveys. 15 of them are identified for the first time. The identification efficiency is about 32.79$\%$ (20/61), comparable to our former results (16 dual AGNs identified from 41 merging galaxies) based on the long-slit spectroscopy. Interestingly, two of the 20 dual AGNs show two prominent cores in radio images and their radio powers show they as the radio-excess AGNs. So far, 31 dual AGNs are found by our project and this is the current largest dual AGN sample, ever constructed with a consistent approach. This sample, together with more candidates from ongoing observations, is of vital importance to study the AGN physics and the coevolution between the supermassive black holes and their host galaxies. △ Less

Submitted 8 November, 2021; v1 submitted 31 October, 2021; originally announced November 2021.

Comments: 14 pages, 26 figures, accepted by AJ

arXiv:2111.00635 [pdf, ps, other]

doi 10.3847/1538-3881/ac2deb

A SysTematic seaRch fOr Dual Agns in meRgINg Galaxies (ASTRO-DARING) II: first results from long-slit spectroscopic observations

Authors: Yang-wei Zhang, Yang Huang, Jin-ming Bai, Xiao-wei Liu, Jian-guo Wang, Xiao-bo Dong

Abstract: Building a large sample of kiloparsec (kpc)-scale dual active galactic nuclei (AGNs) amongst merging galaxies is of vital importance to understand the co-evolution between host galaxies and their central super massive black holes (SMBHs). Doing so, with just such a sample, we have developed an innovative method of systematically searching and identifying dual AGNs of amongst kpc scale merging gala… ▽ More Building a large sample of kiloparsec (kpc)-scale dual active galactic nuclei (AGNs) amongst merging galaxies is of vital importance to understand the co-evolution between host galaxies and their central super massive black holes (SMBHs). Doing so, with just such a sample, we have developed an innovative method of systematically searching and identifying dual AGNs of amongst kpc scale merging galaxies and selected 222 candidates at redshifts $\leqslant$ 0.25. All the selected candidates have FIRST radio detection and at least one of two cores previously revealed as AGN spectroscopically. We report the first results from A SysTematic seaRch fOr Dual Agns in meRgINg Galaxies (ASTRO-DARING), which consist of spatially resolved long-slit spectroscopic observations of 41 targets selected from our merging galaxies sample carried out between November 2014 and February 2017, using the Yunnan Faint Object Spectrograph and Camera (YFOSC) mounted on the 2.4 meter telescope in Lijiang of Yunnan Observatories. Of these 16 are likely dual AGNs and 15 are newly identified. The efficiency of ASTRO-DARING is thus nearly 40 per cent. With this method, we plan to build the first even sample of more than 50 dual AGNs constructed using a consistent approach. Further analysis of the dual AGN sample shall provide vital clues for understanding the co-evolution of galaxies and SMBHs. △ Less

Submitted 8 November, 2021; v1 submitted 31 October, 2021; originally announced November 2021.

Comments: 30 pages, 61 figures, accepted by AJ

arXiv:2110.12091 [pdf, other]

Contrastively Disentangled Sequential Variational Autoencoder

Authors: Junwen Bai, Weiran Wang, Carla Gomes

Abstract: Self-supervised disentangled representation learning is a critical task in sequence modeling. The learnt representations contribute to better model interpretability as well as the data generation, and improve the sample efficiency for downstream tasks. We propose a novel sequence representation learning method, named Contrastively Disentangled Sequential Variational Autoencoder (C-DSVAE), to extra… ▽ More Self-supervised disentangled representation learning is a critical task in sequence modeling. The learnt representations contribute to better model interpretability as well as the data generation, and improve the sample efficiency for downstream tasks. We propose a novel sequence representation learning method, named Contrastively Disentangled Sequential Variational Autoencoder (C-DSVAE), to extract and separate the static (time-invariant) and dynamic (time-variant) factors in the latent space. Different from previous sequential variational autoencoder methods, we use a novel evidence lower bound which maximizes the mutual information between the input and the latent factors, while penalizes the mutual information between the static and dynamic factors. We leverage contrastive estimations of the mutual information terms in training, together with simple yet effective augmentation techniques, to introduce additional inductive biases. Our experiments show that C-DSVAE significantly outperforms the previous state-of-the-art methods on multiple metrics. △ Less

Submitted 22 October, 2021; originally announced October 2021.

Comments: Accepted by NeurIPS 2021

arXiv:2110.08752 [pdf, other]

doi 10.3847/1538-4357/ac323f

SN 2015bq: A Luminous Type Ia Supernova with Early Flux Excess

Authors: Liping Li, Jujia Zhang, Benzhong Dai, Wenxiong Li, Xiaofeng Wang, Qian Zhai, Jinming Bai

Abstract: We present optical and ultraviolet (UV) observations of a luminous type Ia supernova (SN Ia) SN 2015bq characterized by the early flux excess. This SN reaches a B-band absolute magnitude at $M_B = -19.68 \pm 0.41$ mag and a peak bolometric luminosity at $L = (1.75 \pm 0.37) \times 10^{43}$ erg s$^{-1}$, with a relatively small post-maximum decline rate [$Δm_{15}(B) = 0.82 \pm 0.05$ mag]. The flux… ▽ More We present optical and ultraviolet (UV) observations of a luminous type Ia supernova (SN Ia) SN 2015bq characterized by the early flux excess. This SN reaches a B-band absolute magnitude at $M_B = -19.68 \pm 0.41$ mag and a peak bolometric luminosity at $L = (1.75 \pm 0.37) \times 10^{43}$ erg s$^{-1}$, with a relatively small post-maximum decline rate [$Δm_{15}(B) = 0.82 \pm 0.05$ mag]. The flux excess observed in the light curves of SN 2015bq a few days after the explosion, especially seen in the UV bands, might be due to the radioactive decay of $^{56}$Ni mixed into the surface. The radiation from the decay of the surface $^{56}$Ni heats the outer layer of this SN. It produces blue $U-B$ color followed by monotonically reddening in the early phase, dominated iron-group lines, and weak intermediate-mass elements absorption features in the early spectra. The scenario of enhanced $^{56}$Ni in the surface is consistent with a large amount of $^{56}$Ni ($M_{ \rm ^{56}{\rm Ni}}$ = 0.97 $\pm 0.20$ $M_{\odot}$) synthesized during the explosion. The properties of SN 2015bq are found to locate between SN 1991T and SN 1999aa, suggesting the latter two subclasses of SNe Ia may have a common origin. △ Less

Submitted 17 October, 2021; originally announced October 2021.

Comments: 18 pages, 12 figures, accepted for publication in ApJ

arXiv:2110.03888 [pdf, other]

M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining

Authors: Junyang Lin, An Yang, Jinze Bai, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Yong Li, Wei Lin, Jingren Zhou, Hongxia Yang

Abstract: Recent expeditious developments in deep learning algorithms, distributed training, and even hardware design for large models have enabled training extreme-scale models, say GPT-3 and Switch Transformer possessing hundreds of billions or even trillions of parameters. However, under limited resources, extreme-scale model training that requires enormous amounts of computes and memory footprint suffer… ▽ More Recent expeditious developments in deep learning algorithms, distributed training, and even hardware design for large models have enabled training extreme-scale models, say GPT-3 and Switch Transformer possessing hundreds of billions or even trillions of parameters. However, under limited resources, extreme-scale model training that requires enormous amounts of computes and memory footprint suffers from frustratingly low efficiency in model convergence. In this paper, we propose a simple training strategy called "Pseudo-to-Real" for high-memory-footprint-required large models. Pseudo-to-Real is compatible with large models with architecture of sequential layers. We demonstrate a practice of pretraining unprecedented 10-trillion-parameter model, an order of magnitude larger than the state-of-the-art, on solely 512 GPUs within 10 days. Besides demonstrating the application of Pseudo-to-Real, we also provide a technique, Granular CPU offloading, to manage CPU memory for training large model and maintain high GPU utilities. Fast training of extreme-scale models on a decent amount of resources can bring much smaller carbon footprint and contribute to greener AI. △ Less

Submitted 25 October, 2021; v1 submitted 8 October, 2021; originally announced October 2021.

Comments: 14 pages, 4 figures

arXiv:2110.03484 [pdf, other]

Creating Training Sets via Weak Indirect Supervision

Authors: Jieyu Zhang, Bohan Wang, Xiangchen Song, Yujing Wang, Yaming Yang, Jing Bai, Alexander Ratner

Abstract: Creating labeled training sets has become one of the major roadblocks in machine learning. To address this, recent \emph{Weak Supervision (WS)} frameworks synthesize training labels from multiple potentially noisy supervision sources. However, existing frameworks are restricted to supervision sources that share the same output space as the target task. To extend the scope of usable sources, we for… ▽ More Creating labeled training sets has become one of the major roadblocks in machine learning. To address this, recent \emph{Weak Supervision (WS)} frameworks synthesize training labels from multiple potentially noisy supervision sources. However, existing frameworks are restricted to supervision sources that share the same output space as the target task. To extend the scope of usable sources, we formulate Weak Indirect Supervision (WIS), a new research problem for automatically synthesizing training labels based on indirect supervision sources that have different output label spaces. To overcome the challenge of mismatched output spaces, we develop a probabilistic modeling approach, PLRM, which uses user-provided label relations to model and leverage indirect supervision sources. Moreover, we provide a theoretically-principled test of the distinguishability of PLRM for unseen labels, along with a generalization bound. On both image and text classification tasks as well as an industrial advertising application, we demonstrate the advantages of PLRM by outperforming baselines by a margin of 2%-9%. △ Less

Submitted 14 March, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: ICLR 2022

arXiv:2110.02059 [pdf, other]

doi 10.1145/3459637.3482279

Multi-Relational Graph based Heterogeneous Multi-Task Learning in Community Question Answering

Authors: Zizheng Lin, Haowen Ke, Ngo-Yin Wong, Jiaxin Bai, Yangqiu Song, Huan Zhao, Junpeng Ye

Abstract: Various data mining tasks have been proposed to study Community Question Answering (CQA) platforms like Stack Overflow. The relatedness between some of these tasks provides useful learning signals to each other via Multi-Task Learning (MTL). However, due to the high heterogeneity of these tasks, few existing works manage to jointly solve them in a unified framework. To tackle this challenge, we de… ▽ More Various data mining tasks have been proposed to study Community Question Answering (CQA) platforms like Stack Overflow. The relatedness between some of these tasks provides useful learning signals to each other via Multi-Task Learning (MTL). However, due to the high heterogeneity of these tasks, few existing works manage to jointly solve them in a unified framework. To tackle this challenge, we develop a multi-relational graph based MTL model called Heterogeneous Multi-Task Graph Isomorphism Network (HMTGIN) which efficiently solves heterogeneous CQA tasks. In each training forward pass, HMTGIN embeds the input CQA forum graph by an extension of Graph Isomorphism Network and skip connections. The embeddings are then shared across all task-specific output layers to compute respective losses. Moreover, two cross-task constraints based on the domain knowledge about tasks' relationships are used to regularize the joint learning. In the evaluation, the embeddings are shared among different task-specific output layers to make corresponding predictions. To the best of our knowledge, HMTGIN is the first MTL model capable of tackling CQA tasks from the aspect of multi-relational graphs. To evaluate HMTGIN's effectiveness, we build a novel large-scale multi-relational graph CQA dataset with over two million nodes from Stack Overflow. Extensive experiments show that: $(1)$ HMTGIN is superior to all baselines on five tasks; $(2)$ The proposed MTL strategy and cross-task constraints have substantial advantages. △ Less

Submitted 3 September, 2021; originally announced October 2021.

Comments: Full paper of CIKM 2021

arXiv:2110.00973 [pdf, other]

Graph Pointer Neural Networks

Authors: Tianmeng Yang, Yujing Wang, Zhihan Yue, Yaming Yang, Yunhai Tong, Jing Bai

Abstract: Graph Neural Networks (GNNs) have shown advantages in various graph-based applications. Most existing GNNs assume strong homophily of graph structure and apply permutation-invariant local aggregation of neighbors to learn a representation for each node. However, they fail to generalize to heterophilic graphs, where most neighboring nodes have different labels or features, and the relevant nodes ar… ▽ More Graph Neural Networks (GNNs) have shown advantages in various graph-based applications. Most existing GNNs assume strong homophily of graph structure and apply permutation-invariant local aggregation of neighbors to learn a representation for each node. However, they fail to generalize to heterophilic graphs, where most neighboring nodes have different labels or features, and the relevant nodes are distant. Few recent studies attempt to address this problem by combining multiple hops of hidden representations of central nodes (i.e., multi-hop-based approaches) or sorting the neighboring nodes based on attention scores (i.e., ranking-based approaches). As a result, these approaches have some apparent limitations. On the one hand, multi-hop-based approaches do not explicitly distinguish relevant nodes from a large number of multi-hop neighborhoods, leading to a severe over-smoothing problem. On the other hand, ranking-based models do not joint-optimize node ranking with end tasks and result in sub-optimal solutions. In this work, we present Graph Pointer Neural Networks (GPNN) to tackle the challenges mentioned above. We leverage a pointer network to select the most relevant nodes from a large amount of multi-hop neighborhoods, which constructs an ordered sequence according to the relationship with the central node. 1D convolution is then applied to extract high-level features from the node sequence. The pointer-network-based ranker in GPNN is joint-optimized with other parts in an end-to-end manner. Extensive experiments are conducted on six public node classification datasets with heterophilic graphs. The results show that GPNN significantly improves the classification performance of state-of-the-art methods. In addition, analyses also reveal the privilege of the proposed GPNN in filtering out irrelevant neighbors and reducing over-smoothing. △ Less

Submitted 3 January, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

arXiv:2109.12296 [pdf, other]

Jointly Learning to Repair Code and Generate Commit Message

Authors: Jiaqi Bai, Long Zhou, Ambrosio Blanco, Shujie Liu, Furu Wei, Ming Zhou, Zhoujun Li

Abstract: We propose a novel task of jointly repairing program codes and generating commit messages. Code repair and commit message generation are two essential and related tasks for software development. However, existing work usually performs the two tasks independently. We construct a multilingual triple dataset including buggy code, fixed code, and commit messages for this novel task. We provide the cas… ▽ More We propose a novel task of jointly repairing program codes and generating commit messages. Code repair and commit message generation are two essential and related tasks for software development. However, existing work usually performs the two tasks independently. We construct a multilingual triple dataset including buggy code, fixed code, and commit messages for this novel task. We provide the cascaded models as baseline, which are enhanced with different training approaches, including the teacher-student method, the multi-task method, and the back-translation method. To deal with the error propagation problem of the cascaded method, the joint model is proposed that can both repair the code and generate the commit message in a unified framework. Experimental results show that the enhanced cascaded model with teacher-student method and multitask-learning method achieves the best score on different metrics of automated code repair, and the joint model behaves better than the cascaded model on commit message generation. △ Less

Submitted 25 September, 2021; originally announced September 2021.

Comments: Accepted to the 2021 Conference on Empirical Methods in Natural Language Processing

arXiv:2109.08868 [pdf, other]

Backdoor Attack on Hash-based Image Retrieval via Clean-label Data Poisoning

Authors: Kuofeng Gao, Jiawang Bai, Bin Chen, Dongxian Wu, Shu-Tao Xia

Abstract: A backdoored deep hashing model is expected to behave normally on original query images and return the images with the target label when a specific trigger pattern presents. To this end, we propose the confusing perturbations-induced backdoor attack (CIBA). It injects a small number of poisoned images with the correct label into the training data, which makes the attack hard to be detected. To cra… ▽ More A backdoored deep hashing model is expected to behave normally on original query images and return the images with the target label when a specific trigger pattern presents. To this end, we propose the confusing perturbations-induced backdoor attack (CIBA). It injects a small number of poisoned images with the correct label into the training data, which makes the attack hard to be detected. To craft the poisoned images, we first propose the confusing perturbations to disturb the hashing code learning. As such, the hashing model can learn more about the trigger. The confusing perturbations are imperceptible and generated by optimizing the intra-class dispersion and inter-class shift in the Hamming space. We then employ the targeted adversarial patch as the backdoor trigger to improve the attack performance. We have conducted extensive experiments to verify the effectiveness of our proposed CIBA. Our code is available at https://github.com/KuofengGao/CIBA. △ Less

Submitted 2 September, 2023; v1 submitted 18 September, 2021; originally announced September 2021.

Comments: Accepted by BMVC 2023

arXiv:2109.03773 [pdf, other]

Approximate Factor Models with Weaker Loadings

Authors: Jushan Bai, Serena Ng

Abstract: Pervasive cross-section dependence is increasingly recognized as a characteristic of economic data and the approximate factor model provides a useful framework for analysis. Assuming a strong factor structure where $\Lop\Lo/N^α$ is positive definite in the limit when $α=1$, early work established convergence of the principal component estimates of the factors and loadings up to a rotation matrix.… ▽ More Pervasive cross-section dependence is increasingly recognized as a characteristic of economic data and the approximate factor model provides a useful framework for analysis. Assuming a strong factor structure where $\Lop\Lo/N^α$ is positive definite in the limit when $α=1$, early work established convergence of the principal component estimates of the factors and loadings up to a rotation matrix. This paper shows that the estimates are still consistent and asymptotically normal when $α\in(0,1]$ albeit at slower rates and under additional assumptions on the sample size. The results hold whether $α$ is constant or varies across factor loadings. The framework developed for heterogeneous loadings and the simplified proofs that can be also used in strong factor analysis are of independent interest. △ Less

Submitted 12 February, 2023; v1 submitted 8 September, 2021; originally announced September 2021.

arXiv:2109.02046 [pdf, ps, other]

Attentive Knowledge-aware Graph Convolutional Networks with Collaborative Guidance for Personalized Recommendation

Authors: Yankai Chen, Yaming Yang, Yujing Wang, Jing Bai, Xiangchen Song, Irwin King

Abstract: To alleviate data sparsity and cold-start problems of traditional recommender systems (RSs), incorporating knowledge graphs (KGs) to supplement auxiliary information has attracted considerable attention recently. However, simply integrating KGs in current KG-based RS models is not necessarily a guarantee to improve the recommendation performance, which may even weaken the holistic model capability… ▽ More To alleviate data sparsity and cold-start problems of traditional recommender systems (RSs), incorporating knowledge graphs (KGs) to supplement auxiliary information has attracted considerable attention recently. However, simply integrating KGs in current KG-based RS models is not necessarily a guarantee to improve the recommendation performance, which may even weaken the holistic model capability. This is because the construction of these KGs is independent of the collection of historical user-item interactions; hence, information in these KGs may not always be helpful for recommendation to all users. In this paper, we propose attentive Knowledge-aware Graph convolutional networks with Collaborative Guidance for personalized Recommendation (CG-KGR). CG-KGR is a novel knowledge-aware recommendation model that enables ample and coherent learning of KGs and user-item interactions, via our proposed Collaborative Guidance Mechanism. Specifically, CG-KGR first encapsulates historical interactions to interactive information summarization. Then CG-KGR utilizes it as guidance to extract information out of KGs, which eventually provides more precise personalized recommendation. We conduct extensive experiments on four real-world datasets over two recommendation tasks, i.e., Top-K recommendation and Click-Through rate (CTR) prediction. The experimental results show that the CG-KGR model significantly outperforms recent state-of-the-art models by 1.4-27.0% in terms of Recall metric on Top-K recommendation. △ Less

Submitted 2 January, 2022; v1 submitted 5 September, 2021; originally announced September 2021.

arXiv:2108.11125 [pdf, ps, other]

A New Insight on Augmented Lagrangian Method and Its Extensions

Authors: Jianchao Bai

Abstract: Motivated by the recent work [He-Yuan, Balanced Augmented Lagrangian Method for Convex Programming, arXiv: 2108.08554v1, (2021)], a novel Augmented Lagrangian Method (ALM) has been proposed for solving a family of convex optimization problem subject to equality or inequality constraint. This new method is then extended to solve the multi-block separable convex optimization problem, and two related… ▽ More Motivated by the recent work [He-Yuan, Balanced Augmented Lagrangian Method for Convex Programming, arXiv: 2108.08554v1, (2021)], a novel Augmented Lagrangian Method (ALM) has been proposed for solving a family of convex optimization problem subject to equality or inequality constraint. This new method is then extended to solve the multi-block separable convex optimization problem, and two related primal-dual hybrid gradient algorithms are also discussed. Preliminary and some new convergence results are established with the aid of variational analysis for both the saddle point of the problem and the first-order optimality conditions of involved subproblems. △ Less

Submitted 23 November, 2021; v1 submitted 25 August, 2021; originally announced August 2021.

Comments: 15 pages

arXiv:2108.09440 [pdf, other]

Unsupervised Local Discrimination for Medical Images

Authors: Huai Chen, Renzhen Wang, Xiuying Wang, Jieyu Li, Qu Fang, Hui Li, Jianhao Bai, Qing Peng, Deyu Meng, Lisheng Wang

Abstract: Contrastive learning, which aims to capture general representation from unlabeled images to initialize the medical analysis models, has been proven effective in alleviating the high demand for expensive annotations. Current methods mainly focus on instance-wise comparisons to learn the global discriminative features, however, pretermitting the local details to distinguish tiny anatomical structure… ▽ More Contrastive learning, which aims to capture general representation from unlabeled images to initialize the medical analysis models, has been proven effective in alleviating the high demand for expensive annotations. Current methods mainly focus on instance-wise comparisons to learn the global discriminative features, however, pretermitting the local details to distinguish tiny anatomical structures, lesions, and tissues. To address this challenge, in this paper, we propose a general unsupervised representation learning framework, named local discrimination (LD), to learn local discriminative features for medical images by closely embedding semantically similar pixels and identifying regions of similar structures across different images. Specifically, this model is equipped with an embedding module for pixel-wise embedding and a clustering module for generating segmentation. And these two modules are unified through optimizing our novel region discrimination loss function in a mutually beneficial mechanism, which enables our model to reflect structure information as well as measure pixel-wise and region-wise similarity. Furthermore, based on LD, we propose a center-sensitive one-shot landmark localization algorithm and a shape-guided cross-modality segmentation model to foster the generalizability of our model. When transferred to downstream tasks, the learned representation by our method shows a better generalization, outperforming representation from 18 state-of-the-art (SOTA) methods and winning 9 out of all 12 downstream tasks. Especially for the challenging lesion segmentation tasks, the proposed method achieves significantly better performances. The source codes are publicly available at https://github.com/HuaiChen-1994/LDLearning. △ Less

Submitted 17 August, 2022; v1 submitted 21 August, 2021; originally announced August 2021.

Comments: 18 pages, 12 figures

arXiv:2108.02578 [pdf, other]

doi 10.1038/s41598-022-12647-x

Neural network-based prediction of the secret-key rate of quantum key distribution

Authors: Min-Gang Zhou, Zhi-Ping Liu, Wen-Bo Liu, Chen-Long Li, Jun-Lin Bai, Yi-Ran Xue, Yao Fu, Hua-Lei Yin, Zeng-Bing Chen

Abstract: Numerical methods are widely used to calculate the secure key rate of many quantum key distribution protocols in practice, but they consume many computing resources and are too time-consuming. In this work, we take the homodyne detection discrete-modulated continuous-variable quantum key distribution (CV-QKD) as an example, and construct a neural network that can quickly predict the secure key rat… ▽ More Numerical methods are widely used to calculate the secure key rate of many quantum key distribution protocols in practice, but they consume many computing resources and are too time-consuming. In this work, we take the homodyne detection discrete-modulated continuous-variable quantum key distribution (CV-QKD) as an example, and construct a neural network that can quickly predict the secure key rate based on the experimental parameters and experimental results. Compared to traditional numerical methods, the speed of the neural network is improved by several orders of magnitude. Importantly, the predicted key rates are not only highly accurate but also highly likely to be secure. This allows the secure key rate of discrete-modulated CV-QKD to be extracted in real time on a low-power platform. Furthermore, our method is versatile and can be extended to quickly calculate the complex secure key rates of various other unstructured quantum key distribution protocols. △ Less

Submitted 29 May, 2022; v1 submitted 5 August, 2021; originally announced August 2021.

Comments: 12 pages, 5 figures, 2 tables

Journal ref: Sci. Rep. 12, 8879 (2022)

arXiv:2107.08765 [pdf, other]

Adaptive Transfer Learning on Graph Neural Networks

Authors: Xueting Han, Zhenhuan Huang, Bang An, Jing Bai

Abstract: Graph neural networks (GNNs) is widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. However, there is an inherent gap between self-supervised tasks and downstream tasks in terms of optimization objective and training data. Conventional… ▽ More Graph neural networks (GNNs) is widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. However, there is an inherent gap between self-supervised tasks and downstream tasks in terms of optimization objective and training data. Conventional pre-training methods may be not effective enough on knowledge transfer since they do not make any adaptation for downstream tasks. To solve such problems, we propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task. Our methods would adaptively select and combine different auxiliary tasks with the target task in the fine-tuning stage. We design an adaptive auxiliary loss weighting model to learn the weights of auxiliary tasks by quantifying the consistency between auxiliary tasks and the target task. In addition, we learn the weighting model through meta-learning. Our methods can be applied to various transfer learning approaches, it performs well not only in multi-task learning but also in pre-training and fine-tuning. Comprehensive experiments on multiple downstream tasks demonstrate that the proposed methods can effectively combine auxiliary tasks with the target task and significantly improve the performance compared to state-of-the-art methods. △ Less

Submitted 20 July, 2021; v1 submitted 19 July, 2021; originally announced July 2021.

arXiv:2106.05655 [pdf, other]

doi 10.3847/1538-4357/ac116e

Reverberation Mapping of Two Luminous Quasars: the Broad-line Region Structure and Black Hole Mass

Authors: Sha-Sha Li, Sen Yang, Zi-Xu Yang, Yong-Jie Chen, Yu-Yang Songsheng, He-Zhen Liu, Pu Du, Bin Luo, Zhe Yu, Chen Hu, Bo-Wei Jiang, Dong-Wei Bao, Wei-Jian Guo, Zhi-Xiang Zhang, Yan-Rong Li, Ming Xiao, Kai-Xing Lu, Luis C. Ho, Jing-Min Bai, Wei-Hao Bian, Jesús Aceituno, Takeo Minezaki, Mitsuru Kokubo, Jian-Min Wang

Abstract: We report the results of a multi-year spectroscopic and photometric monitoring campaign of two luminous quasars, PG~0923+201 and PG~1001+291, both located at the high-luminosity end of the broad-line region (BLR) size-luminosity relation with optical luminosities above $10^{45}~{\rm erg~s^{-1}}$. PG~0923+201 is for the first time monitored, and PG~1001+291 was previously monitored but our campaign… ▽ More We report the results of a multi-year spectroscopic and photometric monitoring campaign of two luminous quasars, PG~0923+201 and PG~1001+291, both located at the high-luminosity end of the broad-line region (BLR) size-luminosity relation with optical luminosities above $10^{45}~{\rm erg~s^{-1}}$. PG~0923+201 is for the first time monitored, and PG~1001+291 was previously monitored but our campaign has a much longer temporal baseline. We detect time lags of variations of the broad H$β$, H$γ$, Fe {\sc ii} lines with respect to those of the 5100~Å continuum. The velocity-resolved delay map of H$β$ in PG~0923+201 indicates a complicated structure with a mix of Keplerian disk-like motion and outflow, and the map of H$β$ in PG~1001+291 shows a signature of Keplerian disk-like motion. Assuming a virial factor of $f_{\rm BLR}=1$ and FWHM line widths, we measure the black hole mass to be $118_{-16}^{+11}\times 10^7 M_{\odot}$ for PG~0923+201 and $3.33_{-0.54}^{+0.62}\times 10^7 M_{\odot}$ for PG~1001+291. Their respective accretion rates are estimated to be $0.21_{-0.07}^{+0.06} \times L_{\rm Edd}\,c^{-2}$ and $679_{-227}^{+259}\times L_{\rm Edd}\,c^{-2}$, indicating that PG~0923+201 is a sub-Eddington accretor and PG~1001+291 is a super-Eddington accretor. While the H$β$ time lag of PG~0923+201 agrees with the size-luminosity relation, the time lag of PG~1001+291 shows a significant deviation, confirming that in high-luminosity AGN the BLR size depends on both luminosity and Eddington ratio. Black hole mass estimates from single AGN spectra will be over-estimated at high luminosities and redshifts if this effect is not taken into account. △ Less

Submitted 10 June, 2021; originally announced June 2021.

Comments: 21 pages, 14 figures, accepted

Journal ref: 2021, The Astrophysical Journal

arXiv:2105.13145 [pdf, other]

doi 10.1093/mnras/stab1573

GRB 140102A: Insight into Prompt Spectral Evolution and Early Optical Afterglow Emission

Authors: Rahul Gupta, S. R. Oates, S. B. Pandey, A. J. Castro-Tirado, Jagdish C. Joshi, Y. -D. Hu, A. F. Valeev, B. B. Zhang, Z. Zhang, Amit Kumar, A. Aryan, A. Lien, B. Kumar, Ch. Cui, Ch. Wang, Dimple, D. Bhattacharya, E. Sonbas, J. Bai, J. C. Tello, J. Gorosabel, J. M. Castro Cerón, J. R. F. Porto, K. Misra, M. De Pasquale , et al. (16 additional authors not shown)

Abstract: We present and perform a detailed analysis of multi-wavelength observations of \thisgrb, an optical bright GRB with an observed reverse shock (RS) signature. Observations of this GRB were acquired with the BOOTES-4 robotic telescope, the \fermi, and the \swift missions. Time-resolved spectroscopy of the prompt emission shows that changes to the peak energy (\Ep) tracks intensity and the low-energy… ▽ More We present and perform a detailed analysis of multi-wavelength observations of \thisgrb, an optical bright GRB with an observed reverse shock (RS) signature. Observations of this GRB were acquired with the BOOTES-4 robotic telescope, the \fermi, and the \swift missions. Time-resolved spectroscopy of the prompt emission shows that changes to the peak energy (\Ep) tracks intensity and the low-energy spectral index seems to follow the intensity for the first episode, whereas this tracking behavior is less clear during the second episode. The fit to the afterglow light curves shows that the early optical afterglow can be described with RS emission and is consistent with the thin shell scenario of the constant ambient medium. The late time afterglow decay is also consistent with the prediction of the external forward shock (FS) model. We determine the properties of the shocks, Lorentz factor, magnetization parameters, and ambient density of \thisgrb, and compare these parameters with another 12 GRBs, consistent with having RS produced by thin shells in an ISM-like medium. The value of the magnetization parameter ($R_{\rm B} \approx 18$) indicates a moderately magnetized baryonic dominant jet composition for \thisgrb. We also report the host galaxy photometric observations of \thisgrb obtained with 10.4m GTC, 3.5m CAHA, and 3.6m DOT telescopes and find the host (photo $z$ = $2.8^{+0.7}_{-0.9}$) to be a high mass, star-forming galaxy with a star formation rate of $20 \pm 10 \msun$ $\rm yr^{-1}$. △ Less

Submitted 27 May, 2021; originally announced May 2021.

Comments: 27 pages, 16 figures, 12 tables, accepted for publication in MNRAS

arXiv:2105.07209 [pdf, other]

Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos

Authors: Lei Sun, Jia Wang, Kailun Yang, Kaikai Wu, Xiangdong Zhou, Kaiwei Wang, Jian Bai

Abstract: Aerial pixel-wise scene perception of the surrounding environment is an important task for UAVs (Unmanned Aerial Vehicles). Previous research works mainly adopt conventional pinhole cameras or fisheye cameras as the imaging device. However, these imaging systems cannot achieve large Field of View (FoV), small size, and lightweight at the same time. To this end, we design a UAV system with a Panora… ▽ More Aerial pixel-wise scene perception of the surrounding environment is an important task for UAVs (Unmanned Aerial Vehicles). Previous research works mainly adopt conventional pinhole cameras or fisheye cameras as the imaging device. However, these imaging systems cannot achieve large Field of View (FoV), small size, and lightweight at the same time. To this end, we design a UAV system with a Panoramic Annular Lens (PAL), which has the characteristics of small size, low weight, and a 360-degree annular FoV. A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing. In addition, we present the first drone-perspective panoramic scene segmentation dataset Aerial-PASS, with annotated labels of track, field, and others. A comprehensive variety of experiments shows that the designed system performs satisfactorily in aerial panoramic scene parsing. In particular, our proposed model strikes an excellent trade-off between segmentation performance and inference speed suitable, validated on both public street-scene and our established aerial-scene datasets. △ Less

Submitted 15 May, 2021; originally announced May 2021.

Comments: Our dataset will be made publicly available at: http://wangkaiwei.org/downloadeg.html

arXiv:2105.05840 [pdf, other]

doi 10.3847/1538-4357/ac2159

AGN STORM 2: I. First results: A Change in the Weather of Mrk 817

Authors: Erin Kara, Missagh Mehdipour, Gerard A. Kriss, Edward M. Cackett, Nahum Arav, Aaron J. Barth, Doyee Byun, Michael S. Brotherton, Gisella De Rosa, Jonathan Gelbord, Juan V. Hernandez Santisteban, Chen Hu, Jelle Kaastra, Hermine Landt, Yan-Rong Li, Jake A. Miller, John Montano, Ethan Partington, Jesus Aceituno, Jin-Ming Bai, Dongwei Bao, Misty C. Bentz, Thomas G. Brink, Doron Chelouche, Yong-Jie Chen , et al. (47 additional authors not shown)

Abstract: We present the first results from the ongoing, intensive, multi-wavelength monitoring program of the luminous Seyfert 1 galaxy Mrk 817. While this AGN was, in part, selected for its historically unobscured nature, we discovered that the X-ray spectrum is highly absorbed, and there are new blueshifted, broad and narrow UV absorption lines, which suggest that a dust-free, ionized obscurer located at… ▽ More We present the first results from the ongoing, intensive, multi-wavelength monitoring program of the luminous Seyfert 1 galaxy Mrk 817. While this AGN was, in part, selected for its historically unobscured nature, we discovered that the X-ray spectrum is highly absorbed, and there are new blueshifted, broad and narrow UV absorption lines, which suggest that a dust-free, ionized obscurer located at the inner broad line region partially covers the central source. Despite the obscuration, we measure UV and optical continuum reverberation lags consistent with a centrally illuminated Shakura-Sunyaev thin accretion disk, and measure reverberation lags associated with the optical broad line region, as expected. However, in the first 55 days of the campaign, when the obscuration was becoming most extreme, we observe a de-coupling of the UV continuum and the UV broad emission line variability. The correlation recovers in the next 42 days of the campaign, as Mrk 817 enters a less obscured state. The short CIV and Ly alpha lags suggest that the accretion disk extends beyond the UV broad line region. △ Less

Submitted 12 May, 2021; originally announced May 2021.

Comments: 28 pages, 14 figures, submitted to ApJ. Comments welcome

arXiv:2105.05021 [pdf, ps, other]

doi 10.1038/s41550-021-01395-z

A Peculiarly Short-duration Gamma-Ray Burst from Massive Star Core Collapse

Authors: B. -B. Zhang, Z. -K. Liu, Z. -K. Peng, Y. Li, H. -J. Lü, J. Yang, Y. -S. Yang, Y. -H. Yang, Y. -Z. Meng, J. -H. Zou, H. -Y. Ye, X. -G. Wang, J. -R. Mao, X. -H. Zhao, J. -M. Bai, A. J. Castro-Tirado, Y. -D. Hu, Z. -G. Dai, E. -W. Liang, B. Zhang

Abstract: Gamma-ray bursts (GRBs) have been phenomenologically classified into long and short populations based on the observed bimodal distribution of duration. Multi-wavelength and multi-messenger observations in recent years have revealed that in general long GRBs originate from massive star core collapse events, whereas short GRBs originate from binary neutron star mergers. It has been known that the du… ▽ More Gamma-ray bursts (GRBs) have been phenomenologically classified into long and short populations based on the observed bimodal distribution of duration. Multi-wavelength and multi-messenger observations in recent years have revealed that in general long GRBs originate from massive star core collapse events, whereas short GRBs originate from binary neutron star mergers. It has been known that the duration criterion is sometimes unreliable, and multi-wavelength criteria are needed to identify the physical origin of a particular GRB. Some apparently long GRBs have been suggested to have a neutron star merger origin, whereas some apparently short GRBs have been attributed to genuinely long GRBs whose short, bright emission is slightly above the detector's sensitivity threshold. Here we report the comprehensive analysis of the multi-wavelength data of a bright short GRB 200826A. Characterized by a sharp pulse, this burst shows a duration of 1 second and no evidence of an underlying longer-duration event. Its other observational properties such as its spectral behaviors, total energy, and host galaxy offset, are, however, inconsistent with those of other short GRBs believed to originate from binary neutron star mergers. Rather, these properties resemble those of long GRBs. This burst confirms the existence of short duration GRBs with stellar core-collapse origin, and presents some challenges to the existing models. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: Nature Astronomy, the authors' version; 24 pages, 7 figures, 3 tables

Journal ref: Nature Astronomy (2021)

arXiv:2104.14830 [pdf, other]

Scaling End-to-End Models for Large-Scale Multilingual ASR

Authors: Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai

Abstract: Building ASR models across many languages is a challenging multi-task learning problem due to large variations and heavily unbalanced data. Existing work has shown positive transfer from high resource to low resource languages. However, degradations on high resource languages are commonly observed due to interference from the heterogeneous multilingual data and reduction in per-language capacity.… ▽ More Building ASR models across many languages is a challenging multi-task learning problem due to large variations and heavily unbalanced data. Existing work has shown positive transfer from high resource to low resource languages. However, degradations on high resource languages are commonly observed due to interference from the heterogeneous multilingual data and reduction in per-language capacity. We conduct a capacity study on a 15-language task, with the amount of data per language varying from 7.6K to 53.5K hours. We adopt GShard [1] to efficiently scale up to 10B parameters. Empirically, we find that (1) scaling the number of model parameters is an effective way to solve the capacity bottleneck - our 500M-param model already outperforms monolingual baselines and scaling it to 1B and 10B brought further quality gains; (2) larger models are not only more data efficient, but also more efficient in terms of training cost as measured in TPU days - the 1B-param model reaches the same accuracy at 34% of training time as the 500M-param model; (3) given a fixed capacity budget, adding depth works better than width and large encoders do better than large decoders; (4) with continuous training, they can be adapted to new languages and domains. △ Less

Submitted 11 September, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

Comments: ASRU 2021

arXiv:2103.16826 [pdf, other]

doi 10.1088/1674-4527/21/8/204

Radial stellar populations of AGN-host dwarf galaxies in SDSS-IV MaNGA survey

Authors: Wei Cai, Yinghe Zhao, Jin-Ming Bai

Abstract: Based on MaNGA integral field unit (IFU) spectroscopy we search 60 AGN candidates, which have stellar masses $M_{\star}\leqslant5\times10^{9}$$M_{\odot}$ and show AGN ionization signatures in the BPT diagram. For these AGN candidates, we derive the spatially resolved stellar population with the stellar population synthesis code STARLIGHT and measure the gradients of the mean stellar age and metall… ▽ More Based on MaNGA integral field unit (IFU) spectroscopy we search 60 AGN candidates, which have stellar masses $M_{\star}\leqslant5\times10^{9}$$M_{\odot}$ and show AGN ionization signatures in the BPT diagram. For these AGN candidates, we derive the spatially resolved stellar population with the stellar population synthesis code STARLIGHT and measure the gradients of the mean stellar age and metallicity. We find that the gradients of mean stellar age (metallicity) of individual AGN-host dwarfs are diverse in 0-0.5 Re, 0.5-1 Re and 0-1 Re. However, the overall behavior of the mean stellar age (metallicity) profiles tend to be flat, as the median values of the gradients are close to zero. We further study the overall behavior of the mean stellar age (metallicity) by plotting the co-added radial profiles for the AGN sample and compare with a control sample with similar stellar mass. We find that the median values of light-weighted mean stellar ages of AGN sample are as old as 2-3 ~Gyr within 2 Re,which are about 4-7 times older than those of the control sample. Meanwhile, most of the AGN candidates are low-level AGNs, as only eight sources have L[OIII]>$10^{39.5}$~erg~s$^{-1}$. Hence, the AGNs in dwarf galaxies might accelerate the evolution of galaxies by accelerating the consumption of the gas, resulting in an overall quenching of the dwarf galaxies, and the AGNs also become weak due to the lack of gas. The median values of mass-weighted mean stellar age of both samples within 2 $Re$ are similar and as old as about 10~Gyr, indicating that the stellar mass is mainly contributed by old stellar populations.The gradients of co-added mean stellar metallicity for both samples tend to be negative but close to zero, and the similar mean stellar metallicity profiles for both samples indicate that the chemical evolution of the host galaxy is not strongly influenced by the AGN. △ Less

Submitted 31 March, 2021; originally announced March 2021.

Comments: 25 pages, 12 figures, accepted by RAA

arXiv:2103.16752 [pdf, ps, other]

Iteration complexity analysis of a partial LQP-based alternating direction method of multipliers

Authors: Jianchao Bai, Yuxue Ma, Hao Sun, Miao Zhang

Abstract: In this paper, we consider a prototypical convex optimization problem with multi-block variables and separable structures. By adding the Logarithmic Quadratic Proximal (LQP) regularizer with suitable proximal parameter to each of the first grouped subproblems, we develop a partial LQP-based Alternating Direction Method of Multipliers (ADMM-LQP). The dual variable is updated twice with relatively l… ▽ More In this paper, we consider a prototypical convex optimization problem with multi-block variables and separable structures. By adding the Logarithmic Quadratic Proximal (LQP) regularizer with suitable proximal parameter to each of the first grouped subproblems, we develop a partial LQP-based Alternating Direction Method of Multipliers (ADMM-LQP). The dual variable is updated twice with relatively larger stepsizes than the classical region $(0,\frac{1+\sqrt{5}}{2})$. Using a prediction-correction approach to analyze properties of the iterates generated by ADMM-LQP, we establish its global convergence and sublinear convergence rate of $O(1/T)$ in the new ergodic and nonergodic senses, where $T$ denotes the iteration index. We also extend the algorithm to a nonsmooth composite convex optimization and establish {similar convergence results} as our ADMM-LQP. △ Less

Submitted 30 March, 2021; originally announced March 2021.

Comments: 22 pages

arXiv:2103.16154 [pdf, ps, other]

Convergence on a symmetric accelerated stochastic ADMM with larger stepsizes

Authors: Jianchao Bai, Deren Han, Hao Sun, Hongchao Zhang

Abstract: In this paper, we develop a symmetric accelerated stochastic Alternating Direction Method of Multipliers (SAS-ADMM) for solving separable convex optimization problems with linear constraints. The objective function is the sum of a possibly nonsmooth convex function and an average function of many smooth convex functions. Our proposed algorithm combines both ideas of ADMM and the techniques of acce… ▽ More In this paper, we develop a symmetric accelerated stochastic Alternating Direction Method of Multipliers (SAS-ADMM) for solving separable convex optimization problems with linear constraints. The objective function is the sum of a possibly nonsmooth convex function and an average function of many smooth convex functions. Our proposed algorithm combines both ideas of ADMM and the techniques of accelerated stochastic gradient methods possibly with variance reduction to solve the smooth subproblem. One main feature of SAS-ADMM is that its dual variable is symmetrically updated after each update of the separated primal variable, which would allow a more flexible and larger convergence region of the dual variable compared with that of standard deter-ministic or stochastic ADMM. This new stochastic optimization algorithm is shown to have ergodic converge in expectation with O(1/T) convergence rate, where T is the number of outer iterations. Our preliminary experiments indicate the proposed algorithm is very effective for solving separable optimization problems from big-data applications. Finally, 3-block extensions of the algorithm and its variant of an accelerated stochastic augmented Lagrangian method are discussed in the appendix. △ Less

Submitted 19 December, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

Comments: Accepted by CSIAM-AM

arXiv:2103.08205 [pdf, other]

doi 10.1088/1674-4527/21/7/177

The afterglow emission from a stratified jet in GRB 170817A

Authors: K. F. Cheng, X. H. Zhao, B. B. Zhang, J. M. Bai

Abstract: The afterglow of GRB 170817A has been detected for more than three years, but the origin of the multi-band afterglow light curves remains under debate. A classical top-hat jet model is faced with difficulties in producing a shallow rise of the afterglow light curves as observed $(F_ν \propto T^{0.8})$. Here we reconsider the model of stratified ejecta with energy profile of $E(>Γβ)=E_0(Γβ)^{-k}$ a… ▽ More The afterglow of GRB 170817A has been detected for more than three years, but the origin of the multi-band afterglow light curves remains under debate. A classical top-hat jet model is faced with difficulties in producing a shallow rise of the afterglow light curves as observed $(F_ν \propto T^{0.8})$. Here we reconsider the model of stratified ejecta with energy profile of $E(>Γβ)=E_0(Γβ)^{-k}$ as the origin of the afterglow light curves of the burst, where $Γ$ and $β$ are the Lorentz factor and speed of the ejecta, respectively. $k$ is the power-law slope of the energy profile. We consider the ejecta are collimated into jets. Two kinds of jet evolutions are investigated, including a lateral-spreading jet and a non-lateral-spreading jet. We fit the multi-band afterglow light curves, including the X-ray data at one thousand days post-burst, and find that both the models of the spreading and non-spreading jets can fit the light curves well, but the observed angular size of the source and the apparent velocity of the flux centroid for the spreading jet model are beyond the observation limits, while the non-spreading jet model meets the observation limits. Some of the best-fit parameters for the non-spreading jet model, such as the number density of the circumburst medium $\sim10^{-2}$ cm$^{-3}$ and the total jet kinetic energy $E \sim 4.8\times 10^{51}$ erg, also appear plausible. The best-fit slope of the jet energy profile is $k \sim 7.1$. Our results suggest that the afterglow of GRB 170817A may arise from the stratified jet and that the lateral spreading of the jet is not significant. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: 16 pages, 5 figures, 1 table. Accepted for publication in RAA

Showing 301–350 of 865 results for author: Bai, J