Skip to main content

Showing 1–50 of 297 results for author: Hong, X

  1. arXiv:2410.08703  [pdf, other

    cs.CL cs.AI

    On the token distance modeling ability of higher RoPE attention dimension

    Authors: Xiangyu Hong, Che Jiang, Biqing Qi, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou

    Abstract: Length extrapolation algorithms based on Rotary position embedding (RoPE) have shown promising results in extending the context length of language models. However, understanding how position embedding can capture longer-range contextual information remains elusive. Based on the intuition that different dimensions correspond to different frequency of changes in RoPE encoding, we conducted a dimensi… ▽ More

    Submitted 21 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: Accepted to EMNLP 2024 Findings

  2. arXiv:2409.18339  [pdf, other

    cs.CL cs.AI

    AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models

    Authors: Xin Hong, Yuan Gong, Vidhyasaharan Sethu, Ting Dang

    Abstract: Recent advancements in Large Language Models (LLMs) have demonstrated great success in many Natural Language Processing (NLP) tasks. In addition to their cognitive intelligence, exploring their capabilities in emotional intelligence is also crucial, as it enables more natural and empathetic conversational AI. Recent studies have shown LLMs' capability in recognizing emotions, but they often focus… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

    Comments: 5 pages, 4 figures

  3. arXiv:2409.09391  [pdf, other

    cs.CV

    Tran-GCN: A Transformer-Enhanced Graph Convolutional Network for Person Re-Identification in Monitoring Videos

    Authors: Xiaobin Hong, Tarmizi Adam, Masitah Ghazali

    Abstract: Person Re-Identification (Re-ID) has gained popularity in computer vision, enabling cross-camera pedestrian recognition. Although the development of deep learning has provided a robust technical foundation for person Re-ID research, most existing person Re-ID methods overlook the potential relationships among local person features, failing to adequately address the impact of pedestrian pose variat… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

  4. arXiv:2409.05385  [pdf, other

    cs.CL cs.AI

    Towards Building a Robust Knowledge Intensive Question Answering Model with Large Language Models

    Authors: Xingyun Hong, Yan Shao, Zhilin Wang, Manni Duan, Jin Xiongnan

    Abstract: The development of LLMs has greatly enhanced the intelligence and fluency of question answering, while the emergence of retrieval enhancement has enabled models to better utilize external information. However, the presence of noise and errors in retrieved information poses challenges to the robustness of LLMs. In this work, to evaluate the model's performance under multiple interferences, we first… ▽ More

    Submitted 17 September, 2024; v1 submitted 9 September, 2024; originally announced September 2024.

    Comments: This paper has been accepted by NLPCC-2024

  5. arXiv:2408.08412  [pdf, other

    cs.CV

    Penny-Wise and Pound-Foolish in Deepfake Detection

    Authors: Yabin Wang, Zhiwu Huang, Su Zhou, Adam Prugel-Bennett, Xiaopeng Hong

    Abstract: The diffusion of deepfake technologies has sparked serious concerns about its potential misuse across various domains, prompting the urgent need for robust detection methods. Despite advancement, many current approaches prioritize short-term gains at expense of long-term effectiveness. This paper critiques the overly specialized approach of fine-tuning pre-trained models solely with a penny-wise o… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  6. arXiv:2408.06647  [pdf, other

    astro-ph.GA astro-ph.HE

    Magnetic Field of the Quasar 1604+159 from Parsec to Kilo-parsec Scale

    Authors: Xu-Zhi Hu, Xiaoyu Hong, Wei Zhao, Liang Chen, Wei-Yang Wang, Linhui Wu

    Abstract: We present a multi-frequency polarimetric study for the quasar 1604+159. The source was observed at the $L$ band with the American Very Long Baseline Array (VLBA) and the $L$, $X$, and $U$ bands with the Very Large Array (VLA). These observations provide different resolutions from mas to arcsec, enabling us to probe the morphology and magnetic field from tens of parsec to hundreds of kilo-parsec s… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 17 pages, accepted for publication in ApJ

  7. arXiv:2408.01980  [pdf, other

    quant-ph

    Measurement Induced Magic Resources

    Authors: Gongchu Li, Lei Chen, Si-Qi Zhang, Xu-Song Hong, Huaqing Xu, Yuancheng Liu, You Zhou, Geng Chen, Chuan-Feng Li, Alioscia Hamma, Guang-Can Guo

    Abstract: Magic states and magic gates are crucial for achieving universal computation, but some important questions about how magic resources should be implemented to attain quantum advantage have remained unexplored, for instance, in the context of Measurement-based Quantum Computation (MQC) with only single-qubit measurements. This work bridges the gap between MQC and the resource theory of magic by intr… ▽ More

    Submitted 29 August, 2024; v1 submitted 4 August, 2024; originally announced August 2024.

    Comments: 25 pages, 11 figures

  8. arXiv:2407.19491  [pdf, other

    cs.CV

    Multi-modal Crowd Counting via Modal Emulation

    Authors: Chenhao Wang, Xiaopeng Hong, Zhiheng Ma, Yupeng Wei, Yabin Wang, Xiaopeng Fan

    Abstract: Multi-modal crowd counting is a crucial task that uses multi-modal cues to estimate the number of people in crowded scenes. To overcome the gap between different modalities, we propose a modal emulation-based two-pass multi-modal crowd-counting framework that enables efficient modal emulation, alignment, and fusion. The framework consists of two key components: a \emph{multi-modal inference} pass… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: This is the preprint version of the paper to appear in BMVC 2024. Please cite the final published version. Code is available at https://github.com/Mr-Monday/Multi-modal-Crowd-Counting-via-Modal-Emulation

  9. arXiv:2407.19078  [pdf, other

    cs.LG stat.ML

    Practical Marketplace Optimization at Uber Using Causally-Informed Machine Learning

    Authors: Bobby Chen, Siyu Chen, Jason Dowlatabadi, Yu Xuan Hong, Vinayak Iyer, Uday Mantripragada, Rishabh Narang, Apoorv Pandey, Zijun Qin, Abrar Sheikh, Hongtao Sun, Jiaqi Sun, Matthew Walker, Kaichen Wei, Chen Xu, Jingnan Yang, Allen T. Zhang, Guoqing Zhang

    Abstract: Budget allocation of marketplace levers, such as incentives for drivers and promotions for riders, has long been a technical and business challenge at Uber; understanding lever budget changes' impact and estimating cost efficiency to achieve predefined budgets is crucial, with the goal of optimal allocations that maximize business value; we introduce an end-to-end machine learning and optimization… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: To be published in the 2nd Workshop on Causal Inference and Machine Learning in Practice, KDD 2024, August 25 to 29, 2024, Barcelona, Spain, 10 pages

    MSC Class: 62J99

  10. arXiv:2407.11086  [pdf, other

    cs.LG cs.AI physics.chem-ph

    Pre-training with Fractional Denoising to Enhance Molecular Property Prediction

    Authors: Yuyan Ni, Shikun Feng, Xin Hong, Yuancheng Sun, Wei-Ying Ma, Zhi-Ming Ma, Qiwei Ye, Yanyan Lan

    Abstract: Deep learning methods have been considered promising for accelerating molecular screening in drug discovery and material design. Due to the limited availability of labelled data, various self-supervised molecular pre-training methods have been presented. While many existing methods utilize common pre-training tasks in computer vision (CV) and natural language processing (NLP), they often overlook… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  11. arXiv:2407.09367  [pdf, other

    cs.CV

    Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation

    Authors: Zhilin Zhu, Xiaopeng Hong, Zhiheng Ma, Weijun Zhuang, Yaohui Ma, Yong Dai, Yaowei Wang

    Abstract: Continual Test-Time Adaptation (CTTA) involves adapting a pre-trained source model to continually changing unsupervised target domains. In this paper, we systematically analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting under continual domain shifts. To address these challenges, we reshape the online data bu… ▽ More

    Submitted 18 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: This is the preprint version of our paper and supplemental material to appear in ECCV 2024

  12. arXiv:2407.07518  [pdf, other

    cs.CV

    Multi-modal Crowd Counting via a Broker Modality

    Authors: Haoliang Meng, Xiaopeng Hong, Chenhao Wang, Miao Shang, Wangmeng Zuo

    Abstract: Multi-modal crowd counting involves estimating crowd density from both visual and thermal/depth images. This task is challenging due to the significant gap between these distinct modalities. In this paper, we propose a novel approach by introducing an auxiliary broker modality and on this basis frame the task as a triple-modal learning problem. We devise a fusion-based method to generate this brok… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: This is the preprint version of the paper and supplemental material to appear in ECCV 2024. Please cite the final published version. Code is available at https://github.com/HenryCilence/Broker-Modality-Crowd-Counting

  13. arXiv:2407.01310  [pdf, other

    cs.LG cs.CV

    Multi-State-Action Tokenisation in Decision Transformers for Multi-Discrete Action Spaces

    Authors: Perusha Moodley, Pramod Kaushik, Dhillu Thambi, Mark Trovinger, Praveen Paruchuri, Xia Hong, Benjamin Rosman

    Abstract: Decision Transformers, in their vanilla form, struggle to perform on image-based environments with multi-discrete action spaces. Although enhanced Decision Transformer architectures have been developed to improve performance, these methods have not specifically addressed this problem of multi-discrete action spaces which hampers existing Decision Transformer architectures from learning good repres… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  14. arXiv:2406.18159  [pdf, other

    cs.CV cs.GR

    Human-Aware 3D Scene Generation with Spatially-constrained Diffusion Models

    Authors: Xiaolin Hong, Hongwei Yi, Fazhi He, Qiong Cao

    Abstract: Generating 3D scenes from human motion sequences supports numerous applications, including virtual reality and architectural design. However, previous auto-regression-based human-aware 3D scene generation methods have struggled to accurately capture the joint distribution of multiple objects and input humans, often resulting in overlapping object generation in the same space. To address this limit… ▽ More

    Submitted 20 August, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  15. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Task automation has been greatly empowered by the recent advances in Large Language Models (LLMs) via Python code, where the tasks ranging from software engineering development to general-purpose reasoning. While current benchmarks have shown that LLMs can solve tasks using programs like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks o… ▽ More

    Submitted 7 October, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 44 pages, 14 figures, 7 tables, built with love by the BigCode community :)

  16. arXiv:2406.15062  [pdf, other

    cond-mat.str-el cond-mat.supr-con

    Decoupled static and dynamical charge correlations in La$_{2-x}$Sr$_x$CuO$_4$

    Authors: L. Martinelli, I. Biało, X. Hong, J. Oppliger, C. Lin, T. Schaller, J. Küspert, M. H. Fischer, T. Kurosawa, N. Momono, M. Oda, J. Choi, S. Agrestini, M. Garcia-Fernandez, Ke-Jin Zhou, Q. Wang, J. Chang

    Abstract: The relation between charge order, its quantum fluctuations and optical phonon modes in cuprate superconductors remains an unsolved problem. The exploration of these excitations is however complicated by the presence of twinned domains. Here, we use uniaxial strain in combination with ultra-high-resolution Resonant Inelastic X-ray Scattering (RIXS) at the oxygen K- and copper L3-edges to study the… ▽ More

    Submitted 15 July, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures

  17. arXiv:2406.07487  [pdf, other

    cs.CV

    GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

    Authors: Hang Yao, Ming Liu, Haolin Wang, Zhicun Yin, Zifei Yan, Xiaopeng Hong, Wangmeng Zuo

    Abstract: Diffusion models have shown superior performance on unsupervised anomaly detection tasks. Since trained with normal data only, diffusion models tend to reconstruct normal counterparts of test images with certain noises added. However, these methods treat all potential anomalies equally, which may cause two main problems. From the global perspective, the difficulty of reconstructing images with dif… ▽ More

    Submitted 9 September, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted by ECCV 2024, code and models: https://github.com/hyao1/GLAD. Due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract here is shorter than that in the PDF file

  18. arXiv:2406.00334  [pdf, other

    cs.CV

    Image Captioning via Dynamic Path Customization

    Authors: Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Xiaopeng Hong, Yongjian Wu, Rongrong Ji

    Abstract: This paper explores a novel dynamic network for vision and language tasks, where the inferring structure is customized on the fly for different inputs. Most previous state-of-the-art approaches are static and hand-crafted networks, which not only heavily rely on expert knowledge, but also ignore the semantic diversity of input samples, therefore resulting in suboptimal performance. To address thes… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: TNNLS24

  19. arXiv:2405.20696  [pdf, other

    quant-ph

    Directly Estimating Mixed-State Entanglement with Bell Measurement Assistance

    Authors: Gong-Chu Li, Lei Chen, Si-Qi Zhang, Xu-Song Hong, You Zhou, Geng Chen, Chuan-Feng Li, Guang-Can Guo

    Abstract: Entanglement plays a fundamental role in quantum physics and information processing. Here, we develop an unbiased estimator for mixed-state entanglement in the few-shot scenario and directly estimate it using random unitary evolution in a photonic system. As a supplement to traditional projective measurements, we incorporate Bell measurements on qubit-pairs, enriching the previous randomized measu… ▽ More

    Submitted 6 July, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: 5 pages, 4 figures

  20. arXiv:2405.17802  [pdf, other

    cs.LG cs.AI q-bio.BM

    Multi-level Interaction Modeling for Protein Mutational Effect Prediction

    Authors: Yuanle Mo, Xin Hong, Bowen Gao, Yinjun Jia, Yanyan Lan

    Abstract: Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different si… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  21. arXiv:2404.18456  [pdf, other

    quant-ph

    Equivalence Checking of Parameterised Quantum Circuits

    Authors: Xin Hong, Wei-Jia Huang, Wei-Chen Chien, Yuan Feng, Min-Hsiu Hsieh, Sanjiang Li, Mingsheng Ying

    Abstract: Parameterised quantum circuits (PQCs) hold great promise for demonstrating quantum advantages in practical applications of quantum computation. Examples of successful applications include the variational quantum eigensolver, the quantum approximate optimisation algorithm, and quantum machine learning. However, before executing PQCs on real quantum devices, they undergo compilation and optimisation… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  22. arXiv:2404.18060  [pdf, other

    cs.CV cs.LG

    Prompt Customization for Continual Learning

    Authors: Yong Dai, Xiaopeng Hong, Yabin Wang, Zhiheng Ma, Dongmei Jiang, Yaowei Wang

    Abstract: Contemporary continual learning approaches typically select prompts from a pool, which function as supplementary inputs to a pre-trained model. However, this strategy is hindered by the inherent noise of its selection approach when handling increasing tasks. In response to these challenges, we reformulate the prompting approach for continual learning and propose the prompt customization (PC) metho… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: ACM MM

  23. arXiv:2404.01174  [pdf, other

    cs.CV cs.MM

    SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding

    Authors: Wenrui Li, Xiaopeng Hong, Ruiqin Xiong, Xiaopeng Fan

    Abstract: Temporal video grounding (TVG) is a critical task in video content understanding, requiring precise alignment between video content and natural language instructions. Despite significant advancements, existing methods face challenges in managing confidence bias towards salient objects and capturing long-term dependencies in video sequences. To address these issues, we introduce SpikeMba: a multi-m… ▽ More

    Submitted 23 May, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  24. arXiv:2404.00989  [pdf, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    360+x: A Panoptic Multi-modal Scene Understanding Dataset

    Authors: Hao Chen, Yuqi Hou, Chenyuan Qu, Irene Testini, Xiaohan Hong, Jianbo Jiao

    Abstract: Human perception of the world is shaped by a multitude of viewpoints and modalities. While many existing datasets focus on scene understanding from a certain perspective (e.g. egocentric or third-person views), our dataset offers a panoptic perspective (i.e. multiple viewpoints with multiple data modalities). Specifically, we encapsulate third-person panoramic and front views, as well as egocentri… ▽ More

    Submitted 7 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 (Oral Presentation), Project page: https://x360dataset.github.io/

    Journal ref: The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024

  25. arXiv:2403.20009  [pdf, other

    cs.CL cs.LG

    On Large Language Models' Hallucination with Regard to Known Facts

    Authors: Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou

    Abstract: Large language models are successful in answering factoid questions but are also prone to hallucination.We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations.We are able to conduct this analysis via two key ideas.First, we identify the factual question… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted by NAACL 2024 MainConference

  26. arXiv:2403.19952  [pdf, ps, other

    physics.atm-clus

    Theoretical investigation on the optical absorption spectra in cyclo[n]carbons (n=10, 14, 18)

    Authors: Xuhai Hong, Lang Su, Jie Li

    Abstract: The optical absorption spectra of cyclo[n]carbons (n=10, 14, 18) are investigated in the framework of time-dependent density functional theory. The collective plasmon excitations well develop as the increases of the ring size and the symmetry group of cyclo[n]carbons. An increase in intensity for the main peaks with the growing number of atoms in cyclo[n]carbons is observed. With the increase of t… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  27. arXiv:2403.12965  [pdf, other

    cs.CV

    Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

    Authors: Mengting Chen, Xi Chen, Zhonghua Zhai, Chen Ju, Xuewen Hong, Jinsong Lan, Shuai Xiao

    Abstract: This paper introduces a novel framework for virtual try-on, termed Wear-Any-Way. Different from previous methods, Wear-Any-Way is a customizable solution. Besides generating high-fidelity results, our method supports users to precisely manipulate the wearing style. To achieve this goal, we first construct a strong pipeline for standard virtual try-on, supporting single/multiple garment try-on and… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Project Page: https://mengtingchen.github.io/wear-any-way-page/

  28. arXiv:2402.15297  [pdf, other

    cs.CV cs.LG

    Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling

    Authors: Hui Lin, Zhiheng Ma, Rongrong Ji, Yaowei Wang, Zhou Su, Xiaopeng Hong, Deyu Meng

    Abstract: This paper focuses on semi-supervised crowd counting, where only a small portion of the training data are labeled. We formulate the pixel-wise density value to regress as a probability distribution, instead of a single deterministic value. On this basis, we propose a semi-supervised crowd-counting model. Firstly, we design a pixel-wise distribution matching loss to measure the differences in the p… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: This is the technical report of a paper that was submitted to IEEE Transactions and is now under review

  29. arXiv:2401.17132  [pdf, other

    astro-ph.HE

    Evolution of magnetic field of the Quasar 1604+159 at pc scale

    Authors: Xu-Zhi Hu, Xiaoyu Hong, Wei Zhao, Liang Chen, Wei-Yang Wang, Linhui Wu

    Abstract: We have analyzed the total intensity, spectral index, linear polarization, and RM distributions at pc scale for the quasar 1604+159. The source was observed in 2002 and 2020 with the VLBA. Combining the MOJAVE results, we studied the evolution of the magnetic field. We detected a core-jet structure. The jet extends to a distance of ~25 mas. The jet shape varies slightly with time. We divided the s… ▽ More

    Submitted 1 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 24 pages, 14 figures, accepted for publication in ApJ

  30. arXiv:2401.12164  [pdf, other

    cs.CV cs.AI

    Semi-supervised segmentation of land cover images using nonlinear canonical correlation analysis with multiple features and t-SNE

    Authors: Hong Wei, James Xiao, Yichao Zhang, Xia Hong

    Abstract: Image segmentation is a clustering task whereby each pixel is assigned a cluster label. Remote sensing data usually consists of multiple bands of spectral images in which there exist semantically meaningful land cover subregions, co-registered with other source data such as LIDAR (LIght Detection And Ranging) data, where available. This suggests that, in order to account for spatial correlation be… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  31. Multimechanism quantum anomalous Hall and Chern number tunable states in germanene (silicene, stanene)/$M$Bi$_2$Te$_4$ heterostructures

    Authors: Zhe Li, Jiatong Zhang, Xiyu Hong, Xiao Feng, Ke He

    Abstract: By constructing germanene (silicene, stanene)/$M$Bi$_2$Te$_4$ ($M$ = 3d-transition elements) heterostructures, we discovered and designed multimechanism quantum-anomalous-Hall (QAH) systems, including $Γ$-based QAH, $K$-$K'$-connected QAH, and valley-polarized $K$- or $K'$-based QAH states via first-principle computations. The unique systems possess a global gap and tunable Chern number. The coexi… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 9 pages, 4 figures

    Journal ref: Phys. Rev. B 109, 235132(2024)

  32. arXiv:2401.03870  [pdf, other

    cs.CV

    Gramformer: Learning Crowd Counting via Graph-Modulated Transformer

    Authors: Hui Lin, Zhiheng Ma, Xiaopeng Hong, Qinnan Shangguan, Deyu Meng

    Abstract: Transformer has been popular in recent crowd counting work since it breaks the limited receptive field of traditional CNNs. However, since crowd images always contain a large number of similar patches, the self-attention mechanism in Transformer tends to find a homogenized solution where the attention maps of almost all patches are identical. In this paper, we address this problem by proposing Gra… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: This is the accepted version of the paper and supplemental material to appear in AAAI 2024. Please cite the final published version. Code is available at {https://github.com/LoraLinH/Gramformer}

  33. arXiv:2401.02335  [pdf, other

    cs.CV

    Linguistic Profiling of Deepfakes: An Open Database for Next-Generation Deepfake Detection

    Authors: Yabin Wang, Zhiwu Huang, Zhiheng Ma, Xiaopeng Hong

    Abstract: The emergence of text-to-image generative models has revolutionized the field of deepfakes, enabling the creation of realistic and convincing visual content directly from textual descriptions. However, this advancement presents considerably greater challenges in detecting the authenticity of such content. Existing deepfake detection datasets and methods often fall short in effectively capturing th… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  34. arXiv:2312.14792  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.IT math.PR

    The Rate-Distortion-Perception-Classification Tradeoff: Joint Source Coding and Modulation via Inverse-Domain GANs

    Authors: Junli Fang, João F. C. Mota, Baoshan Lu, Weicheng Zhang, Xuemin Hong

    Abstract: The joint source-channel coding (JSCC) framework leverages deep learning to learn from data the best codes for source and channel coding. When the output signal, rather than being binary, is directly mapped onto the IQ domain (complex-valued), we call the resulting framework joint source coding and modulation (JSCM). We consider a JSCM scenario and show the existence of a strict tradeoff between c… ▽ More

    Submitted 6 June, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Paper accepted in IEEE Transactions on Signal Processing

  35. arXiv:2312.07867  [pdf, other

    cs.AI cs.CL

    BESTMVQA: A Benchmark Evaluation System for Medical Visual Question Answering

    Authors: Xiaojie Hong, Zixin Song, Liangzhi Li, Xiaoli Wang, Feiyan Liu

    Abstract: Medical Visual Question Answering (Med-VQA) is a very important task in healthcare industry, which answers a natural language question with a medical image. Existing VQA techniques in information systems can be directly applied to solving the task. However, they often suffer from (i) the data insufficient problem, which makes it difficult to train the state of the arts (SOTAs) for the domain-speci… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  36. arXiv:2311.12221  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Spinon heat transport in the three-dimensional quantum magnet PbCuTe$_2$O$_6$

    Authors: Xiaochen Hong, Matthias Gillig, Abanoub R. N. Hanna, Shravani Chillal, A. T. M. Nazmul Islam, Bella Lake, Bernd Büchner, Christian Hess

    Abstract: Quantum spin liquids (QSL) are novel phases of matter which remain quantum disordered even at the lowest temperature. They are characterized by emergent gauge fields and fractionalized quasiparticles. Here we show that the sub-Kelvin thermal transport of the three-dimensional $S=1/2$ hyper-hyperkagome quantum magnet PbCuTe$_2$O$_6$ is governed by a sizeable charge-neutral fermionic contribution wh… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  37. arXiv:2311.07311  [pdf, other

    cs.CL cs.AI

    Do large language models and humans have similar behaviors in causal inference with script knowledge?

    Authors: Xudong Hong, Margarita Ryzhova, Daniel Adrian Biondi, Vera Demberg

    Abstract: Recently, large pre-trained language models (LLMs) have demonstrated superior language understanding abilities, including zero-shot causal reasoning. However, it is unclear to what extent their capabilities are similar to human ones. We here study the processing of an event $B$ in a script-based story, which causally depends on a previous event $A$. In our manipulation, event $A$ is stated, negate… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 15 pages, 3 figures

    ACM Class: I.2.7; I.2.0

  38. arXiv:2311.06126  [pdf, other

    astro-ph.HE astro-ph.GA

    A centi-pc-scale compact radio core in the nearby galaxy M60

    Authors: Xiaofeng Li, Jun Yang, Xiaopeng Cheng, Mai Liao, Xiaoyu Hong, Liming Dou, Tianle Zhao, Zhongying Fan, Fupeng Zhang, Weirong Huang

    Abstract: M60, an elliptical galaxy located 16.5~Mpc away, has an active nucleus with a very low luminosity and an extremely low accretion rate. Its central supermassive black hole has a mass of $M_{\rm BH}\sim4.5\times10^{9}\, M_{\odot}$ and a Schwarzschild radii corresponding to $R_{\rm S}\sim5.4\,μ\mathrm{as}$. To investigate the nature of its innermost radio nucleus, data from the Very Long Baseline Arr… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: 15 pages, 5 figures, 3 tables, accepted for publication in Astrophysical Journal

  39. arXiv:2310.10352  [pdf, other

    cs.CV

    Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes

    Authors: Yifei Qian, Xiaopeng Hong, Zhongliang Guo, Ognjen Arandjelović, Carl R. Donovan

    Abstract: To alleviate the heavy annotation burden for training a reliable crowd counting model and thus make the model more practicable and accurate by being able to benefit from more data, this paper presents a new semi-supervised method based on the mean teacher framework. When there is a scarcity of labeled data available, the model is prone to overfit local patches. Within such contexts, the convention… ▽ More

    Submitted 20 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted by TCSVT

  40. arXiv:2310.04900  [pdf, other

    cs.CV

    HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

    Authors: Nina Shvetsova, Anna Kukleva, Xudong Hong, Christian Rupprecht, Bernt Schiele, Hilde Kuehne

    Abstract: Instructional videos are a common source for learning text-video or even multimodal representations by leveraging subtitles extracted with automatic speech recognition systems (ASR) from the audio signal in the videos. However, in contrast to human-annotated captions, both speech and subtitles naturally differ from the visual content of the videos and thus provide only noisy supervision. As a resu… ▽ More

    Submitted 7 September, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: https://github.com/ninatu/howtocaption

  41. arXiv:2310.04237  [pdf

    cs.CL

    Written and spoken corpus of real and fake social media postings about COVID-19

    Authors: Ng Bee Chin, Ng Zhi Ee Nicole, Kyla Kwan, Lee Yong Han Dylann, Liu Fang, Xu Hong

    Abstract: This study investigates the linguistic traits of fake news and real news. There are two parts to this study: text data and speech data. The text data for this study consisted of 6420 COVID-19 related tweets re-filtered from Patwa et al. (2021). After cleaning, the dataset contained 3049 tweets, with 2161 labeled as 'real' and 888 as 'fake'. The speech data for this study was collected from TikTok,… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 9 pages, 3 tables

  42. arXiv:2309.04931  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Transport Anisotropy in One-dimensional Graphene Superlattice in the High Kronig-Penney Potential Limit

    Authors: Tianlin Li, Hanying Chen, Kun Wang, Yifei Hao, Le Zhang, Kenji Watanabe, Takashi Taniguchi, Xia Hong

    Abstract: One-dimensional graphene superlattice subjected to strong Kronig-Penney (KP) potential is promising for achieving electron lensing effect, while previous studies utilizing the modulated dielectric gates can only yield a moderate, spatially dispersed potential profile. Here, we realize high KP potential modulation of graphene via nanoscale ferroelectric domain gating. Graphene transistors are fabri… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

    Comments: 12 pages, 5 figures, and Supplemental Material

    Journal ref: Phys. Rev. Lett. 132, 056204 (2024)

  43. arXiv:2309.00781  [pdf, other

    cs.LG stat.ML

    Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction

    Authors: Alejandro Rodriguez Dominguez, Muhammad Shahzad, Xia Hong

    Abstract: Multi-modal problems can be effectively addressed using multiple hypothesis frameworks, but integrating these frameworks into learning models poses significant challenges. This paper introduces a Structured Radial Basis Function Network (s-RBFN) as an ensemble of multiple hypothesis predictors for regression. During the training of the predictors, first the centroidal Voronoi tessellations are for… ▽ More

    Submitted 20 September, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Acepted Paper for AI-2024 Forty-fourth SGAI International Conference on Artificial Intelligence CAMBRIDGE, ENGLAND 17-19 DECEMBER 2024

    MSC Class: 28-08; 28-11; 26B25; 26C15; 46A03; 46T12; 49Q05; 51-08; 60D05; 62J02; 62H10; 62-08; 68W25; 68T07; 68T20 ACM Class: I.2.1; I.2.6; I.5.1; I.6.4; I.6.5

  44. CSM-H-R: A Context Modeling Framework in Supporting Reasoning Automation for Interoperable Intelligent Systems and Privacy Protection

    Authors: Songhui Yue, Xiaoyan Hong, Randy K. Smith

    Abstract: The automation of High-Level Context (HLC) reasoning across intelligent systems at scale is imperative because of the unceasing accumulation of contextual data, the trend of the fusion of data from multiple sources (e.g., sensors, intelligent systems), and the intrinsic complexity and dynamism of context-based decision-making processes. To mitigate the challenges posed by these issues, we propose… ▽ More

    Submitted 5 April, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: 13 pages, 10 figures, Keywords: Automation, Context Dynamism, Context Modeling, Context Reasoning, Intelligent System, Interoperability, Privacy Protection, System Integration

  45. arXiv:2308.02993   

    math.CV

    Plurifinely open sets and complex Monge-Ampère measures

    Authors: Nguyen Xuan Hong

    Abstract: The aim of the paper is to investigate the structure of plurifinely open sets. As an application, we will prove an equality on complex Monge-Ampère measures in plurifinely open sets.

    Submitted 13 September, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

    Comments: There is an error in my article. Associate Professor Do Hoang Son pointed it out. I would like to thank Associate Professor Do Hoang Son. I would also like to thank Professor Mohamed for his comments

  46. arXiv:2308.00440  [pdf, other

    quant-ph

    Decision Diagrams for Symbolic Verification of Quantum Circuits

    Authors: Xin Hong, Wei-Jia Huang, Wei-Chen Chien, Yuan Feng, Min-Hsiu Hsieh, Sanjiang Li, Chia-Shun Yeh, Mingsheng Ying

    Abstract: With the rapid development of quantum computing, automatic verification of quantum circuits becomes more and more important. While several decision diagrams (DDs) have been introduced in quantum circuit simulation and verification, none of them supports symbolic computation. Algorithmic manipulations of symbolic objects, however, have been identified as crucial, if not indispensable, for several v… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  47. arXiv:2307.15899  [pdf, other

    math.NA

    Exponential DG methods for Vlasov equations

    Authors: Nicolas Crouseilles, Xue Hong

    Abstract: In this work, an exponential Discontinuous Galerkin (DG) method is proposed to solve numerically Vlasov type equations. The DG method is used for space discretization which is combined exponential Lawson Runge-Kutta method for time discretization to get high order accuracy in time and space. In addition to get high order accuracy in time, the use of Lawson methods enables to overcome the stringent… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

  48. arXiv:2306.16963  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Phonon thermal transport shaped by strong spin-phonon scattering in a Kitaev material Na$_2$Co$_2$TeO$_6$

    Authors: Xiaochen Hong, Matthias Gillig, Weiliang Yao, Lukas Janssen, Vilmos Kocsis, Sebastian Gass, Yuan Li, Anja U. B. Wolter, Bernd Büchner, Christian Hess

    Abstract: The recent report of a half-quantized thermal Hall effect in the Kitaev material $α$-RuCl$_3$ has sparked a strong debate on whether it is generated by Majorana fermion edge currents or whether other more conventional mechanisms involving magnons or phonons are at its origin. A more direct evidence for Majorana fermions which could be expected to arise from a contribution to the longitudinal heat… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Journal ref: npj Quantum Mater. 9, 18 (2024)

  49. arXiv:2305.12525  [pdf, other

    astro-ph.HE astro-ph.GA

    Unveiling the small-scale jets in the rapidly growing supermassive black hole IZw1

    Authors: Xiaolong Yang, Su Yao, Luigi C. Gallo, Jun Yang, Luis C. Ho, Minfeng Gu, Willem A. Baan, Jiri Svoboda, Ran Wang, Xiang Liu, Xiaoyu Hong, Xue-Bing Wu, Wei Zhao

    Abstract: Accretion of black holes at near-Eddington or super-Eddington rates is the most powerful episode that drives black hole growth, and it may work in several types of objects. However, the physics of accretion and jet-disc coupling in such a state remains unclear, mainly because the associated jets are not easily detectable due to the extremely weak emission or possibly episodic nature of the jets. O… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: 19 pages, 8 figures and 4 tables, submitted to ApJ. 2nd round referee report received. comments welcome

  50. arXiv:2305.01928  [pdf, other

    cs.CV

    Visual Transformation Telling

    Authors: Wanqing Cui, Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

    Abstract: Humans can naturally reason from superficial state differences (e.g. ground wetness) to transformations descriptions (e.g. raining) according to their life experience. In this paper, we propose a new visual reasoning task to test this transformation reasoning ability in real-world scenarios, called \textbf{V}isual \textbf{T}ransformation \textbf{T}elling (VTT). Given a series of states (i.e. image… ▽ More

    Submitted 11 June, 2024; v1 submitted 3 May, 2023; originally announced May 2023.