Skip to main content

Showing 1–50 of 374 results for author: Hu, C

  1. arXiv:2410.15882  [pdf, other

    cs.LG cs.CV cs.RO

    Distributed Learning for UAV Swarms

    Authors: Chen Hu, Hanchi Ren, Jingjing Deng, Xianghua Xie

    Abstract: Unmanned Aerial Vehicle (UAV) swarms are increasingly deployed in dynamic, data-rich environments for applications such as environmental monitoring and surveillance. These scenarios demand efficient data processing while maintaining privacy and security, making Federated Learning (FL) a promising solution. FL allows UAVs to collaboratively train global models without sharing raw data, but challeng… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  2. arXiv:2410.11570  [pdf, other

    cs.RO eess.SY

    A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction

    Authors: Zhouheng Li, Bei Zhou, Cheng Hu, Lei Xie, Hongye Su

    Abstract: The development of autonomous driving has boosted the research on autonomous racing. However, existing local trajectory planning methods have difficulty planning trajectories with optimal velocity profiles at racetracks with sharp corners, thus weakening the performance of autonomous racing. To address this problem, we propose a local trajectory planning method that integrates Velocity Prediction… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  3. arXiv:2410.10601  [pdf, other

    cs.RO

    Fully Asynchronous Neuromorphic Perception for Mobile Robot Dodging with Loihi Chips

    Authors: Junjie Jiang, Delei Kong, Chenming Hu, Zheng Fang

    Abstract: Sparse and asynchronous sensing and processing in natural organisms lead to ultra low-latency and energy-efficient perception. Event cameras, known as neuromorphic vision sensors, are designed to mimic these characteristics. However, fully utilizing the sparse and asynchronous event stream remains challenging. Influenced by the mature algorithms of standard cameras, most existing event-based algor… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  4. arXiv:2410.08734  [pdf, other

    cs.LG cs.CV

    Gradients Stand-in for Defending Deep Leakage in Federated Learning

    Authors: H. Yi, H. Ren, C. Hu, Y. Li, J. Deng, X. Xie

    Abstract: Federated Learning (FL) has become a cornerstone of privacy protection, shifting the paradigm towards localizing sensitive data while only sending model gradients to a central server. This strategy is designed to reinforce privacy protections and minimize the vulnerabilities inherent in centralized data storage systems. Despite its innovative approach, recent empirical studies have highlighted pot… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  5. arXiv:2410.07860  [pdf, other

    cs.CV

    BA-Net: Bridge Attention in Deep Neural Networks

    Authors: Ronghui Zhang, Runzong Zou, Yue Zhao, Zirui Zhang, Junzhou Chen, Yue Cao, Chuan Hu, Houbing Song

    Abstract: Attention mechanisms, particularly channel attention, have become highly influential in numerous computer vision tasks. Despite their effectiveness, many existing methods primarily focus on optimizing performance through complex attention modules applied at individual convolutional layers, often overlooking the synergistic interactions that can occur across multiple layers. In response to this gap… ▽ More

    Submitted 10 October, 2024; v1 submitted 10 October, 2024; originally announced October 2024.

  6. arXiv:2410.05740  [pdf, other

    cs.RO cs.AI eess.SY

    Learning to Race in Extreme Turning Scene with Active Exploration and Gaussian Process Regression-based MPC

    Authors: Guoqiang Wu, Cheng Hu, Wangjia Weng, Zhouheng Li, Yonghao Fu, Lei Xie, Hongye Su

    Abstract: Extreme cornering in racing often induces large side-slip angles, presenting a formidable challenge in vehicle control. To tackle this issue, this paper introduces an Active Exploration with Double GPR (AEDGPR) system. The system initiates by planning a minimum-time trajectory with a Gaussian Process Regression(GPR) compensated model. The planning results show that in the cornering section, the ya… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  7. arXiv:2410.04868  [pdf, other

    cs.RO eess.SY

    Predictive Spliner: Data-Driven Overtaking in Autonomous Racing Using Opponent Trajectory Prediction

    Authors: Nicolas Baumann, Edoardo Ghignone, Cheng Hu, Benedict Hildisch, Tino Hämmerle, Alessandro Bettoni, Andrea Carron, Lei Xie, Michele Magno

    Abstract: Head-to-head racing against opponents is a challenging and emerging topic in the domain of autonomous racing. We propose Predictive Spliner, a data-driven overtaking planner that learns the behavior of opponents through Gaussian Process (GP) regression, which is then leveraged to compute viable overtaking maneuvers in future sections of the racing track. Experimentally validated on a 1:10 scale au… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Submitted to RA-L

  8. arXiv:2410.03415  [pdf, other

    cs.CL

    Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation

    Authors: Xinpeng Wang, Chengzhi Hu, Paul Röttger, Barbara Plank

    Abstract: Training a language model to be both helpful and harmless requires careful calibration of refusal behaviours: Models should refuse to follow malicious instructions or give harmful advice (e.g. "how do I kill someone?"), but they should not refuse safe requests, even if they superficially resemble unsafe ones (e.g. "how do I kill a Python process?"). Avoiding such false refusal, as prior work has s… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

  9. arXiv:2409.19599  [pdf, other

    cs.CV

    Gradient is All You Need: Gradient-Based Attention Fusion for Infrared Small Target Detection

    Authors: Chen Hu, Yian Huang, Kexuan Li, Luping Zhang, Yiming Zhu, Yufei Peng, Tian Pu, Zhenming Peng

    Abstract: Infrared small target detection (IRSTD) is widely used in civilian and military applications. However, IRSTD encounters several challenges, including the tendency for small and dim targets to be obscured by complex backgrounds. To address this issue, we propose the Gradient Network (GaNet), which aims to extract and preserve edge and gradient information of small targets. GaNet employs the Gradien… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

  10. arXiv:2409.19521  [pdf, other

    cs.CR cs.LG

    GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks

    Authors: Rongchang Li, Minjie Chen, Chang Hu, Han Chen, Wenpeng Xing, Meng Han

    Abstract: Large Language Models (LLMs) like GPT-4, LLaMA, and Qwen have demonstrated remarkable success across a wide range of applications. However, these models remain inherently vulnerable to prompt injection attacks, which can bypass existing safety mechanisms, highlighting the urgent need for more robust attack detection methods and comprehensive evaluation benchmarks. To address these challenges, we i… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

  11. arXiv:2409.14846  [pdf, other

    cs.AI cs.CV

    A-VL: Adaptive Attention for Large Vision-Language Models

    Authors: Junyang Zhang, Mu Yuan, Ruiguang Zhong, Puhan Luo, Huiyou Zhan, Ningkang Zhang, Chengchen Hu, Xiangyang Li

    Abstract: The Large Vision-Language Model (LVLM) integrates computer vision and natural language processing techniques, offering substantial application potential. However, these models demand extensive resources during inference. Adaptive attention techniques can dynamically reduce computational redundancy and thus improve efficiency. Although current adaptive attention methods significantly reduce the mem… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  12. arXiv:2409.14589  [pdf, other

    cs.CV

    URSimulator: Human-Perception-Driven Prompt Tuning for Enhanced Virtual Urban Renewal via Diffusion Models

    Authors: Chuanbo Hu, Shan Jia, Xin Li

    Abstract: Tackling Urban Physical Disorder (e.g., abandoned buildings, litter, messy vegetation, graffiti) is essential, as it negatively impacts the safety, well-being, and psychological state of communities. Urban Renewal is the process of revitalizing these neglected and decayed areas within a city to improve the physical environment and quality of life for residents. Effective urban renewal efforts can… ▽ More

    Submitted 22 September, 2024; originally announced September 2024.

  13. arXiv:2409.12785  [pdf

    cs.CE cs.AI cs.LG

    Investigation on domain adaptation of additive manufacturing monitoring systems to enhance digital twin reusability

    Authors: Jiarui Xie, Zhuo Yang, Chun-Chun Hu, Haw-Ching Yang, Yan Lu, Yaoyao Fiona Zhao

    Abstract: Powder bed fusion (PBF) is an emerging metal additive manufacturing (AM) technology that enables rapid fabrication of complex geometries. However, defects such as pores and balling may occur and lead to structural unconformities, thus compromising the mechanical performance of the part. This has become a critical challenge for quality assurance as the nature of some defects is stochastic during th… ▽ More

    Submitted 20 September, 2024; v1 submitted 19 September, 2024; originally announced September 2024.

    Comments: 8 pages, 7 figures, 3 tables. IEEE CASE 2024

  14. arXiv:2409.11100  [pdf, other

    cs.LG stat.ML

    Fractional Naive Bayes (FNB): non-convex optimization for a parsimonious weighted selective naive Bayes classifier

    Authors: Carine Hue, Marc Boullé

    Abstract: We study supervised classification for datasets with a very large number of input variables. The naïve Bayes classifier is attractive for its simplicity, scalability and effectiveness in many real data applications. When the strong naïve Bayes assumption of conditional independence of the input variables given the target variable is not valid, variable selection and model averaging are two common… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

  15. arXiv:2409.08846  [pdf, other

    cs.CR cs.CL cs.LG

    FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition

    Authors: Zhenhua Xu, Wenpeng Xing, Zhebo Wang, Chang Hu, Chen Jie, Meng Han

    Abstract: Training Large Language Models (LLMs) requires immense computational power and vast amounts of data. As a result, protecting the intellectual property of these models through fingerprinting is essential for ownership authentication. While adding fingerprints to LLMs through fine-tuning has been attempted, it remains costly and unscalable. In this paper, we introduce FP-VEC, a pilot study on using… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  16. arXiv:2409.06226  [pdf, other

    cs.IR cs.CL cs.LG

    NLP-Powered Repository and Search Engine for Academic Papers: A Case Study on Cyber Risk Literature with CyLit

    Authors: Linfeng Zhang, Changyue Hu, Zhiyu Quan

    Abstract: As the body of academic literature continues to grow, researchers face increasing difficulties in effectively searching for relevant resources. Existing databases and search engines often fall short of providing a comprehensive and contextually relevant collection of academic literature. To address this issue, we propose a novel framework that leverages Natural Language Processing (NLP) techniques… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

  17. arXiv:2409.01011  [pdf, other

    cs.CL cs.CV

    Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts

    Authors: Yingfa Chen, Chenlong Hu, Cong Feng, Chenyang Song, Shi Yu, Xu Han, Zhiyuan Liu, Maosong Sun

    Abstract: This study presents a multi-modal multi-granularity tokenizer specifically designed for analyzing ancient Chinese scripts, focusing on the Chu bamboo slip (CBS) script used during the Spring and Autumn and Warring States period (771-256 BCE) in Ancient China. Considering the complex hierarchical structure of ancient Chinese scripts, where a single character may be a combination of multiple sub-cha… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: 12 pages, 3 figures

  18. arXiv:2409.00727  [pdf, other

    cs.AI cs.CL cs.IR

    Hound: Hunting Supervision Signals for Few and Zero Shot Node Classification on Text-attributed Graph

    Authors: Yuxiang Wang, Xiao Yan, Shiyu Jin, Quanqing Xu, Chuanhui Yang, Yuanyuan Zhu, Chuang Hu, Bo Du, Jiawei Jiang

    Abstract: Text-attributed graph (TAG) is an important type of graph structured data with text descriptions for each node. Few- and zero-shot node classification on TAGs have many applications in fields such as academia and social networks. However, the two tasks are challenging due to the lack of supervision signals, and existing methods only use the contrastive loss to align graph-based node embedding and… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  19. arXiv:2409.00664  [pdf, other

    q-bio.NC cs.LG

    Video-based Analysis Reveals Atypical Social Gaze in People with Autism Spectrum Disorder

    Authors: Xiangxu Yu, Mindi Ruan, Chuanbo Hu, Wenqi Li, Lynn K. Paul, Xin Li, Shuo Wang

    Abstract: In this study, we present a quantitative and comprehensive analysis of social gaze in people with autism spectrum disorder (ASD). Diverging from traditional first-person camera perspectives based on eye-tracking technologies, this study utilizes a third-person perspective database from the Autism Diagnostic Observation Schedule, 2nd Edition (ADOS-2) interview videos, encompassing ASD participants… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  20. arXiv:2408.17090  [pdf, other

    cs.LG cs.AI cs.CV

    FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition

    Authors: Chen Hu, Jingjing Deng, Xianghua Xie, Xiaoke Ma

    Abstract: Federated learning is a machine learning paradigm that enables decentralized clients to collaboratively learn a shared model while keeping all the training data local. While considerable research has focused on federated image generation, particularly Generative Adversarial Networks, Variational Autoencoders have received less attention. In this paper, we address the challenges of non-IID (indepen… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  21. arXiv:2408.13963  [pdf, other

    cs.CV

    Shifted Window Fourier Transform And Retention For Image Captioning

    Authors: Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi

    Abstract: Image Captioning is an important Language and Vision task that finds application in a variety of contexts, ranging from healthcare to autonomous vehicles. As many real-world applications rely on devices with limited resources, much effort in the field was put into the development of lighter and faster models. However, much of the current optimizations focus on the Transformer architecture in contr… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: Pre-print version of paper accepted for ICONIP 2024

  22. arXiv:2408.13959  [pdf, other

    cs.CL

    Bidirectional Awareness Induction in Autoregressive Seq2Seq Models

    Authors: Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi

    Abstract: Autoregressive Sequence-To-Sequence models are the foundation of many Deep Learning achievements in major research fields such as Vision and Natural Language Processing. Despite that, they still present significant limitations. For instance, when errors occur in the early steps of the prediction, the whole output is severely affected. Such reliance on previously predicted tokens and the inherent c… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  23. arXiv:2408.12840  [pdf, other

    cs.LG

    HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices

    Authors: Ao Zhou, Jianlei Yang, Yingjie Qi, Tong Qiao, Yumeng Shi, Cenlin Duan, Weisheng Zhao, Chunming Hu

    Abstract: Graph Neural Networks (GNNs) are becoming increasingly popular for graph-based learning tasks such as point cloud processing due to their state-of-the-art (SOTA) performance. Nevertheless, the research community has primarily focused on improving model expressiveness, lacking consideration of how to design efficient GNN models for edge scenarios with real-time requirements and limited resources. E… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: Accepted by IEEE Transactions on Computers

  24. arXiv:2408.11172  [pdf, other

    cs.LG cs.AI cs.CL cs.LO

    SubgoalXL: Subgoal-based Expert Learning for Theorem Proving

    Authors: Xueliang Zhao, Lin Zheng, Haige Bo, Changran Hu, Urmish Thakker, Lingpeng Kong

    Abstract: Formal theorem proving, a field at the intersection of mathematics and computer science, has seen renewed interest with advancements in large language models (LLMs). This paper introduces SubgoalXL, a novel approach that synergizes subgoal-based proofs with expert learning to enhance LLMs' capabilities in formal theorem proving within the Isabelle environment. SubgoalXL addresses two critical chal… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  25. arXiv:2408.09851  [pdf, other

    cs.NI eess.SY

    ISAC-Fi: Enabling Full-fledged Monostatic Sensing over Wi-Fi Communication

    Authors: Zhe Chen, Chao Hu, Tianyue Zheng, Hangcheng Cao, Yanbing Yang, Yen Chu, Hongbo Jiang, Jun Luo

    Abstract: Whereas Wi-Fi communications have been exploited for sensing purpose for over a decade, the bistatic or multistatic nature of Wi-Fi still poses multiple challenges, hampering real-life deployment of integrated sensing and communication (ISAC) within Wi-Fi framework. In this paper, we aim to re-design WiFi so that monostatic sensing (mimicking radar) can be achieved over the multistatic communicati… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 14 pages, 22 figures

  26. arXiv:2408.07413  [pdf, other

    cs.CL

    Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models

    Authors: Chenhui Hu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao

    Abstract: Knowledge editing aims to update outdated or incorrect knowledge in large language models (LLMs). However, current knowledge editing methods have limited scalability for lifelong editing. This study explores the fundamental reason why knowledge editing fails in lifelong editing. We begin with the closed-form solution derived from linear associative memory, which underpins state-of-the-art knowledg… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  27. arXiv:2408.06592  [pdf, other

    cs.CV

    ActiveNeRF: Learning Accurate 3D Geometry by Active Pattern Projection

    Authors: Jianyu Tao, Changping Hu, Edward Yang, Jing Xu, Rui Chen

    Abstract: NeRFs have achieved incredible success in novel view synthesis. However, the accuracy of the implicit geometry is unsatisfactory because the passive static environmental illumination has low spatial frequency and cannot provide enough information for accurate geometry reconstruction. In this work, we propose ActiveNeRF, a 3D geometry reconstruction framework, which improves the geometry quality of… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 18 pages, 10 figures

  28. arXiv:2408.04886  [pdf

    cs.PF

    Automated PMC-based Power Modeling Methodology for Modern Mobile GPUs

    Authors: Pranab Dash, Y. Charlie Hu, Abhilash Jindal

    Abstract: The rise of machine learning workload on smartphones has propelled GPUs into one of the most power-hungry components of modern smartphones and elevates the need for optimizing the GPU power draw by mobile apps. Optimizing the power consumption of mobile GPUs in turn requires accurate estimation of their power draw during app execution. In this paper, we observe that the prior-art, utilization-freq… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  29. arXiv:2408.01691  [pdf, other

    cs.LG cs.AI

    TreeCSS: An Efficient Framework for Vertical Federated Learning

    Authors: Qinbo Zhang, Xiao Yan, Yukai Ding, Quanqing Xu, Chuang Hu, Xiaokai Zhou, Jiawei Jiang

    Abstract: Vertical federated learning (VFL) considers the case that the features of data samples are partitioned over different participants. VFL consists of two main steps, i.e., identify the common data samples for all participants (alignment) and train model using the aligned data samples (training). However, when there are many participants and data samples, both alignment and training become slow. As s… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 16 pages, 7 figures

  30. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  31. arXiv:2407.18487  [pdf, other

    cs.CV

    SMPISD-MTPNet: Scene Semantic Prior-Assisted Infrared Ship Detection Using Multi-Task Perception Networks

    Authors: Chen Hu, Xiaogang Dong, Yian Huang Lele Wang, Liang Xu, Tian Pu, Zhenming Peng

    Abstract: Infrared ship detection (IRSD) has received increasing attention in recent years due to the robustness of infrared images to adverse weather. However, a large number of false alarms may occur in complex scenes. To address these challenges, we propose the Scene Semantic Prior-Assisted Multi-Task Perception Network (SMPISD-MTPNet), which includes three stages: scene semantic extraction, deep feature… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  32. Adaptive Differentially Private Structural Entropy Minimization for Unsupervised Social Event Detection

    Authors: Zhiwei Yang, Yuecen Wei, Haoran Li, Qian Li, Lei Jiang, Li Sun, Xiaoyan Yu, Chunming Hu, Hao Peng

    Abstract: Social event detection refers to extracting relevant message clusters from social media data streams to represent specific events in the real world. Social event detection is important in numerous areas, such as opinion analysis, social safety, and decision-making. Most current methods are supervised and require access to large amounts of data. These methods need prior knowledge of the events and… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Accepted to ACM CIKM 2024

  33. arXiv:2407.16996  [pdf, other

    cs.CE math.AT

    Quotient complex (QC)-based machine learning for 2D perovskite design

    Authors: Chuan-Shen Hu, Rishikanta Mayengbam, Kelin Xia, Tze Chien Sum

    Abstract: With remarkable stability and exceptional optoelectronic properties, two-dimensional (2D) halide layered perovskites hold immense promise for revolutionizing photovoltaic technology. Presently, inadequate representations have substantially impeded the design and discovery of 2D perovskites. In this context, we introduce a novel computational topology framework termed the quotient complex (QC), whi… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  34. arXiv:2407.15959  [pdf, other

    cs.HC

    "It's a Good Idea to Put It Into Words": Writing `Rudders' in the Initial Stages of Visualization Design

    Authors: Chase Stokes, Clara Hu, Marti A. Hearst

    Abstract: Written language is a useful tool for non-visual creative activities like writing essays and planning searches. This paper investigates the integration of written language in to the visualization design process. We create the idea of a 'writing rudder,' which acts as a guiding force or strategy for the design. Via an interview study of 24 working visualization designers, we first established that… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures, accepted for IEEE VIS conference 2024

    ACM Class: H.5.0

  35. arXiv:2407.15848  [pdf, other

    cs.CV

    BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-scale Scenes

    Authors: Chih-Hai Su, Chih-Yao Hu, Shr-Ruei Tsai, Jie-Ying Lee, Chin-Yang Lin, Yu-Lun Liu

    Abstract: While Neural Radiance Fields (NeRFs) have demonstrated exceptional quality, their protracted training duration remains a limitation. Generalizable and MVS-based NeRFs, although capable of mitigating training time, often incur tradeoffs in quality. This paper presents a novel approach called BoostMVSNeRFs to enhance the rendering quality of MVS-based NeRFs in large-scale scenes. We first identify l… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: SIGGRAPH 2024 Conference Papers. Project page: https://su-terry.github.io/BoostMVSNeRFs/

  36. arXiv:2407.15098  [pdf, other

    cs.CR cs.LG

    SeqMIA: Sequential-Metric Based Membership Inference Attack

    Authors: Hao Li, Zheng Li, Siyuan Wu, Chengrui Hu, Yutong Ye, Min Zhang, Dengguo Feng, Yang Zhang

    Abstract: Most existing membership inference attacks (MIAs) utilize metrics (e.g., loss) calculated on the model's final state, while recent advanced attacks leverage metrics computed at various stages, including both intermediate and final stages, throughout the model training. Nevertheless, these attacks often process multiple intermediate states of the metric independently, ignoring their time-dependent… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM CCS 2024

  37. arXiv:2407.14335  [pdf, other

    econ.GN cs.CE cs.CR q-fin.CP stat.CO

    Quantifying the Blockchain Trilemma: A Comparative Analysis of Algorand, Ethereum 2.0, and Beyond

    Authors: Yihang Fu, Mingwei Jing, Jiaolun Zhou, Peilin Wu, Ye Wang, Luyao Zhang, Chuang Hu

    Abstract: Blockchain technology is essential for the digital economy and metaverse, supporting applications from decentralized finance to virtual assets. However, its potential is constrained by the "Blockchain Trilemma," which necessitates balancing decentralization, security, and scalability. This study evaluates and compares two leading proof-of-stake (PoS) systems, Algorand and Ethereum 2.0, against the… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  38. arXiv:2407.11750  [pdf, other

    cs.CV

    Cycle Contrastive Adversarial Learning for Unsupervised image Deraining

    Authors: Chen Zhao, Weiling Cai, ChengWei Hu, Zheng Yuan

    Abstract: To tackle the difficulties in fitting paired real-world data for single image deraining (SID), recent unsupervised methods have achieved notable success. However, these methods often struggle to generate high-quality, rain-free images due to a lack of attention to semantic representation and image content, resulting in ineffective separation of content from the rain layer. In this paper, we propos… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  39. arXiv:2407.06309  [pdf, other

    cs.CY cs.AI

    Multimodal Chain-of-Thought Reasoning via ChatGPT to Protect Children from Age-Inappropriate Apps

    Authors: Chuanbo Hu, Bin Liu, Minglei Yin, Yilu Zhou, Xin Li

    Abstract: Mobile applications (Apps) could expose children to inappropriate themes such as sexual content, violence, and drug use. Maturity rating offers a quick and effective method for potential users, particularly guardians, to assess the maturity levels of apps. Determining accurate maturity ratings for mobile apps is essential to protect children's health in today's saturated digital marketplace. Exist… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  40. arXiv:2407.04929  [pdf, other

    cs.RO

    Toward Precise Robotic Weed Flaming Using a Mobile Manipulator with a Flamethrower

    Authors: Di Wang, Chengsong Hu, Shuangyu Xie, Joe Johnson, Hojun Ji, Yingtao Jiang, Muthukumar Bagavathiannan, Dezhen Song

    Abstract: Robotic weed flaming is a new and environmentally friendly approach to weed removal in the agricultural field. Using a mobile manipulator equipped with a flamethrower, we design a new system and algorithm to enable effective weed flaming, which requires robotic manipulation with a soft and deformable end effector, as the thermal coverage of the flame is affected by dynamic or unknown environmental… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: IROS 2024

  41. arXiv:2407.01168  [pdf, other

    cs.CV cs.AI

    Multi-View Black-Box Physical Attacks on Infrared Pedestrian Detectors Using Adversarial Infrared Grid

    Authors: Kalibinuer Tiliwalidi, Chengyin Hu, Weiwen Shi

    Abstract: While extensive research exists on physical adversarial attacks within the visible spectrum, studies on such techniques in the infrared spectrum are limited. Infrared object detectors are vital in modern technological applications but are susceptible to adversarial attacks, posing significant security threats. Previous studies using physical perturbations like light bulb arrays and aerogels for wh… ▽ More

    Submitted 8 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  42. arXiv:2406.18067  [pdf, other

    cs.CL eess.AS

    Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification

    Authors: Yaqian Hao, Chenguang Hu, Yingying Gao, Shilei Zhang, Junlan Feng

    Abstract: The diverse nature of dialects presents challenges for models trained on specific linguistic patterns, rendering them susceptible to errors when confronted with unseen or out-of-distribution (OOD) data. This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects. By integrating a generative model and the energy margin loss, our appro… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  43. arXiv:2406.18065  [pdf, other

    eess.AS cs.SD

    On Calibration of Speech Classification Models: Insights from Energy-Based Model Investigations

    Authors: Yaqian Hao, Chenguang Hu, Yingying Gao, Shilei Zhang, Junlan Feng

    Abstract: For speech classification tasks, deep learning models often achieve high accuracy but exhibit shortcomings in calibration, manifesting as classifiers exhibiting overconfidence. The significance of calibration lies in its critical role in guaranteeing the reliability of decision-making within deep learning systems. This study explores the effectiveness of Energy-Based Models in calibrating confiden… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  44. arXiv:2406.18007  [pdf, other

    cs.MM

    Deep Mamba Multi-modal Learning

    Authors: Jian Zhu, Xin Zou, Yu Cui, Zhangmin Huang, Chenshu Hu, Bo Lyu

    Abstract: Inspired by the excellent performance of Mamba networks, we propose a novel Deep Mamba Multi-modal Learning (DMML). It can be used to achieve the fusion of multi-modal features. We apply DMML to the field of multimedia retrieval and propose an innovative Deep Mamba Multi-modal Hashing (DMMH) method. It combines the advantages of algorithm accuracy and inference speed. We validated the effectivenes… ▽ More

    Submitted 9 April, 2024; originally announced June 2024.

    Comments: Deep Mamba Multi-modal Learning; Deep Mamba Multi-modal Hashing

  45. arXiv:2406.17565  [pdf, other

    cs.DC

    MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool

    Authors: Cunchen Hu, Heyang Huang, Junhao Hu, Jiang Xu, Xusheng Chen, Tao Xie, Chenxi Wang, Sa Wang, Yungang Bao, Ninghui Sun, Yizhou Shan

    Abstract: Large language model (LLM) serving has transformed from stateless to stateful systems, utilizing techniques like context caching and disaggregated inference. These optimizations extend the lifespan and domain of the KV cache, necessitating a new architectural approach. We present MemServe, a unified system that integrates both inter-request and intra-request optimizations. MemServe introduces MemP… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  46. arXiv:2406.17219  [pdf, other

    cs.CV

    Facial Identity Anonymization via Intrinsic and Extrinsic Attention Distraction

    Authors: Zhenzhong Kuang, Xiaochen Yang, Yingjie Shen, Chao Hu, Jun Yu

    Abstract: The unprecedented capture and application of face images raise increasing concerns on anonymization to fight against privacy disclosure. Most existing methods may suffer from the problem of excessive change of the identity-independent information or insufficient identity protection. In this paper, we present a new face anonymization approach by distracting the intrinsic and extrinsic identity atte… ▽ More

    Submitted 6 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: Zhenzhong Kuang, Xiaochen Yang, Yingjie Shen, Chao Hu, Jun Yu; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 12406-12415 Date of Conference: 17-21 June 2024 Conference Location: Seattle, USA

  47. arXiv:2406.13361  [pdf, other

    cs.CL cs.LG

    Improving Zero-Shot Cross-Lingual Transfer via Progressive Code-Switching

    Authors: Zhuoran Li, Chunming Hu, Junfan Chen, Zhijun Chen, Xiaohui Guo, Richong Zhang

    Abstract: Code-switching is a data augmentation scheme mixing words from multiple languages into source lingual text. It has achieved considerable generalization performance of cross-lingual transfer tasks by aligning cross-lingual contextual word representations. However, uncontrolled and over-replaced code-switching would augment dirty samples to model training. In other words, the excessive code-switchin… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures, 6 tables. Accepted by International Joint Conference on Artificial Intelligence (IJCAI 2024)

  48. arXiv:2406.13268  [pdf, other

    eess.AS cs.SD

    CEC: A Noisy Label Detection Method for Speaker Recognition

    Authors: Yao Shen, Yingying Gao, Yaqian Hao, Chenguang Hu, Fulin Zhang, Junlan Feng, Shilei Zhang

    Abstract: Noisy labels are inevitable, even in well-annotated datasets. The detection of noisy labels is of significant importance to enhance the robustness of speaker recognition models. In this paper, we propose a novel noisy label detection approach based on two new statistical metrics: Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC). These metrics are calculated through Cros… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: interspeech 2024

  49. arXiv:2406.08751  [pdf, other

    cs.AI

    3D Building Generation in Minecraft via Large Language Models

    Authors: Shiying Hu, Zengrong Huang, Chengpeng Hu, Jialin Liu

    Abstract: Recently, procedural content generation has exhibited considerable advancements in the domain of 2D game level generation such as Super Mario Bros. and Sokoban through large language models (LLMs). To further validate the capabilities of LLMs, this paper explores how LLMs contribute to the generation of 3D buildings in a sandbox game, Minecraft. We propose a Text to Building in Minecraft (T2BM) mo… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted by IEEE Conference on Games

  50. arXiv:2406.08216  [pdf, ps, other

    cs.SE

    A Software Engineering Perspective on Testing Large Language Models: Research, Practice, Tools and Benchmarks

    Authors: Sinclair Hudson, Sophia Jit, Boyue Caroline Hu, Marsha Chechik

    Abstract: Large Language Models (LLMs) are rapidly becoming ubiquitous both as stand-alone tools and as components of current and future software systems. To enable usage of LLMs in the high-stake or safety-critical systems of 2030, they need to undergo rigorous testing. Software Engineering (SE) research on testing Machine Learning (ML) components and ML-based systems has systematically explored many topic… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.