Skip to main content

Showing 1–50 of 73 results for author: Fan, B

  1. arXiv:2409.10890  [pdf, other

    eess.IV cs.CV

    SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance

    Authors: Shun Zou, Mingya Zhang, Bingjian Fan, Zhengyi Zhou, Xiuguo Zou

    Abstract: Skin lesion segmentation is a crucial method for identifying early skin cancer. In recent years, both convolutional neural network (CNN) and Transformer-based methods have been widely applied. Moreover, combining CNN and Transformer effectively integrates global and local relationships, but remains limited by the quadratic complexity of Transformer. To address this, we propose a hybrid architectur… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: Submitted to ACCV2024 workshop

  2. arXiv:2408.01044  [pdf, other

    cs.CV

    Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model

    Authors: Yang Jin, Lei Zhang, Shi Yan, Bin Fan, Binglu Wang

    Abstract: Gaze object prediction (GOP) aims to predict the category and location of the object that a human is looking at. Previous methods utilized box-level supervision to identify the object that a person is looking at, but struggled with semantic ambiguity, ie, a single box may contain several items since objects are close together. The Vision foundation model (VFM) has improved in object segmentation u… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV2024

  3. arXiv:2407.18483  [pdf

    cs.CL cs.AI

    A Role-specific Guided Large Language Model for Ophthalmic Consultation Based on Stylistic Differentiation

    Authors: Laiyi Fu, Binbin Fan, Hongkai Du, Yanxiang Feng, Chunhua Li, Huping Song

    Abstract: Ophthalmology consultations are crucial for diagnosing, treating, and preventing eye diseases. However, the growing demand for consultations exceeds the availability of ophthalmologists. By leveraging large pre-trained language models, we can design effective dialogues for specific scenarios, aiding in consultations. Traditional fine-tuning strategies for question-answering tasks are impractical d… ▽ More

    Submitted 31 July, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  4. arXiv:2407.06115  [pdf, other

    cs.CV cs.AI cs.CL

    Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline

    Authors: Qi Jia, Baoyu Fan, Cong Xu, Lu Liu, Liang Jin, Guoguang Du, Zhenhua Guo, Yaqian Zhao, Xuanjing Huang, Rengang Li

    Abstract: Existing video multi-modal sentiment analysis mainly focuses on the sentiment expression of people within the video, yet often neglects the induced sentiment of viewers while watching the videos. Induced sentiment of viewers is essential for inferring the public response to videos, has broad application in analyzing public societal sentiment, effectiveness of advertising and other areas. The micro… ▽ More

    Submitted 15 May, 2024; originally announced July 2024.

  5. arXiv:2407.05098  [pdf, other

    cs.LG cs.AI

    FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning

    Authors: Boyu Fan, Chenrui Wu, Xiang Su, Pan Hui

    Abstract: Despite extensive research into data heterogeneity in federated learning (FL), system heterogeneity remains a significant yet often overlooked challenge. Traditional FL approaches typically assume homogeneous hardware resources across FL clients, implying that clients can train a global model within a comparable time frame. However, in practical FL systems, clients often have heterogeneous resourc… ▽ More

    Submitted 15 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024

  6. arXiv:2407.01007  [pdf, other

    cs.CV

    GMT: A Robust Global Association Model for Multi-Target Multi-Camera Tracking

    Authors: Huijie Fan, Tinghui Zhao, Qiang Wang, Baojie Fan, Yandong Tang, LianQing Liu

    Abstract: In the task of multi-target multi-camera (MTMC) tracking of pedestrians, the data association problem is a key issue and main challenge, especially with complications arising from camera movements, lighting variations, and obstructions. However, most MTMC models adopt two-step approaches, thus heavily depending on the results of the first-step tracking in practical applications. Moreover, the same… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  7. arXiv:2406.05962  [pdf, other

    cs.DC cs.DB

    Data Caching for Enterprise-Grade Petabyte-Scale OLAP

    Authors: Chunxu Tang, Bin Fan, Jing Zhao, Chen Liang, Yi Wang, Beinan Wang, Ziyue Qiu, Lu Qiu, Bowen Ding, Shouzhuo Sun, Saiguang Che, Jiaming Mai, Shouwei Chen, Yu Zhu, Jianjian Xie, Yutian, Sun, Yao Li, Yangjun Zhang, Ke Wang, Mingmin Chen

    Abstract: With the exponential growth of data and evolving use cases, petabyte-scale OLAP data platforms are increasingly adopting a model that decouples compute from storage. This shift, evident in organizations like Uber and Meta, introduces operational challenges including massive, read-heavy I/O traffic with potential throttling, as well as skewed and fragmented data access patterns. Addressing these ch… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted to the USENIX Annual Technical Conference (USENIX ATC) 2024

  8. arXiv:2405.18882  [pdf, other

    cs.CV

    DecomCAM: Advancing Beyond Saliency Maps through Decomposition and Integration

    Authors: Yuguang Yang, Runtang Guo, Sheng Wu, Yimi Wang, Linlin Yang, Bo Fan, Jilong Zhong, Juan Zhang, Baochang Zhang

    Abstract: Interpreting complex deep networks, notably pre-trained vision-language models (VLMs), is a formidable challenge. Current Class Activation Map (CAM) methods highlight regions revealing the model's decision-making basis but lack clear saliency maps and detailed interpretability. To bridge this gap, we propose DecomCAM, a novel decomposition-and-integration method that distills shared patterns from… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by Neurocomputing journal

  9. Public Technologies Transforming Work of the Public and the Public Sector

    Authors: Seyun Kim, Bonnie Fan, Willa Yunqi Yang, Jessie Ramey, Sarah E Fox, Haiyi Zhu, John Zimmerman, Motahhare Eslami

    Abstract: Technologies adopted by the public sector have transformed the work practices of employees in public agencies by creating different means of communication and decision-making. Although much of the recent research in the future of work domain has concentrated on the effects of technological advancements on public sector employees, the influence on work practices of external stakeholders engaging wi… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  10. arXiv:2405.15225  [pdf, other

    cs.CV

    Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection

    Authors: Yajing Liu, Shijun Zhou, Xiyao Liu, Chunhui Hao, Baojie Fan, Jiandong Tian

    Abstract: Single-source domain generalization (SDG) for object detection is a challenging yet essential task as the distribution bias of the unseen domain degrades the algorithm performance significantly. However, existing methods attempt to extract domain-invariant features, neglecting that the biased data leads the network to learn biased features that are non-causal and poorly generalizable. To this end,… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  11. arXiv:2405.08245  [pdf

    cs.CV cs.AI

    Progressive enhancement and restoration for mural images under low-light and defected conditions based on multi-receptive field strategy

    Authors: Xiameng Wei, Binbin Fan, Ying Wang, Yanxiang Feng, Laiyi Fu

    Abstract: Ancient murals are valuable cultural heritage with great archaeological value. They provide insights into ancient religions, ceremonies, folklore, among other things through their content. However, due to long-term oxidation and inadequate protection, ancient murals have suffered continuous damage, including peeling and mold etc. Additionally, since ancient murals were typically painted indoors, t… ▽ More

    Submitted 16 July, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  12. arXiv:2405.04909  [pdf, other

    cs.CV cs.AI

    Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

    Authors: Zhengxing Lan, Hongbo Li, Lingshan Liu, Bo Fan, Yisheng Lv, Yilong Ren, Zhiyong Cui

    Abstract: Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of the complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) without explici… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  13. arXiv:2404.05960  [pdf, other

    cs.CV

    EasyTrack: Efficient and Compact One-stream 3D Point Clouds Tracker

    Authors: Baojie Fan, Wuyang Zhou, Kai Wang, Shijun Zhou, Fengyu Xu, Jiandong Tian

    Abstract: Most of 3D single object trackers (SOT) in point clouds follow the two-stream multi-stage 3D Siamese or motion tracking paradigms, which process the template and search area point clouds with two parallel branches, built on supervised point cloud backbones. In this work, beyond typical 3D Siamese or motion tracking, we propose a neat and compact one-stream transformer 3D SOT paradigm from the nove… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  14. arXiv:2404.00360  [pdf, other

    cs.CV

    Reusable Architecture Growth for Continual Stereo Matching

    Authors: Chenghao Zhang, Gaofeng Meng, Bin Fan, Kun Tian, Zhaoxiang Zhang, Shiming Xiang, Chunhong Pan

    Abstract: The remarkable performance of recent stereo depth estimation models benefits from the successful use of convolutional neural networks to regress dense disparity. Akin to most tasks, this needs gathering training data that covers a number of heterogeneous scenes at deployment time. However, training samples are typically acquired continuously in practical applications, making the capability to lear… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Extended version of CVPR 2022 paper "Continual Stereo Matching of Continuous Driving Scenes with Growing Architecture" - Accepted to TPAMI in 2024

  15. arXiv:2403.16374  [pdf, other

    cs.LG cs.CV cs.RO

    ProIn: Learning to Predict Trajectory Based on Progressive Interactions for Autonomous Driving

    Authors: Yinke Dong, Haifeng Yuan, Hongkun Liu, Wei Jing, Fangzhen Li, Hongmin Liu, Bin Fan

    Abstract: Accurate motion prediction of pedestrians, cyclists, and other surrounding vehicles (all called agents) is very important for autonomous driving. Most existing works capture map information through an one-stage interaction with map by vector-based attention, to provide map constraints for social interaction and multi-modal differentiation. However, these methods have to encode all required map rul… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  16. arXiv:2403.15156  [pdf, other

    cs.RO cs.CV eess.SY

    Infrastructure-Assisted Collaborative Perception in Automated Valet Parking: A Safety Perspective

    Authors: Yukuan Jia, Jiawen Zhang, Shimeng Lu, Baokang Fan, Ruiqing Mao, Sheng Zhou, Zhisheng Niu

    Abstract: Environmental perception in Automated Valet Parking (AVP) has been a challenging task due to severe occlusions in parking garages. Although Collaborative Perception (CP) can be applied to broaden the field of view of connected vehicles, the limited bandwidth of vehicular communications restricts its application. In this work, we propose a BEV feature-based CP network architecture for infrastructur… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 7 pages, 7 figures, 4 tables, accepted by IEEE VTC2024-Spring

  17. arXiv:2403.14910  [pdf, other

    cs.CV

    Defying Imbalanced Forgetting in Class Incremental Learning

    Authors: Shixiong Xu, Gaofeng Meng, Xing Nie, Bolin Ni, Bin Fan, Shiming Xiang

    Abstract: We observe a high level of imbalance in the accuracy of different classes in the same old task for the first time. This intriguing phenomenon, discovered in replay-based Class Incremental Learning (CIL), highlights the imbalanced forgetting of learned classes, as their accuracy is similar before the occurrence of catastrophic forgetting. This discovery remains previously unidentified due to the re… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: AAAI2024

  18. arXiv:2402.09836  [pdf, other

    cs.AI

    Chain-of-Planned-Behaviour Workflow Elicits Few-Shot Mobility Generation in LLMs

    Authors: Chenyang Shao, Fengli Xu, Bingbing Fan, Jingtao Ding, Yuan Yuan, Meng Wang, Yong Li

    Abstract: The powerful reasoning capabilities of large language models (LLMs) have brought revolutionary changes to many fields, but their performance in human behaviour generation has not yet been extensively explored. This gap likely emerges because the internal processes governing behavioral intentions cannot be solely explained by abstract reasoning. Instead, they are also influenced by a multitude of f… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  19. arXiv:2402.01158  [pdf, other

    cs.CL

    LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning

    Authors: Rongsheng Wang, Haoming Chen, Ruizhe Zhou, Han Ma, Yaofei Duan, Yanlan Kang, Songhua Yang, Baoyu Fan, Tao Tan

    Abstract: ChatGPT and other general large language models (LLMs) have achieved remarkable success, but they have also raised concerns about the misuse of AI-generated texts. Existing AI-generated text detection models, such as based on BERT and RoBERTa, are prone to in-domain over-fitting, leading to poor out-of-domain (OOD) detection performance. In this paper, we first collected Chinese text responses gen… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 17 pages, 13 tables, 7 figures

  20. arXiv:2312.12091  [pdf, other

    cs.DC

    Model-Heterogeneous Federated Learning for Internet of Things: Enabling Technologies and Future Directions

    Authors: Boyu Fan, Siyang Jiang, Xiang Su, Pan Hui

    Abstract: Internet of Things (IoT) interconnects a massive amount of devices, generating heterogeneous data with diverse characteristics. IoT data emerges as a vital asset for data-intensive IoT applications, such as healthcare, smart city and predictive maintenance, harnessing the vast volume of heterogeneous data to its maximum advantage. These applications leverage different Artificial Intelligence (AI)… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  21. arXiv:2312.11930  [pdf, other

    cs.RO eess.SY

    InPTC: Integrated Planning and Tube-Following Control for Prescribed-Time Collision-Free Navigation of Wheeled Mobile Robots

    Authors: Xiaodong Shao, Bin Zhang, Hui Zhi, Jose Guadalupe Romero, Bowen Fan, Qinglei Hu, David Navarro-Alarcon

    Abstract: In this article, we propose a novel approach, called InPTC (Integrated Planning and Tube-Following Control), for prescribed-time collision-free navigation of wheeled mobile robots in a compact convex workspace cluttered with static, sufficiently separated, and convex obstacles. A path planner with prescribed-time convergence is presented based upon Bouligand's tangent cones and time scale transfor… ▽ More

    Submitted 27 August, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  22. arXiv:2312.11497  [pdf, other

    cs.HC cs.CY

    The Public Algorithms Survey in Allegheny County

    Authors: Yu-Ru Lin, Beth Schwanke, Rosta Farzan, Bonnie Fan, Motahhare Eslami, Hong Shen, Sarah Fox

    Abstract: This survey study focuses on public opinion regarding the use of algorithmic decision-making in government sectors, specifically in Allegheny County, Pennsylvania. Algorithms are becoming increasingly prevalent in various public domains, including both routine and high-stakes government functions. Despite their growing use, public sentiment remains divided, with concerns about privacy and accuracy… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  23. arXiv:2312.10713  [pdf, other

    cs.CV cs.AI

    Synthesizing Black-box Anti-forensics DeepFakes with High Visual Quality

    Authors: Bing Fan, Shu Hu, Feng Ding

    Abstract: DeepFake, an AI technology for creating facial forgeries, has garnered global attention. Amid such circumstances, forensics researchers focus on developing defensive algorithms to counter these threats. In contrast, there are techniques developed for enhancing the aggressiveness of DeepFake, e.g., through anti-forensics attacks, to disrupt forensic detectors. However, such attacks often sacrifice… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted for publication at ICASSP 2024

    ACM Class: I.4.5

  24. arXiv:2312.07132  [pdf, other

    cs.CV cs.MM

    Image Content Generation with Causal Reasoning

    Authors: Xiaochuan Li, Baoyu Fan, Runze Zhang, Liang Jin, Di Wang, Zhenhua Guo, Yaqian Zhao, Rengang Li

    Abstract: The emergence of ChatGPT has once again sparked research in generative artificial intelligence (GAI). While people have been amazed by the generated results, they have also noticed the reasoning potential reflected in the generated textual content. However, this current ability for causal reasoning is primarily limited to the domain of language generation, such as in models like GPT-3. In visual m… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted by the 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024) in December 2023

  25. arXiv:2311.11971  [pdf, other

    cs.CV

    LiDAR-HMR: 3D Human Mesh Recovery from LiDAR

    Authors: Bohao Fan, Wenzhao Zheng, Jianjiang Feng, Jie Zhou

    Abstract: In recent years, point cloud perception tasks have been garnering increasing attention. This paper presents the first attempt to estimate 3D human body mesh from sparse LiDAR point clouds. We found that the major challenge in estimating human pose and mesh from point clouds lies in the sparsity, noise, and incompletion of LiDAR point clouds. Facing these challenges, we propose an effective sparse-… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Code is available at: https://github.com/soullessrobot/LiDAR-HMR/

  26. arXiv:2311.00156  [pdf, other

    cs.DC cs.DB

    Rethinking the Cloudonomics of Efficient I/O for Data-Intensive Analytics Applications

    Authors: Chunxu Tang, Yi Wang, Bin Fan, Beinan Wang, Shouwei Chen, Ziyue Qiu, Chen Liang, Jing Zhao, Yu Zhu, Mingmin Chen, Zhongting Hu

    Abstract: This paper explores a prevailing trend in the industry: migrating data-intensive analytics applications from on-premises to cloud-native environments. We find that the unique cost models associated with cloud-based storage necessitate a more nuanced understanding of optimizing performance. Specifically, based on traces collected from Uber's Presto fleet in production, we argue that common I/O opti… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures

  27. arXiv:2309.13411  [pdf, other

    cs.LG cs.AI cs.CV

    Towards Attributions of Input Variables in a Coalition

    Authors: Xinhao Zheng, Huiqi Deng, Bo Fan, Quanshi Zhang

    Abstract: This paper aims to develop a new attribution method to explain the conflict between individual variables' attributions and their coalition's attribution from a fully new perspective. First, we find that the Shapley value can be reformulated as the allocation of Harsanyi interactions encoded by the AI model. Second, based the re-alloction of interactions, we extend the Shapley value to the attribut… ▽ More

    Submitted 28 November, 2023; v1 submitted 23 September, 2023; originally announced September 2023.

  28. arXiv:2308.00628  [pdf, other

    cs.CV cs.AI cs.LG

    Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes

    Authors: Bohao Fan, Siqi Wang, Wenxuan Guo, Wenzhao Zheng, Jianjiang Feng, Jie Zhou

    Abstract: 3D human pose estimation in outdoor environments has garnered increasing attention recently. However, prevalent 3D human pose datasets pertaining to outdoor scenes lack diversity, as they predominantly utilize only one type of modality (RGB image or pointcloud), and often feature only one individual within each scene. This limited scope of dataset infrastructure considerably hinders the variabilit… ▽ More

    Submitted 6 August, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: Code and data will be released on https://github.com/soullessrobot/Human-M3-Dataset

  29. arXiv:2306.12377  [pdf, ps, other

    cs.LG cs.CG cs.CR

    Geometric Algorithms for $k$-NN Poisoning

    Authors: Diego Ihara Centurion, Karine Chubarian, Bohan Fan, Francesco Sgherzi, Thiruvenkadam S Radhakrishnan, Anastasios Sidiropoulos, Angelo Straight

    Abstract: We propose a label poisoning attack on geometric data sets against $k$-nearest neighbor classification. We provide an algorithm that can compute an $\varepsilon n$-additive approximation of the optimal poisoning in $n\cdot 2^{2^{O(d+k/\varepsilon)}}$ time for a given data set $X \in \mathbb{R}^d$, where $|X| = n$. Our algorithm achieves its objectives through the application of multi-scale random… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: 14 pages, 1 figure

  30. arXiv:2305.09783  [pdf, other

    q-fin.CP cs.CE cs.LG

    Deep Learning for Solving and Estimating Dynamic Macro-Finance Models

    Authors: Benjamin Fan, Edward Qiao, Anran Jiao, Zhouzhou Gu, Wenhao Li, Lu Lu

    Abstract: We develop a methodology that utilizes deep learning to simultaneously solve and estimate canonical continuous-time general equilibrium models in financial economics. We illustrate our method in two examples: (1) industrial dynamics of firms and (2) macroeconomic models with financial frictions. Through these applications, we illustrate the advantages of our method: generality, simultaneous soluti… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  31. arXiv:2304.10771  [pdf, other

    cs.CV

    A Revisit of the Normalized Eight-Point Algorithm and A Self-Supervised Deep Solution

    Authors: Bin Fan, Yuchao Dai, Yongduek Seo, Mingyi He

    Abstract: The normalized eight-point algorithm has been widely viewed as the cornerstone in two-view geometry computation, where the seminal Hartley's normalization has greatly improved the performance of the direct linear transformation algorithm. A natural question is, whether there exists and how to find other normalization methods that may further improve the performance as per each input sample. In thi… ▽ More

    Submitted 15 January, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: Accepted by Visual Intelligence

  32. arXiv:2302.11661  [pdf, other

    cs.RO

    Linear Kinematics for General Constant Curvature and Torsion Manipulators

    Authors: Bill Fan, Farhan Rozaidi, Capprin Bass, Gina Olson, Melinda Malley, Ross L Hatton

    Abstract: We present a novel general model that unifies the kinematics of constant curvature and constant twist continuum manipulators. Combining this kinematics with energy-based physics, we derive a linear mapping from actuator configuration to manipulator deformation that is analogous to traditional robot forward kinematics. Our model generalizes across manipulators with different sizes, types of bending… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    Comments: Accepted for presentation at the 6th IEEE-RAS International Conference on Soft Robotics (RoboSoft 2023)

  33. arXiv:2302.10465  [pdf, other

    cs.CV

    A Flexible Multi-view Multi-modal Imaging System for Outdoor Scenes

    Authors: Meng Zhang, Wenxuan Guo, Bohao Fan, Yifan Chen, Jianjiang Feng, Jie Zhou

    Abstract: Multi-view imaging systems enable uniform coverage of 3D space and reduce the impact of occlusion, which is beneficial for 3D object detection and tracking accuracy. However, existing imaging systems built with multi-view cameras or depth sensors are limited by the small applicable scene and complicated composition. In this paper, we propose a wireless multi-view multi-modal 3D imaging system gene… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

  34. arXiv:2301.06657   

    cs.CV

    Free Lunch for Generating Effective Outlier Supervision

    Authors: Sen Pei, Jiaxi Sun, Richard Yi Da Xu, Bin Fan, Shiming Xiang, Gaofeng Meng

    Abstract: When deployed in practical applications, computer vision systems will encounter numerous unexpected images (\emph{i.e.}, out-of-distribution data). Due to the potentially raised safety risks, these aforementioned unseen data should be carefully identified and handled. Generally, existing approaches in dealing with out-of-distribution (OOD) detection mainly focus on the statistical difference betwe… ▽ More

    Submitted 17 January, 2024; v1 submitted 16 January, 2023; originally announced January 2023.

    Comments: We have rewritten this paper, and published as "Image Background Serves as Good Proxy for Out-of-distribution Data" arXiv:2307.00519

  35. arXiv:2211.10889  [pdf, other

    cs.DB

    Metadata Caching in Presto: Towards Fast Data Processing

    Authors: Beinan Wang, Chunxu Tang, Rongrong Zhong, Bin Fan, Yi Wang, Jasmine Wang, Shouwei Chen, Bowen Ding, Lu Zhang

    Abstract: Presto is an open-source distributed SQL query engine for OLAP, aiming for "SQL on everything". Since open-sourced in 2013, Presto has been consistently gaining popularity in large-scale data analytics and attracting adoption from a wide range of enterprises. From the development and operation of Presto, we witnessed a significant amount of CPU consumption on parsing column-oriented data files in… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: 5 pages, 8 figures

  36. Learning a Task-specific Descriptor for Robust Matching of 3D Point Clouds

    Authors: Zhiyuan Zhang, Yuchao Dai, Bin Fan, Jiadai Sun, Mingyi He

    Abstract: Existing learning-based point feature descriptors are usually task-agnostic, which pursue describing the individual 3D point clouds as accurate as possible. However, the matching task aims at describing the corresponding points consistently across different 3D point clouds. Therefore these too accurate features may play a counterproductive role due to the inconsistent point feature representations… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 2022

  37. Searching Dense Point Correspondences via Permutation Matrix Learning

    Authors: Zhiyuan Zhang, Jiadai Sun, Yuchao Dai, Bin Fan, Qi Liu

    Abstract: Although 3D point cloud data has received widespread attentions as a general form of 3D signal expression, applying point clouds to the task of dense correspondence estimation between 3D shapes has not been investigated widely. Furthermore, even in the few existing 3D point cloud-based methods, an important and widely acknowledged principle, i.e . one-to-one matching, is usually ignored. In respon… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE Signal Processing Letters (SPL) 2022

  38. arXiv:2210.03040  [pdf, other

    cs.CV

    Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video

    Authors: Bin Fan, Yuchao Dai, Hongdong Li

    Abstract: A single rolling-shutter (RS) image may be viewed as a row-wise combination of a sequence of global-shutter (GS) images captured by a (virtual) moving GS camera within the exposure duration. Although RS cameras are widely used, the RS effect causes obvious image distortion especially in the presence of fast camera motion, hindering downstream computer vision tasks. In this paper, we propose to inv… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 16 Pages, 14 Figures

  39. SoccerNet 2022 Challenges Results

    Authors: Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao , et al. (69 additional authors not shown)

    Abstract: The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on det… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Accepted at ACM MMSports 2022

  40. arXiv:2208.00817  [pdf, other

    cs.CV

    DSLA: Dynamic smooth label assignment for efficient anchor-free object detection

    Authors: Hu Su, Yonghao He, Rui Jiang, Jiabin Zhang, Wei Zou, Bin Fan

    Abstract: Anchor-free detectors basically formulate object detection as dense classification and regression. For popular anchor-free detectors, it is common to introduce an individual prediction branch to estimate the quality of localization. The following inconsistencies are observed when we delve into the practices of classification and quality estimation. Firstly, for some adjacent samples which are assi… ▽ More

    Submitted 29 September, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: single column, 33 pages, 7 figures, accepted by Pattern Recognition

  41. arXiv:2207.02968  [pdf, other

    stat.ML cs.LG

    Unsupervised Manifold Alignment with Joint Multidimensional Scaling

    Authors: Dexiong Chen, Bowen Fan, Carlos Oliver, Karsten Borgwardt

    Abstract: We introduce Joint Multidimensional Scaling, a novel approach for unsupervised manifold alignment, which maps datasets from two different domains, without any known correspondences between data instances across the datasets, to a common low-dimensional Euclidean space. Our approach integrates Multidimensional Scaling (MDS) and Wasserstein Procrustes analysis into a joint optimization problem to si… ▽ More

    Submitted 16 February, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

    Comments: ICLR 2023, see https://openreview.net/forum?id=lUpjsrKItz4

  42. arXiv:2206.13741  [pdf, other

    cs.NI eess.SP

    Social-aware Cooperative Caching in Fog Radio Access Networks

    Authors: Baotian Fan, Yanxiang Jiang, Fu-Chun Zheng, Mehdi Bennis, Xiaohu You

    Abstract: In this paper, the cooperative caching problem in fog radio access networks (F-RANs) is investigated to jointly optimize the transmission delay and energy consumption. Exploiting the potential social relationships among fog access points (F-APs), we firstly propose a clustering scheme based on hedonic coalition game (HCG) to improve the potential cooperation gain. Then, considering that the optimi… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: 6 pages, 5 figures. This paper has been accepted by IEEE ICC 2022

  43. arXiv:2205.12912  [pdf, other

    cs.CV

    Context-Aware Video Reconstruction for Rolling Shutter Cameras

    Authors: Bin Fan, Yuchao Dai, Zhiyuan Zhang, Qi Liu, Mingyi He

    Abstract: With the ubiquity of rolling shutter (RS) cameras, it is becoming increasingly attractive to recover the latent global shutter (GS) video from two consecutive RS frames, which also places a higher demand on realism. Existing solutions, using deep neural networks or optimization, achieve promising performance. However, these methods generate intermediate GS frames through image warping based on the… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)

  44. arXiv:2204.12935  [pdf, other

    cs.CL cs.AI

    AdaCoach: A Virtual Coach for Training Customer Service Agents

    Authors: Shuang Peng, Shuai Zhu, Minghui Yang, Haozhou Huang, Dan Liu, Zujie Wen, Xuelian Li, Biao Fan

    Abstract: With the development of online business, customer service agents gradually play a crucial role as an interface between the companies and their customers. Most companies spend a lot of time and effort on hiring and training customer service agents. To this end, we propose AdaCoach: A Virtual Coach for Training Customer Service Agents, to promote the ability of newly hired service agents before they… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: 5 pages

  45. VRNet: Learning the Rectified Virtual Corresponding Points for 3D Point Cloud Registration

    Authors: Zhiyuan Zhang, Jiadai Sun, Yuchao Dai, Bin Fan, Mingyi He

    Abstract: 3D point cloud registration is fragile to outliers, which are labeled as the points without corresponding points. To handle this problem, a widely adopted strategy is to estimate the relative pose based only on some accurate correspondences, which is achieved by building correspondences on the identified inliers or by selecting reliable ones. However, these approaches are usually complicated and t… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted by IEEE Transactions on Circuits and Systems for Video Technology

  46. arXiv:2203.07003  [pdf, other

    cs.CV cs.AI

    MTLDesc: Looking Wider to Describe Better

    Authors: Changwei Wang, Rongtao Xu, Yuyang Zhang, Shibiao Xu, Weiliang Meng, Bin Fan, Xiaopeng Zhang

    Abstract: Limited by the locality of convolutional neural networks, most existing local features description methods only learn local descriptors with local information and lack awareness of global and surrounding spatial context. In this work, we focus on making local descriptors "look wider to describe better" by learning local Descriptors with More Than just Local information (MTLDesc). Specifically, we… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

  47. arXiv:2112.11648  [pdf, other

    cs.CV

    Out-of-distribution Detection with Boundary Aware Learning

    Authors: Sen Pei, Xin Zhang, Bin Fan, Gaofeng Meng

    Abstract: There is an increasing need to determine whether inputs are out-of-distribution (\emph{OOD}) for safely deploying machine learning models in the open world scenario. Typical neural classifiers are based on the closed world assumption, where the training data and the test data are drawn \emph{i.i.d.} from the same distribution, and as a result, give over-confident predictions even faced with \emph{… ▽ More

    Submitted 8 July, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Journal ref: ECCV 2022 Poster

  48. arXiv:2108.06036  [pdf, other

    cs.LG

    An Information-theoretic Perspective of Hierarchical Clustering

    Authors: Yicheng Pan, Feng Zheng, Bingchen Fan

    Abstract: A combinatorial cost function for hierarchical clustering was introduced by Dasgupta \cite{dasgupta2016cost}. It has been generalized by Cohen-Addad et al. \cite{cohen2019hierarchical} to a general form named admissible function. In this paper, we investigate hierarchical clustering from the \emph{information-theoretic} perspective and formulate a new objective function. We also establish the rela… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

  49. arXiv:2108.04775  [pdf, other

    cs.CV cs.RO

    SUNet: Symmetric Undistortion Network for Rolling Shutter Correction

    Authors: Bin Fan, Yuchao Dai, Mingyi He

    Abstract: The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism, leading to image distortions if the camera moves during image acquisition. In this paper, we present a novel deep network to solve the generic rolling shutter correction problem with two consecutive frames. Our pipeline is symmetrically designed to predict the global shutter image corresponding to the intermedia… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: Accepted by IEEE International Conference on Computer Vision (ICCV) 2021

  50. arXiv:2103.12542  [pdf, other

    cs.CR

    EmgAuth: Unlocking Smartphones with EMG Signals

    Authors: Boyu Fan, Xiang Su, Jianwei Niu, Pan Hui

    Abstract: Screen lock is a critical security feature for smartphones to prevent unauthorized access. Although various screen unlocking technologies, including fingerprint and facial recognition, have been widely adopted, they still have some limitations. For example, fingerprints can be stolen by special material stickers and facial recognition systems can be cheated by 3D-printed head models. In this paper… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

    Comments: 13 pages, 16 figures