Skip to main content

Showing 1–28 of 28 results for author: Qu, D

  1. arXiv:2409.19499  [pdf, other

    cs.RO

    Fast-UMI: A Scalable and Hardware-Independent Universal Manipulation Interface

    Authors: Ziniu Wu, Tianyu Wang, Zhaxizhuoma, Chuyue Guan, Zhongjie Jia, Shuai Liang, Haoming Song, Delin Qu, Dong Wang, Zhigang Wang, Nieqing Cao, Yan Ding, Bin Zhao, Xuelong Li

    Abstract: Collecting real-world manipulation trajectory data involving robotic arms is essential for developing general-purpose action policies in robotic manipulation, yet such data remains scarce. Existing methods face limitations such as high costs, labor intensity, hardware dependencies, and complex setup requirements involving SLAM algorithms. In this work, we introduce Fast-UMI, an interface-mediated… ▽ More

    Submitted 28 September, 2024; originally announced September 2024.

  2. arXiv:2408.15428  [pdf, other

    cs.CV

    HEAD: A Bandwidth-Efficient Cooperative Perception Approach for Heterogeneous Connected and Autonomous Vehicles

    Authors: Deyuan Qu, Qi Chen, Yongqi Zhu, Yihao Zhu, Sergei S. Avedisov, Song Fu, Qing Yang

    Abstract: In cooperative perception studies, there is often a trade-off between communication bandwidth and perception performance. While current feature fusion solutions are known for their excellent object detection performance, transmitting the entire sets of intermediate feature maps requires substantial bandwidth. Furthermore, these fusion approaches are typically limited to vehicles that use identical… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV 2024 Workshop

  3. arXiv:2408.13024  [pdf, other

    cs.CV

    Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding

    Authors: Xianqiang Gao, Pingrui Zhang, Delin Qu, Dong Wang, Zhigang Wang, Yan Ding, Bin Zhao, Xuelong Li

    Abstract: 3D Object Affordance Grounding aims to predict the functional regions on a 3D object and has laid the foundation for a wide range of applications in robotics. Recent advances tackle this problem via learning a mapping between 3D regions and a single human-object interaction image. However, the geometric structure of the 3D object and the object in the human-object interaction image are not always… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  4. arXiv:2406.16038  [pdf, other

    cs.CV

    LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

    Authors: Delin Qu, Qizhi Chen, Pingrui Zhang, Xianqiang Gao, Bin Zhao, Dong Wang, Xuelong Li

    Abstract: This paper aims to advance the progress of physical world interactive scene reconstruction by extending the interactive object reconstruction from single object level to complex scene level. To this end, we first construct one simulated and one real scene-level physical interaction dataset containing 28 scenes with multiple interactive objects per scene. Furthermore, to accurately model the intera… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  5. Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

    Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

    Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More

    Submitted 9 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  6. arXiv:2406.02916  [pdf, other

    cs.RO

    Real-time Motion Planning for autonomous vehicles in dynamic environments

    Authors: Mohammad Dehghani Tezerjani, Dominic Carrillo, Deyuan Qu, Sudip Dhakal, Amir Mirzaeinia, Qing Yang

    Abstract: Recent advancements in self-driving car technologies have enabled them to navigate autonomously through various environments. However, one of the critical challenges in autonomous vehicle operation is trajectory planning, especially in dynamic environments with moving obstacles. This research aims to tackle this challenge by proposing a robust algorithm tailored for autonomous cars operating in dy… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 8 pages

  7. arXiv:2312.16141  [pdf, other

    cs.CV

    VirtualPainting: Addressing Sparsity with Virtual Points and Distance-Aware Data Augmentation for 3D Object Detection

    Authors: Sudip Dhakal, Dominic Carrillo, Deyuan Qu, Michael Nutt, Qing Yang, Song Fu

    Abstract: In recent times, there has been a notable surge in multimodal approaches that decorates raw LiDAR point clouds with camera-derived features to improve object detection performance. However, we found that these methods still grapple with the inherent sparsity of LiDAR point cloud data, primarily because fewer points are enriched with camera-derived features for sparsely distributed objects. We pres… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  8. arXiv:2312.10818  [pdf, other

    cs.CV

    Facial Emotion Recognition using CNN in PyTorch

    Authors: Deyuan Qu, Sudip Dhakal, Dominic Carrillo

    Abstract: In this project, we have implemented a model to recognize real-time facial emotions given the camera images. Current approaches would read all data and input it into their model, which has high space complexity. Our model is based on the Convolutional Neural Network utilizing the PyTorch library. We believe our implementation will significantly improve the space complexity and provide a useful con… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  9. arXiv:2312.04822  [pdf, other

    cs.CV

    SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles

    Authors: Deyuan Qu, Qi Chen, Tianyu Bai, Hongsheng Lu, Heng Fan, Hao Zhang, Song Fu, Qing Yang

    Abstract: Cooperative perception for connected and automated vehicles is traditionally achieved through the fusion of feature maps from two or more vehicles. However, the absence of feature maps shared from other vehicles can lead to a significant decline in 3D object detection performance for cooperative perception models compared to standalone 3D detection models. This drawback impedes the adoption of coo… ▽ More

    Submitted 26 August, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted by IROS 2024

  10. arXiv:2311.15766  [pdf, other

    cs.CL

    Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges

    Authors: Nianwen Si, Hao Zhang, Heyu Chang, Wenlin Zhang, Dan Qu, Weiqiang Zhang

    Abstract: In recent years, large language models (LLMs) have spurred a new research paradigm in natural language processing. Despite their excellent capability in knowledge-based question answering and reasoning, their potential to retain faulty or even harmful knowledge poses risks of malicious application. The challenge of mitigating this issue and transforming these models into purer assistants is crucia… ▽ More

    Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Work in progress

  11. arXiv:2311.11700  [pdf, other

    cs.CV

    GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting

    Authors: Chi Yan, Delin Qu, Dan Xu, Bin Zhao, Zhigang Wang, Dong Wang, Xuelong Li

    Abstract: In this paper, we introduce \textbf{GS-SLAM} that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping (SLAM) system. It facilitates a better balance between efficiency and accuracy. Compared to recent SLAM methods employing neural implicit representations, our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup… ▽ More

    Submitted 7 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted to CVPR 2024(highlight). Project Page: https://gs-slam.github.io/

  12. arXiv:2311.11013  [pdf, other

    cs.CV

    Implicit Event-RGBD Neural SLAM

    Authors: Delin Qu, Chi Yan, Dong Wang, Jie Yin, Dan Xu, Bin Zhao, Xuelong Li

    Abstract: Implicit neural SLAM has achieved remarkable progress recently. Nevertheless, existing methods face significant challenges in non-ideal scenarios, such as motion blur or lighting variation, which often leads to issues like convergence failures, localization drifts, and distorted mapping. To address these challenges, we propose EN-SLAM, the first event-RGBD implicit neural SLAM framework, which eff… ▽ More

    Submitted 17 March, 2024; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Accept at CVPR 2024

  13. arXiv:2310.02050  [pdf, other

    cs.CL cs.CV

    Tuning Large language model for End-to-end Speech Translation

    Authors: Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Xiaolin Jiao

    Abstract: With the emergence of large language models (LLMs), multimodal models based on LLMs have demonstrated significant potential. Models such as LLaSM, X-LLM, and SpeechGPT exhibit an impressive ability to comprehend and generate human instructions. However, their performance often falters when faced with complex tasks like end-to-end speech translation (E2E-ST), a cross-language and cross-modal transl… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  14. DropDim: A Regularization Method for Transformer Networks

    Authors: Hao Zhang, Dan Qu, Keji Shao, Xukui Yang

    Abstract: We introduceDropDim, a structured dropout method designed for regularizing the self-attention mechanism, which is a key component of the transformer. In contrast to the general dropout method, which randomly drops neurons, DropDim drops part of the embedding dimensions. In this way, the semantic information can be completely discarded. Thus, the excessive coadapting between different embedding dim… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Journal ref: IEEE SIGNAL PROCESSING LETTERS, VOL. 29, 2022

  15. Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning

    Authors: Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Wei-Qiang Zhang

    Abstract: The end-to-end speech translation (E2E-ST) model has gradually become a mainstream paradigm due to its low latency and less error propagation. However, it is non-trivial to train such a model well due to the task complexity and data scarcity. The speech-and-text modality differences result in the E2E-ST model performance usually inferior to the corresponding machine translation (MT) model. Based o… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Journal ref: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 31, 2023

  16. arXiv:2304.10295  [pdf, other

    cs.CL cs.SD eess.AS

    Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation

    Authors: Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Zhen Li

    Abstract: Existing techniques often attempt to make knowledge transfer from a powerful machine translation (MT) to speech translation (ST) model with some elaborate techniques, which often requires transcription as extra input during training. However, transcriptions are not always available, and how to improve the ST model performance without transcription, i.e., data efficiency, has rarely been studied in… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted by ICASSP 2023

  17. arXiv:2303.18125  [pdf, other

    cs.CV

    Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction

    Authors: Delin Qu, Yizhen Lao, Zhigang Wang, Dong Wang, Bin Zhao, Xuelong Li

    Abstract: This paper addresses the problem of rolling shutter correction in complex nonlinear and dynamic scenes with extreme occlusion. Existing methods suffer from two main drawbacks. Firstly, they face challenges in estimating the accurate correction field due to the uniform velocity assumption, leading to significant image correction errors under complex motion. Secondly, the drastic occlusion in dynami… ▽ More

    Submitted 15 August, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: accepted at ICCV 2023

  18. arXiv:2209.08503  [pdf, other

    cs.CV

    Revisiting Rolling Shutter Bundle Adjustment: Toward Accurate and Fast Solution

    Authors: Bangyan Liao, Delin Qu, Yifei Xue, Huiqing Zhang, Yizhen Lao

    Abstract: We propose a robust and fast bundle adjustment solution that estimates the 6-DoF pose of the camera and the geometry of the environment based on measurements from a rolling shutter (RS) camera. This tackles the challenges in the existing works, namely relying on additional sensors, high frame rate video as input, restrictive assumptions on camera motion, readout direction, and poor efficiency. To… ▽ More

    Submitted 18 April, 2023; v1 submitted 18 September, 2022; originally announced September 2022.

    Comments: Accepted to CVPR 2023

  19. arXiv:2101.07116  [pdf, other

    cs.CV cs.AI

    LNSMM: Eye Gaze Estimation With Local Network Share Multiview Multitask

    Authors: Yong Huang, Ben Chen, Daiming Qu

    Abstract: Eye gaze estimation has become increasingly significant in computer vision.In this paper,we systematically study the mainstream of eye gaze estimation methods,propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.First,we construct a local sharing network for feature extraction of gaze points and gaze directions estimation,which can reduce network computati… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

  20. arXiv:2009.13631  [pdf, other

    cs.DB

    Tempura: A General Cost Based Optimizer Framework for Incremental Data Processing (Extended Version)

    Authors: Zuozhi Wang, Kai Zeng, Botong Huang, Wei Chen, Xiaozong Cui, Bo Wang, Ji Liu, Liya Fan, Dachuan Qu, Zhenyu Hou, Tao Guan, Chen Li, Jingren Zhou

    Abstract: Incremental processing is widely-adopted in many applications, ranging from incremental view maintenance, stream computing, to recently emerging progressive data warehouse and intermittent query processing. Despite many algorithms developed on this topic, none of them can produce an incremental plan that always achieves the best performance, since the optimal plan is data dependent. In this paper,… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Comments: 19 pages, 8 figures. The short version of this paper is accepeted at VLDB 2021 (PVLDB Volume 14, Issue 1)

    ACM Class: H.2.4

  21. arXiv:1905.00005  [pdf, ps, other

    eess.SP cs.IT

    Optimal Preamble Length for Spectral Efficiency in Grant-Free RA with Massive MIMO

    Authors: Jie Ding, Daiming Qu, Hao Jiang

    Abstract: Grant-free random access (RA) with massive MIMO is a promising RA technique for massive access with low signaling overhead. In the grant-free RA with massive MIMO, preamble length has a critical impact on the performance of the system. In this paper, the optimal preamble length is investigated to maximize spectral efficiency (SE) of the grant-free RA with massive MIMO, where effects of the preambl… ▽ More

    Submitted 29 April, 2019; originally announced May 2019.

    Comments: Accepted By IEEE ICEIC 2019. arXiv admin note: text overlap with arXiv:1805.08345

  22. arXiv:1810.04458   

    cs.IT

    Cluster Pairwise Error Probability and Construction of Parity-Check-Concatenated Polar Codes

    Authors: Tao Wang, Daiming Qu, Tao Jiang

    Abstract: A successive cancellation list (SCL) decoder with limited list size for polar codes can not be analyzed as a successive cancellation (SC) decoder, nor as a maximum likelihood (ML) decoder, due to the complicated decoding errors caused by path elimination. To address this issue, an analytical tool, named as cluster pairwise error probability (CPEP), is proposed in this paper to measure the competit… ▽ More

    Submitted 21 March, 2019; v1 submitted 10 October, 2018; originally announced October 2018.

    Comments: There are some errors in Algorithm 1 of Page 6

  23. arXiv:1809.07535  [pdf, ps, other

    cs.IT

    Multiple Preambles for High Success Rate of Grant-Free Random Access with Massive MIMO

    Authors: Hao Jiang, Daiming Qu, Jie Ding, Tao Jiang

    Abstract: Grant-free random access (RA) with massive MIMO is a promising RA technique with low signaling overhead that provides significant benefits in increasing the channel reuse efficiency. Since user equipment (UE) detection and channel estimation in grant-free RA rely solely on the received preambles, preamble designs that enable high success rate of UE detection and channel estimation are very much in… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.

  24. arXiv:1806.09836  [pdf, ps, other

    cs.IT

    Virtual Carrier Sensing Based Random Access in Massive MIMO Systems

    Authors: Jie Ding, Daiming Qu, Hao Jiang, Tao Jiang

    Abstract: The 5th generation mobile communication systems aim to support massive access for future wireless applications. Unfortunately, wireless resource scarcity in random access (RA) is a fundamental bottleneck for enabling massive access. To address this problem, we propose a virtual carrier sensing (VCS) based RA scheme in massive MIMO systems. The essence of the proposed scheme lies in exploiting wire… ▽ More

    Submitted 26 June, 2018; originally announced June 2018.

  25. arXiv:1805.08345  [pdf, ps, other

    cs.IT

    Success Probability of Grant-Free Random Access with Massive MIMO

    Authors: Jie Ding, Daiming Qu, Hao Jiang, Tao Jiang

    Abstract: Massive MIMO opens up new avenues for enabling highly efficient random access (RA) by offering abundance of spatial degrees of freedom. In this paper, we investigate the grant-free RA with massive MIMO and derive the analytic expressions of success probability of the grant-free RA for conjugate beamforming and zero-forcing beamforming techniques.With the derived analytic expressions, we further sh… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  26. arXiv:1802.03706  [pdf, other

    cs.IT

    FDM-Structured Preamble Optimization for Channel Estimation in MIMO-OQAM/FBMC Systems

    Authors: Wenfeng Liu, Da Chen, Kai Luo, Tao Jiang, Daiming Qu

    Abstract: In this paper, we consider the problem of preamble design in multiple-input multiple-output (MIMO) systems employing offset quadrature amplitude modulation based filter bank multicarrier (OQAM/FBMC) and propose a preamble optimization method for the frequency division multiplexing (FDM)-structured preamble. Specifically, we formulate an optimization problem to determine the frequency division mult… ▽ More

    Submitted 11 February, 2018; originally announced February 2018.

    Comments: 11 pages, 7 figures

  27. Downlink Precoding with Mixed Statistical and Imperfect Instantaneous CSI for Massive MIMO Systems

    Authors: Shuang Qiu, Da Chen, Daiming Qu, Kai Luo, Tao Jiang

    Abstract: In this paper, the feasibility of a new downlink transmission mode in massive multi-input multi-output (MIMO) systems is investigated with two types of users, i.e., the users with only statistical channel state information (CSI) and the users with imperfect instantaneous CSI. The problem of downlink precoding design with mixed utilization of statistical and imperfect instantaneous CSI is addressed… ▽ More

    Submitted 30 November, 2017; v1 submitted 29 November, 2017; originally announced November 2017.

    Comments: 14 pages, 9 figures, transactions

  28. arXiv:1601.00413  [pdf, ps, other

    cs.IT

    Improving Bandwidth Efficiency of FBMC-OQAM Through Virtual Symbols

    Authors: Daiming Qu, Fang Wang, Tao Jiang, Behrouz Farhang-Boroujeny

    Abstract: Filter bank multicarrier (FBMC) systems that are based on offset quadrature amplitude modulation (OQAM), namely, FBMC-OQAM, have been criticized for their inefficiency in the use of spectral resources, because of the long ramp-up and ramp-down tails at the beginning and the end of each data packet, respectively. We propose a novel method for shortening these tails. By appending a set of virtual (i… ▽ More

    Submitted 4 January, 2016; originally announced January 2016.