Skip to main content

Showing 1–50 of 247 results for author: Lyu, S

  1. arXiv:2410.15318  [pdf, other

    cs.NE cs.AI cs.LG

    SNAP: Stopping Catastrophic Forgetting in Hebbian Learning with Sigmoidal Neuronal Adaptive Plasticity

    Authors: Tianyi Xu, Patrick Zheng, Shiyan Liu, Sicheng Lyu, Isabeau Prémont-Schwarz

    Abstract: Artificial Neural Networks (ANNs) suffer from catastrophic forgetting, where the learning of new tasks causes the catastrophic forgetting of old tasks. Existing Machine Learning (ML) algorithms, including those using Stochastic Gradient Descent (SGD) and Hebbian Learning typically update their weights linearly with experience i.e., independently of their current strength. This contrasts with biolo… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

    Comments: 6 pages, 11 figures, accepted at Montréal AI and Neuroscience (MAIN) 2024 conference

  2. arXiv:2410.11502  [pdf, other

    cs.LG cs.AI cs.NE

    Offline Model-Based Optimization by Learning to Rank

    Authors: Rong-Xi Tan, Ke Xue, Shen-Huan Lyu, Haopu Shang, Yao Wang, Yaoyuan Wang, Sheng Fu, Chao Qian

    Abstract: Offline model-based optimization (MBO) aims to identify a design that maximizes a black-box function using only a fixed, pre-collected dataset of designs and their corresponding scores. A common approach in offline MBO is to train a regression-based surrogate model by minimizing mean squared error (MSE) and then find the best design within this surrogate model by different optimizers (e.g., gradie… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  3. arXiv:2410.06126  [pdf, other

    cs.CV

    $\textit{X}^2$-DFD: A framework for e${X}$plainable and e${X}$tendable Deepfake Detection

    Authors: Yize Chen, Zhiyuan Yan, Siwei Lyu, Baoyuan Wu

    Abstract: Detecting deepfakes has become an important task. Most existing detection methods provide only real/fake predictions without offering human-comprehensible explanations. Recent studies leveraging MLLMs for deepfake detection have shown improvements in explainability. However, the performance of pre-trained MLLMs (e.g., LLaVA) remains limited due to a lack of understanding of their capabilities for… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  4. arXiv:2409.19681  [pdf, other

    cs.CV

    Simple and Fast Distillation of Diffusion Models

    Authors: Zhenyu Zhou, Defang Chen, Can Wang, Chun Chen, Siwei Lyu

    Abstract: Diffusion-based generative models have demonstrated their powerful performance across various tasks, but this comes at a cost of the slow sampling speed. To achieve both efficient and high-quality synthesis, various distillation-based accelerated sampling methods have been developed recently. However, they generally require time-consuming fine tuning with elaborate designs to achieve satisfactory… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

    Comments: Accepted by NeurIPS 2024

  5. arXiv:2409.19365  [pdf, other

    cs.CV cs.AI

    Conditional Image Synthesis with Diffusion Models: A Survey

    Authors: Zheyuan Zhan, Defang Chen, Jian-Ping Mei, Zhenghe Zhao, Jiawei Chen, Chun Chen, Siwei Lyu, Can Wang

    Abstract: Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity… ▽ More

    Submitted 3 October, 2024; v1 submitted 28 September, 2024; originally announced September 2024.

  6. arXiv:2409.09638  [pdf, other

    cs.MM

    Multi-view Hypergraph-based Contrastive Learning Model for Cold-Start Micro-video Recommendation

    Authors: Sisuo Lyu, Xiuze Zhou, Xuming Hu

    Abstract: With the widespread use of mobile devices and the rapid growth of micro-video platforms such as TikTok and Kwai, the demand for personalized micro-video recommendation systems has significantly increased. Micro-videos typically contain diverse information, such as textual metadata, visual cues (e.g., cover images), and dynamic video content, significantly affecting user interaction and engagement… ▽ More

    Submitted 15 September, 2024; originally announced September 2024.

  7. arXiv:2408.16305  [pdf, other

    cs.CV

    Semantics-Oriented Multitask Learning for DeepFake Detection: A Joint Embedding Approach

    Authors: Mian Zou, Baosheng Yu, Yibing Zhan, Siwei Lyu, Kede Ma

    Abstract: In recent years, the multimedia forensics and security community has seen remarkable progress in multitask learning for DeepFake (i.e., face forgery) detection. The prevailing strategy has been to frame DeepFake detection as a binary classification problem augmented by manipulation-oriented auxiliary tasks. This strategy focuses on learning features specific to face manipulations, which exhibit li… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  8. arXiv:2408.13787  [pdf, other

    cs.LG cs.DC

    Mask-Encoded Sparsification: Mitigating Biased Gradients in Communication-Efficient Split Learning

    Authors: Wenxuan Zhou, Zhihao Qu, Shen-Huan Lyu, Miao Cai, Baoliu Ye

    Abstract: This paper introduces a novel framework designed to achieve a high compression ratio in Split Learning (SL) scenarios where resource-constrained devices are involved in large-scale model training. Our investigations demonstrate that compressing feature maps within SL leads to biased gradients that can negatively impact the convergence rates and diminish the generalization capabilities of the resul… ▽ More

    Submitted 26 September, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

    Journal ref: Proceedings of the 27th European Conference on Artificial Intelligence, 2024

  9. arXiv:2408.07703  [pdf, other

    cs.CV

    Knowledge Distillation with Refined Logits

    Authors: Wujie Sun, Defang Chen, Siwei Lyu, Genlang Chen, Chun Chen, Can Wang

    Abstract: Recent research on knowledge distillation has increasingly focused on logit distillation because of its simplicity, effectiveness, and versatility in model compression. In this paper, we introduce Refined Logit Distillation (RLD) to address the limitations of current logit distillation methods. Our approach is motivated by the observation that even high-performing teacher models can make incorrect… ▽ More

    Submitted 19 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

    Comments: 11 pages, 7 figures

  10. arXiv:2408.04300  [pdf, other

    eess.IV cs.CV

    An Explainable Non-local Network for COVID-19 Diagnosis

    Authors: Jingfu Yang, Peng Huang, Jing Hu, Shu Hu, Siwei Lyu, Xin Wang, Jun Guo, Xi Wu

    Abstract: The CNN has achieved excellent results in the automatic classification of medical images. In this study, we propose a novel deep residual 3D attention non-local network (NL-RAN) to classify CT images included COVID-19, common pneumonia, and normal to perform rapid and explainable COVID-19 diagnosis. We built a deep residual 3D attention non-local network that could achieve end-to-end training. The… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  11. arXiv:2408.02191  [pdf, other

    cs.CV

    Dense Feature Interaction Network for Image Inpainting Localization

    Authors: Ye Yao, Tingfeng Han, Shan Jia, Siwei Lyu

    Abstract: Image inpainting, which is the task of filling in missing areas in an image, is a common image editing technique. Inpainting can be used to conceal or alter image contents in malicious manipulation of images, driving the need for research in image inpainting detection. Existing methods mostly rely on a basic encoder-decoder structure, which often results in a high number of false positives or miss… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  12. arXiv:2407.21788  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Vision-Language Model Based Handwriting Verification

    Authors: Mihir Chauhan, Abhishek Satbhai, Mohammad Abuzar Hashemi, Mir Basheer Ali, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

    Abstract: Handwriting Verification is a critical in document forensics. Deep learning based approaches often face skepticism from forensic document examiners due to their lack of explainability and reliance on extensive training data and handcrafted features. This paper explores using Vision Language Models (VLMs), such as OpenAI's GPT-4o and Google's PaliGemma, to address these challenges. By leveraging th… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: 4 Pages, 1 Figure, 1 Table, Accepted as Short paper at Irish Machine Vision and Image Processing (IMVIP) Conference

  13. Enhancement of deltaful two-pion exchange nuclear forces

    Authors: Haiming Chen, Rui Peng, Songlin Lyu, Bingwei Long

    Abstract: The role of the delta isobar degrees of freedom in nucleon-nucleon scattering is revisited. We attempt to understand why the dimensionally regularized two-pion exchanges with the explicit delta isobar is much stronger than the ones with spectral function regularization. When the cutoff value of spectral function regularization is varied, the isoscalar central component exhibits a rather large cuto… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 16 pages, 6 figures

    Journal ref: Communications in Theoretical Physics, Volume 76, Number 9 (2024)

  14. arXiv:2407.05108  [pdf, other

    cs.LG stat.ML

    The Role of Depth, Width, and Tree Size in Expressiveness of Deep Forest

    Authors: Shen-Huan Lyu, Jin-Hui Wu, Qin-Cheng Zheng, Baoliu Ye

    Abstract: Random forests are classical ensemble algorithms that construct multiple randomized decision trees and aggregate their predictions using naive averaging. \citet{zhou2019deep} further propose a deep forest algorithm with multi-layer forests, which outperforms random forests in various tasks. The performance of deep forests is related to three hyperparameters in practice: depth, width, and tree size… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Journal ref: In: Proceedings of the 27th European Conference on Artificial Intelligence, 2024

  15. arXiv:2407.03107  [pdf

    cs.HC cs.GR cs.MM

    Design of a UE5-based digital twin platform

    Authors: Shaoqiu Lyu, Muzhi Wang, Sunrui Zhang, Shengzhi Wang

    Abstract: Aiming at the current mainstream 3D scene engine learning and building cost is too high, this thesis proposes a digital twin platform design program based on Unreal Engine 5 (UE5). It aims to provide a universal platform construction design process to effectively reduce the learning cost of large-scale scene construction. Taking an actual project of a unit as an example, the overall cycle work of… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  16. arXiv:2406.16943  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

    Authors: Shengzhe Lyu, Yongliang Chen, Di Duan, Renqi Jia, Weitao Xu

    Abstract: In the realm of smart sensing with the Internet of Things, earable devices are empowered with the capability of multi-modality sensing and intelligence of context-aware computing, leading to its wide usage in Human Activity Recognition (HAR). Nonetheless, unlike the movements captured by Inertial Measurement Unit (IMU) sensors placed on the upper or lower body, those motion signals obtained from e… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: accepted by 2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT)

  17. arXiv:2406.10427  [pdf, other

    cs.LG cs.CR

    Adaptive Randomized Smoothing: Certifying Multi-Step Defences against Adversarial Examples

    Authors: Saiyue Lyu, Shadab Shaikh, Frederick Shpilevskiy, Evan Shelhamer, Mathias Lécuyer

    Abstract: We propose Adaptive Randomized Smoothing (ARS) to certify the predictions of our test-time adaptive models against adversarial examples. ARS extends the analysis of randomized smoothing using f-Differential Privacy to certify the adaptive composition of multiple steps. For the first time, our theory covers the sound adaptive composition of general and high-dimensional functions of noisy input. We… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  18. arXiv:2406.04745  [pdf, other

    cs.LG cs.CV

    Confidence-aware Contrastive Learning for Selective Classification

    Authors: Yu-Chang Wu, Shen-Huan Lyu, Haopu Shang, Xiangyu Wang, Chao Qian

    Abstract: Selective classification enables models to make predictions only when they are sufficiently confident, aiming to enhance safety and reliability, which is important in high-stakes scenarios. Previous methods mainly use deep neural networks and focus on modifying the architecture of classification layers to enable the model to estimate the confidence of its prediction. This work provides a generaliz… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  19. arXiv:2406.01112  [pdf, other

    cs.CV

    BACON: Bayesian Optimal Condensation Framework for Dataset Distillation

    Authors: Zheng Zhou, Hongbo Zhao, Guangliang Cheng, Xiangtai Li, Shuchang Lyu, Wenquan Feng, Qi Zhao

    Abstract: Dataset Distillation (DD) aims to distill knowledge from extensive datasets into more compact ones while preserving performance on the test set, thereby reducing storage costs and training expenses. However, existing methods often suffer from computational intensity, particularly exhibiting suboptimal performance with large dataset sizes due to the lack of a robust theoretical framework for analyz… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 22 pages, 10 figures

  20. arXiv:2406.00985  [pdf, other

    cs.CV

    MultiEdits: Simultaneous Multi-Aspect Editing with Text-to-Image Diffusion Models

    Authors: Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Suresh Lokhande, Siwei Lyu

    Abstract: Text-driven image synthesis has made significant advancements with the development of diffusion models, transforming how visual content is generated from text prompts. Despite these advances, text-driven image editing, a key area in computer graphics, faces unique challenges. A major challenge is making simultaneous edits across multiple objects or attributes. Applying these methods sequentially f… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  21. arXiv:2405.18320  [pdf, other

    cs.CV cs.AI cs.CL

    Self-Supervised Learning Based Handwriting Verification

    Authors: Mihir Chauhan, Mohammad Abuzar Hashemi, Abhishek Satbhai, Mir Basheer Ali, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

    Abstract: We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND data… ▽ More

    Submitted 1 August, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 8 pages, 2 figures, 2 tables, Accepted at Irish Machine Vision and Image Processing Conference 2024

  22. arXiv:2405.17837  [pdf, other

    cs.HC

    Enabling Generative Design Tools with LLM Agents for Building Novel Devices: A Case Study on Fluidic Computation Interfaces

    Authors: Qiuyu Lu, Jiawei Fang, Zhihao Yao, Yue Yang, Shiqing Lyu, Haipeng Mi, Lining Yao

    Abstract: In the field of Human-Computer Interaction (HCI), the development of interactive devices represents a significant area of focus. The advent of novel hardware and advanced fabrication techniques has underscored the demand for specialized design tools that democratize the prototyping process for such cutting-edge devices. While these tools simplify the process through parametric design and simulatio… ▽ More

    Submitted 22 July, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 25 pages, 12 figures

  23. arXiv:2405.11326  [pdf, other

    cs.LG cs.CV

    On the Trajectory Regularity of ODE-based Diffusion Sampling

    Authors: Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, Siwei Lyu

    Abstract: Diffusion-based generative models use stochastic differential equations (SDEs) and their equivalent ordinary differential equations (ODEs) to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models. We characterize an implicit denoi… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: ICML 2024, 30 pages. arXiv admin note: text overlap with arXiv:2305.19947

  24. arXiv:2405.08487  [pdf, other

    cs.CV cs.CR

    Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method

    Authors: Mian Zou, Baosheng Yu, Yibing Zhan, Siwei Lyu, Kede Ma

    Abstract: In recent years, deep learning has greatly streamlined the process of generating realistic fake face images. Aware of the dangers, researchers have developed various tools to spot these counterfeits. Yet none asked the fundamental question: What digital manipulations make a real photographic face image fake, while others do not? In this paper, we put face forgery in a semantic context and define t… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  25. arXiv:2405.04051  [pdf, ps, other

    cs.IT

    On the quantization goodness of polar lattices

    Authors: Ling Liu, Shanxiang Lyu, Cong Ling, Baoming Bai

    Abstract: In this work, we prove that polar lattices, when tailored for lossy compression, are quantization-good in the sense that their normalized second moments approach $\frac{1}{2πe}$ as the dimension of lattices increases. It has been predicted by Zamir et al. \cite{ZamirQZ96} that the Entropy Coded Dithered Quantization (ECDQ) system using quantization-good lattices can achieve the rate-distortion bou… ▽ More

    Submitted 13 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 12 pages, 5 figures, submitted to IEEE for possible publication

  26. arXiv:2405.00135  [pdf, other

    cs.IT eess.SP

    Improving Channel Resilience for Task-Oriented Semantic Communications: A Unified Information Bottleneck Approach

    Authors: Shuai Lyu, Yao Sun, Linke Guo, Xiaoyong Yuan, Fang Fang, Lan Zhang, Xianbin Wang

    Abstract: Task-oriented semantic communications (TSC) enhance radio resource efficiency by transmitting task-relevant semantic information. However, current research often overlooks the inherent semantic distinctions among encoded features. Due to unavoidable channel variations from time and frequency-selective fading, semantically sensitive feature units could be more susceptible to erroneous inference if… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE Communications Letters

  27. arXiv:2404.19171  [pdf, other

    cs.CV cs.AI

    Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection

    Authors: Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang, Siwei Lyu, Jizhong Han

    Abstract: With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes. While effective in their specific modalities, traditional detection methods fall short in addressing the generalizability of detection across diverse cross-modal deepfakes. This paper aims to explicitly learn potential cross-modal correlation to enhance… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: accepted by ICME 2024

  28. arXiv:2404.18033  [pdf, other

    cs.CV

    Exposing Text-Image Inconsistency Using Diffusion Models

    Authors: Mingzhen Huang, Shan Jia, Zhou Zhou, Yan Ju, Jialing Cai, Siwei Lyu

    Abstract: In the battle against widespread online misinformation, a growing problem is text-image inconsistency, where images are misleadingly paired with texts with different intent or meaning. Existing classification-based methods for text-image inconsistency can identify contextual inconsistencies but fail to provide explainable justifications for their decisions that humans can understand. Although more… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  29. arXiv:2404.13146  [pdf, other

    cs.CR cs.CV

    DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection

    Authors: Yan Ju, Chengzhe Sun, Shan Jia, Shuwei Hou, Zhaofeng Si, Soumyya Kanti Datta, Lipeng Ke, Riky Zhou, Anita Nikolich, Siwei Lyu

    Abstract: Deepfakes, as AI-generated media, have increasingly threatened media integrity and personal privacy with realistic yet fake digital content. In this work, we introduce an open-source and user-friendly online platform, DeepFake-O-Meter v2.0, that integrates state-of-the-art methods for detecting Deepfake images, videos, and audio. Built upon DeepFake-O-Meter v1.0, we have made significant upgrades… ▽ More

    Submitted 27 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  30. arXiv:2403.14077  [pdf, other

    cs.AI cs.CR

    Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

    Authors: Shan Jia, Reilin Lyu, Kangran Zhao, Yize Chen, Zhiyuan Yan, Yan Ju, Chuanbo Hu, Xin Li, Baoyuan Wu, Siwei Lyu

    Abstract: DeepFakes, which refer to AI-generated media content, have become an increasing concern due to their use as a means for disinformation. Detecting DeepFakes is currently solved with programmed machine learning algorithms. In this work, we investigate the capabilities of multimodal large language models (LLMs) in DeepFake detection. We conducted qualitative and quantitative experiments to demonstrat… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  31. arXiv:2403.13358  [pdf, other

    cs.RO cs.CV cs.LG

    GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot

    Authors: Wenxuan Song, Han Zhao, Pengxiang Ding, Can Cui, Shangke Lyu, Yaning Fan, Donglin Wang

    Abstract: Multi-task robot learning holds significant importance in tackling diverse and complex scenarios. However, current approaches are hindered by performance issues and difficulties in collecting training datasets. In this paper, we propose GeRM (Generalist Robotic Model). We utilize offline reinforcement learning to optimize data utilization strategies to learn from both demonstrations and sub-optima… ▽ More

    Submitted 9 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  32. arXiv:2403.12631  [pdf

    cs.RO cs.AI

    PointGrasp: Point Cloud-based Grasping for Tendon-driven Soft Robotic Glove Applications

    Authors: Chen Hu, Shirui Lyu, Eojin Rho, Daekyum Kim, Shan Luo, Letizia Gionfrida

    Abstract: Controlling hand exoskeletons to assist individuals with grasping tasks poses a challenge due to the difficulty in understanding user intentions. We propose that most daily grasping tasks during activities of daily living (ADL) can be deduced by analyzing object geometries (simple and complex) from 3D point clouds. The study introduces PointGrasp, a real-time system designed for identifying househ… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 6 pages, 8 figures, conference

    ACM Class: I.2; I.4

  33. arXiv:2403.03101  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

    Authors: Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Ningyu Zhang, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen

    Abstract: Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories durin… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Work in progress. Project page: https://zjunlp.github.io/project/KnowAgent/ Code: https://github.com/zjunlp/KnowAgent

  34. arXiv:2402.06749  [pdf

    physics.optics physics.bio-ph physics.med-ph

    Copper phosphate micro-flowers coated with indocyanine green and iron oxide nanoparticles for in vivo localization optoacoustic tomography and magnetic actuation

    Authors: Daniil Nozdriukhin, Shuxin Lyu, Jerome Bonvin, Michael Reiss, Daniel Razansky, Xose Luis Dean-Ben

    Abstract: Efficient drug delivery is a major challenge in modern medicine and pharmaceutical research. Micrometer-scale robots have recently been proposed as a promising venue to amplify precision of drug administration. Remotely controlled microrobots sufficiently small to navigate through microvascular networks can reach any part of the human body, yet real-time tracking is crucial for providing precise g… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  35. arXiv:2402.01154  [pdf, other

    cs.CR

    Towards Quantum-Safe Federated Learning via Homomorphic Encryption: Learning with Gradients

    Authors: Guangfeng Yan, Shanxiang Lyu, Hanxu Hou, Zhiyong Zheng, Linqi Song

    Abstract: This paper introduces a privacy-preserving distributed learning framework via private-key homomorphic encryption. Thanks to the randomness of the quantization of gradients, our learning with error (LWE) based encryption can eliminate the error terms, thus avoiding the issue of error expansion in conventional LWE-based homomorphic encryption. The proposed system allows a large number of learning pa… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  36. arXiv:2401.17255  [pdf, other

    quant-ph cond-mat.str-el physics.chem-ph

    Towards Quantum Simulation of Non-Markovian Open Quantum Dynamics: A Universal and Compact Theory

    Authors: Xiang Li, Su-Xiang Lyu, Yao Wang, Rui-Xue Xu, Xiao Zheng, YiJing Yan

    Abstract: Non-Markovianity, the intricate dependence of an open quantum system on its temporal evolution history, holds tremendous implications across various scientific disciplines. However, accurately characterizing the complex non-Markovian effects has posed a formidable challenge for numerical simulations. While quantum computing technologies show promise, a universal theory enabling practical quantum a… ▽ More

    Submitted 2 September, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 14 pages, 5 figures

  37. arXiv:2401.10113  [pdf, ps, other

    cs.CV

    Exposing Lip-syncing Deepfakes from Mouth Inconsistencies

    Authors: Soumyya Kanti Datta, Shan Jia, Siwei Lyu

    Abstract: A lip-syncing deepfake is a digitally manipulated video in which a person's lip movements are created convincingly using AI models to match altered or entirely new audio. Lip-syncing deepfakes are a dangerous type of deepfakes as the artifacts are limited to the lip region and more difficult to discern. In this paper, we describe a novel approach, LIP-syncing detection based on mouth INConsistency… ▽ More

    Submitted 3 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  38. arXiv:2312.17431  [pdf, other

    cs.CR cs.CV

    MVPatch: More Vivid Patch for Adversarial Camouflaged Attacks on Object Detectors in the Physical World

    Authors: Zheng Zhou, Hongbo Zhao, Ju Liu, Qiaosheng Zhang, Liwei Geng, Shuchang Lyu, Wenquan Feng

    Abstract: Recent studies have shown that Adversarial Patches (APs) can effectively manipulate object detection models. However, the conspicuous patterns often associated with these patches tend to attract human attention, posing a significant challenge. Existing research has primarily focused on enhancing attack efficacy in the physical domain while often neglecting the optimization of stealthiness and tran… ▽ More

    Submitted 19 July, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 16 pages, 8 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  39. arXiv:2312.09785  [pdf, other

    cs.CL

    RJUA-QA: A Comprehensive QA Dataset for Urology

    Authors: Shiwei Lyu, Chenfei Chi, Hongbo Cai, Lei Shi, Xiaoyan Yang, Lei Liu, Xiang Chen, Deng Zhao, Zhiqiang Zhang, Xianguo Lyu, Ming Zhang, Fangzhou Li, Xiaowei Ma, Yue Shen, Jinjie Gu, Wei Xue, Yiran Huang

    Abstract: We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications. RJUA-QA is derived from realistic clinical scenarios and aims to facilitate LLMs in generating reliable diagnostic and advice. The dataset contains 2,132 curated Question-Co… ▽ More

    Submitted 7 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: An initial version

  40. Multiple Instance Learning for Uplift Modeling

    Authors: Yao Zhao, Haipeng Zhang, Shiwei Lyu, Ruiying Jiang, Jinjie Gu, Guannan Zhang

    Abstract: Uplift modeling is widely used in performance marketing to estimate effects of promotion campaigns (e.g., increase of customer retention rate). Since it is impossible to observe outcomes of a recipient in treatment (e.g., receiving a certain promotion) and control (e.g., without promotion) groups simultaneously (i.e., counter-factual), uplift models are mainly trained on instances of treatment and… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: short paper of CIKM22(full version)

    Journal ref: Proceedings of the 31st ACM International Conference on Information and Knowledge Management (2022) 4727-4731

  41. arXiv:2312.05738  [pdf, other

    cs.CR cs.AI

    FedReverse: Multiparty Reversible Deep Neural Network Watermarking

    Authors: Junlong Mao, Huiyi Tang, Yi Zhang, Fengxia Liu, Zhiyong Zheng, Shanxiang Lyu

    Abstract: The proliferation of Deep Neural Networks (DNN) in commercial applications is expanding rapidly. Simultaneously, the increasing complexity and cost of training DNN models have intensified the urgency surrounding the protection of intellectual property associated with these trained models. In this regard, DNN watermarking has emerged as a crucial safeguarding technique. This paper presents FedRever… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 13 pages

  42. arXiv:2311.18196  [pdf, ps, other

    math.AC

    Formal lifting of dualizing complexes and consequences

    Authors: Shiji Lyu

    Abstract: We show that for a Noetherian ring $A$ that is $I$-adically complete for an ideal $I$, if $A/I$ admits a dualizing complex, so does $A$. This gives an alternative proof of the fact that a Noetherian complete local ring admits a dualizing complex. We discuss several consequences of this result. We also consider a generalization of the notion of dualizing complexes to infinite-dimensional rings and… ▽ More

    Submitted 5 September, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: 21 pages. Added a section on quotient of Cohen-Macaulay rings

  43. arXiv:2311.11278  [pdf, other

    cs.CV

    Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection

    Authors: Zhiyuan Yan, Yuhao Luo, Siwei Lyu, Qingshan Liu, Baoyuan Wu

    Abstract: Deepfake detection faces a critical generalization hurdle, with performance deteriorating when there is a mismatch between the distributions of training and testing data. A broadly received explanation is the tendency of these detectors to be overfitted to forgery-specific artifacts, rather than learning features that are widely applicable across various forgeries. To address this issue, we propos… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  44. arXiv:2311.06712  [pdf, other

    eess.IV

    PuzzleTuning: Explicitly Bridge Pathological and Natural Image with Puzzles

    Authors: Tianyi Zhang, Shangqing Lyu, Yanli Lei, Sicheng Chen, Nan Ying, Yufang He, Yu Zhao, Yunlu Feng, Hwee Kuan Lee, Guanglei Zhang

    Abstract: Pathological image analysis is a crucial field in computer vision. Due to the annotation scarcity in the pathological field, pre-training with self-supervised learning (SSL) is widely applied to learn on unlabeled images. However, the current SSL-based pathological pre-training: (1) does not explicitly explore the essential focuses of the pathological field, and (2) does not effectively bridge wit… ▽ More

    Submitted 22 April, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figures, 8 tables

  45. arXiv:2311.06015  [pdf

    cs.RO cs.AI

    RSG: Fast Learning Adaptive Skills for Quadruped Robots by Skill Graph

    Authors: Hongyin Zhang, Diyuan Shi, Zifeng Zhuang, Han Zhao, Zhenyu Wei, Feng Zhao, Sibo Gai, Shangke Lyu, Donglin Wang

    Abstract: Developing robotic intelligent systems that can adapt quickly to unseen wild situations is one of the critical challenges in pursuing autonomous robotics. Although some impressive progress has been made in walking stability and skill learning in the field of legged robots, their ability to fast adaptation is still inferior to that of animals in nature. Animals are born with massive skills needed t… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  46. arXiv:2311.05836  [pdf, other

    eess.IV cs.CV cs.LG

    UMedNeRF: Uncertainty-aware Single View Volumetric Rendering for Medical Neural Radiance Fields

    Authors: Jing Hu, Qinrui Fan, Shu Hu, Siwei Lyu, Xi Wu, Xin Wang

    Abstract: In the field of clinical medicine, computed tomography (CT) is an effective medical imaging modality for the diagnosis of various pathologies. Compared with X-ray images, CT images can provide more information, including multi-planar slices and three-dimensional structures for clinical diagnosis. However, CT imaging requires patients to be exposed to large doses of ionizing radiation for a long ti… ▽ More

    Submitted 1 March, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

  47. arXiv:2311.04732  [pdf, other

    cond-mat.mtrl-sci physics.comp-ph

    General-purpose machine-learned potential for 16 elemental metals and their alloys

    Authors: Keke Song, Rui Zhao, Jiahui Liu, Yanzhou Wang, Eric Lindgren, Yong Wang, Shunda Chen, Ke Xu, Ting Liang, Penghua Ying, Nan Xu, Zhiqiang Zhao, Jiuyang Shi, Junjie Wang, Shuang Lyu, Zezhu Zeng, Shirong Liang, Haikuan Dong, Ligang Sun, Yue Chen, Zhuhua Zhang, Wanlin Guo, Ping Qian, Jian Sun, Paul Erhart , et al. (3 additional authors not shown)

    Abstract: Machine-learned potentials (MLPs) have exhibited remarkable accuracy, yet the lack of general-purpose MLPs for a broad spectrum of elements and their alloys limits their applicability. Here, we present a feasible approach for constructing a unified general-purpose MLP for numerous elements, demonstrated through a model (UNEP-v1) for 16 elemental metals and their alloys. To achieve a complete repre… ▽ More

    Submitted 12 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Main text with 17 pages and 8 figures; supplementary with 26 figures and 4 tables; source code and training/test data available

  48. arXiv:2311.02926  [pdf, other

    cs.CV cs.AI

    Deep Image Semantic Communication Model for Artificial Intelligent Internet of Things

    Authors: Li Ping Qian, Yi Zhang, Sikai Lyu, Huijie Zhu, Yuan Wu, Xuemin Sherman Shen, Xiaoniu Yang

    Abstract: With the rapid development of Artificial Intelligent Internet of Things (AIoT), the image data from AIoT devices has been witnessing the explosive increasing. In this paper, a novel deep image semantic communication model is proposed for the efficient image communication in AIoT. Particularly, at the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract th… ▽ More

    Submitted 8 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  49. arXiv:2310.17902  [pdf

    eess.IV

    CPIA Dataset: A Comprehensive Pathological Image Analysis Dataset for Self-supervised Learning Pre-training

    Authors: Nan Ying, Yanli Lei, Tianyi Zhang, Shangqing Lyu, Chunhui Li, Sicheng Chen, Zeyu Liu, Yu Zhao, Guanglei Zhang

    Abstract: Pathological image analysis is a crucial field in computer-aided diagnosis, where deep learning is widely applied. Transfer learning using pre-trained models initialized on natural images has effectively improved the downstream pathological performance. However, the lack of sophisticated domain-specific pathological initialization hinders their potential. Self-supervised learning (SSL) enables pre… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  50. arXiv:2310.14374  [pdf, other

    cs.CV

    OV-VG: A Benchmark for Open-Vocabulary Visual Grounding

    Authors: Chunlei Wang, Wenquan Feng, Xiangtai Li, Guangliang Cheng, Shuchang Lyu, Binghao Liu, Lijiang Chen, Qi Zhao

    Abstract: Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light of the widespread adoption of vision-based foundational models. Its primary objective is to comprehend novel concepts that are not encompassed within a predefined vocabulary. One key facet of this endeavor is Visual Grounding, which entails locating a specific region within an image based on a corresponding… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.