Skip to main content

Showing 1–50 of 2,135 results for author: Liu, R

  1. arXiv:2410.16128  [pdf, other

    cs.AI cs.LG

    SMART: Self-learning Meta-strategy Agent for Reasoning Tasks

    Authors: Rongxing Liu, Kumar Shridhar, Manish Prajapat, Patrick Xia, Mrinmaya Sachan

    Abstract: Tasks requiring deductive reasoning, especially those involving multiple steps, often demand adaptive strategies such as intermediate generation of rationales or programs, as no single approach is universally optimal. While Language Models (LMs) can enhance their outputs through iterative self-refinement and strategy adjustments, they frequently fail to apply the most effective strategy in their f… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  2. arXiv:2410.16049  [pdf, other

    cs.SE cs.CR

    Dirty-Waters: Detecting Software Supply Chain Smells

    Authors: Raphina Liu, Sofia Bobadilla, Benoit Baudry, Martin Monperrus

    Abstract: Using open-source dependencies is essential in modern software development. However, this practice implies significant trust in third-party code, while there is little support for developers to assess this trust. As a consequence, attacks have been increasingly occurring through third-party dependencies. These are called software supply chain attacks. In this paper, we target the problem of projec… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  3. arXiv:2410.14660  [pdf, other

    cs.LG

    A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

    Authors: Shengjie Sun, Runze Liu, Jiafei Lyu, Jing-Wen Yang, Liangpeng Zhang, Xiu Li

    Abstract: Large Language Models (LLMs) have shown significant potential in designing reward functions for Reinforcement Learning (RL) tasks. However, obtaining high-quality reward code often involves human intervention, numerous LLM queries, or repetitive RL training. To address these issues, we propose CARD, a LLM-driven Reward Design framework that iteratively generates and improves reward function code.… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  4. arXiv:2410.14642  [pdf, ps, other

    eess.SP

    Joint Space-Time Adaptive Processing and Beamforming Design for Cell-Free ISAC Systems

    Authors: Rang Liu, Ming Li, Qian Liu

    Abstract: In this paper, we explore cooperative sensing and communication within cell-free integrated sensing and communication (ISAC) systems. Specifically, multiple transmit access points (APs) collaboratively serve multiple communication users while simultaneously illuminating a potential target, with a separate sensing AP dedicated to collecting echo signals for target detection. To improve the performa… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: 5 pages, 2 figures, submitted to IEEE conference

  5. arXiv:2410.14250  [pdf, other

    cs.CV

    Vision-Language Navigation with Energy-Based Policy

    Authors: Rui Liu, Wenguan Wang, Yi Yang

    Abstract: Vision-language navigation (VLN) requires an agent to execute actions following human instructions. Existing VLN models are optimized through expert demonstrations by supervised behavioural cloning or incorporating manual reward engineering. While straightforward, these efforts overlook the accumulation of errors in the Markov decision process, and struggle to match the distribution of the expert… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  6. arXiv:2410.14101  [pdf, other

    cs.SD cs.AI eess.AS

    Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

    Authors: Shuwei He, Rui Liu, Haizhou Li

    Abstract: Visual Text-to-Speech (VTTS) aims to take the spatial environmental image as the prompt to synthesize the reverberation speech for the spoken content. Previous research focused on the RGB modality for global environmental modeling, overlooking the potential of multi-source spatial knowledge like depth, speaker position, and environmental semantics. To address the issues, we propose a novel multi-s… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 5 pages, 1 figure

  7. arXiv:2410.13981  [pdf, other

    cs.LG cs.AI

    On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery

    Authors: Renpu Liu, Ruida Zhou, Cong Shen, Jing Yang

    Abstract: An intriguing property of the Transformer is its ability to perform in-context learning (ICL), where the Transformer can solve different inference tasks without parameter updating based on the contextual information provided by the corresponding input-output demonstration pairs. It has been theoretically proved that ICL is enabled by the capability of Transformers to perform gradient-descent algor… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  8. arXiv:2410.13851  [pdf, other

    cs.RO cs.CV cs.GR

    Differentiable Robot Rendering

    Authors: Ruoshi Liu, Alper Canberk, Shuran Song, Carl Vondrick

    Abstract: Vision foundation models trained on massive amounts of visual data have shown unprecedented reasoning and planning skills in open-world settings. A key challenge in applying them to robotic tasks is the modality gap between visual data and action data. We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Project Page: https://drrobot.cs.columbia.edu/

  9. arXiv:2410.13372  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    High-temperature ferromagnetism and ferroelasticity in ultraflexible atomically thin square-shaped lattices

    Authors: Xinyuan Huang, Yueqiao Qu, Yu Liao, Qian Zheng, Ran Liu, Yu Chen, Liang Liu, Junzhong Wang, Gang Yao

    Abstract: The coexistence of high-temperature intrinsic ferromagnetic ordering, large magnetic anisotropy, along with novel mechanical properties such as ferroelasticity and flexibility, in experimental feasible two-dimensional (2D) crystals is greatly appealing for nanoscale spintronics. However, the progress in identifying such materials is limited. Here, by first-principles calculations, we report the fi… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 16 pages, 5 figures

  10. arXiv:2410.13327  [pdf, other

    cond-mat.supr-con cond-mat.str-el

    Cryogenic Digital Image Correlation as a Probe of Strain in Iron-Based Superconductors

    Authors: Ziye Mo, Chunyi Li, Wenting Zhang, Chang Liu, Yongxin Sun, Ruixian Liu, Xingye Lu

    Abstract: Uniaxial strain is a powerful tuning parameter that can control symmetry and anisotropic electronic properties in iron-based superconductors. However, accurately characterizing anisotropic strain can be challenging and complex. Here, we utilize a cryogenic optical system equipped with a high-spatial-resolution microscope to characterize surface strains in iron-based superconductors using the digit… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 6 pages, 4 figures. Published online in Chinese Physics Letters. DOI 10.1088/0256-307X/41/10/107102

  11. arXiv:2410.13319  [pdf, other

    cond-mat.supr-con cond-mat.str-el

    Evolution of pairing symmetry in FeSe$_{1-x}$S$_x$ as probed by uniaxial-strain tuning of $T_c$

    Authors: Ruixian Liu, Qi Tang, Chang Liu, Chunyi Li, Kaijuan Zhou, Qiaoyu Wang, Xingye Lu

    Abstract: In iron-based superconductors (FeSCs), the interplay between electronic nematicity and superconductivity is essential for understanding the exotic superconducting ground state. In the nematic regime, uniaxial-strain ($\varepsilon$) tuning of the superconducting transition temperature $T_c$ [$ΔT_c(\varepsilon)=α\varepsilon+β\varepsilon^2$] offers a unique approach to investigating the evolution of… ▽ More

    Submitted 18 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 6 pages, 4 figures. Supplementary is available upon reasonable request

  12. arXiv:2410.12923  [pdf, ps, other

    eess.SP

    DOA Estimation-Oriented Joint Array Partitioning and Beamforming Designs for ISAC Systems

    Authors: Rang Liu, Ming Li, Qian Liu, A. Lee Swindlehurst

    Abstract: Integrated sensing and communication has been identified as an enabling technology for forthcoming wireless networks. In an effort to achieve an improved performance trade-off between multiuser communications and radar sensing, this paper considers a dynamically-partitioned antenna array architecture for monostatic ISAC systems, in which each element of the array at the base station can function a… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 14 pages, 9 figures, submitted to IEEE journal

  13. arXiv:2410.12830  [pdf, other

    q-bio.QM cs.AI cs.LG

    Incorporating Metabolic Information into LLMs for Anomaly Detection in Clinical Time-Series

    Authors: Maxx Richard Rahman, Ruoxuan Liu, Wolfgang Maass

    Abstract: Anomaly detection in clinical time-series holds significant potential in identifying suspicious patterns in different biological parameters. In this paper, we propose a targeted method that incorporates the clinical domain knowledge into LLMs to improve their ability to detect anomalies. We introduce the Metabolism Pathway-driven Prompting (MPP) method, which integrates the information about metab… ▽ More

    Submitted 19 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

    Journal ref: NeurIPS 2024 Workshop on Time Series in the Age of Large Models

  14. arXiv:2410.11522  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction

    Authors: Renhang Liu, Abhinaba Roy, Dorien Herremans

    Abstract: In this work, we present a novel method for music emotion recognition that leverages Large Language Model (LLM) embeddings for label alignment across multiple datasets and zero-shot prediction on novel categories. First, we compute LLM embeddings for emotion labels and apply non-parametric clustering to group similar labels, across multiple datasets containing disjoint labels. We use these cluster… ▽ More

    Submitted 17 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

  15. arXiv:2410.10737  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Online Statistical Inference for Time-varying Sample-averaged Q-learning

    Authors: Saunak Kumar Panda, Ruiqi Liu, Yisha Xiang

    Abstract: Reinforcement learning (RL) has emerged as a key approach for training agents in complex and uncertain environments. Incorporating statistical inference in RL algorithms is essential for understanding and managing uncertainty in model performance. This paper introduces a time-varying batch-averaged Q-learning algorithm, termed sampleaveraged Q-learning, which improves upon traditional single-sampl… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  16. arXiv:2410.10544  [pdf

    physics.chem-ph physics.comp-ph

    Dual-Path Mechanism of Amino Acid Racemization Mediated by Quantum Mechanical Tunneling

    Authors: Xinrui Yang, Rui Liu, Ruiqi Xu, Zhaohua Cui, Zhigang Wang

    Abstract: The racemization of amino acids constitutes one of the most elemental and critical reactions, holding primitive significance for understanding the life's origin and maintenance. Nevertheless, its mechanism at the atomic level has been persistently misunderstood for more than a century. In this work, we demonstrate that the racemization of amino acid molecules in aqueous environments can occur simu… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 15 pages, 4 figures

  17. arXiv:2410.10432  [pdf, other

    quant-ph

    Individual solid-state nuclear spin qubits with coherence exceeding seconds

    Authors: James O'Sullivan, Jaime Travesedo, Louis Pallegoix, Zhiyuan W. Huang, Alexande May, Boris Yavkin, Patrick Hogan, Sen Lin, Renbao Liu, Thierry Chaneliere, Sylvain Bertaina, Philippe Goldner, Daniel Esteve, Denis Vion, Patrick Abgrall, Patrice Bertet, Emmanuel Flurin

    Abstract: The ability to coherently control and read out qubits with long coherence times in a scalable system is a crucial requirement for any quantum processor. Nuclear spins in the solid state have shown great promise as long-lived qubits. Control and readout of individual nuclear spin qubit registers has made major progress in the recent years using individual electron spin ancilla addressed either elec… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: 14 pages, 4 main figures, 7 supplementary figures

  18. Eliminating the Language Bias for Visual Question Answering with fine-grained Causal Intervention

    Authors: Ying Liu, Ge Bai, Chenji Lu, Shilong Li, Zhang Zhang, Ruifang Liu, Wenbin Guo

    Abstract: Despite the remarkable advancements in Visual Question Answering (VQA), the challenge of mitigating the language bias introduced by textual information remains unresolved. Previous approaches capture language bias from a coarse-grained perspective. However, the finer-grained information within a sentence, such as context and keywords, can result in different biases. Due to the ignorance of fine-gr… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Journal ref: 2024 IEEE International Conference on Multimedia and Expo (ICME), Niagara Falls, ON, Canada, 2024, pp. 1-6

  19. arXiv:2410.09524  [pdf, other

    cs.CL cs.SD eess.AS

    Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context Modeling

    Authors: Rui Liu, Zhenqi Jia, Jie Yang, Yifan Hu, Haizhou Li

    Abstract: Conversational Text-to-Speech (CTTS) aims to accurately express an utterance with the appropriate style within a conversational setting, which attracts more attention nowadays. While recognizing the significance of the CTTS task, prior studies have not thoroughly investigated speech emphasis expression, which is essential for conveying the underlying intention and attitude in human-machine interac… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

    Comments: submitted to IEEE Transaction

  20. arXiv:2410.09426  [pdf, other

    cs.CL cs.LG

    FlatQuant: Flatness Matters for LLM Quantization

    Authors: Yuxuan Sun, Ruikang Liu, Haoli Bai, Han Bao, Kang Zhao, Yuening Li, Jiaxin Hu, Xianzhi Yu, Lu Hou, Chun Yuan, Xin Jiang, Wulong Liu, Jun Yao

    Abstract: Recently, quantization has been widely used for the compression and acceleration of large language models~(LLMs). Due to the outliers in LLMs, it is crucial to flatten weights and activations to minimize quantization error with the equally spaced quantization points. Prior research explores various pre-quantization transformations to suppress outliers, such as per-channel scaling and Hadamard tran… ▽ More

    Submitted 12 October, 2024; originally announced October 2024.

    Comments: 23 pages

  21. arXiv:2410.09132  [pdf, other

    cs.LG cs.AI cs.CV

    When Graph meets Multimodal: Benchmarking on Multimodal Attributed Graphs Learning

    Authors: Hao Yan, Chaozhuo Li, Zhigang Yu, Jun Yin, Ruochen Liu, Peiyan Zhang, Weihao Han, Mingzheng Li, Zhengxin Zeng, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang, Senzhang Wang

    Abstract: Multimodal attributed graphs (MAGs) are prevalent in various real-world scenarios and generally contain two kinds of knowledge: (a) Attribute knowledge is mainly supported by the attributes of different modalities contained in nodes (entities) themselves, such as texts and images. (b) Topology knowledge, on the other hand, is provided by the complex interactions posed between nodes. The cornerston… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

  22. arXiv:2410.09123  [pdf, other

    cs.LG cs.AI

    Context-Aware Adapter Tuning for Few-Shot Relation Learning in Knowledge Graphs

    Authors: Ran Liu, Zhongzhou Liu, Xiaoli Li, Yuan Fang

    Abstract: Knowledge graphs (KGs) are instrumental in various real-world applications, yet they often suffer from incompleteness due to missing relations. To predict instances for novel relations with limited training examples, few-shot relation learning approaches have emerged, utilizing techniques such as meta-learning. However, the assumption is that novel relations in meta-testing and base relations in m… ▽ More

    Submitted 17 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: Accepted by EMNLP 2024

  23. arXiv:2410.08477  [pdf, other

    q-fin.MF

    Cross-Currency Basis Swaps Referencing Backward-Looking Rates

    Authors: Yining Ding, Ruyi Liu, Marek Rutkowski

    Abstract: The financial industry has undergone a significant transition from the London Interbank Offered Rate (LIBOR) to Risk Free Rates (RFR) such as, e.g., the Secured Overnight Financing Rate (SOFR) in the U.S. and the AUD Overnight Index Average (AONIA) in Australia, as the primary benchmark rate for borrowing costs. The paper examines the pricing and hedging method for SOFR-related financial products… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 46 pages, 6 gifures

    MSC Class: 0H10; 60H30; 91G30; 91G40

  24. arXiv:2410.08471  [pdf, other

    cs.FL eess.SY

    Opacity Enforcement by Edit Functions Under Incomparable Observations

    Authors: Wei Duan, Ruotian Liu, Maria Pia Fanti, Christoforos N. Hadjicostis, Zhiwu Li

    Abstract: As an information-flow privacy property, opacity characterizes whether a malicious external observer (referred to as an intruder) is able to infer the secret behavior of a system. This paper addresses the problem of opacity enforcement using edit functions in discrete event systems modeled by partially observed deterministic finite automata. A defender uses the edit function as an interface at the… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  25. arXiv:2410.08421  [pdf, other

    cs.LG

    Generalizable autoregressive modeling of time series through functional narratives

    Authors: Ran Liu, Wenrui Ma, Ellen Zippi, Hadi Pouransari, Jingyun Xiao, Chris Sandino, Behrooz Mahasseni, Juri Minxha, Erdrin Azemi, Eva L. Dyer, Ali Moin

    Abstract: Time series data are inherently functions of time, yet current transformers often learn time series by modeling them as mere concatenations of time periods, overlooking their functional properties. In this work, we propose a novel objective for transformers that learn time series by re-interpreting them as temporal functions. We build an alternative sequence of time series by constructing degradat… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  26. arXiv:2410.07592  [pdf, other

    cs.AI

    Diversified and Adaptive Negative Sampling on Knowledge Graphs

    Authors: Ran Liu, Zhongzhou Liu, Xiaoli Li, Hao Wu, Yuan Fang

    Abstract: In knowledge graph embedding, aside from positive triplets (ie: facts in the knowledge graph), the negative triplets used for training also have a direct influence on the model performance. In reality, since knowledge graphs are sparse and incomplete, negative triplets often lack explicit labels, and thus they are often obtained from various sampling strategies (eg: randomly replacing an entity in… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: 30 pages, 7 figures, Journal

  27. arXiv:2410.06774  [pdf, other

    stat.AP stat.ME

    Retrieved dropout imputation considering administrative study withdrawal

    Authors: Rong Liu, Yongming Qu

    Abstract: The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) E9 (R1) Addendum provides a framework for defining estimands in clinical trials. Treatment policy strategy is the mostly used approach to handle intercurrent events in defining estimands. Imputing missing values for potential outcomes under the treatment policy strategy has been discussed… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: 16 pages, 5 tables, and 2 figures

  28. arXiv:2410.06645  [pdf, other

    cs.CV

    Continual Learning in the Frequency Domain

    Authors: Ruiqi Liu, Boyu Diao, Libo Huang, Zijia An, Zhulin An, Yongjun Xu

    Abstract: Continual learning (CL) is designed to learn new tasks while preserving existing knowledge. Replaying samples from earlier tasks has proven to be an effective method to mitigate the forgetting of previously acquired knowledge. However, the current research on the training efficiency of rehearsal-based methods is insufficient, which limits the practical application of CL systems in resource-limited… ▽ More

    Submitted 10 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

    Comments: Accepted by NeurIPS 2024

  29. arXiv:2410.06326  [pdf, other

    stat.ME stat.ML

    A convex formulation of covariate-adjusted Gaussian graphical models via natural parametrization

    Authors: Ruobin Liu, Guo Yu

    Abstract: Gaussian graphical models (GGMs) are widely used for recovering the conditional independence structure among random variables. Recently, several key advances have been made to exploit an additional set of variables for better estimating the GGMs of the variables of interest. For example, in co-expression quantitative trait locus (eQTL) studies, both the mean expression level of genes as well as th… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  30. arXiv:2410.06234  [pdf, other

    cs.CV cs.AI cs.LG

    TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data

    Authors: Jeremy Andrew Irvin, Emily Ruoyu Liu, Joyce Chuyi Chen, Ines Dormoy, Jinyoung Kim, Samar Khanna, Zhuo Zheng, Stefano Ermon

    Abstract: Large vision and language assistants have enabled new capabilities for interpreting natural images. These approaches have recently been adapted to earth observation data, but they are only able to handle single image inputs, limiting their use for many real-world tasks. In this work, we develop a new vision and language assistant called TEOChat that can engage in conversations about temporal seque… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  31. arXiv:2410.06138  [pdf, other

    astro-ph.HE

    On the External Inverse Compton Scattering off the Prompt Emission in GRB 221009A

    Authors: Cui-Yuan Dai, Jian-He Zheng, Xiao-Hong Zhao, Ruo-Yu Liu, Xiang-Yu Wang

    Abstract: The light curve of the TeV emission in GRB 221009A displays a smooth transition from an initial rapid rise to a slower rise and eventually a decay phase. The smooth temporal profile of the TeV emission suggests that it mainly results from an external shock. The temporal overlap between the prompt KeV-MeV emission and the early TeV afterglow indicates that external inverse Compton scattering (EIC)… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 25 pages, 11 figures, 2 tables, comments are welcome

  32. arXiv:2410.04936  [pdf, other

    cs.AI

    Training Interactive Agent in Large FPS Game Map with Rule-enhanced Reinforcement Learning

    Authors: Chen Zhang, Huan Hu, Yuan Zhou, Qiyang Cao, Ruochen Liu, Wenya Wei, Elvis S. Liu

    Abstract: In the realm of competitive gaming, 3D first-person shooter (FPS) games have gained immense popularity, prompting the development of game AI systems to enhance gameplay. However, deploying game AI in practical scenarios still poses challenges, particularly in large-scale and complex FPS games. In this paper, we focus on the practical deployment of game AI in the online multiplayer competitive 3D F… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  33. arXiv:2410.04425  [pdf, other

    astro-ph.HE

    LHAASO detection of very-high-energy gamma-ray emission surrounding PSR J0248+6021

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: We report the detection of an extended very-high-energy (VHE) gamma-ray source coincident with the locations of middle-aged (62.4~\rm kyr) pulsar PSR J0248+6021, by using the LHAASO-WCDA data of live 796 days and LHAASO-KM2A data of live 1216 days. A significant excess of \gray induced showers is observed both by WCDA in energy bands of 1-25~\rm TeV and KM2A in energy bands of $>$ 25~\rm TeV with… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

    Comments: 12 pages, 10 figures, Accepted by Sci. China-Phys. Mech. Astron

  34. arXiv:2410.03719  [pdf, other

    cs.CL cs.SD eess.AS

    FluentEditor+: Text-based Speech Editing by Modeling Local Hierarchical Acoustic Smoothness and Global Prosody Consistency

    Authors: Rui Liu, Jiatian Xi, Ziyue Jiang, Haizhou Li

    Abstract: Text-based speech editing (TSE) allows users to modify speech by editing the corresponding text and performing operations such as cutting, copying, and pasting to generate updated audio without altering the original recording directly. Text-based speech editing (TSE) allows users to modify speech by editing the corresponding text and performing operations such as cutting, copying, and pasting to g… ▽ More

    Submitted 28 September, 2024; originally announced October 2024.

    Comments: Work in progress

  35. arXiv:2410.03113  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Self-Assembly of a halogenated organic molecule on the Si(111) $\surd$3$\times$$\surd$3-Ag surface

    Authors: R Liu, D. Marchese, R. C. Mawhinney, M. C. Gallagher

    Abstract: We study the self-assembly of halogen-based organic molecules on a passivated silicon surface. The room temperature adsorption of 2,4,6-tris(4-iodophenyl)-1,3,5-triazine (TIPT) on the Si(111)-$\surd$3$\times$$\surd$3-Ag surface is described. The adsorption is investigated primarily by room-temperature scanning tunneling microscopy (STM) and density-functional theoretical (DFT) calculations. The ex… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

  36. arXiv:2410.02804  [pdf, other

    cs.CV cs.AI

    Leveraging Retrieval Augment Approach for Multimodal Emotion Recognition Under Missing Modalities

    Authors: Qi Fan, Hongyu Yuan, Haolin Zuo, Rui Liu, Guanglai Gao

    Abstract: Multimodal emotion recognition utilizes complete multimodal information and robust multimodal joint representation to gain high performance. However, the ideal condition of full modality integrity is often not applicable in reality and there always appears the situation that some modalities are missing. For example, video, audio, or text data is missing due to sensor failure or network bandwidth p… ▽ More

    Submitted 18 September, 2024; originally announced October 2024.

    Comments: Under reviewing

  37. arXiv:2410.02236  [pdf, other

    cs.LG eess.SY

    C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front

    Authors: Ruohong Liu, Yuxin Pan, Linjie Xu, Lei Song, Pengcheng You, Yize Chen, Jiang Bian

    Abstract: Multi-objective reinforcement learning (MORL) excels at handling rapidly changing preferences in tasks that involve multiple criteria, even for unseen preferences. However, previous dominating MORL methods typically generate a fixed policy set or preference-conditioned policy through multiple training iterations exclusively for sampled preference vectors, and cannot ensure the efficient discovery… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: 27 pages, 8 figues. In Submission to a conference

  38. arXiv:2410.02128  [pdf, other

    cs.LG

    Breaking the mold: The challenge of large scale MARL specialization

    Authors: Stefan Juang, Hugh Cao, Arielle Zhou, Ruochen Liu, Nevin L. Zhang, Elvis Liu

    Abstract: In multi-agent learning, the predominant approach focuses on generalization, often neglecting the optimization of individual agents. This emphasis on generalization limits the ability of agents to utilize their unique strengths, resulting in inefficiencies. This paper introduces Comparative Advantage Maximization (CAM), a method designed to enhance individual agent specialization in multiagent sys… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: 19 pages

  39. arXiv:2410.01768  [pdf, other

    cs.CV

    SegEarth-OV: Towards Traning-Free Open-Vocabulary Segmentation for Remote Sensing Images

    Authors: Kaiyu Li, Ruixun Liu, Xiangyong Cao, Deyu Meng, Zhi Wang

    Abstract: Remote sensing image plays an irreplaceable role in fields such as agriculture, water resources, military, and disaster relief. Pixel-level interpretation is a critical aspect of remote sensing image applications; however, a prevalent limitation remains the need for extensive manual annotation. For this, we try to introduce open-vocabulary semantic segmentation (OVSS) into the remote sensing conte… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

  40. arXiv:2410.01495  [pdf, other

    cs.HC

    Open-vocabulary Multimodal Emotion Recognition: Dataset, Metric, and Benchmark

    Authors: Zheng Lian, Haiyang Sun, Licai Sun, Lan Chen, Haoyu Chen, Hao Gu, Zhuofan Wen, Shun Chen, Siyuan Zhang, Hailiang Yao, Mingyu Xu, Kang Chen, Bin Liu, Rui Liu, Shan Liang, Ya Li, Jiangyan Yi, Jianhua Tao

    Abstract: Multimodal Emotion Recognition (MER) is an important research topic. This paper advocates for a transformative paradigm in MER. The rationale behind our work is that current approaches often rely on a limited set of basic emotion labels, which do not adequately represent the rich spectrum of human emotions. These traditional and overly simplistic emotion categories fail to capture the inherent com… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

  41. arXiv:2410.00361  [pdf, other

    cs.CL

    PclGPT: A Large Language Model for Patronizing and Condescending Language Detection

    Authors: Hongbo Wang, Mingda Li, Junyu Lu, Hebin Xia, Liang Yang, Bo Xu, Ruizhu Liu, Hongfei Lin

    Abstract: Disclaimer: Samples in this paper may be harmful and cause discomfort! Patronizing and condescending language (PCL) is a form of speech directed at vulnerable groups. As an essential branch of toxic language, this type of language exacerbates conflicts and confrontations among Internet communities and detrimentally impacts disadvantaged groups. Traditional pre-trained language models (PLMs) perf… ▽ More

    Submitted 30 September, 2024; originally announced October 2024.

    Comments: Accepted for EMNLP2024 (Findings)

  42. arXiv:2410.00313  [pdf, ps, other

    cs.IT eess.SP

    Pre-Chirp-Domain Index Modulation for Full-Diversity Affine Frequency Division Multiplexing towards 6G

    Authors: Guangyao Liu, Tianqi Mao, Zhenyu Xiao, Ruiqi Liu, Miaowen Wen

    Abstract: Affine frequency division multiplexing (AFDM), tailored as a superior multicarrier technique utilizing chirp signals for high-mobility communications, is envisioned as a promising candidate for the sixth-generation (6G) wireless network. AFDM is based on the discrete affine Fourier transform (DAFT) with two adjustable parameters of the chirp signals, termed as the pre-chirp and post-chirp paramete… ▽ More

    Submitted 17 October, 2024; v1 submitted 30 September, 2024; originally announced October 2024.

  43. arXiv:2409.17277  [pdf, other

    cs.RO cs.LG

    Building Real-time Awareness of Out-of-distribution in Trajectory Prediction for Autonomous Vehicles

    Authors: Tongfei, Guo, Taposh Banerjee, Rui Liu, Lili Su

    Abstract: Trajectory prediction describes the motions of surrounding moving obstacles for an autonomous vehicle; it plays a crucial role in enabling timely decision-making, such as collision avoidance and trajectory replanning. Accurate trajectory planning is the key to reliable vehicle deployments in open-world environment, where unstructured obstacles bring in uncertainties that are impossible to fully ca… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

  44. arXiv:2409.16973  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Adaptive Self-Supervised Learning Strategies for Dynamic On-Device LLM Personalization

    Authors: Rafael Mendoza, Isabella Cruz, Richard Liu, Aarav Deshmukh, David Williams, Jesscia Peng, Rohan Iyer

    Abstract: Large language models (LLMs) have revolutionized how we interact with technology, but their personalization to individual user preferences remains a significant challenge, particularly in on-device applications. Traditional methods often depend heavily on labeled datasets and can be resource-intensive. To address these issues, we present Adaptive Self-Supervised Learning Strategies (ASLS), which u… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

    Comments: First ASLS

  45. arXiv:2409.16577  [pdf, other

    cs.RO cs.AI

    Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape

    Authors: Chao Huang, Wenshuo Zang, Carlo Pinciroli, Zhi Jane Li, Taposh Banerjee, Lili Su, Rui Liu

    Abstract: Compared with single robots, Multi-Robot Systems (MRS) can perform missions more efficiently due to the presence of multiple members with diverse capabilities. However, deploying an MRS in wide real-world environments is still challenging due to uncertain and various obstacles (e.g., building clusters and trees). With a limited understanding of environmental uncertainty on performance, an MRS cann… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  46. arXiv:2409.15663  [pdf

    stat.ME

    BARD: A seamless two-stage dose optimization design integrating backfill and adaptive randomization

    Authors: Yixuan Zhao, Rachael Liu, Jianchang Lin, Ying Yuan

    Abstract: One common approach for dose optimization is a two-stage design, which initially conducts dose escalation to identify the maximum tolerated dose (MTD), followed by a randomization stage where patients are assigned to two or more doses to further assess and compare their risk-benefit profiles to identify the optimal dose. A limitation of this approach is its requirement for a relatively large sampl… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  47. arXiv:2409.14215  [pdf, other

    cs.CV

    @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology

    Authors: Xin Jiang, Junwei Zheng, Ruiping Liu, Jiahang Li, Jiaming Zhang, Sven Matthiesen, Rainer Stiefelhagen

    Abstract: As Vision-Language Models (VLMs) advance, human-centered Assistive Technologies (ATs) for helping People with Visual Impairments (PVIs) are evolving into generalists, capable of performing multiple tasks simultaneously. However, benchmarking VLMs for ATs remains under-explored. To bridge this gap, we first create a novel AT benchmark (@Bench). Guided by a pre-design user study with PVIs, our bench… ▽ More

    Submitted 21 September, 2024; originally announced September 2024.

    Comments: Accepted by WACV 2025, project page: https://junweizheng93.github.io/publications/ATBench/ATBench.html

  48. arXiv:2409.13988  [pdf, other

    cs.CV

    GAInS: Gradient Anomaly-aware Biomedical Instance Segmentation

    Authors: Runsheng Liu, Hao Jiang, Yanning Zhou, Huangjing Lin, Liansheng Wang, Hao Chen

    Abstract: Instance segmentation plays a vital role in the morphological quantification of biomedical entities such as tissues and cells, enabling precise identification and delineation of different structures. Current methods often address the challenges of touching, overlapping or crossing instances through individual modeling, while neglecting the intrinsic interrelation between these conditions. In this… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: Accepted by BIBM2024

  49. arXiv:2409.13987  [pdf, other

    cs.CV

    Holistic and Historical Instance Comparison for Cervical Cell Detection

    Authors: Hao Jiang, Runsheng Liu, Yanning Zhou, Huangjing Lin, Hao Chen

    Abstract: Cytology screening from Papanicolaou (Pap) smears is a common and effective tool for the preventive clinical management of cervical cancer, where abnormal cell detection from whole slide images serves as the foundation for reporting cervical cytology. However, cervical cell detection remains challenging due to 1) hazily-defined cell types (e.g., ASC-US) with subtle morphological discrepancies caus… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: Accepted by BIBM2024

  50. arXiv:2409.13912  [pdf, other

    cs.CV

    OneBEV: Using One Panoramic Image for Bird's-Eye-View Semantic Mapping

    Authors: Jiale Wei, Junwei Zheng, Ruiping Liu, Jie Hu, Jiaming Zhang, Rainer Stiefelhagen

    Abstract: In the field of autonomous driving, Bird's-Eye-View (BEV) perception has attracted increasing attention in the community since it provides more comprehensive information compared with pinhole front-view images and panoramas. Traditional BEV methods, which rely on multiple narrow-field cameras and complex pose estimations, often face calibration and synchronization issues. To break the wall of the… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: Accepted by ACCV 2024. Project code at: https://github.com/JialeWei/OneBEV