Skip to main content

Showing 1–50 of 467 results for author: Wu, N

  1. arXiv:2410.16165  [pdf, other

    cs.CL cs.DB

    From Tokens to Materials: Leveraging Language Models for Scientific Discovery

    Authors: Yuwei Wan, Tong Xie, Nan Wu, Wenjie Zhang, Chunyu Kit, Bram Hoex

    Abstract: Exploring the predictive capabilities of language models in material science is an ongoing interest. This study investigates the application of language model embeddings to enhance material property prediction in materials science. By evaluating various contextual embedding methods and pre-trained models, including Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-t… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  2. arXiv:2410.11359  [pdf, other

    cs.LG cs.RO stat.ML

    DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

    Authors: Eric Hanchen Jiang, Zhi Zhang, Dinghuai Zhang, Andrew Lizarraga, Chenheng Xu, Yasi Zhang, Siyan Zhao, Zhengjie Xu, Peiyu Yu, Yuer Tang, Deqian Kong, Ying Nian Wu

    Abstract: Advancements in reinforcement learning have led to the development of sophisticated models capable of learning complex decision-making tasks. However, efficiently integrating world models with decision transformers remains a challenge. In this paper, we introduce a novel approach that combines the Dreamer algorithm's ability to generate anticipatory trajectories with the adaptive learning strength… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  3. arXiv:2410.07140  [pdf, other

    cs.IR cs.DB cs.GR

    DSparsE: Dynamic Sparse Embedding for Knowledge Graph Completion

    Authors: Chuhong Yang, Bin Li, Nan Wu

    Abstract: Addressing the incompleteness problem in knowledge graph remains a significant challenge. Current knowledge graph completion methods have their limitations. For example, ComDensE is prone to overfitting and suffers from the degradation with the increase of network depth while InteractE has the limitations in feature interaction and interpretability. To this end, we propose a new method called dyna… ▽ More

    Submitted 22 September, 2024; originally announced October 2024.

    Comments: 15 pages, 5 figures, camera ready for ICPR

  4. arXiv:2410.06460  [pdf, other

    cs.LG

    A Benchmark on Directed Graph Representation Learning in Hardware Designs

    Authors: Haoyu Wang, Yinan Huang, Nan Wu, Pan Li

    Abstract: To keep pace with the rapid advancements in design complexity within modern computing systems, directed graph representation learning (DGRL) has become crucial, particularly for encoding circuit netlists, computational graphs, and developing surrogate models for hardware performance prediction. However, DGRL remains relatively unexplored, especially in the hardware domain, mainly due to the lack o… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  5. arXiv:2410.01858  [pdf, other

    q-bio.CB cs.LG q-bio.GN

    Long-range gene expression prediction with token alignment of large language model

    Authors: Edouardo Honig, Huixin Zhan, Ying Nian Wu, Zijun Frank Zhang

    Abstract: Gene expression is a cellular process that plays a fundamental role in human phenotypical variations and diseases. Despite advances of deep learning models for gene expression prediction, recent benchmarks have revealed their inability to learn distal regulatory grammar. Here, we address this challenge by leveraging a pretrained large language model to enhance gene expression prediction. We introd… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

    Comments: 14 pages, 10 figures

  6. arXiv:2409.20014  [pdf, other

    cond-mat.mtrl-sci

    Autonomous tip-induced chemical reactions in scanning probe microscopy

    Authors: Nian Wu, Markus Aapro, Joakim S. Jestilä, Robert Drost, Miguel Martınez Garcıa, Tomas Torres, Feifei Xiang, Nan Cao, Zhijie He, Giovanni Bottari, Peter Liljeroth, Adam S. Foster

    Abstract: Scanning Probe Microscopy (SPM) techniques have shown great potential in fabricating nanoscale structures endowed with exotic quantum properties achieved through various manipulations of atoms and molecules. However, the selection of proper manipulation parameters requires extensive domain knowledge, which is not necessarily transferable to new systems. Therefore, efficient and autonomous SPM tech… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

  7. arXiv:2409.15376  [pdf, other

    cs.LG cs.AI cs.CL

    ControlMath: Controllable Data Generation Promotes Math Generalist Models

    Authors: Nuo Chen, Ning Wu, Jianhui Chang, Jia Li

    Abstract: Utilizing large language models (LLMs) for data augmentation has yielded encouraging results in mathematical reasoning. However, these approaches face constraints in problem diversity, potentially restricting them to in-domain/distribution data generation. To this end, we propose ControlMath, an iterative method involving an equation-generator module and two LLM-based agents. The module creates di… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: 17 pages

    Report number: EMNLP 2024 Main

  8. LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space

    Authors: Michael Gilbert, Yannan Nellie Wu, Joel S. Emer, Vivienne Sze

    Abstract: Latency and energy consumption are key metrics in the performance of deep neural network (DNN) accelerators. A significant factor contributing to latency and energy is data transfers. One method to reduce transfers or data is reusing data when multiple operations use the same data. Fused-layer accelerators reuse data across operations in different layers by retaining intermediate data in on-chip b… ▽ More

    Submitted 14 October, 2024; v1 submitted 20 September, 2024; originally announced September 2024.

    Comments: To be published in IEEE Transactions on Circuits and Systems for Artificial Intelligence

  9. arXiv:2409.09238  [pdf, other

    physics.plasm-ph

    High-Fidelity Data-Driven Dynamics Model for Reinforcement Learning-based Magnetic Control in HL-3 Tokamak

    Authors: Niannian Wu, Zongyu Yang, Rongpeng Li, Ning Wei, Yihang Chen, Qianyun Dong, Jiyuan Li, Guohui Zheng, Xinwen Gong, Feng Gao, Bo Li, Min Xu, Zhifeng Zhao, Wulyu Zhong

    Abstract: The drive to control tokamaks, a prominent technology in nuclear fusion, is essential due to its potential to provide a virtually unlimited source of clean energy. Reinforcement learning (RL) promises improved flexibility to manage the intricate and non-linear dynamics of the plasma encapsulated in a tokamak. However, RL typically requires substantial interaction with a simulator capable of accura… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  10. arXiv:2409.08551  [pdf, other

    stat.ML cs.LG

    Think Twice Before You Act: Improving Inverse Problem Solving With MCMC

    Authors: Yaxuan Zhu, Zehao Dou, Haoxin Zheng, Yasi Zhang, Ying Nian Wu, Ruiqi Gao

    Abstract: Recent studies demonstrate that diffusion models can serve as a strong prior for solving inverse problems. A prominent example is Diffusion Posterior Sampling (DPS), which approximates the posterior distribution of data given the measure using Tweedie's formula. Despite the merits of being versatile in solving various inverse problems without re-training, the performance of DPS is hindered by the… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  11. arXiv:2409.04421  [pdf, other

    cs.CL cs.AI cs.LG

    RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs

    Authors: Jiaxing Wu, Lin Ning, Luyang Liu, Harrison Lee, Neo Wu, Chao Wang, Sushant Prakash, Shawn O'Banion, Bradley Green, Jun Xie

    Abstract: LLM-powered personalization agent systems employ Large Language Models (LLMs) to predict users' behavior from their past activities. However, their effectiveness often hinges on the ability to effectively leverage extensive, long user historical data due to its inherent noise and length of such data. Existing pretrained LLMs may generate summaries that are concise but lack the necessary context fo… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  12. arXiv:2409.03845  [pdf, other

    cs.LG stat.ML

    Latent Space Energy-based Neural ODEs

    Authors: Sheng Cheng, Deqian Kong, Jianwen Xie, Kookjin Lee, Ying Nian Wu, Yezhou Yang

    Abstract: This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data. This family of models generates each data point in the time series by a neural emission model, which is a non-linear transformation of a latent state vector. The trajectory of the latent states is implicitly described by a neural ordinary differential equation (ODE), with the initial… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  13. arXiv:2408.16966  [pdf, other

    cs.LG cs.AI cs.CL

    UserSumBench: A Benchmark Framework for Evaluating User Summarization Approaches

    Authors: Chao Wang, Neo Wu, Lin Ning, Jiaxing Wu, Luyang Liu, Jun Xie, Shawn O'Banion, Bradley Green

    Abstract: Large language models (LLMs) have shown remarkable capabilities in generating user summaries from a long list of raw user activity data. These summaries capture essential user information such as preferences and interests, and therefore are invaluable for LLM-based personalization applications, such as explainable recommender systems. However, the development of new summarization techniques is hin… ▽ More

    Submitted 5 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  14. arXiv:2408.16526  [pdf, other

    cond-mat.str-el quant-ph

    Evolution of two-magnon bound states in a higher-spin ferromagnetic chain with single-ion anisotropy: A complete solution

    Authors: Xinlan Lou, Jiawei Li, Ning Wu

    Abstract: Few-magnon bound states in quantum spin chains have been long studied and attracted much recent attentions. For a higher-spin ferromagnetic XXZ chain with single-ion anisotropy, several features regarding the evolution of the low-lying two-magnon bound states with varying wave number were observed in the literature. However, most of these observations are only qualitatively understood due to the l… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 6 pages, 4 figures, accepted for publication as a Letter in Physical Review B

    Journal ref: Phys. Rev. B 110, L100404 (2024)

  15. arXiv:2408.11671  [pdf, other

    quant-ph

    In situ mixer calibration for superconducting quantum circuits

    Authors: Nan Wu, Jing Lin, Changrong Xie, Zechen Guo, Wenhui Huang, Libo Zhang, Yuxuan Zhou, Xuandong Sun, Jiawei Zhang, Weijie Guo, Xiayu Linpeng, Song Liu, Yang Liu, Wenhui Ren, Ziyu Tao, Ji Jiang, Ji Chu, Jingjing Niu, Youpeng Zhong, Dapeng Yu

    Abstract: Mixers play a crucial role in superconducting quantum computing, primarily by facilitating frequency conversion of signals to enable precise control and readout of quantum states. However, imperfections, particularly carrier leakage and unwanted sideband signal, can significantly compromise control fidelity. To mitigate these defects, regular and precise mixer calibrations are indispensable, yet t… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 9 pages, 7 figures

  16. arXiv:2408.09965  [pdf, ps, other

    quant-ph cond-mat.stat-mech

    Relaxing towards generalized one-body Boltzmann states

    Authors: Sheng-Wen Li, Ning Wu

    Abstract: Isolated quantum systems follow the reversible unitary evolution; if we focus on the dynamics of local states and observables, they exhibit the irreversible relaxation behaviors. Here we study the local relaxation process in an isolated chain consisting of \emph{N} three level systems. Though the entropy of the full many body state keeps a constant, it turns out the total correlation of this syste… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 7 pages, 4 figures, comments are welcome

  17. arXiv:2408.08862  [pdf, other

    cs.LG

    Visual Agents as Fast and Slow Thinkers

    Authors: Guangyan Sun, Mingyu Jin, Zhenting Wang, Cheng-Long Wang, Siqi Ma, Qifan Wang, Ying Nian Wu, Yongfeng Zhang, Dongfang Liu

    Abstract: Achieving human-level intelligence requires refining cognitive distinctions between System 1 and System 2 thinking. While contemporary AI, driven by large language models, demonstrates human-like traits, it falls short of genuine cognition. Transitioning from structured benchmarks to real-world scenarios presents challenges for visual agents, often leading to inaccurate and overly confident respon… ▽ More

    Submitted 6 September, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

  18. arXiv:2408.07971  [pdf, other

    cs.CL

    Predicting Lung Cancer Patient Prognosis with Large Language Models

    Authors: Danqing Hu, Bing Liu, Xiang Li, Xiaofeng Zhu, Nan Wu

    Abstract: Prognosis prediction is crucial for determining optimal treatment plans for lung cancer patients. Traditionally, such predictions relied on models developed from retrospective patient data. Recently, large language models (LLMs) have gained attention for their ability to process and generate text based on extensive learned knowledge. In this study, we evaluate the potential of GPT-4o mini and GPT-… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  19. arXiv:2408.02693  [pdf, other

    physics.comp-ph cs.AI

    Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models

    Authors: Chuan Liu, Chunshu Wu, Shihui Cao, Mingkai Chen, James Chenhao Liang, Ang Li, Michael Huang, Chuang Ren, Dongfang Liu, Ying Nian Wu, Tong Geng

    Abstract: The rapid development of AI highlights the pressing need for sustainable energy, a critical global challenge for decades. Nuclear fusion, generally seen as an ultimate solution, has been the focus of intensive research for nearly a century, with investments reaching hundreds of billions of dollars. Recent advancements in Inertial Confinement Fusion have drawn significant attention to fusion resear… ▽ More

    Submitted 5 October, 2024; v1 submitted 3 August, 2024; originally announced August 2024.

  20. arXiv:2407.17900  [pdf, other

    cs.CL cs.LG

    The Power of Combining Data and Knowledge: GPT-4o is an Effective Interpreter of Machine Learning Models in Predicting Lymph Node Metastasis of Lung Cancer

    Authors: Danqing Hu, Bing Liu, Xiaofeng Zhu, Nan Wu

    Abstract: Lymph node metastasis (LNM) is a crucial factor in determining the initial treatment for patients with lung cancer, yet accurate preoperative diagnosis of LNM remains challenging. Recently, large language models (LLMs) have garnered significant attention due to their remarkable text generation capabilities. Leveraging the extensive medical knowledge learned from vast corpora, LLMs can estimate pro… ▽ More

    Submitted 14 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  21. arXiv:2407.11219  [pdf, other

    cs.CV eess.IV

    TLRN: Temporal Latent Residual Networks For Large Deformation Image Registration

    Authors: Nian Wu, Jiarui Xing, Miaomiao Zhang

    Abstract: This paper presents a novel approach, termed {\em Temporal Latent Residual Network (TLRN)}, to predict a sequence of deformation fields in time-series image registration. The challenge of registering time-series images often lies in the occurrence of large motions, especially when images differ significantly from a reference (e.g., the start of a cardiac cycle compared to the peak stretching phase… ▽ More

    Submitted 23 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 10 pages. Accepted by MICCAI 2024

  22. arXiv:2407.11098  [pdf, other

    cs.LG cs.AI

    Inertial Confinement Fusion Forecasting via Large Language Models

    Authors: Mingkai Chen, Taowen Wang, Shihui Cao, James Chenhao Liang, Chuan Liu, Chunshu Wu, Qifan Wang, Ying Nian Wu, Michael Huang, Chuang Ren, Ang Li, Tong Geng, Dongfang Liu

    Abstract: Controlled fusion energy is deemed pivotal for the advancement of human civilization. In this study, we introduce $\textbf{LPI-LLM}$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms tailored to address a critical challenge, Laser-Plasma Instabilities ($\texttt{LPI}$), in Inertial Confinement Fusion ($\texttt{ICF}$). Our approach offers several key c… ▽ More

    Submitted 14 October, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  23. arXiv:2407.10981  [pdf, other

    cs.NI cs.CR

    Systematic Literature Review of AI-enabled Spectrum Management in 6G and Future Networks

    Authors: Bushra Sabir, Shuiqiao Yang, David Nguyen, Nan Wu, Alsharif Abuadbba, Hajime Suzuki, Shangqi Lai, Wei Ni, Ding Ming, Surya Nepal

    Abstract: Artificial Intelligence (AI) has advanced significantly in various domains like healthcare, finance, and cybersecurity, with successes such as DeepMind's medical imaging and Tesla's autonomous vehicles. As telecommunications transition from 5G to 6G, integrating AI is crucial for complex demands like data processing, network optimization, and security. Despite ongoing research, there's a gap in co… ▽ More

    Submitted 12 June, 2024; originally announced July 2024.

    Comments: 35 pages

  24. arXiv:2407.09286  [pdf, other

    math.ST

    Adaptive Bayesian Regression on Data with Low Intrinsic Dimensionality

    Authors: Tao Tang, Nan Wu, Xiuyuan Cheng, David Dunson

    Abstract: We study how the posterior contraction rate under a Gaussian process (GP) prior depends on the intrinsic dimension of the predictors and smoothness of the regression function. An open question is whether a generic GP prior that does not incorporate knowledge of the intrinsic lower-dimensional structure of the predictors can attain an adaptive rate for a broad class of such structures. We show that… ▽ More

    Submitted 5 September, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  25. arXiv:2407.04191  [pdf, other

    cs.CV cs.AI cs.GR

    GazeFusion: Saliency-guided Image Generation

    Authors: Yunxiang Zhang, Nan Wu, Connor Z. Lin, Gordon Wetzstein, Qi Sun

    Abstract: Diffusion models offer unprecedented image generation capabilities given just a text prompt. While emerging control mechanisms have enabled users to specify the desired spatial arrangements of the generated content, they cannot predict or control where viewers will pay more attention due to the complexity of human vision. Recognizing the critical necessity of attention-controllable image generatio… ▽ More

    Submitted 16 March, 2024; originally announced July 2024.

  26. arXiv:2407.02280  [pdf, other

    cs.CV cs.AI

    FedIA: Federated Medical Image Segmentation with Heterogeneous Annotation Completeness

    Authors: Yangyang Xiang, Nannan Wu, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Federated learning has emerged as a compelling paradigm for medical image segmentation, particularly in light of increasing privacy concerns. However, most of the existing research relies on relatively stringent assumptions regarding the uniformity and completeness of annotations across clients. Contrary to this, this paper highlights a prevalent challenge in medical practice: incomplete annotatio… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI 2024

  27. arXiv:2407.02229  [pdf, other

    cs.CV

    LaMoD: Latent Motion Diffusion Model For Myocardial Strain Generation

    Authors: Jiarui Xing, Nivetha Jayakumar, Nian Wu, Yu Wang, Frederick H. Epstein, Miaomiao Zhang

    Abstract: Motion and deformation analysis of cardiac magnetic resonance (CMR) imaging videos is crucial for assessing myocardial strain of patients with abnormal heart functions. Recent advances in deep learning-based image registration algorithms have shown promising results in predicting motion fields from routinely acquired CMR sequences. However, their accuracy often diminishes in regions with subtle ap… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  28. arXiv:2406.18995  [pdf, other

    cs.LG cs.AI

    FedMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity

    Authors: Zhaobin Sun, Nannan Wu, Junjie Shi, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Cross-silo federated learning (FL) enables decentralized organizations to collaboratively train models while preserving data privacy and has made significant progress in medical image classification. One common assumption is task homogeneity where each client has access to all classes during training. However, in clinical practice, given a multi-label classification task, constrained by the level… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Early accepted by MICCAI 2024

  29. arXiv:2406.18456  [pdf, other

    stat.ML math.DG

    Boundary Detection Algorithm Inspired by Locally Linear Embedding

    Authors: Pei-Cheng Kuo, Nan Wu

    Abstract: In the study of high-dimensional data, it is often assumed that the data set possesses an underlying lower-dimensional structure. A practical model for this structure is an embedded compact manifold with boundary. Since the underlying manifold structure is typically unknown, identifying boundary points from the data distributed on the manifold is crucial for various applications. In this work, we… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 31 Pages, 5 figures

    MSC Class: 53-08; 53Z50

  30. arXiv:2406.15658  [pdf, other

    cs.CV cs.AI

    TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning

    Authors: Nemin Wu, Qian Cao, Zhangyu Wang, Zeping Liu, Yanlin Qi, Jielu Zhang, Joshua Ni, Xiaobai Yao, Hongxu Ma, Lan Mu, Stefano Ermon, Tanuja Ganu, Akshay Nambi, Ni Lao, Gengchen Mai

    Abstract: Spatial representation learning (SRL) aims at learning general-purpose neural network representations from various types of spatial data (e.g., points, polylines, polygons, networks, images, etc.) in their native formats. Learning good spatial representations is a fundamental problem for various downstream applications such as species distribution modeling, weather forecasting, trajectory generati… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures. Submitted to NeurIPS 2024 Datasets and Benchmarks Track. Under review

  31. arXiv:2406.14434  [pdf, other

    cs.CL

    Towards Truthful Multilingual Large Language Models: Benchmarking and Alignment Strategies

    Authors: Weihao Liu, Ning Wu, Wenbiao Ding, Shining Liang, Ming Gong, Dongmei Zhang

    Abstract: In the era of large language models (LLMs), building multilingual large language models (MLLMs) that can serve users worldwide holds great significance. However, existing research seldom focuses on the truthfulness of MLLMs. Meanwhile, contemporary multilingual aligning technologies struggle to balance massive languages and often exhibit serious truthfulness gaps across different languages, especi… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 15 pages

  32. arXiv:2406.07219  [pdf, other

    math.OA math.FA

    On the Bures metric, C*-norm, and the quantum metric

    Authors: Konrad Aguilar, Karina Behera, Tron Omland, Nicole Wu

    Abstract: We prove that the topology on the density space with respect to a unital C*-algebra and a faithful induced by the C*-norm is finer than the Bures metric topology. We also provide an example when this containment is strict. Next, we provide a metric on the density space induced by a quantum metric in the sense of Rieffel and prove that the induced topology is the same as the topology induced by the… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 13 pages, 1 figure

    MSC Class: 46L89; 46L30; 58B34

  33. arXiv:2406.05301  [pdf, other

    eess.SY

    Active Islanding Detection Using Pulse Compression Probing

    Authors: Nicholas Piaquadio, N. Eva Wu, Morteza Sarailoo

    Abstract: An islanding detection scheme is developed using pulse compression probing (PCP). A state space system realization is taken from the probing output. The nu-gap metric is applied to compare the measured system to fully intact system and classify it as islanded, or grid-connected. The designed detector displays fast operation, accurate islanding detection results under varying grid condition, and is… ▽ More

    Submitted 18 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: Pending Publication at 2024 IEEE PESGM

  34. arXiv:2405.19758  [pdf, other

    cs.RO

    InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

    Authors: Muzhi Han, Yifeng Zhu, Song-Chun Zhu, Ying Nian Wu, Yuke Zhu

    Abstract: Learning abstract state representations and knowledge is crucial for long-horizon robot planning. We present InterPreT, an LLM-powered framework for robots to learn symbolic predicates from language feedback of human non-experts during embodied interaction. The learned predicates provide relational abstractions of the environment state, facilitating the learning of symbolic operators that capture… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: RSS 2024; https://interpret-robot.github.io

  35. arXiv:2405.18816  [pdf, other

    cs.CV cs.LG

    Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

    Authors: Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

    Abstract: Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis. By leveraging the instantaneous change-of-variables formula, one can directly compute image likelihoods from a learned flow, making them enticing candidates as priors for downstream tasks such as inverse problems. In particular, a natural a… ▽ More

    Submitted 30 September, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted to NeurIPS 2024

  36. arXiv:2405.18515  [pdf, other

    cs.LG

    Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

    Authors: Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang

    Abstract: Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embod… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  37. arXiv:2405.16865  [pdf, other

    q-bio.NC cs.LG stat.ML

    An Investigation of Conformal Isometry Hypothesis for Grid Cells

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: This paper investigates the conformal isometry hypothesis as a potential explanation for hexagonal periodic patterns in grid cell response maps. The hypothesis posits that grid cell activity forms a high-dimensional vector in neural space, encoding the agent's position in 2D physical space. As the agent moves, this vector rotates within a 2D manifold in the neural space, driven by a recurrent neur… ▽ More

    Submitted 10 October, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.19192

  38. arXiv:2405.16852  [pdf, other

    cs.LG cs.AI stat.ML

    EM Distillation for One-step Diffusion Models

    Authors: Sirui Xie, Zhisheng Xiao, Diederik P Kingma, Tingbo Hou, Ying Nian Wu, Kevin Patrick Murphy, Tim Salimans, Ben Poole, Ruiqi Gao

    Abstract: While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Disti… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  39. arXiv:2405.16730  [pdf, other

    cs.LG cs.AI stat.AP

    Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

    Authors: Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

    Abstract: Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues inclu… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  40. arXiv:2405.16127  [pdf, other

    cs.IR

    Finetuning Large Language Model for Personalized Ranking

    Authors: Zhuoxi Bai, Ning Wu, Fengyu Cai, Xinyi Zhu, Yun Xiong

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across various domains, motivating researchers to investigate their potential use in recommendation systems. However, directly applying LLMs to recommendation tasks has proven challenging due to the significant disparity between the data used for pre-training LLMs and the specific requirements of recommendation tasks. In this st… ▽ More

    Submitted 20 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  41. arXiv:2405.14018  [pdf, other

    cs.CR cs.LG stat.AP

    Watermarking Generative Tabular Data

    Authors: Hengzhi He, Peiyu Yu, Junpeng Ren, Ying Nian Wu, Guang Cheng

    Abstract: In this paper, we introduce a simple yet effective tabular data watermarking mechanism with statistical guarantees. We show theoretically that the proposed watermark can be effectively detected, while faithfully preserving the data fidelity, and also demonstrates appealing robustness against additive noise attack. The general idea is to achieve the watermarking through a strategic embedding based… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  42. arXiv:2405.10570  [pdf

    eess.IV cs.AI

    Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI

    Authors: Yirong Zhou, Chengyan Wang, Mengtian Lu, Kunyuan Guo, Zi Wang, Dan Ruan, Rui Guo, Peijun Zhao, Jianhua Wang, Naiming Wu, Jianzhong Lin, Yinyin Chen, Hang Jin, Lianxin Xie, Lilan Wu, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Xiaobo Qu

    Abstract: In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 8 figures, 6 tables

  43. A First Look at Immersive Telepresence on Apple Vision Pro

    Authors: Ruizhi Cheng, Nan Wu, Matteo Varvello, Eugene Chai, Songqing Chen, Bo Han

    Abstract: Due to the widespread adoption of "work-from-home" policies, videoconferencing applications (e.g., Zoom) have become indispensable for remote communication. However, they often lack immersiveness, leading to the so-called "Zoom fatigue" and degrading communication efficiency. The recent debut of Apple Vision Pro, a mobile headset that supports "spatial persona", aims to offer an immersive telepres… ▽ More

    Submitted 11 September, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: Published in ACM IMC 2024

  44. arXiv:2404.17805  [pdf, other

    cs.LG cs.CV

    From Optimization to Generalization: Fair Federated Learning against Quality Shift via Inter-Client Sharpness Matching

    Authors: Nannan Wu, Zhuo Kuang, Zengqiang Yan, Li Yu

    Abstract: Due to escalating privacy concerns, federated learning has been recognized as a vital approach for training deep neural networks with decentralized medical data. In practice, it is challenging to ensure consistent imaging quality across various institutions, often attributed to equipment malfunctions affecting a minority of clients. This imbalance in image quality can cause the federated model to… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: This paper is accepted at IJCAI'24 (Main Track)

  45. arXiv:2404.15531  [pdf, other

    econ.TH

    Maximal Procurement under a Budget

    Authors: Nicole Immorlica, Nicholas Wu, Brendan Lucier

    Abstract: We study the problem of a principal who wants to influence an agent's observable action, subject to an ex-post budget. The agent has a private type determining their cost function. This paper endogenizes the value of the resource driving incentives, which holds no inherent value but is restricted by finite availability. We characterize the optimal mechanism, showing the emergence of a pooling regi… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  46. arXiv:2404.07389  [pdf, other

    cs.CV

    Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models

    Authors: Yasi Zhang, Peiyu Yu, Ying Nian Wu

    Abstract: Text-to-image diffusion models have shown great success in generating high-quality text-guided images. Yet, these models may still fail to semantically align generated images with the provided text prompts, leading to problems like incorrect attribute binding and/or catastrophic object neglect. Given the pervasive object-oriented structure underlying text prompts, we introduce a novel object-condi… ▽ More

    Submitted 1 October, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted to ECCV 2024

  47. Machine-learning-inspired quantum control in many-body dynamics

    Authors: Meng-Yun Mao, Zheng Cheng, Liangsheng Li, Ning Wu, Wen-Long You

    Abstract: Achieving precise preparation of quantum many-body states is crucial for the practical implementation of quantum computation and quantum simulation. However, the inherent challenges posed by unavoidable excitations at critical points during quench processes necessitate careful design of control fields. In this work, we introduce a promising and versatile dynamic control neural network tailored to… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 14 pages, 9 figures; Accepted in Phys. Rev. A

    Journal ref: Physical Review A 109, 042428 (2024)

  48. arXiv:2403.17448  [pdf, other

    cs.RO

    Adaptive Line-Of-Sight guidance law based on vector fields path following for underactuated unmanned surface vehicle

    Authors: Jie Qi, Ronghua Wanga, Nailong Wu

    Abstract: The focus of this paper is to develop a methodology that enables an unmanned surface vehicle (USV) to efficiently track a planned path. The introduction of a vector field-based adaptive line of-sight guidance law (VFALOS) for accurate trajectory tracking and minimizing the overshoot response time during USV tracking of curved paths improves the overall line-of-sight (LOS) guidance method. These im… ▽ More

    Submitted 5 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  49. arXiv:2403.13855  [pdf, other

    math.CO

    A Non-Terminating Game of Beggar-My-Neighbor

    Authors: Brayden Casella, Philip M. Anderson, Michael Kleber, Richard P. Mann, Reed Nessler, William Rucklidge, Samuel G. Williams, Nicolas Wu

    Abstract: We demonstrate the existence of a non-terminating game of Beggar-My-Neighbor, discovered by lead author Brayden Casella. We detail the method for constructing this game and identify a cyclical structure of 62 tricks that is reached by 30 distinct starting hands. We further present a short history of the search for this solution since the problem was posed, and a record of previously found longest… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  50. arXiv:2403.11552  [pdf, other

    cs.RO cs.AI

    LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

    Authors: Shu Wang, Muzhi Han, Ziyuan Jiao, Zeyu Zhang, Ying Nian Wu, Song-Chun Zhu, Hangxin Liu

    Abstract: Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Sp… ▽ More

    Submitted 21 August, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: IROS 2024. Codes available: https://github.com/AssassinWS/LLM-TAMP