Skip to main content

Showing 1–50 of 121 results for author: Wu, Y N

  1. arXiv:2410.11359  [pdf, other

    cs.LG cs.RO stat.ML

    DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

    Authors: Eric Hanchen Jiang, Zhi Zhang, Dinghuai Zhang, Andrew Lizarraga, Chenheng Xu, Yasi Zhang, Siyan Zhao, Zhengjie Xu, Peiyu Yu, Yuer Tang, Deqian Kong, Ying Nian Wu

    Abstract: Advancements in reinforcement learning have led to the development of sophisticated models capable of learning complex decision-making tasks. However, efficiently integrating world models with decision transformers remains a challenge. In this paper, we introduce a novel approach that combines the Dreamer algorithm's ability to generate anticipatory trajectories with the adaptive learning strength… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  2. arXiv:2410.01858  [pdf, other

    q-bio.CB cs.LG q-bio.GN

    Long-range gene expression prediction with token alignment of large language model

    Authors: Edouardo Honig, Huixin Zhan, Ying Nian Wu, Zijun Frank Zhang

    Abstract: Gene expression is a cellular process that plays a fundamental role in human phenotypical variations and diseases. Despite advances of deep learning models for gene expression prediction, recent benchmarks have revealed their inability to learn distal regulatory grammar. Here, we address this challenge by leveraging a pretrained large language model to enhance gene expression prediction. We introd… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

    Comments: 14 pages, 10 figures

  3. LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space

    Authors: Michael Gilbert, Yannan Nellie Wu, Joel S. Emer, Vivienne Sze

    Abstract: Latency and energy consumption are key metrics in the performance of deep neural network (DNN) accelerators. A significant factor contributing to latency and energy is data transfers. One method to reduce transfers or data is reusing data when multiple operations use the same data. Fused-layer accelerators reuse data across operations in different layers by retaining intermediate data in on-chip b… ▽ More

    Submitted 14 October, 2024; v1 submitted 20 September, 2024; originally announced September 2024.

    Comments: To be published in IEEE Transactions on Circuits and Systems for Artificial Intelligence

  4. arXiv:2409.08551  [pdf, other

    stat.ML cs.LG

    Think Twice Before You Act: Improving Inverse Problem Solving With MCMC

    Authors: Yaxuan Zhu, Zehao Dou, Haoxin Zheng, Yasi Zhang, Ying Nian Wu, Ruiqi Gao

    Abstract: Recent studies demonstrate that diffusion models can serve as a strong prior for solving inverse problems. A prominent example is Diffusion Posterior Sampling (DPS), which approximates the posterior distribution of data given the measure using Tweedie's formula. Despite the merits of being versatile in solving various inverse problems without re-training, the performance of DPS is hindered by the… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  5. arXiv:2409.03845  [pdf, other

    cs.LG stat.ML

    Latent Space Energy-based Neural ODEs

    Authors: Sheng Cheng, Deqian Kong, Jianwen Xie, Kookjin Lee, Ying Nian Wu, Yezhou Yang

    Abstract: This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data. This family of models generates each data point in the time series by a neural emission model, which is a non-linear transformation of a latent state vector. The trajectory of the latent states is implicitly described by a neural ordinary differential equation (ODE), with the initial… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  6. arXiv:2408.08862  [pdf, other

    cs.LG

    Visual Agents as Fast and Slow Thinkers

    Authors: Guangyan Sun, Mingyu Jin, Zhenting Wang, Cheng-Long Wang, Siqi Ma, Qifan Wang, Ying Nian Wu, Yongfeng Zhang, Dongfang Liu

    Abstract: Achieving human-level intelligence requires refining cognitive distinctions between System 1 and System 2 thinking. While contemporary AI, driven by large language models, demonstrates human-like traits, it falls short of genuine cognition. Transitioning from structured benchmarks to real-world scenarios presents challenges for visual agents, often leading to inaccurate and overly confident respon… ▽ More

    Submitted 6 September, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

  7. arXiv:2408.02693  [pdf, other

    physics.comp-ph cs.AI

    Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models

    Authors: Chuan Liu, Chunshu Wu, Shihui Cao, Mingkai Chen, James Chenhao Liang, Ang Li, Michael Huang, Chuang Ren, Dongfang Liu, Ying Nian Wu, Tong Geng

    Abstract: The rapid development of AI highlights the pressing need for sustainable energy, a critical global challenge for decades. Nuclear fusion, generally seen as an ultimate solution, has been the focus of intensive research for nearly a century, with investments reaching hundreds of billions of dollars. Recent advancements in Inertial Confinement Fusion have drawn significant attention to fusion resear… ▽ More

    Submitted 5 October, 2024; v1 submitted 3 August, 2024; originally announced August 2024.

  8. arXiv:2407.11098  [pdf, other

    cs.LG cs.AI

    Inertial Confinement Fusion Forecasting via Large Language Models

    Authors: Mingkai Chen, Taowen Wang, Shihui Cao, James Chenhao Liang, Chuan Liu, Chunshu Wu, Qifan Wang, Ying Nian Wu, Michael Huang, Chuang Ren, Ang Li, Tong Geng, Dongfang Liu

    Abstract: Controlled fusion energy is deemed pivotal for the advancement of human civilization. In this study, we introduce $\textbf{LPI-LLM}$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms tailored to address a critical challenge, Laser-Plasma Instabilities ($\texttt{LPI}$), in Inertial Confinement Fusion ($\texttt{ICF}$). Our approach offers several key c… ▽ More

    Submitted 14 October, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  9. arXiv:2405.19758  [pdf, other

    cs.RO

    InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

    Authors: Muzhi Han, Yifeng Zhu, Song-Chun Zhu, Ying Nian Wu, Yuke Zhu

    Abstract: Learning abstract state representations and knowledge is crucial for long-horizon robot planning. We present InterPreT, an LLM-powered framework for robots to learn symbolic predicates from language feedback of human non-experts during embodied interaction. The learned predicates provide relational abstractions of the environment state, facilitating the learning of symbolic operators that capture… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: RSS 2024; https://interpret-robot.github.io

  10. arXiv:2405.18816  [pdf, other

    cs.CV cs.LG

    Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

    Authors: Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

    Abstract: Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis. By leveraging the instantaneous change-of-variables formula, one can directly compute image likelihoods from a learned flow, making them enticing candidates as priors for downstream tasks such as inverse problems. In particular, a natural a… ▽ More

    Submitted 30 September, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted to NeurIPS 2024

  11. arXiv:2405.18515  [pdf, other

    cs.LG

    Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

    Authors: Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang

    Abstract: Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embod… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  12. arXiv:2405.16865  [pdf, other

    q-bio.NC cs.LG stat.ML

    An Investigation of Conformal Isometry Hypothesis for Grid Cells

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: This paper investigates the conformal isometry hypothesis as a potential explanation for hexagonal periodic patterns in grid cell response maps. The hypothesis posits that grid cell activity forms a high-dimensional vector in neural space, encoding the agent's position in 2D physical space. As the agent moves, this vector rotates within a 2D manifold in the neural space, driven by a recurrent neur… ▽ More

    Submitted 10 October, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.19192

  13. arXiv:2405.16852  [pdf, other

    cs.LG cs.AI stat.ML

    EM Distillation for One-step Diffusion Models

    Authors: Sirui Xie, Zhisheng Xiao, Diederik P Kingma, Tingbo Hou, Ying Nian Wu, Kevin Patrick Murphy, Tim Salimans, Ben Poole, Ruiqi Gao

    Abstract: While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Disti… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  14. arXiv:2405.16730  [pdf, other

    cs.LG cs.AI stat.AP

    Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

    Authors: Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

    Abstract: Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues inclu… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  15. arXiv:2405.14018  [pdf, other

    cs.CR cs.LG stat.AP

    Watermarking Generative Tabular Data

    Authors: Hengzhi He, Peiyu Yu, Junpeng Ren, Ying Nian Wu, Guang Cheng

    Abstract: In this paper, we introduce a simple yet effective tabular data watermarking mechanism with statistical guarantees. We show theoretically that the proposed watermark can be effectively detected, while faithfully preserving the data fidelity, and also demonstrates appealing robustness against additive noise attack. The general idea is to achieve the watermarking through a strategic embedding based… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  16. arXiv:2404.07389  [pdf, other

    cs.CV

    Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models

    Authors: Yasi Zhang, Peiyu Yu, Ying Nian Wu

    Abstract: Text-to-image diffusion models have shown great success in generating high-quality text-guided images. Yet, these models may still fail to semantically align generated images with the provided text prompts, leading to problems like incorrect attribute binding and/or catastrophic object neglect. Given the pervasive object-oriented structure underlying text prompts, we introduce a novel object-condi… ▽ More

    Submitted 1 October, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted to ECCV 2024

  17. arXiv:2403.11552  [pdf, other

    cs.RO cs.AI

    LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

    Authors: Shu Wang, Muzhi Han, Ziyuan Jiao, Zeyu Zhang, Ying Nian Wu, Song-Chun Zhu, Hangxin Liu

    Abstract: Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Sp… ▽ More

    Submitted 21 August, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: IROS 2024. Codes available: https://github.com/AssassinWS/LLM-TAMP

  18. arXiv:2402.17179  [pdf, other

    cs.LG q-bio.BM

    Dual-Space Optimization: Improved Molecule Sequence Design by Latent Prompt Transformer

    Authors: Deqian Kong, Yuhao Huang, Jianwen Xie, Edouardo Honig, Ming Xu, Shuanghong Xue, Pei Lin, Sanping Zhou, Sheng Zhong, Nanning Zheng, Ying Nian Wu

    Abstract: Designing molecules with desirable properties, such as drug-likeliness and high binding affinities towards protein targets, is a challenging problem. In this paper, we propose the Dual-Space Optimization (DSO) method that integrates latent space sampling and data space selection to solve this problem. DSO iteratively updates a latent space generative model and a synthetic dataset in an optimizatio… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  19. arXiv:2402.08075  [pdf, other

    q-bio.GN cs.AI cs.LG

    Efficient and Scalable Fine-Tune of Language Models for Genome Understanding

    Authors: Huixin Zhan, Ying Nian Wu, Zijun Zhang

    Abstract: Although DNA foundation models have advanced the understanding of genomes, they still face significant challenges in the limited scale and diversity of genomic data. This limitation starkly contrasts with the success of natural language foundation models, which thrive on substantially larger scales. Furthermore, genome understanding involves numerous downstream genome annotation tasks with inheren… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  20. arXiv:2402.04647  [pdf, other

    cs.LG

    Latent Plan Transformer: Planning as Latent Variable Inference

    Authors: Deqian Kong, Dehong Xu, Minglu Zhao, Bo Pang, Jianwen Xie, Andrew Lizarraga, Yuhao Huang, Sirui Xie, Ying Nian Wu

    Abstract: In tasks aiming for long-term returns, planning becomes essential. We study generative modeling for planning with datasets repurposed from offline reinforcement learning. Specifically, we identify temporal consistency in the absence of step-wise rewards as one key technical challenge. We introduce the Latent Plan Transformer (LPT), a novel model that leverages a latent space to connect a Transform… ▽ More

    Submitted 28 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  21. arXiv:2401.09742  [pdf, other

    cs.CV

    Image Translation as Diffusion Visual Programmers

    Authors: Cheng Han, James C. Liang, Qifan Wang, Majid Rabbani, Sohail Dianat, Raghuveer Rao, Ying Nian Wu, Dongfang Liu

    Abstract: We introduce the novel Diffusion Visual Programmer (DVP), a neuro-symbolic image translation framework. Our proposed DVP seamlessly embeds a condition-flexible diffusion model within the GPT architecture, orchestrating a coherent sequence of visual programs (i.e., computer vision models) for various pro-symbolic steps, which span RoI identification, style transfer, and position manipulation, facil… ▽ More

    Submitted 30 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 25 pages, 20 figures

  22. arXiv:2311.06212  [pdf, other

    stat.ML cs.LG stat.AP

    Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

    Authors: Andrew Lizarraga, Brandon Taraku, Edouardo Honig, Ying Nian Wu, Shantanu H. Joshi

    Abstract: Given the complex geometry of white matter streamlines, Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces. However, despite these recent successes, the majority of encoder architectures only perform dimension reduction on single streamlines as opposed to a full bundle of streamlines. This is a severe limitation of… ▽ More

    Submitted 18 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: 5 pages, 4 figures, 1 table

  23. arXiv:2310.19192  [pdf, other

    q-bio.NC cs.LG stat.ML

    Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: Grid cells in the entorhinal cortex of mammalian brains exhibit striking hexagon grid firing patterns in their response maps as the animal (e.g., a rat) navigates in a 2D open environment. In this paper, we study the emergence of the hexagon grid patterns of grid cells based on a general recurrent neural network (RNN) model that captures the navigation process. The responses of grid cells collecti… ▽ More

    Submitted 19 February, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

  24. arXiv:2310.09604  [pdf, other

    cs.LG cs.CV

    Learning Hierarchical Features with Joint Latent Space Energy-Based Prior

    Authors: Jiali Cui, Ying Nian Wu, Tian Han

    Abstract: This paper studies the fundamental problem of multi-layer generator models in learning hierarchical representations. The multi-layer generator model that consists of multiple layers of latent variables organized in a top-down architecture tends to learn multiple levels of data abstraction. However, such multi-layer latent variables are typically parameterized to be Gaussian, which can be less info… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

  25. arXiv:2310.03325  [pdf, other

    cs.AI cs.CV cs.LG

    Learning Concept-Based Causal Transition and Symbolic Reasoning for Visual Planning

    Authors: Yilue Qian, Peiyu Yu, Ying Nian Wu, Yao Su, Wei Wang, Lifeng Fan

    Abstract: Visual planning simulates how humans make decisions to achieve desired goals in the form of searching for visual causal transitions between an initial visual state and a final visual goal state. It has become increasingly important in egocentric vision with its advantages in guiding agents to perform daily tasks in complex environments. In this paper, we propose an interpretable and generalizable… ▽ More

    Submitted 27 March, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

  26. arXiv:2310.03253  [pdf, other

    cs.LG q-bio.BM stat.ML

    Molecule Design by Latent Prompt Transformer

    Authors: Deqian Kong, Yuhao Huang, Jianwen Xie, Ying Nian Wu

    Abstract: This paper proposes a latent prompt Transformer model for solving challenging optimization problems such as molecule design, where the goal is to find molecules with optimal values of a target chemical or biological property that can be computed by an existing software. Our proposed model consists of three components. (1) A latent vector whose prior distribution is modeled by a Unet transformation… ▽ More

    Submitted 5 February, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

  27. arXiv:2310.03218  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Energy-Based Prior Model with Diffusion-Amortized MCMC

    Authors: Peiyu Yu, Yaxuan Zhu, Sirui Xie, Xiaojian Ma, Ruiqi Gao, Song-Chun Zhu, Ying Nian Wu

    Abstract: Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in the field of generative modeling due to its flexibility in the formulation and strong modeling power of the latent space. However, the common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progres… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  28. Tailors: Accelerating Sparse Tensor Algebra by Overbooking Buffer Capacity

    Authors: Zi Yu Xue, Yannan Nellie Wu, Joel S. Emer, Vivienne Sze

    Abstract: Sparse tensor algebra is a challenging class of workloads to accelerate due to low arithmetic intensity and varying sparsity patterns. Prior sparse tensor algebra accelerators have explored tiling sparse data to increase exploitable data reuse and improve throughput, but typically allocate tile size in a given buffer for the worst-case data occupancy. This severely limits the utilization of availa… ▽ More

    Submitted 26 June, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: 17 pages, 13 figures, in MICRO 2023

    Journal ref: 56th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO '23), 2023

  29. arXiv:2309.09017  [pdf, other

    cs.RO

    Triple Regression for Camera Agnostic Sim2Real Robot Grasping and Manipulation Tasks

    Authors: Yuanhong Zeng, Yizhou Zhao, Ying Nian Wu

    Abstract: Sim2Real (Simulation to Reality) techniques have gained prominence in robotic manipulation and motion planning due to their ability to enhance success rates by enabling agents to test and evaluate various policies and trajectories. In this paper, we investigate the advantages of integrating Sim2Real into robotic frameworks. We introduce the Triple Regression Sim2Real framework, which constructs a… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  30. arXiv:2307.07862  [pdf, other

    cs.RO eess.SY

    Sim2Plan: Robot Motion Planning via Message Passing between Simulation and Reality

    Authors: Yizhou Zhao, Yuanhong Zeng, Qian Long, Ying Nian Wu, Song-Chun Zhu

    Abstract: Simulation-to-real is the task of training and developing machine learning models and deploying them in real settings with minimal additional training. This approach is becoming increasingly popular in fields such as robotics. However, there is often a gap between the simulated environment and the real world, and machine learning models trained in simulation may not perform as well in the real wor… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Comments: Published as a conference paper at FTC 2023

  31. arXiv:2307.04047  [pdf, other

    cs.CV

    Threshold-Consistent Margin Loss for Open-World Deep Metric Learning

    Authors: Qin Zhang, Linghan Xu, Qingming Tang, Jun Fang, Ying Nian Wu, Joe Tighe, Yifan Xing

    Abstract: Existing losses used in deep metric learning (DML) for image retrieval often lead to highly non-uniform intra-class and inter-class representation structures across test classes and data distributions. When combined with the common practice of using a fixed threshold to declare a match, this gives rise to significant performance variations in terms of false accept rate (FAR) and false reject rate… ▽ More

    Submitted 12 March, 2024; v1 submitted 8 July, 2023; originally announced July 2023.

    Comments: Accepted to ICLR'24

  32. arXiv:2306.14902  [pdf, other

    q-bio.BM cs.LG stat.ML

    Molecule Design by Latent Space Energy-Based Modeling and Gradual Distribution Shifting

    Authors: Deqian Kong, Bo Pang, Tian Han, Ying Nian Wu

    Abstract: Generation of molecules with desired chemical and biological properties such as high drug-likeness, high binding affinity to target proteins, is critical for drug discovery. In this paper, we propose a probabilistic generative model to capture the joint distribution of molecules and their properties. Our model assumes an energy-based model (EBM) in the latent space. Conditional on the latent vecto… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Journal ref: 39th Conference on Uncertainty in Artificial Intelligence 2023

  33. arXiv:2306.06323  [pdf, other

    cs.CV cs.LG

    Learning Joint Latent Space EBM Prior Model for Multi-layer Generator

    Authors: Jiali Cui, Ying Nian Wu, Tian Han

    Abstract: This paper studies the fundamental problem of learning multi-layer generator models. The multi-layer generator model builds multiple layers of latent variables as a prior model on top of the generator, which benefits learning complex data distribution and hierarchical representations. However, such a prior model usually focuses on modeling inter-layer relations between latent variables by assuming… ▽ More

    Submitted 11 October, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  34. arXiv:2306.01153  [pdf, other

    cs.CL

    Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

    Authors: Yan Xu, Deqian Kong, Dehong Xu, Ziwei Ji, Bo Pang, Pascale Fung, Ying Nian Wu

    Abstract: The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system. Common strategies either adopt a two-step paradigm, which optimizes knowledge selection and response generation separately, and may overlook the inherent correlation between these two tasks, or leverage conditional variational method to j… ▽ More

    Submitted 5 August, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted to ICML 2023

  35. HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity

    Authors: Yannan Nellie Wu, Po-An Tsai, Saurav Muralidharan, Angshuman Parashar, Vivienne Sze, Joel S. Emer

    Abstract: Due to complex interactions among various deep neural network (DNN) optimization techniques, modern DNNs can have weights and activations that are dense or sparse with diverse sparsity degrees. To offer a good trade-off between accuracy and hardware performance, an ideal DNN accelerator should have high flexibility to efficiently translate DNN sparsity into reductions in energy and/or latency with… ▽ More

    Submitted 1 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to MICRO23

  36. arXiv:2305.12039  [pdf, other

    cs.CV

    Learning for Transductive Threshold Calibration in Open-World Recognition

    Authors: Qin Zhang, Dongsheng An, Tianjun Xiao, Tong He, Qingming Tang, Ying Nian Wu, Joseph Tighe, Yifan Xing, Stefano Soatto

    Abstract: In deep metric learning for visual recognition, the calibration of distance thresholds is crucial for achieving desired model performance in the true positive rates (TPR) or true negative rates (TNR). However, calibrating this threshold presents challenges in open-world scenarios, where the test classes can be entirely disjoint from those encountered during training. We define the problem of findi… ▽ More

    Submitted 22 March, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

  37. arXiv:2304.09842  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models

    Authors: Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao

    Abstract: Large language models (LLMs) have achieved remarkable progress in solving various natural language processing tasks due to emergent reasoning abilities. However, LLMs have inherent limitations as they are incapable of accessing up-to-date information (stored on the Web or in task-specific knowledge bases), using external tools, and performing precise mathematical and logical reasoning. In this pap… ▽ More

    Submitted 31 October, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: 32 pages, 10 figures, 24 tables. Accepted to NeurIPS 2023

  38. arXiv:2211.11033  [pdf, other

    cs.AI cs.CV

    On the Complexity of Bayesian Generalization

    Authors: Yu-Zhe Shi, Manjie Xu, John E. Hopcroft, Kun He, Joshua B. Tenenbaum, Song-Chun Zhu, Ying Nian Wu, Wenjuan Han, Yixin Zhu

    Abstract: We consider concept generalization at a large scale in the diverse and natural visual spectrum. Established computational modes (i.e., rule-based or similarity-based) are primarily studied isolated and focus on confined and abstract problem spaces. In this work, we study these two modes when the problem space scales up, and the $complexity$ of concepts becomes diverse. Specifically, at the… ▽ More

    Submitted 25 November, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

  39. arXiv:2210.12515  [pdf, other

    cs.LG cs.AI

    SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data

    Authors: Cristian Challu, Peihong Jiang, Ying Nian Wu, Laurent Callot

    Abstract: In this work, we tackle two widespread challenges in real applications for time-series forecasting that have been largely understudied: distribution shifts and missing data. We propose SpectraNet, a novel multivariate time-series forecasting model that dynamically infers a latent space spectral decomposition to capture current temporal dynamics and correlations on the recent observed history. A Co… ▽ More

    Submitted 25 October, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

  40. arXiv:2210.02684  [pdf, other

    q-bio.NC cs.LG stat.ML

    Conformal Isometry of Lie Group Representation in Recurrent Network of Grid Cells

    Authors: Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

    Abstract: The activity of the grid cell population in the medial entorhinal cortex (MEC) of the mammalian brain forms a vector representation of the self-position of the animal. Recurrent neural networks have been proposed to explain the properties of the grid cells by updating the neural activity vector based on the velocity input of the animal. In doing so, the grid cell system effectively performs path i… ▽ More

    Submitted 7 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

  41. arXiv:2210.01603  [pdf, other

    cs.LG cs.CL cs.CV

    Neural-Symbolic Recursive Machine for Systematic Generalization

    Authors: Qing Li, Yixin Zhu, Yitao Liang, Ying Nian Wu, Song-Chun Zhu, Siyuan Huang

    Abstract: Current learning models often struggle with human-like systematic generalization, particularly in learning compositional rules from limited data and extrapolating them to novel combinations. We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS), allowing for the emergence of combinatorial syntax and semantics directly from training data. The NSR emp… ▽ More

    Submitted 29 April, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: ICLR 2024. Project website: https://liqing-ustc.github.io/NSR/

  42. arXiv:2209.14610  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

    Authors: Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, Ashwin Kalyan

    Abstract: Mathematical reasoning, a core ability of human intelligence, presents unique challenges for machines in abstract thinking and logical reasoning. Recent large pre-trained language models such as GPT-3 have achieved remarkable progress on mathematical reasoning tasks written in text form, such as math word problems (MWP). However, it is unknown if the models can handle more complex problems that in… ▽ More

    Submitted 2 March, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: ICLR 2023. 26 pages and 18 figures. The data and code are available at https://promptpg.github.io

  43. arXiv:2206.05895  [pdf, other

    cs.LG cs.CL

    Latent Diffusion Energy-Based Model for Interpretable Text Modeling

    Authors: Peiyu Yu, Sirui Xie, Xiaojian Ma, Baoxiong Jia, Bo Pang, Ruiqi Gao, Yixin Zhu, Song-Chun Zhu, Ying Nian Wu

    Abstract: Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in generative modeling. Fueled by its flexibility in the formulation and strong modeling power of the latent space, recent works built upon it have made interesting attempts aiming at the interpretability of text modeling. However, latent space EBMs also inherit some flaws from EBMs in data spa… ▽ More

    Submitted 4 October, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  44. arXiv:2205.05826  [pdf, other

    cs.AR cs.CV cs.DC

    Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

    Authors: Yannan Nellie Wu, Po-An Tsai, Angshuman Parashar, Vivienne Sze, Joel S. Emer

    Abstract: In recent years, many accelerators have been proposed to efficiently process sparse tensor algebra applications (e.g., sparse neural networks). However, these proposals are single points in a large and diverse design space. The lack of systematic description and modeling support for these sparse tensor accelerators impedes hardware designers from efficient and effective design space exploration. T… ▽ More

    Submitted 9 January, 2023; v1 submitted 11 May, 2022; originally announced May 2022.

    Comments: Update website link, update UOP format description

  45. arXiv:2202.07586  [pdf, other

    cs.LG

    Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection

    Authors: Cristian Challu, Peihong Jiang, Ying Nian Wu, Laurent Callot

    Abstract: Multivariate time series anomaly detection has become an active area of research in recent years, with Deep Learning models outperforming previous approaches on benchmark datasets. Among reconstruction-based models, most previous work has focused on Variational Autoencoders and Generative Adversarial Networks. This work presents DGHL, a new family of generative models for time series anomaly detec… ▽ More

    Submitted 25 February, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: accepted at AISTATS 2022

  46. arXiv:2201.05299  [pdf, other

    cs.CV cs.CL cs.IR

    A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering

    Authors: Feng Gao, Qing Ping, Govind Thattai, Aishwarya Reganti, Ying Nian Wu, Prem Natarajan

    Abstract: Outside-knowledge visual question answering (OK-VQA) requires the agent to comprehend the image, make use of relevant knowledge from the entire web, and digest all the information to answer the question. Most previous works address the problem by first fusing the image and question in the multi-modal space, which is inflexible for further fusion with a vast amount of external knowledge. In this pa… ▽ More

    Submitted 13 January, 2022; originally announced January 2022.

  47. arXiv:2111.12990  [pdf, other

    cs.AI cs.CV cs.LG

    Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning

    Authors: Chi Zhang, Sirui Xie, Baoxiong Jia, Ying Nian Wu, Song-Chun Zhu, Yixin Zhu

    Abstract: Is intelligence realized by connectionist or classicist? While connectionist approaches have achieved superhuman performance, there has been growing evidence that such task-specific superiority is particularly fragile in systematic generalization. This observation lies in the central debate between connectionist and classicist, wherein the latter continually advocates an algebraic treatment in cog… ▽ More

    Submitted 20 July, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: ECCV 2022 paper. Supplementary: http://wellyzhang.github.io/attach/eccv22zhang_alans_supp.pdf Project: http://wellyzhang.github.io/project/alans.html

  48. arXiv:2110.15497  [pdf, other

    cs.CV cs.LG

    Unsupervised Foreground Extraction via Deep Region Competition

    Authors: Peiyu Yu, Sirui Xie, Xiaojian Ma, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu

    Abstract: We present Deep Region Competition (DRC), an algorithm designed to extract foreground objects from images in a fully unsupervised manner. Foreground extraction can be viewed as a special case of generic image segmentation that focuses on identifying and disentangling objects from the background. In this work, we rethink the foreground extraction by reconciling energy-based prior with generative im… ▽ More

    Submitted 4 October, 2023; v1 submitted 28 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021

  49. arXiv:2110.00137  [pdf, other

    cs.LG cs.AI cs.HC

    Iterative Teacher-Aware Learning

    Authors: Luyao Yuan, Dongruo Zhou, Junhong Shen, Jingdong Gao, Jeffrey L. Chen, Quanquan Gu, Ying Nian Wu, Song-Chun Zhu

    Abstract: In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. The teacher adjusts her teaching method for different students, and the student, after getting familiar with the teacher's instruction mechanism, can infer the teacher's intention to learn faster. Recently, the benefits of integrating this cooperative pedagogy into machine concept learning in dis… ▽ More

    Submitted 26 October, 2021; v1 submitted 30 September, 2021; originally announced October 2021.

    Journal ref: Advances in Neural Information Processing Systems (2021)

  50. arXiv:2108.11556  [pdf, other

    cs.LG

    Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification

    Authors: Bo Pang, Ying Nian Wu

    Abstract: We propose a latent space energy-based prior model for text generation and classification. The model stands on a generator network that generates the text sequence based on a continuous latent vector. The energy term of the prior model couples a continuous latent vector and a symbolic one-hot vector, so that discrete category can be inferred from the observed example based on the continuous latent… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: 8 pages