-
A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models
Authors:
Shivam Kumar,
Yun Yang,
Lizhen Lin
Abstract:
In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models.…
▽ More
In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models. Our results lead to the convergence rate of a sieve maximum likelihood estimator (MLE) for estimating the conditional distribution (and its devolved counterpart) of the response given predictors in the Hellinger (Wasserstein) metric. Our rates depend solely on the intrinsic dimension and smoothness of the true conditional distribution. These findings provide an explanation of why conditional deep generative models can circumvent the curse of dimensionality from the perspective of statistical foundations and demonstrate that they can learn a broader class of nearly singular conditional distributions. Our analysis also emphasizes the importance of introducing a small noise perturbation to the data when they are supported sufficiently close to a manifold. Finally, in our numerical studies, we demonstrate the effective implementation of the proposed approach using both synthetic and real-world datasets, which also provide complementary validation to our theoretical findings.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation
Authors:
Rong Tang,
Lizhen Lin,
Yun Yang
Abstract:
We consider a class of conditional forward-backward diffusion models for conditional generative modeling, that is, generating new data given a covariate (or control variable). To formally study the theoretical properties of these conditional generative models, we adopt a statistical framework of distribution regression to characterize the large sample properties of the conditional distribution est…
▽ More
We consider a class of conditional forward-backward diffusion models for conditional generative modeling, that is, generating new data given a covariate (or control variable). To formally study the theoretical properties of these conditional generative models, we adopt a statistical framework of distribution regression to characterize the large sample properties of the conditional distribution estimators induced by these conditional forward-backward diffusion models. Here, the conditional distribution of data is assumed to smoothly change over the covariate. In particular, our derived convergence rate is minimax-optimal under the total variation metric within the regimes covered by the existing literature. Additionally, we extend our theory by allowing both the data and the covariate variable to potentially admit a low-dimensional manifold structure. In this scenario, we demonstrate that the conditional forward-backward diffusion model can adapt to both manifold structures, meaning that the derived estimation error bound (under the Wasserstein metric) depends only on the intrinsic dimensionalities of the data and the covariate.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Data-Driven Output Regulation via Internal Model Principle
Authors:
Liquan Lin,
Jie Huang
Abstract:
The data-driven techniques have been developed to deal with the output regulation problem of unknown linear systems by various approaches. In this paper, we first extend an existing algorithm from single-input single-output linear systems to multi-input multi-output linear systems. Then, by separating the dynamics used in the learning phase and the control phase, we further propose an improved alg…
▽ More
The data-driven techniques have been developed to deal with the output regulation problem of unknown linear systems by various approaches. In this paper, we first extend an existing algorithm from single-input single-output linear systems to multi-input multi-output linear systems. Then, by separating the dynamics used in the learning phase and the control phase, we further propose an improved algorithm that significantly reduces the computational cost and weakens the solvability conditions over the first algorithm.
△ Less
Submitted 14 September, 2024;
originally announced September 2024.
-
Towards Automatic Linearization via SMT Solving
Authors:
Jian Cao,
Liyong Lin,
Lele Li
Abstract:
Mathematical optimization is ubiquitous in modern applications. However, in practice, we often need to use nonlinear optimization models, for which the existing optimization tools such as Cplex or Gurobi may not be directly applicable and an (error-prone) manual transformation often has to be done. Thus, to address this issue, in this paper we investigate the problem of automatically verifying and…
▽ More
Mathematical optimization is ubiquitous in modern applications. However, in practice, we often need to use nonlinear optimization models, for which the existing optimization tools such as Cplex or Gurobi may not be directly applicable and an (error-prone) manual transformation often has to be done. Thus, to address this issue, in this paper we investigate the problem of automatically verifying and synthesizing reductions, the solution of which may allow an automatic linearization of nonlinear models. We show that the synthesis of reductions can be formulated as an $\exists^* \forall^*$ synthesis problem, which can be solved by an SMT solver via the counter-example guided inductive synthesis approach (CEGIS).
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
Non-iterative complex variable solution on sequential shallow tunnelling in gravitational geomaterial with reasonable far-field displacement
Authors:
Luo-bin Lin,
Fu-quan Chen,
Change-jie Zheng,
Shang-shun Lin
Abstract:
Sequential excavation is common in shallow tunnel engineering, especially for large-span tunnels. However, existing complex variable solutions can not handle sequential shallow tunnelling effectively. This paper proposes a new complex variable solution on sequential shallow tunnelling in gravitational geomaterial with reasonable far-field displacement in a non-iterative manner by incorporating a b…
▽ More
Sequential excavation is common in shallow tunnel engineering, especially for large-span tunnels. However, existing complex variable solutions can not handle sequential shallow tunnelling effectively. This paper proposes a new complex variable solution on sequential shallow tunnelling in gravitational geomaterial with reasonable far-field displacement in a non-iterative manner by incorporating a bidirectional stepwise conformal mapping combining Charge Simulation Method and Complex Dipole Simulation Method. The non-iterative manner ensures that the mechanical models of sequential excavation stages share similar mathematical formation with non-successive mixed boundary conditions, which are respectively transformed into corresponding homogenerous Riemann-Hilbert problems, which are solved to obtain stress and displacement fields of sequential shallow tunnelling. The proposed solution is subsequently validated by sufficient comparisons with equivalent finite element solution with good agreements. The comparisons also suggest that the proposed solution should be more accurate than the finite element one. A parametric investigation is finally conducted to illustrate possible practical applications of the proposed solution with several engineering recommendations. Additionally, the theoretical improvements and defects of the proposed solution are discussed for objectivity.
△ Less
Submitted 28 July, 2024;
originally announced July 2024.
-
Infinite quantum signal processing for arbitrary Szegő functions
Authors:
Michel Alexis,
Lin Lin,
Gevorg Mnatsakanyan,
Christoph Thiele,
Jiasu Wang
Abstract:
We provide a complete solution to the problem of infinite quantum signal processing for the class of Szegő functions, which are functions that satisfy a logarithmic integrability condition and include almost any function that allows for a quantum signal processing representation. We do so by introducing a new algorithm called the Riemann-Hilbert-Weiss algorithm, which can compute any individual ph…
▽ More
We provide a complete solution to the problem of infinite quantum signal processing for the class of Szegő functions, which are functions that satisfy a logarithmic integrability condition and include almost any function that allows for a quantum signal processing representation. We do so by introducing a new algorithm called the Riemann-Hilbert-Weiss algorithm, which can compute any individual phase factor independent of all other phase factors. Our algorithm is also the first provably stable numerical algorithm for computing phase factors of any arbitrary Szegő function. The proof of stability involves solving a Riemann-Hilbert factorization problem in nonlinear Fourier analysis using elements of spectral theory.
△ Less
Submitted 10 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Complex variable solution on noncircular and asymmetrical tunnelling embedded by bidirectional conformal mapping incorporating Charge Simulation Method
Authors:
Luobin Lin,
Fuquan Chen,
Changjie Zheng,
Shangshun Lin
Abstract:
Mechanical issues of noncircular and asymmetrical tunnelling can be estimated using complex variable method with suitable conformal mapping. Exsiting solution schemes of conformal mapping for noncircular tunnel generally need iteration or optimization strategy, and are thereby mathematically complicated. This paper proposes a new bidirectional conformal mapping for deep and shallow tunnels of nonc…
▽ More
Mechanical issues of noncircular and asymmetrical tunnelling can be estimated using complex variable method with suitable conformal mapping. Exsiting solution schemes of conformal mapping for noncircular tunnel generally need iteration or optimization strategy, and are thereby mathematically complicated. This paper proposes a new bidirectional conformal mapping for deep and shallow tunnels of noncircular and asymmetrical shapes by incorporating Charge Simulation Method. The solution scheme of this new bidirectional conformal mapping only involves a pair of linear systems, and is therefore logically straight-forward, computationally efficient, and practically easy in coding. New numerical strategies are developed to deal with possible sharp corners of cavity by small arc simulation and densified collocation points. Several numerical examples are presented to illustrate the geometrical usage of the new bidirectional conformal mapping. Furthermore, the new bidirectional conformal mapping is embedded into two complex variable solutions of noncircular and asymmetrical shallow tunnelling in gravitational geomaterial with reasonable far-field displacement. The respective result comparisons with finite element solution and exsiting analytical solution show good agreements, indicating the feasible mechanical usage of the new bidirectional conformal mapping.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Authors:
Licong Lin,
Jingfeng Wu,
Sham M. Kakade,
Peter L. Bartlett,
Jason D. Lee
Abstract:
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, wh…
▽ More
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, which predict that increasing model size monotonically improves performance.
We study the theory of scaling laws in an infinite dimensional linear regression setup. Specifically, we consider a model with $M$ parameters as a linear function of sketched covariates. The model is trained by one-pass stochastic gradient descent (SGD) using $N$ data. Assuming the optimal parameter satisfies a Gaussian prior and the data covariance matrix has a power-law spectrum of degree $a>1$, we show that the reducible part of the test error is $Θ(M^{-(a-1)} + N^{-(a-1)/a})$. The variance error, which increases with $M$, is dominated by the other errors due to the implicit regularization of SGD, thus disappearing from the bound. Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Multi-level quantum signal processing with applications to ground state preparation using fast-forwarded Hamiltonian evolution
Authors:
Yulong Dong,
Lin Lin
Abstract:
The preparation of the ground state of a Hamiltonian $H$ with a large spectral radius has applications in many areas such as electronic structure theory and quantum field theory. Given an initial state with a constant overlap with the ground state, and assuming that the Hamiltonian $H$ can be efficiently simulated with an ideal fast-forwarding protocol, we first demonstrate that employing a linear…
▽ More
The preparation of the ground state of a Hamiltonian $H$ with a large spectral radius has applications in many areas such as electronic structure theory and quantum field theory. Given an initial state with a constant overlap with the ground state, and assuming that the Hamiltonian $H$ can be efficiently simulated with an ideal fast-forwarding protocol, we first demonstrate that employing a linear combination of unitaries (LCU) approach can prepare the ground state at a cost of $\mathcal{O}(\log^2(\|H\| Δ^{-1}))$ queries to controlled Hamiltonian evolution. Here $\|H\|$ is the spectral radius of $H$ and $Δ$ the spectral gap. However, traditional Quantum Signal Processing (QSP)-based methods fail to capitalize on this efficient protocol, and its cost scales as $\mathcal{O}(\|H\| Δ^{-1})$. To bridge this gap, we develop a multi-level QSP-based algorithm that exploits the fast-forwarding feature. This novel algorithm not only matches the efficiency of the LCU approach when an ideal fast-forwarding protocol is available, but also exceeds it with a reduced cost that scales as $\mathcal{O}(\log(\|H\| Δ^{-1}))$. Additionally, our multi-level QSP method requires only $\mathcal{O}(\log(\|H\| Δ^{-1}))$ coefficients for implementing single qubit rotations. This eliminates the need for constructing the PREPARE oracle in LCU, which prepares a state encoding $\mathcal{O}(\|H\| Δ^{-1})$ coefficients regardless of whether the Hamiltonian can be fast-forwarded.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Time-dependent complex variable solution on quasi three-dimensional shallow tunnelling in gravititational geomaterial with reasonable far-field displacement
Authors:
Luobin Lin,
Fuquan Chen,
Changjie Zheng
Abstract:
Three-dimensional effect of tunnel face and gravitational excavation generally occur in shallow tunnelling, which are nevertheless not adequately considered in present complex variable solutions. In this paper, a new time-dependent complex variable solution on quasi three-dimensional shallow tunnelling in gravitational geomaterial is derived, and the far-field displacement singularity is eliminate…
▽ More
Three-dimensional effect of tunnel face and gravitational excavation generally occur in shallow tunnelling, which are nevertheless not adequately considered in present complex variable solutions. In this paper, a new time-dependent complex variable solution on quasi three-dimensional shallow tunnelling in gravitational geomaterial is derived, and the far-field displacement singularity is eliminated by fixed far-field ground surface in the whole excavation time span. With an equivalent coefficient of three-dimensional effect, the quasi three-dimensional shallow tunnelling is transformed into a plane strain problem with time-dependent virtual traction along tunnel periphery. The mixed boundaries of fixed far-field ground surface and nearby free segment form a homogenerous Riemann-Hilbert problem with extra constraints of the virtual traction along tunnel periphery, which is simultaneously solved using an iterative linear system with good numerical stability. The mixed boundary conditions along the ground surface in the whole excavation time span are well satisified in a numerical case, which is further examined by comparing with corresponding finite element solution. The results are in good agreements, and the proposed solution illustrates high efficiency. More discussions are made on excavation rate, viscosity, and solution convergence. A latent paradox is disclosed for objectivity.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Complex variable solution on over-/under-break shallow tunnelling in gravitational geomaterial with reasonable far-field displacement
Authors:
Luo-bin Lin,
Fu-quan Chen,
Jin-ping Zhuang
Abstract:
Over-/under-break excavation is a common phenomenon in shallow tunnelling, which is nonetheless not generally considered in existing complex variable solutions. In this paper, a new equilibrium mechanical model on over-/under-break shallow tunnelling in gravitational geomaterial is established by fixing far-field ground surface to form a corresponding mixed boundary problem. With integration of a…
▽ More
Over-/under-break excavation is a common phenomenon in shallow tunnelling, which is nonetheless not generally considered in existing complex variable solutions. In this paper, a new equilibrium mechanical model on over-/under-break shallow tunnelling in gravitational geomaterial is established by fixing far-field ground surface to form a corresponding mixed boundary problem. With integration of a newly proposed bidirectional composite conformal mapping using Charge Simulation Method, a complex variable solution of infinite complex potential series is subsequently derived using analytic continuation to tranform the mixed boundaries into a homogenerous Riemann-Hilbert problem, which is iteratively solved to obtain the stress and displacement in geomaterial. The infinite complex potential series of the complex variable solution are truncated to obtain numerical results, which is rectified by Lanczos filtering to reduce the oscillation of Gibbs phenomena. The bidirectional conformal mapping is discussed and validated via several numerical cases, and the subsequent complex variable solution is verified by examining the Lanczos filtering and solution convergence, and comparing with corresponding finite element solution and existing analytical solution. Further discussions are made to disclose possible defects of the proposed solution for objectivity.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
The ESPRIT algorithm under high noise: Optimal error scaling and noisy super-resolution
Authors:
Zhiyan Ding,
Ethan N. Epperly,
Lin Lin,
Ruizhe Zhang
Abstract:
Subspace-based signal processing techniques, such as the Estimation of Signal Parameters via Rotational Invariant Techniques (ESPRIT) algorithm, are popular methods for spectral estimation. These algorithms can achieve the so-called super-resolution scaling under low noise conditions, surpassing the well-known Nyquist limit. However, the performance of these algorithms under high-noise conditions…
▽ More
Subspace-based signal processing techniques, such as the Estimation of Signal Parameters via Rotational Invariant Techniques (ESPRIT) algorithm, are popular methods for spectral estimation. These algorithms can achieve the so-called super-resolution scaling under low noise conditions, surpassing the well-known Nyquist limit. However, the performance of these algorithms under high-noise conditions is not as well understood. Existing state-of-the-art analysis indicates that ESPRIT and related algorithms can be resilient even for signals where each observation is corrupted by statistically independent, mean-zero noise of size $\mathcal{O}(1)$, but these analyses only show that the error $ε$ decays at a slow rate $ε=\mathcal{\tilde{O}}(n^{-1/2})$ with respect to the cutoff frequency $n$. In this work, we prove that under certain assumptions of bias and high noise, the ESPRIT algorithm can attain a significantly improved error scaling $ε= \mathcal{\tilde{O}}(n^{-3/2})$, exhibiting noisy super-resolution scaling beyond the Nyquist limit. We further establish a theoretical lower bound and show that this scaling is optimal. Our analysis introduces novel matrix perturbation results, which could be of independent interest.
△ Less
Submitted 22 April, 2024; v1 submitted 5 April, 2024;
originally announced April 2024.
-
An Efficient Quantum Circuit for Block Encoding a Pairing Hamiltonian
Authors:
Diyi Liu,
Weijie Du,
Lin Lin,
James P. Vary,
Chao Yang
Abstract:
We present an efficient quantum circuit for block encoding pairing Hamiltonian often studied in nuclear physics. Our block encoding scheme does not require mapping the creation and annihilation operators to the Pauli operators and representing the Hamiltonian as a linear combination of unitaries. Instead, we show how to encode the Hamiltonian directly using controlled swap operations. We analyze t…
▽ More
We present an efficient quantum circuit for block encoding pairing Hamiltonian often studied in nuclear physics. Our block encoding scheme does not require mapping the creation and annihilation operators to the Pauli operators and representing the Hamiltonian as a linear combination of unitaries. Instead, we show how to encode the Hamiltonian directly using controlled swap operations. We analyze the gate complexity of the block encoding circuit and show that it scales polynomially with respect to the number of qubits required to represent a quantum state associated with the pairing Hamiltonian. We also show how the block encoding circuit can be combined with the quantum singular value transformation to construct an efficient quantum circuit for approximating the density of states of a pairing Hamiltonian. The techniques presented can be extended to encode more general second-quantized Hamiltonians.
△ Less
Submitted 21 February, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
Quantum algorithm for linear non-unitary dynamics with near-optimal dependence on all parameters
Authors:
Dong An,
Andrew M. Childs,
Lin Lin
Abstract:
We introduce a family of identities that express general linear non-unitary evolution operators as a linear combination of unitary evolution operators, each solving a Hamiltonian simulation problem. This formulation can exponentially enhance the accuracy of the recently introduced linear combination of Hamiltonian simulation (LCHS) method [An, Liu, and Lin, Physical Review Letters, 2023]. For the…
▽ More
We introduce a family of identities that express general linear non-unitary evolution operators as a linear combination of unitary evolution operators, each solving a Hamiltonian simulation problem. This formulation can exponentially enhance the accuracy of the recently introduced linear combination of Hamiltonian simulation (LCHS) method [An, Liu, and Lin, Physical Review Letters, 2023]. For the first time, this approach enables quantum algorithms to solve linear differential equations with both optimal state preparation cost and near-optimal scaling in matrix queries on all parameters.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
Mean-field variational inference with the TAP free energy: Geometric and statistical properties in linear models
Authors:
Michael Celentano,
Zhou Fan,
Licong Lin,
Song Mei
Abstract:
We study mean-field variational inference in a Bayesian linear model when the sample size n is comparable to the dimension p. In high dimensions, the common approach of minimizing a Kullback-Leibler divergence from the posterior distribution, or maximizing an evidence lower bound, may deviate from the true posterior mean and underestimate posterior uncertainty. We study instead minimization of the…
▽ More
We study mean-field variational inference in a Bayesian linear model when the sample size n is comparable to the dimension p. In high dimensions, the common approach of minimizing a Kullback-Leibler divergence from the posterior distribution, or maximizing an evidence lower bound, may deviate from the true posterior mean and underestimate posterior uncertainty. We study instead minimization of the TAP free energy, showing in a high-dimensional asymptotic framework that it has a local minimizer which provides a consistent estimate of the posterior marginals and may be used for correctly calibrated posterior inference. Geometrically, we show that the landscape of the TAP free energy is strongly convex in an extensive neighborhood of this local minimizer, which under certain general conditions can be found by an Approximate Message Passing (AMP) algorithm. We then exhibit an efficient algorithm that linearly converges to the minimizer within this local neighborhood. In settings where it is conjectured that no efficient algorithm can find this local neighborhood, we prove analogous geometric properties for a local minimizer of the TAP free energy reachable by AMP, and show that posterior inference based on this minimizer remains correctly calibrated.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Modified mean curvature flow and CMC foliation conjecture in almost Fuchsian manifolds
Authors:
Zheng Huang,
Longzhi Lin,
Zhou Zhang
Abstract:
There has been a conjecture, often attributed to Thurston, which asserts that every almost Fuchsian manifold is foliated by closed incompressible constant mean curvature (CMC) surfaces. In this paper, for a certain class of almost Fuchsian manifolds, we prove the long-time existence and convergence of the modified mean curvature flow $$\frac{\partial F}{\partial t}=-(H-c)\vecν,$$ which was first i…
▽ More
There has been a conjecture, often attributed to Thurston, which asserts that every almost Fuchsian manifold is foliated by closed incompressible constant mean curvature (CMC) surfaces. In this paper, for a certain class of almost Fuchsian manifolds, we prove the long-time existence and convergence of the modified mean curvature flow $$\frac{\partial F}{\partial t}=-(H-c)\vecν,$$ which was first introduced by Xiao and the second named author in \cite{LX12}. As an application, we confirm Thurston's CMC foliation conjecture for such a subclass of almost Fuchsian manifolds.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Random coordinate descent: a simple alternative for optimizing parameterized quantum circuits
Authors:
Zhiyan Ding,
Taehee Ko,
Jiahao Yao,
Lin Lin,
Xiantao Li
Abstract:
Variational quantum algorithms rely on the optimization of parameterized quantum circuits in noisy settings. The commonly used back-propagation procedure in classical machine learning is not directly applicable in this setting due to the collapse of quantum states after measurements. Thus, gradient estimations constitute a significant overhead in a gradient-based optimization of such quantum circu…
▽ More
Variational quantum algorithms rely on the optimization of parameterized quantum circuits in noisy settings. The commonly used back-propagation procedure in classical machine learning is not directly applicable in this setting due to the collapse of quantum states after measurements. Thus, gradient estimations constitute a significant overhead in a gradient-based optimization of such quantum circuits. This paper introduces a random coordinate descent algorithm as a practical and easy-to-implement alternative to the full gradient descent algorithm. This algorithm only requires one partial derivative at each iteration. Motivated by the behavior of measurement noise in the practical optimization of parameterized quantum circuits, this paper presents an optimization problem setting that is amenable to analysis. Under this setting, the random coordinate descent algorithm exhibits the same level of stochastic stability as the full gradient approach, making it as resilient to noise. The complexity of the random coordinate descent method is generally no worse than that of the gradient descent and can be much better for various quantum optimization problems with anisotropic Lipschitz constants. Theoretical analysis and extensive numerical experiments validate our findings.
△ Less
Submitted 28 June, 2024; v1 submitted 31 October, 2023;
originally announced November 2023.
-
A new complex variable solution on noncircular shallow tunnelling with reasonable far-field displacement
Authors:
Luo-bin Lin,
Fu-quan Chen,
Shang-shun Lin
Abstract:
A new mechanical model on noncircular shallow tunnelling considering initial stress field is proposed in this paper by constraining far-field ground surface to eliminate displacement singularity at infinity, and the originally unbalanced tunnel excavation problem in existing solutions is turned to an equilibrium one of mixed boundaries. By applying analytic continuation, the mixed boundaries are t…
▽ More
A new mechanical model on noncircular shallow tunnelling considering initial stress field is proposed in this paper by constraining far-field ground surface to eliminate displacement singularity at infinity, and the originally unbalanced tunnel excavation problem in existing solutions is turned to an equilibrium one of mixed boundaries. By applying analytic continuation, the mixed boundaries are transformed to a homogenerous Riemann-Hilbert problem, which is subsequently solved via an efficient and accurate iterative method with boundary conditions of static equilibrium, displacement single-valuedness, and traction along tunnel periphery. The Lanczos filtering technique is used in the final stress and displacement solution to reduce the Gibbs phenomena caused by the constrained far-field ground surface for more accurte results. Several numerical cases are conducted to intensively verify the proposed solution by examining boundary conditions and comparing with existing solutions, and all the results are in good agreements. Then more numerical cases are conducted to investigate the stress and deformation distribution along ground surface and tunnel periphery, and several engineering advices are given. Further discussions on the defects of the proposed solution are also conducted for objectivity.
△ Less
Submitted 20 October, 2023; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
Authors:
Licong Lin,
Yu Bai,
Song Mei
Abstract:
Large transformer models pretrained on offline reinforcement learning datasets have demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they can make good decisions when prompted with interaction trajectories from unseen environments. However, when and how transformers can be trained to perform ICRL have not been theoretically well-understood. In particular, it is…
▽ More
Large transformer models pretrained on offline reinforcement learning datasets have demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they can make good decisions when prompted with interaction trajectories from unseen environments. However, when and how transformers can be trained to perform ICRL have not been theoretically well-understood. In particular, it is unclear which reinforcement-learning algorithms transformers can perform in context, and how distribution mismatch in offline training data affects the learned algorithms. This paper provides a theoretical framework that analyzes supervised pretraining for ICRL. This includes two recently proposed training methods -- algorithm distillation and decision-pretrained transformers. First, assuming model realizability, we prove the supervised-pretrained transformer will imitate the conditional expectation of the expert algorithm given the observed trajectory. The generalization error will scale with model capacity and a distribution divergence factor between the expert and offline algorithms. Second, we show transformers with ReLU attention can efficiently approximate near-optimal online reinforcement learning algorithms like LinUCB and Thompson sampling for stochastic linear bandits, and UCB-VI for tabular Markov decision processes. This provides the first quantitative analysis of the ICRL capabilities of transformers pretrained from offline trajectories.
△ Less
Submitted 26 May, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Updatable Estimation in Generalized Linear Models with Missing Response
Authors:
Xianhua Zhang,
Lu lin,
Qihua Wang
Abstract:
This paper develops an updatable inverse probability weighting (UIPW) estimation for the generalized linear models with response missing at random in streaming data sets. A two-step online updating algorithm is provided for the proposed method. In the first step we construct an updatable estimator for the parameter in propensity function and hence obtain an updatable estimator of the propensity fu…
▽ More
This paper develops an updatable inverse probability weighting (UIPW) estimation for the generalized linear models with response missing at random in streaming data sets. A two-step online updating algorithm is provided for the proposed method. In the first step we construct an updatable estimator for the parameter in propensity function and hence obtain an updatable estimator of the propensity function; in the second step we propose an UIPW estimator with the inverse of the updating propensity function value at each observation as the weight for estimating the parameter of interest. The UIPW estimation is universally applicable due to its relaxation on the constraint on the number of data batches. It is shown that the proposed estimator is consistent and asymptotically normal with the same asymptotic variance as that of the oracle estimator, and hence the oracle property is obtained. The finite sample performance of the proposed estimator is illustrated by the simulation and real data analysis. All numerical studies confirm that the UIPW estimator performs as well as the batch learner.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Statistical Limits of Adaptive Linear Models: Low-Dimensional Estimation and Inference
Authors:
Licong Lin,
Mufang Ying,
Suvrojit Ghosh,
Koulik Khamaru,
Cun-Hui Zhang
Abstract:
Estimation and inference in statistics pose significant challenges when data are collected adaptively. Even in linear models, the Ordinary Least Squares (OLS) estimator may fail to exhibit asymptotic normality for single coordinate estimation and have inflated error. This issue is highlighted by a recent minimax lower bound, which shows that the error of estimating a single coordinate can be enlar…
▽ More
Estimation and inference in statistics pose significant challenges when data are collected adaptively. Even in linear models, the Ordinary Least Squares (OLS) estimator may fail to exhibit asymptotic normality for single coordinate estimation and have inflated error. This issue is highlighted by a recent minimax lower bound, which shows that the error of estimating a single coordinate can be enlarged by a multiple of $\sqrt{d}$ when data are allowed to be arbitrarily adaptive, compared with the case when they are i.i.d. Our work explores this striking difference in estimation performance between utilizing i.i.d. and adaptive data. We investigate how the degree of adaptivity in data collection impacts the performance of estimating a low-dimensional parameter component in high-dimensional linear models. We identify conditions on the data collection mechanism under which the estimation error for a low-dimensional parameter component matches its counterpart in the i.i.d. setting, up to a factor that depends on the degree of adaptivity. We show that OLS or OLS on centered data can achieve this matching error. In addition, we propose a novel estimator for single coordinate inference via solving a Two-stage Adaptive Linear Estimating equation (TALE). Under a weaker form of adaptivity in data collection, we establish an asymptotic normality property of the proposed estimator.
△ Less
Submitted 28 October, 2023; v1 submitted 30 September, 2023;
originally announced October 2023.
-
A Refined Algorithm for the Adaptive Optimal Output Regulation Problem
Authors:
Liquan Lin,
Jie Huang
Abstract:
Given a linear unknown system with $m$ inputs, $p$ outputs, $n$ dimensional state vector, and $q$ dimensional ecosystem, the problem of the adaptive optimal output regulation of this system boils down to iteratively solving a set of linear equations and each of these equations contains $\frac{n (n+1)}{2} + (m+q)n$ unknown variables. In this paper, we refine the existing algorithm by decoupling eac…
▽ More
Given a linear unknown system with $m$ inputs, $p$ outputs, $n$ dimensional state vector, and $q$ dimensional ecosystem, the problem of the adaptive optimal output regulation of this system boils down to iteratively solving a set of linear equations and each of these equations contains $\frac{n (n+1)}{2} + (m+q)n$ unknown variables. In this paper, we refine the existing algorithm by decoupling each of these linear equations into two lower-dimensional linear equations. The first one contains $nq$ unknown variables, and the second one contains $\frac{n (n+1)}{2} + mn$ unknown variables. As a result, the solvability conditions for these equations are also significantly weakened.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Reasonable mechanical model on shallow tunnel excavation to eliminate displacement singularity caused by unbalanced resultant
Authors:
Luobin Lin,
Fuquan Chen,
Xianhai Huang
Abstract:
When considering initial stress field in geomaterial, nonzero resultant of shallow tunnel excavation exists, which produces logarithmic items in complex potentials, and would further lead to a unique displacement singularity at infinity to violate geo-engineering fact in real world. The mechanical and mathematical reasons of such a unique displacement singularity in the existing mechanical models…
▽ More
When considering initial stress field in geomaterial, nonzero resultant of shallow tunnel excavation exists, which produces logarithmic items in complex potentials, and would further lead to a unique displacement singularity at infinity to violate geo-engineering fact in real world. The mechanical and mathematical reasons of such a unique displacement singularity in the existing mechanical models are elaborated, and a new mechanical model is subsequently proposed to eliminate this singularity by constraining far-field ground surface displacement, and the original unbalanced resultant problem is converted into an equilibrium one with mixed boundary conditions. To solve stress and displacement in the new model, the analytic continuation is applied to transform the mixed boundary conditions into a homogenerous Riemann-Hilbert problem with extra constraints, which is then solved using an approximate and iterative method with good numerical stability. The Lanczos filtering is applied to the stress and displacement solution to reduce the Gibbs phenomena caused by abrupt change of the boundary conditions along ground surface. Several numerical cases are conducted to verify the proposed mechanical model and the results strongly validate that the proposed mechanical model successfully eliminates the displacement singularity caused by unbalanced resultant with good convergence and accuracy to obtain stress and displacement for shallow tunnel excavation. A parametric investigation is subsequently conducted to study the influence of tunnel depth, lateral coefficient, and free surface range on stress and displacement distribution in geomaterial.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Dense outputs from quantum simulations
Authors:
Jin-Peng Liu,
Lin Lin
Abstract:
The quantum dense output problem is the process of evaluating time-accumulated observables from time-dependent quantum dynamics using quantum computers. This problem arises frequently in applications such as quantum control and spectroscopic computation. We present a range of algorithms designed to operate on both early and fully fault-tolerant quantum platforms. These methodologies draw upon tech…
▽ More
The quantum dense output problem is the process of evaluating time-accumulated observables from time-dependent quantum dynamics using quantum computers. This problem arises frequently in applications such as quantum control and spectroscopic computation. We present a range of algorithms designed to operate on both early and fully fault-tolerant quantum platforms. These methodologies draw upon techniques like amplitude estimation, Hamiltonian simulation, quantum linear Ordinary Differential Equation (ODE) solvers, and quantum Carleman linearization. We provide a comprehensive complexity analysis with respect to the evolution time $T$ and error tolerance $ε$. Our results demonstrate that the linearization approach can nearly achieve optimal complexity $\mathcal{O}(T/ε)$ for a certain type of low-rank dense outputs. Moreover, we provide a linearization of the dense output problem that yields an exact and finite-dimensional closure which encompasses the original states. This formulation is related to the Koopman Invariant Subspace theory and may be of independent interest in nonlinear control and scientific machine learning.
△ Less
Submitted 19 June, 2024; v1 submitted 26 July, 2023;
originally announced July 2023.
-
Robust iterative method for symmetric quantum signal processing in all parameter regimes
Authors:
Yulong Dong,
Lin Lin,
Hongkang Ni,
Jiasu Wang
Abstract:
This paper addresses the problem of solving nonlinear systems in the context of symmetric quantum signal processing (QSP), a powerful technique for implementing matrix functions on quantum computers. Symmetric QSP focuses on representing target polynomials as products of matrices in SU(2) that possess symmetry properties. We present a novel Newton's method tailored for efficiently solving the nonl…
▽ More
This paper addresses the problem of solving nonlinear systems in the context of symmetric quantum signal processing (QSP), a powerful technique for implementing matrix functions on quantum computers. Symmetric QSP focuses on representing target polynomials as products of matrices in SU(2) that possess symmetry properties. We present a novel Newton's method tailored for efficiently solving the nonlinear system involved in determining the phase factors within the symmetric QSP framework. Our method demonstrates rapid and robust convergence in all parameter regimes, including the challenging scenario with ill-conditioned Jacobian matrices, using standard double precision arithmetic operations. For instance, solving symmetric QSP for a highly oscillatory target function $α\cos(1000 x)$ (polynomial degree $\approx 1433$) takes $6$ iterations to converge to machine precision when $α=0.9$, and the number of iterations only increases to $18$ iterations when $α=1-10^{-9}$ with a highly ill-conditioned Jacobian matrix. Leveraging the matrix product states the structure of symmetric QSP, the computation of the Jacobian matrix incurs a computational cost comparable to a single function evaluation. Moreover, we introduce a reformulation of symmetric QSP using real-number arithmetics, further enhancing the method's efficiency. Extensive numerical tests validate the effectiveness and robustness of our approach, which has been implemented in the QSPPACK software package.
△ Less
Submitted 23 July, 2023;
originally announced July 2023.
-
Invariant correlation under marginal transforms
Authors:
Takaaki Koike,
Liyuan Lin,
Ruodu Wang
Abstract:
A useful property of independent samples is that their correlation remains the same after applying marginal transforms. This invariance property plays a fundamental role in statistical inference, but does not hold in general for dependent samples. In this paper, we study this invariance property on the Pearson correlation coefficient and its applications. A multivariate random vector is said to ha…
▽ More
A useful property of independent samples is that their correlation remains the same after applying marginal transforms. This invariance property plays a fundamental role in statistical inference, but does not hold in general for dependent samples. In this paper, we study this invariance property on the Pearson correlation coefficient and its applications. A multivariate random vector is said to have an invariant correlation if its pairwise correlation coefficients remain unchanged under any common marginal transforms. For a bivariate case, we characterize all models of such a random vector via a certain combination of comonotonicity -- the strongest form of positive dependence -- and independence. In particular, we show that the class of exchangeable copulas with invariant correlation is precisely described by what we call positive Fréchet copulas. In the general multivariate case, we characterize the set of all invariant correlation matrices via the clique partition polytope. We also propose a positive regression dependent model that admits any prescribed invariant correlation matrix. Finally, we show that all our characterization results of invariant correlation, except one special case, remain the same if the common marginal transforms are confined to the set of increasing ones.
△ Less
Submitted 15 August, 2024; v1 submitted 19 June, 2023;
originally announced June 2023.
-
Inverse Volume Scaling of Finite-Size Error in Periodic Coupled Cluster Theory
Authors:
Xin Xing,
Lin Lin
Abstract:
Coupled cluster theory is one of the most popular post-Hartree-Fock methods for ab initio molecular quantum chemistry. The finite-size error of the correlation energy in periodic coupled cluster calculations for three-dimensional insulating systems has been observed to satisfy the inverse volume scaling, even in the absence of any correction schemes. This is surprising, as simpler theories that ut…
▽ More
Coupled cluster theory is one of the most popular post-Hartree-Fock methods for ab initio molecular quantum chemistry. The finite-size error of the correlation energy in periodic coupled cluster calculations for three-dimensional insulating systems has been observed to satisfy the inverse volume scaling, even in the absence of any correction schemes. This is surprising, as simpler theories that utilize only a subset of the coupled cluster diagrams exhibit much slower decay of the finite-size error, which scales inversely with the length of the system. In this study, we review the current understanding of finite-size error in quantum chemistry methods for periodic systems. We introduce new tools that elucidate the mechanisms behind this phenomenon in the context of coupled cluster doubles calculations. This reconciles some seemingly paradoxical statements related to finite-size scaling. Our findings also show that singularity subtraction can be a powerful method to effectively reduce finite-size errors in practical quantum chemistry calculations for periodic systems.
△ Less
Submitted 31 March, 2024; v1 submitted 6 April, 2023;
originally announced April 2023.
-
Parallel generalized solutions of mixed boundary value problem on partially fixed unit annulus subjected to arbitrary traction
Authors:
Luobin Lin,
Fuquan Chen,
Xianhai Huang
Abstract:
This paper provides two parallel solutions on the mixed boundary value problem of a unit annulus subjected to a partially fixed outer periphery and an arbitrary traction acting along the inner periphery using the complex variable method. The analytic continuation is applied to turn the mixed boundary value problem into a Riemann-Hilbert problem across the free segment along the outer periphery. Tw…
▽ More
This paper provides two parallel solutions on the mixed boundary value problem of a unit annulus subjected to a partially fixed outer periphery and an arbitrary traction acting along the inner periphery using the complex variable method. The analytic continuation is applied to turn the mixed boundary value problem into a Riemann-Hilbert problem across the free segment along the outer periphery. Two parallel interpreting methods of the unused traction and displacement boundary condition along the outer periphery together with the traction boundary condition along the inner periphery respectively form two parallel complex linear constraint sets, which are then iteratively solved via a successive approximation method to reach the same stable stress and displacement solutions with the Lanczos filtering technique. Finally, four typical numerical cases coded by \texttt{FORTRAN} are carried out and compared to the same cases performed on \texttt{ABAQUS}. The results indicate that these two parallel solutions are both accurate, stable, robust, and fast, and validate that these two parallel solutions are numerically equivalent.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Anti-symmetric Barron functions and their approximation with sums of determinants
Authors:
Nilin Abrahamsen,
Lin Lin
Abstract:
A fundamental problem in quantum physics is to encode functions that are completely anti-symmetric under permutations of identical particles. The Barron space consists of high-dimensional functions that can be parameterized by infinite neural networks with one hidden layer. By explicitly encoding the anti-symmetric structure, we prove that the anti-symmetric functions which belong to the Barron sp…
▽ More
A fundamental problem in quantum physics is to encode functions that are completely anti-symmetric under permutations of identical particles. The Barron space consists of high-dimensional functions that can be parameterized by infinite neural networks with one hidden layer. By explicitly encoding the anti-symmetric structure, we prove that the anti-symmetric functions which belong to the Barron space can be efficiently approximated with sums of determinants. This yields a factorial improvement in complexity compared to the standard representation in the Barron space and provides a theoretical explanation for the effectiveness of determinant-based architectures in ab-initio quantum chemistry.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
Semi-parametric inference based on adaptively collected data
Authors:
Licong Lin,
Koulik Khamaru,
Martin J. Wainwright
Abstract:
Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the parameter vector of a generalized linear regression model contaminated by a non-parametric nuisance component. We construct suitably weighted estimating equations…
▽ More
Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the parameter vector of a generalized linear regression model contaminated by a non-parametric nuisance component. We construct suitably weighted estimating equations that account for adaptivity in data collection, and provide conditions under which the associated estimates are asymptotically normal. Our results characterize the degree of "explorability" required for asymptotic normality to hold. For the simpler problem of estimating a linear functional, we provide similar guarantees under much weaker assumptions. We illustrate our general theory with concrete consequences for various problems, including standard linear bandits and sparse generalized bandits, and compare with other methods via simulation studies.
△ Less
Submitted 4 March, 2023;
originally announced March 2023.
-
Linear combination of Hamiltonian simulation for nonunitary dynamics with optimal state preparation cost
Authors:
Dong An,
Jin-Peng Liu,
Lin Lin
Abstract:
We propose a simple method for simulating a general class of non-unitary dynamics as a linear combination of Hamiltonian simulation (LCHS) problems. LCHS does not rely on converting the problem into a dilated linear system problem, or on the spectral mapping theorem. The latter is the mathematical foundation of many quantum algorithms for solving a wide variety of tasks involving non-unitary proce…
▽ More
We propose a simple method for simulating a general class of non-unitary dynamics as a linear combination of Hamiltonian simulation (LCHS) problems. LCHS does not rely on converting the problem into a dilated linear system problem, or on the spectral mapping theorem. The latter is the mathematical foundation of many quantum algorithms for solving a wide variety of tasks involving non-unitary processes, such as the quantum singular value transformation (QSVT). The LCHS method can achieve optimal cost in terms of state preparation. We also demonstrate an application for open quantum dynamics simulation using the complex absorbing potential method with near-optimal dependence on all parameters.
△ Less
Submitted 23 October, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
On Robust Numerical Solver for ODE via Self-Attention Mechanism
Authors:
Zhongzhan Huang,
Mingfu Liang,
Liang Lin
Abstract:
With the development of deep learning techniques, AI-enhanced numerical solvers are expected to become a new paradigm for solving differential equations due to their versatility and effectiveness in alleviating the accuracy-speed trade-off in traditional numerical solvers. However, this paradigm still inevitably requires a large amount of high-quality data, whose acquisition is often very expensiv…
▽ More
With the development of deep learning techniques, AI-enhanced numerical solvers are expected to become a new paradigm for solving differential equations due to their versatility and effectiveness in alleviating the accuracy-speed trade-off in traditional numerical solvers. However, this paradigm still inevitably requires a large amount of high-quality data, whose acquisition is often very expensive in natural science and engineering problems. Therefore, in this paper, we explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, AttSolver, which introduces an additive self-attention mechanism to the numerical solution of differential equations based on the dynamical system perspective of the residual neural network. Our results on benchmarks, ranging from high-dimensional problems to chaotic systems, demonstrate the effectiveness of AttSolver in generally improving the performance of existing traditional numerical solvers without any elaborated model crafting. Finally, we analyze the convergence, generalization, and robustness of the proposed method experimentally and theoretically.
△ Less
Submitted 4 February, 2023;
originally announced February 2023.
-
Finite-size effects in periodic coupled cluster calculations
Authors:
Xin Xing,
Lin Lin
Abstract:
We provide the first rigorous study of the finite-size error in the simplest and representative coupled cluster theory, namely the coupled cluster doubles (CCD) theory, for gapped periodic systems. Assuming that the CCD equations are solved using exact Hartree-Fock orbitals and orbital energies, we prove that the convergence rate of finite-size error scales as…
▽ More
We provide the first rigorous study of the finite-size error in the simplest and representative coupled cluster theory, namely the coupled cluster doubles (CCD) theory, for gapped periodic systems. Assuming that the CCD equations are solved using exact Hartree-Fock orbitals and orbital energies, we prove that the convergence rate of finite-size error scales as $\mathscr{O}(N_\mathbf{k}^{-\frac13})$, where $N_{\mathbf{k}}$ is the number of discretization point in the Brillouin zone and characterizes the system size. Our analysis shows that the dominant error lies in the coupled cluster amplitude calculation, and the convergence of the finite-size error in energy calculations can be boosted to $\mathscr{O}(N_\mathbf{k}^{-1})$ with accurate amplitudes. This also provides the first proof of the scaling of the finite-size error in the third order Møller-Plesset perturbation theory (MP3) for periodic systems.
△ Less
Submitted 12 February, 2023;
originally announced February 2023.
-
Extrinsic Bayesian Optimizations on Manifolds
Authors:
Yihao Fang,
Mu Niu,
Pokman Cheung,
Lizhen Lin
Abstract:
We propose an extrinsic Bayesian optimization (eBO) framework for general optimization problems on manifolds. Bayesian optimization algorithms build a surrogate of the objective function by employing Gaussian processes and quantify the uncertainty in that surrogate by deriving an acquisition function. This acquisition function represents the probability of improvement based on the kernel of the Ga…
▽ More
We propose an extrinsic Bayesian optimization (eBO) framework for general optimization problems on manifolds. Bayesian optimization algorithms build a surrogate of the objective function by employing Gaussian processes and quantify the uncertainty in that surrogate by deriving an acquisition function. This acquisition function represents the probability of improvement based on the kernel of the Gaussian process, which guides the search in the optimization process. The critical challenge for designing Bayesian optimization algorithms on manifolds lies in the difficulty of constructing valid covariance kernels for Gaussian processes on general manifolds. Our approach is to employ extrinsic Gaussian processes by first embedding the manifold onto some higher dimensional Euclidean space via equivariant embeddings and then constructing a valid covariance kernel on the image manifold after the embedding. This leads to efficient and scalable algorithms for optimization over complex manifolds. Simulation study and real data analysis are carried out to demonstrate the utilities of our eBO framework by applying the eBO to various optimization problems over manifolds such as the sphere, the Grassmannian, and the manifold of positive definite matrices.
△ Less
Submitted 28 December, 2022; v1 submitted 21 December, 2022;
originally announced December 2022.
-
A spectral collocation method for elliptic PDEs in irregular domains with Fourier extension
Authors:
Xianru Chen,
Li Lin
Abstract:
Based on the Fourier extension, we propose an oversampling collocation method for solving the elliptic partial differential equations with variable coefficients over arbitrary irregular domains. This method only uses the function values on the equispaced nodes, which has low computational cost and versatility. While a variety of numerical experiments are presented to demonstrate the effectiveness…
▽ More
Based on the Fourier extension, we propose an oversampling collocation method for solving the elliptic partial differential equations with variable coefficients over arbitrary irregular domains. This method only uses the function values on the equispaced nodes, which has low computational cost and versatility. While a variety of numerical experiments are presented to demonstrate the effectiveness of this method, it shows that the approximation error fast reaches a plateau with increasing the degrees of freedom, due to the inherent ill-conditioned of frames.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Near-optimal multiple testing in Bayesian linear models with finite-sample FDR control
Authors:
Taejoo Ahn,
Licong Lin,
Song Mei
Abstract:
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures that control the False Discovery Rate (FDR), while concurrently identifying a greater number of relevant variables. Model-X methods, such as Knockoffs and conditional randomization tests, achieve the primary goal of finite-sample FDR control, assuming a known distribution of covariates.…
▽ More
In high dimensional variable selection problems, statisticians often seek to design multiple testing procedures that control the False Discovery Rate (FDR), while concurrently identifying a greater number of relevant variables. Model-X methods, such as Knockoffs and conditional randomization tests, achieve the primary goal of finite-sample FDR control, assuming a known distribution of covariates. However, whether these methods can also achieve the secondary goal of maximizing discoveries remains uncertain. In fact, designing procedures to discover more relevant variables with finite-sample FDR control is a largely open question, even within the arguably simplest linear models.
In this paper, we develop near-optimal multiple testing procedures for high dimensional Bayesian linear models with isotropic covariates. We introduce Model-X procedures that provably control the frequentist FDR from finite samples, even when the model is misspecified, and conjecturally achieve near-optimal power when the data follow the Bayesian linear model. Our proposed procedure, PoEdCe, incorporates three key ingredients: Posterior Expectation, distilled Conditional randomization test (dCRT), and the Benjamini-Hochberg procedure with e-values (eBH). The optimality conjecture of PoEdCe is based on a heuristic calculation of its asymptotic true positive proportion (TPP) and false discovery proportion (FDP), which is supported by methods from statistical physics as well as extensive numerical simulations. Our result establishes the Bayesian linear model as a benchmark for comparing the power of various multiple testing procedures.
△ Less
Submitted 21 July, 2023; v1 submitted 4 November, 2022;
originally announced November 2022.
-
Infinite quantum signal processing
Authors:
Yulong Dong,
Lin Lin,
Hongkang Ni,
Jiasu Wang
Abstract:
Quantum signal processing (QSP) represents a real scalar polynomial of degree $d$ using a product of unitary matrices of size $2\times 2$, parameterized by $(d+1)$ real numbers called the phase factors. This innovative representation of polynomials has a wide range of applications in quantum computation. When the polynomial of interest is obtained by truncating an infinite polynomial series, a nat…
▽ More
Quantum signal processing (QSP) represents a real scalar polynomial of degree $d$ using a product of unitary matrices of size $2\times 2$, parameterized by $(d+1)$ real numbers called the phase factors. This innovative representation of polynomials has a wide range of applications in quantum computation. When the polynomial of interest is obtained by truncating an infinite polynomial series, a natural question is whether the phase factors have a well defined limit as the degree $d\to \infty$. While the phase factors are generally not unique, we find that there exists a consistent choice of parameterization so that the limit is well defined in the $\ell^1$ space. This generalization of QSP, called the infinite quantum signal processing, can be used to represent a large class of non-polynomial functions. Our analysis reveals a surprising connection between the regularity of the target function and the decay properties of the phase factors. Our analysis also inspires a very simple and efficient algorithm to approximately compute the phase factors in the $\ell^1$ space. The algorithm uses only double precision arithmetic operations, and provably converges when the $\ell^1$ norm of the Chebyshev coefficients of the target function is upper bounded by a constant that is independent of $d$. This is also the first numerically stable algorithm for finding phase factors with provable performance guarantees in the limit $d\to \infty$.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
On solution of conformal mapping for a lower half plane containing a symmetrical noncircular cavity
Authors:
Luobin Lin,
Fuquan Chen,
Xianhai Huang
Abstract:
In this paper, we provide a candidate solution to obtain the coefficients of the conformal mapping for a lower half plane containing a symmetrical noncircular cavity using penalty function method and modified Particle Swarm Method. The nonconvexity of the penalty function is proven via the concept of convex function and proof of contradiction. The solution procedure is presented very detailedly in…
▽ More
In this paper, we provide a candidate solution to obtain the coefficients of the conformal mapping for a lower half plane containing a symmetrical noncircular cavity using penalty function method and modified Particle Swarm Method. The nonconvexity of the penalty function is proven via the concept of convex function and proof of contradiction. The solution procedure is presented very detailedly in pseudocodes to ensure that the solution can be fully repeated and further improved. The solution accuracy and efficiency are also discussed.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Time-marching based quantum solvers for time-dependent linear differential equations
Authors:
Di Fang,
Lin Lin,
Yu Tong
Abstract:
The time-marching strategy, which propagates the solution from one time step to the next, is a natural strategy for solving time-dependent differential equations on classical computers, as well as for solving the Hamiltonian simulation problem on quantum computers. For more general linear differential equations, a time-marching based quantum solver can suffer from exponentially vanishing success p…
▽ More
The time-marching strategy, which propagates the solution from one time step to the next, is a natural strategy for solving time-dependent differential equations on classical computers, as well as for solving the Hamiltonian simulation problem on quantum computers. For more general linear differential equations, a time-marching based quantum solver can suffer from exponentially vanishing success probability with respect to the number of time steps and is thus considered impractical. We solve this problem by repeatedly invoking a technique called the uniform singular value amplification, and the overall success probability can be lower bounded by a quantity that is independent of the number of time steps. The success probability can be further improved using a compression gadget lemma. This provides a path of designing quantum differential equation solvers that is alternative to those based on quantum linear systems algorithms (QLSA). We demonstrate the performance of the time-marching strategy with a high-order integrator based on the truncated Dyson series. The complexity of the algorithm depends linearly on the amplification ratio, which quantifies the deviation from a unitary dynamics. We prove that the linear dependence on the amplification ratio attains the query complexity lower bound and thus cannot be improved in the worst case. This algorithm also surpasses existing QLSA based solvers in three aspects: (1) the coefficient matrix $A(t)$ does not need to be diagonalizable. (2) $A(t)$ can be non-smooth, and is only of bounded variation. (3) It can use fewer queries to the initial state. Finally, we demonstrate the time-marching strategy with a first-order truncated Magnus series, while retaining the aforementioned benefits. Our analysis also raises some open questions concerning the differences between time-marching and QLSA based methods for solving differential equations.
△ Less
Submitted 15 March, 2023; v1 submitted 14 August, 2022;
originally announced August 2022.
-
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver
Authors:
Zhongzhan Huang,
Senwei Liang,
Hong Zhang,
Haizhao Yang,
Liang Lin
Abstract:
The large-scale simulation of dynamical systems is critical in numerous scientific and engineering disciplines. However, traditional numerical solvers are limited by the choice of step sizes when estimating integration, resulting in a trade-off between accuracy and computational efficiency. To address this challenge, we introduce a deep learning-based corrector called Neural Vector (NeurVec), whic…
▽ More
The large-scale simulation of dynamical systems is critical in numerous scientific and engineering disciplines. However, traditional numerical solvers are limited by the choice of step sizes when estimating integration, resulting in a trade-off between accuracy and computational efficiency. To address this challenge, we introduce a deep learning-based corrector called Neural Vector (NeurVec), which can compensate for integration errors and enable larger time step sizes in simulations. Our extensive experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability on a continuous phase space, even when trained using limited and discrete data. NeurVec significantly accelerates traditional solvers, achieving speeds tens to hundreds of times faster while maintaining high levels of accuracy and stability. Moreover, NeurVec's simple-yet-effective design, combined with its ease of implementation, has the potential to establish a new paradigm for fast-solving differential equations based on deep learning.
△ Less
Submitted 19 September, 2023; v1 submitted 7 August, 2022;
originally announced August 2022.
-
Uniqueness of conformal-harmonic maps on locally conformally flat 4-manifolds
Authors:
Longzhi Lin,
Jingyong Zhu
Abstract:
Motivated by the theory of harmonic maps on Riemannian surfaces, conformal-harmonic maps between two Riemannian manifolds $M$ and $N$ were introduced in search of a natural notion of harmonicity for maps defined on a general even dimensional Riemannian manifold $M$. They are critical points of a conformally invariant energy functional and reassemble the GJMS operators when the target is the set of…
▽ More
Motivated by the theory of harmonic maps on Riemannian surfaces, conformal-harmonic maps between two Riemannian manifolds $M$ and $N$ were introduced in search of a natural notion of harmonicity for maps defined on a general even dimensional Riemannian manifold $M$. They are critical points of a conformally invariant energy functional and reassemble the GJMS operators when the target is the set of real or complex numbers. On a four dimensional manifold, conformal-harmonic maps are the conformally invariant counterparts of the intrinsic bi-harmonic maps and a mapping version of the conformally invariant Paneitz operator for functions.
In this paper, we consider conformal-harmonic maps from certain locally conformally flat 4-manifolds into spheres. We prove a quantitative uniqueness result for such conformal-harmonic maps as an immediate consequence of convexity for the conformally-invariant energy functional. To this end, we are led to prove a version of second order Hardy inequality on manifolds, which may be of independent interest.
△ Less
Submitted 20 February, 2023; v1 submitted 12 June, 2022;
originally announced June 2022.
-
Efficient anti-symmetrization of a neural network layer by taming the sign problem
Authors:
Nilin Abrahamsen,
Lin Lin
Abstract:
Explicit antisymmetrization of a neural network is a potential candidate for a universal function approximator for generic antisymmetric functions, which are ubiquitous in quantum physics. However, this procedure is a priori factorially costly to implement, making it impractical for large numbers of particles. The strategy also suffers from a sign problem. Namely, due to near-exact cancellation of…
▽ More
Explicit antisymmetrization of a neural network is a potential candidate for a universal function approximator for generic antisymmetric functions, which are ubiquitous in quantum physics. However, this procedure is a priori factorially costly to implement, making it impractical for large numbers of particles. The strategy also suffers from a sign problem. Namely, due to near-exact cancellation of positive and negative contributions, the magnitude of the antisymmetrized function may be significantly smaller than before anti-symmetrization. We show that the anti-symmetric projection of a two-layer neural network can be evaluated efficiently, opening the door to using a generic antisymmetric layer as a building block in anti-symmetric neural network Ansatzes. This approximation is effective when the sign problem is controlled, and we show that this property depends crucially the choice of activation function under standard Xavier/He initialization methods. As a consequence, using a smooth activation function requires re-scaling of the neural network weights compared to standard initializations.
△ Less
Submitted 6 September, 2023; v1 submitted 24 May, 2022;
originally announced May 2022.
-
Communication-Efficient Adaptive Federated Learning
Authors:
Yujia Wang,
Lu Lin,
Jinghui Chen
Abstract:
Federated learning is a machine learning training paradigm that enables clients to jointly train models without sharing their own localized data. However, the implementation of federated learning in practice still faces numerous challenges, such as the large communication overhead due to the repetitive server-client synchronization and the lack of adaptivity by SGD-based model updates. Despite tha…
▽ More
Federated learning is a machine learning training paradigm that enables clients to jointly train models without sharing their own localized data. However, the implementation of federated learning in practice still faces numerous challenges, such as the large communication overhead due to the repetitive server-client synchronization and the lack of adaptivity by SGD-based model updates. Despite that various methods have been proposed for reducing the communication cost by gradient compression or quantization, and the federated versions of adaptive optimizers such as FedAdam are proposed to add more adaptivity, the current federated learning framework still cannot solve the aforementioned challenges all at once. In this paper, we propose a novel communication-efficient adaptive federated learning method (FedCAMS) with theoretical convergence guarantees. We show that in the nonconvex stochastic optimization setting, our proposed FedCAMS achieves the same convergence rate of $O(\frac{1}{\sqrt{TKm}})$ as its non-compressed counterparts. Extensive experiments on various benchmarks verify our theoretical analysis.
△ Less
Submitted 19 April, 2023; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Joint mixability and notions of negative dependence
Authors:
Takaaki Koike,
Liyuan Lin,
Ruodu Wang
Abstract:
A joint mix is a random vector with a constant component-wise sum. The dependence structure of a joint mix minimizes some common objectives such as the variance of the component-wise sum, and it is regarded as a concept of extremal negative dependence. In this paper, we explore the connection between the joint mix structure and popular notions of negative dependence in statistics, such as negative…
▽ More
A joint mix is a random vector with a constant component-wise sum. The dependence structure of a joint mix minimizes some common objectives such as the variance of the component-wise sum, and it is regarded as a concept of extremal negative dependence. In this paper, we explore the connection between the joint mix structure and popular notions of negative dependence in statistics, such as negative correlation dependence, negative orthant dependence and negative association. A joint mix is not always negatively dependent in any of the above senses, but some natural classes of joint mixes are. We derive various necessary and sufficient conditions for a joint mix to be negatively dependent, and study the compatibility of these notions. For identical marginal distributions, we show that a negatively dependent joint mix solves a multi-marginal optimal transport problem for quadratic cost under a novel setting of uncertainty. Analysis of this optimal transport problem with heterogeneous marginals reveals a trade-off between negative dependence and the joint mix structure.
△ Less
Submitted 2 January, 2024; v1 submitted 25 April, 2022;
originally announced April 2022.
-
Ground state preparation and energy estimation on early fault-tolerant quantum computers via quantum eigenvalue transformation of unitary matrices
Authors:
Yulong Dong,
Lin Lin,
Yu Tong
Abstract:
Under suitable assumptions, the algorithms in [Lin, Tong, Quantum 2020] can estimate the ground state energy and prepare the ground state of a quantum Hamiltonian with near-optimal query complexities. However, this is based on a block encoding input model of the Hamiltonian, whose implementation is known to require a large resource overhead. We develop a tool called quantum eigenvalue transformati…
▽ More
Under suitable assumptions, the algorithms in [Lin, Tong, Quantum 2020] can estimate the ground state energy and prepare the ground state of a quantum Hamiltonian with near-optimal query complexities. However, this is based on a block encoding input model of the Hamiltonian, whose implementation is known to require a large resource overhead. We develop a tool called quantum eigenvalue transformation of unitary matrices with real polynomials (QET-U), which uses a controlled Hamiltonian evolution as the input model, a single ancilla qubit and no multi-qubit control operations, and is thus suitable for early fault-tolerant quantum devices. This leads to a simple quantum algorithm that outperforms all previous algorithms with a comparable circuit structure for estimating the ground state energy. For a class of quantum spin Hamiltonians, we propose a new method that exploits certain anti-commutation relations and further removes the need of implementing the controlled Hamiltonian evolution. Coupled with Trotter based approximation of the Hamiltonian evolution, the resulting algorithm can be very suitable for early fault-tolerant quantum devices. We demonstrate the performance of the algorithm using IBM Qiskit for the transverse field Ising model. If we are further allowed to use multi-qubit Toffoli gates, we can then implement amplitude amplification and a new binary amplitude estimation algorithm, which increases the circuit depth but decreases the total query complexity. The resulting algorithm saturates the near-optimal complexity for ground state preparation and energy estimating using a constant number of ancilla qubits (no more than 3).
△ Less
Submitted 18 October, 2022; v1 submitted 12 April, 2022;
originally announced April 2022.
-
Explicit Quantum Circuits for Block Encodings of Certain Sparse Matrices
Authors:
Daan Camps,
Lin Lin,
Roel Van Beeumen,
Chao Yang
Abstract:
Many standard linear algebra problems can be solved on a quantum computer by using recently developed quantum linear algebra algorithms that make use of block encodings and quantum eigenvalue/singular value transformations. A block encoding embeds a properly scaled matrix of interest A in a larger unitary transformation U that can be decomposed into a product of simpler unitaries and implemented e…
▽ More
Many standard linear algebra problems can be solved on a quantum computer by using recently developed quantum linear algebra algorithms that make use of block encodings and quantum eigenvalue/singular value transformations. A block encoding embeds a properly scaled matrix of interest A in a larger unitary transformation U that can be decomposed into a product of simpler unitaries and implemented efficiently on a quantum computer. Although quantum algorithms can potentially achieve exponential speedup in solving linear algebra problems compared to the best classical algorithm, such gain in efficiency ultimately hinges on our ability to construct an efficient quantum circuit for the block encoding of A, which is difficult in general, and not trivial even for well-structured sparse matrices. In this paper, we give a few examples on how efficient quantum circuits can be explicitly constructed for some well-structured sparse matrices, and discuss a few strategies used in these constructions. We also provide implementations of these quantum circuits in MATLAB.
△ Less
Submitted 22 May, 2023; v1 submitted 18 March, 2022;
originally announced March 2022.
-
Global Bias-Corrected Divide-and-Conquer by Quantile-Matched Composite for General Nonparametric Regressions
Authors:
Yan Chen,
Lu Lin
Abstract:
The issues of bias-correction and robustness are crucial in the strategy of divide-and-conquer (DC), especially for asymmetric nonparametric models with massive data. It is known that quantile-based methods can achieve the robustness, but the quantile estimation for nonparametric regression has non-ignorable bias when the error distribution is asymmetric. This paper explores a global bias-correcte…
▽ More
The issues of bias-correction and robustness are crucial in the strategy of divide-and-conquer (DC), especially for asymmetric nonparametric models with massive data. It is known that quantile-based methods can achieve the robustness, but the quantile estimation for nonparametric regression has non-ignorable bias when the error distribution is asymmetric. This paper explores a global bias-corrected DC by quantile-matched composite for nonparametric regressions with general error distributions. The proposed strategies can achieve the bias-correction and robustness, simultaneously. Unlike common DC quantile estimations that use an identical quantile level to construct a local estimator by each local machine, in the new methodologies, the local estimators are obtained at various quantile levels for different data batches, and then the global estimator is elaborately constructed as a weighted sum of the local estimators. In the weighted sum, the weights and quantile levels are well-matched such that the bias of the global estimator is corrected significantly, especially for the case where the error distribution is asymmetric. Based on the asymptotic properties of the global estimator, the optimal weights are attained, and the corresponding algorithms are then suggested. The behaviors of the new methods are further illustrated by various numerical examples from simulation experiments and real data analyses. Compared with the competitors, the new methods have the favorable features of estimation accuracy, robustness, applicability and computational efficiency.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
Lecture Notes on Quantum Algorithms for Scientific Computation
Authors:
Lin Lin
Abstract:
This is a set of lecture notes used in a graduate topic class in applied mathematics called ``Quantum Algorithms for Scientific Computation'' at the Department of Mathematics, UC Berkeley during the fall semester of 2021. These lecture notes focus only on quantum algorithms closely related to scientific computation, and in particular, matrix computation. The main purpose of the lecture notes is to…
▽ More
This is a set of lecture notes used in a graduate topic class in applied mathematics called ``Quantum Algorithms for Scientific Computation'' at the Department of Mathematics, UC Berkeley during the fall semester of 2021. These lecture notes focus only on quantum algorithms closely related to scientific computation, and in particular, matrix computation. The main purpose of the lecture notes is to introduce quantum phase estimation (QPE) and ``post-QPE'' methods such as block encoding, quantum signal processing, and quantum singular value transformation, and to demonstrate their applications in solving eigenvalue problems, linear systems of equations, and differential equations. The intended audience is the broad computational science and engineering (CSE) community interested in using fault-tolerant quantum computers to solve challenging scientific computing problems.
△ Less
Submitted 20 January, 2022;
originally announced January 2022.
-
A new locally linear embedding scheme in light of Hessian eigenmap
Authors:
Liren Lin,
Chih-Wei Chen
Abstract:
We provide a new interpretation of Hessian locally linear embedding (HLLE), revealing that it is essentially a variant way to implement the same idea of locally linear embedding (LLE). Based on the new interpretation, a substantial simplification can be made, in which the idea of "Hessian" is replaced by rather arbitrary weights. Moreover, we show by numerical examples that HLLE may produce projec…
▽ More
We provide a new interpretation of Hessian locally linear embedding (HLLE), revealing that it is essentially a variant way to implement the same idea of locally linear embedding (LLE). Based on the new interpretation, a substantial simplification can be made, in which the idea of "Hessian" is replaced by rather arbitrary weights. Moreover, we show by numerical examples that HLLE may produce projection-like results when the dimension of the target space is larger than that of the data manifold, and hence one further modification concerning the manifold dimension is suggested. Combining all the observations, we finally achieve a new LLE-type method, which is called tangential LLE (TLLE). It is simpler and more robust than HLLE.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
Bayesian Optimal Two-sample Tests in High-dimension
Authors:
Kyoungjae Lee,
Kisung You,
Lizhen Lin
Abstract:
We propose optimal Bayesian two-sample tests for testing equality of high-dimensional mean vectors and covariance matrices between two populations. In many applications including genomics and medical imaging, it is natural to assume that only a few entries of two mean vectors or covariance matrices are different. Many existing tests that rely on aggregating the difference between empirical means o…
▽ More
We propose optimal Bayesian two-sample tests for testing equality of high-dimensional mean vectors and covariance matrices between two populations. In many applications including genomics and medical imaging, it is natural to assume that only a few entries of two mean vectors or covariance matrices are different. Many existing tests that rely on aggregating the difference between empirical means or covariance matrices are not optimal or yield low power under such setups. Motivated by this, we develop Bayesian two-sample tests employing a divide-and-conquer idea, which is powerful especially when the difference between two populations is sparse but large. The proposed two-sample tests manifest closed forms of Bayes factors and allow scalable computations even in high-dimensions. We prove that the proposed tests are consistent under relatively mild conditions compared to existing tests in the literature. Furthermore, the testable regions from the proposed tests turn out to be optimal in terms of rates. Simulation studies show clear advantages of the proposed tests over other state-of-the-art methods in various scenarios. Our tests are also applied to the analysis of the gene expression data of two cancer data sets.
△ Less
Submitted 5 December, 2021;
originally announced December 2021.