-
Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky
Authors:
Ethan N. Epperly,
Joel A. Tropp,
Robert J. Webber
Abstract:
Randomly pivoted Cholesky (RPCholesky) is an algorithm for constructing a low-rank approximation of a positive-semidefinite matrix using a small number of columns. This paper develops an accelerated version of RPCholesky that employs block matrix computations and rejection sampling to efficiently simulate the execution of the original algorithm. For the task of approximating a kernel matrix, the a…
▽ More
Randomly pivoted Cholesky (RPCholesky) is an algorithm for constructing a low-rank approximation of a positive-semidefinite matrix using a small number of columns. This paper develops an accelerated version of RPCholesky that employs block matrix computations and rejection sampling to efficiently simulate the execution of the original algorithm. For the task of approximating a kernel matrix, the accelerated algorithm can run over $40\times$ faster. The paper contains implementation details, theoretical guarantees, experiments on benchmark data sets, and an application to computational chemistry.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
A new approach to strong convergence
Authors:
Chi-Fang Chen,
Jorge Garza-Vargas,
Joel A. Tropp,
Ramon van Handel
Abstract:
A family of random matrices $\boldsymbol{X}^N=(X_1^N,\ldots,X_d^N)$ converges strongly to a family $\boldsymbol{x}=(x_1,\ldots,x_d)$ in a $C^*$-algebra if $\|P(\boldsymbol{X}^N)\|\to\|P(\boldsymbol{x})\|$ for every noncommutative polynomial $P$. This phenomenon plays a central role in several recent breakthrough results on random graphs, geometry, and operator algebras. However, strong convergence…
▽ More
A family of random matrices $\boldsymbol{X}^N=(X_1^N,\ldots,X_d^N)$ converges strongly to a family $\boldsymbol{x}=(x_1,\ldots,x_d)$ in a $C^*$-algebra if $\|P(\boldsymbol{X}^N)\|\to\|P(\boldsymbol{x})\|$ for every noncommutative polynomial $P$. This phenomenon plays a central role in several recent breakthrough results on random graphs, geometry, and operator algebras. However, strong convergence is notoriously difficult to prove and has generally required delicate problem-specific methods.
In this paper, we develop a new approach to strong convergence that uses only soft arguments. Our method exploits the fact that for many natural models, the expected trace of $P(\boldsymbol{X}^N)$ is a rational function of $\frac{1}{N}$ whose lowest order asymptotics are easily understood. We develop a general technique to deduce strong convergence directly from these inputs using the inequality of A. and V. Markov for univariate polynomials and elementary Fourier analysis. To illustrate the method, we develop the following applications.
1. We give a short proof of the result of Friedman that random regular graphs have a near-optimal spectral gap, and obtain a sharp understanding of the large deviations probabilities of the second eigenvalue.
2. We prove a strong quantitative form of the strong convergence property of random permutation matrices due to Bordenave and Collins.
3. We extend the above to any stable representation of the symmetric group, providing many new examples of the strong convergence phenomenon.
△ Less
Submitted 26 June, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Randomized matrix computations: Themes and variations
Authors:
Anastasia Kireeva,
Joel A. Tropp
Abstract:
This short course offers a new perspective on randomized algorithms for matrix computations. It explores the distinct ways in which probability can be used to design algorithms for numerical linear algebra. Each design template is illustrated by its application to several computational problems. This treatment establishes conceptual foundations for randomized numerical linear algebra, and it forge…
▽ More
This short course offers a new perspective on randomized algorithms for matrix computations. It explores the distinct ways in which probability can be used to design algorithms for numerical linear algebra. Each design template is illustrated by its application to several computational problems. This treatment establishes conceptual foundations for randomized numerical linear algebra, and it forges links between algorithms that may initially seem unrelated.
△ Less
Submitted 8 April, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Randomized algorithms for low-rank matrix approximation: Design, analysis, and applications
Authors:
Joel A. Tropp,
Robert J. Webber
Abstract:
This survey explores modern approaches for computing low-rank approximations of high-dimensional matrices by means of the randomized SVD, randomized subspace iteration, and randomized block Krylov iteration. The paper compares the procedures via theoretical analyses and numerical studies to highlight how the best choice of algorithm depends on spectral properties of the matrix and the computationa…
▽ More
This survey explores modern approaches for computing low-rank approximations of high-dimensional matrices by means of the randomized SVD, randomized subspace iteration, and randomized block Krylov iteration. The paper compares the procedures via theoretical analyses and numerical studies to highlight how the best choice of algorithm depends on spectral properties of the matrix and the computational resources available.
Despite superior performance for many problems, randomized block Krylov iteration has not been widely adopted in computational science. The paper strengthens the case for this method in three ways. First, it presents new pseudocode that can significantly reduce computational costs. Second, it provides a new analysis that yields simple, precise, and informative error bounds. Last, it showcases applications to challenging scientific problems, including principal component analysis for genetic data and spectral clustering for molecular dynamics data.
△ Less
Submitted 21 September, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Robust, randomized preconditioning for kernel ridge regression
Authors:
Mateo Díaz,
Ethan N. Epperly,
Zachary Frangella,
Joel A. Tropp,
Robert J. Webber
Abstract:
This paper investigates two randomized preconditioning techniques for solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 \leq N \leq 10^7$), and it introduces two new methods with state-of-the-art performance. The first method, RPCholesky preconditioning, accurately solves the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficie…
▽ More
This paper investigates two randomized preconditioning techniques for solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 \leq N \leq 10^7$), and it introduces two new methods with state-of-the-art performance. The first method, RPCholesky preconditioning, accurately solves the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k \ll N$ selected data centers at a cost of $O((N + k^2) k \log k)$ operations. The proposed methods solve a broad range of KRR problems, making them ideal for practical applications.
△ Less
Submitted 10 July, 2024; v1 submitted 24 April, 2023;
originally announced April 2023.
-
Sparse random Hamiltonians are quantumly easy
Authors:
Chi-Fang,
Chen,
Alexander M. Dalzell,
Mario Berta,
Fernando G. S. L. Brandão,
Joel A. Tropp
Abstract:
A candidate application for quantum computers is to simulate the low-temperature properties of quantum systems. For this task, there is a well-studied quantum algorithm that performs quantum phase estimation on an initial trial state that has a nonnegligible overlap with a low-energy state. However, it is notoriously hard to give theoretical guarantees that such a trial state can be prepared effic…
▽ More
A candidate application for quantum computers is to simulate the low-temperature properties of quantum systems. For this task, there is a well-studied quantum algorithm that performs quantum phase estimation on an initial trial state that has a nonnegligible overlap with a low-energy state. However, it is notoriously hard to give theoretical guarantees that such a trial state can be prepared efficiently. Moreover, the heuristic proposals that are currently available, such as with adiabatic state preparation, appear insufficient in practical cases. This paper shows that, for most random sparse Hamiltonians, the maximally mixed state is a sufficiently good trial state, and phase estimation efficiently prepares states with energy arbitrarily close to the ground energy. Furthermore, any low-energy state must have nonnegligible quantum circuit complexity, suggesting that low-energy states are classically nontrivial and phase estimation is the optimal method for preparing such states (up to polynomial factors). These statements hold for two models of random Hamiltonians: (i) a sum of random signed Pauli strings and (ii) a random signed $d$-sparse Hamiltonian. The main technical argument is based on some new results in nonasymptotic random matrix theory. In particular, a refined concentration bound for the spectral density is required to obtain complexity guarantees for these random Hamiltonians.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
XTrace: Making the most of every sample in stochastic trace estimation
Authors:
Ethan N. Epperly,
Joel A. Tropp,
Robert J. Webber
Abstract:
The implicit trace estimation problem asks for an approximation of the trace of a square matrix, accessed via matrix-vector products (matvecs). This paper designs new randomized algorithms, XTrace and XNysTrace, for the trace estimation problem by exploiting both variance reduction and the exchangeability principle. For a fixed budget of matvecs, numerical experiments show that the new methods can…
▽ More
The implicit trace estimation problem asks for an approximation of the trace of a square matrix, accessed via matrix-vector products (matvecs). This paper designs new randomized algorithms, XTrace and XNysTrace, for the trace estimation problem by exploiting both variance reduction and the exchangeability principle. For a fixed budget of matvecs, numerical experiments show that the new methods can achieve errors that are orders of magnitude smaller than existing algorithms, such as the Girard-Hutchinson estimator or the Hutch++ estimator. A theoretical analysis confirms the benefits by offering a precise description of the performance of these algorithms as a function of the spectrum of the input matrix. The paper also develops an exchangeable estimator, XDiag, for approximating the diagonal of a square matrix using matvecs.
△ Less
Submitted 5 January, 2024; v1 submitted 18 January, 2023;
originally announced January 2023.
-
Sharp phase transitions in Euclidean integral geometry
Authors:
Martin Lotz,
Joel A. Tropp
Abstract:
The intrinsic volumes of a convex body are fundamental invariants that capture information about the average volume of the projection of the convex body onto a random subspace of fixed dimension. The intrinsic volumes also play a central role in integral geometry formulas that describe how moving convex bodies interact. Recent work has demonstrated that the sequence of intrinsic volumes concentrat…
▽ More
The intrinsic volumes of a convex body are fundamental invariants that capture information about the average volume of the projection of the convex body onto a random subspace of fixed dimension. The intrinsic volumes also play a central role in integral geometry formulas that describe how moving convex bodies interact. Recent work has demonstrated that the sequence of intrinsic volumes concentrates sharply around its centroid, which is called the central intrinsic volume. The purpose of this paper is to derive finer concentration inequalities for the intrinsic volumes and related sequences. These concentration results have striking implications for high-dimensional integral geometry. In particular, they uncover new phase transitions in formulas for random projections, rotation means, random slicing, and the kinematic formula. In each case, the location of the phase transition is determined by reducing each convex body to a single summary parameter.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations
Authors:
Yifan Chen,
Ethan N. Epperly,
Joel A. Tropp,
Robert J. Webber
Abstract:
The randomly pivoted partial Cholesky algorithm (RPCholesky) computes a factorized rank-k approximation of an N x N positive-semidefinite (psd) matrix. RPCholesky requires only (k + 1) N entry evaluations and O(k^2 N) additional arithmetic operations, and it can be implemented with just a few lines of code. The method is particularly useful for approximating a kernel matrix.
This paper offers a…
▽ More
The randomly pivoted partial Cholesky algorithm (RPCholesky) computes a factorized rank-k approximation of an N x N positive-semidefinite (psd) matrix. RPCholesky requires only (k + 1) N entry evaluations and O(k^2 N) additional arithmetic operations, and it can be implemented with just a few lines of code. The method is particularly useful for approximating a kernel matrix.
This paper offers a thorough new investigation of the empirical and theoretical behavior of this fundamental algorithm. For matrix approximation problems that arise in scientific machine learning, experiments show that RPCholesky matches or beats the performance of alternative algorithms. Moreover, RPCholesky provably returns low-rank approximations that are nearly optimal. The simplicity, effectiveness, and robustness of RPCholesky strongly support its use in scientific computing and machine learning applications.
△ Less
Submitted 21 October, 2024; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Efficient error and variance estimation for randomized matrix computations
Authors:
Ethan N. Epperly,
Joel A. Tropp
Abstract:
Randomized matrix algorithms have become workhorse tools in scientific computing and machine learning. To use these algorithms safely in applications, they should be coupled with posterior error estimates to assess the quality of the output. To meet this need, this paper proposes two diagnostics: a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling meth…
▽ More
Randomized matrix algorithms have become workhorse tools in scientific computing and machine learning. To use these algorithms safely in applications, they should be coupled with posterior error estimates to assess the quality of the output. To meet this need, this paper proposes two diagnostics: a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling method to estimate the variance of the output of a randomized matrix computation. Both of these diagnostics are rapid to compute for randomized low-rank approximation algorithms such as the randomized SVD and randomized Nyström approximation, and they provide useful information that can be used to assess the quality of the computed output and guide algorithmic parameter choices.
△ Less
Submitted 1 October, 2024; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Fast & Accurate Randomized Algorithms for Linear Systems and Eigenvalue Problems
Authors:
Yuji Nakatsukasa,
Joel A. Tropp
Abstract:
This paper develops a new class of algorithms for general linear systems and eigenvalue problems. These algorithms apply fast randomized sketching to accelerate subspace projection methods, such as GMRES and Rayleigh--Ritz. This approach offers great flexibility in designing the basis for the approximation subspace, which can improve scalability in many computational environments. The resulting al…
▽ More
This paper develops a new class of algorithms for general linear systems and eigenvalue problems. These algorithms apply fast randomized sketching to accelerate subspace projection methods, such as GMRES and Rayleigh--Ritz. This approach offers great flexibility in designing the basis for the approximation subspace, which can improve scalability in many computational environments. The resulting algorithms outperform the classic methods with minimal loss of accuracy. For model problems, numerical experiments show large advantages over MATLAB's optimized routines, including a $100 \times$ speedup over gmres and a $10 \times$ speedup over eigs.
△ Less
Submitted 15 February, 2022; v1 submitted 29 October, 2021;
originally announced November 2021.
-
Randomized Nyström Preconditioning
Authors:
Zachary Frangella,
Joel A. Tropp,
Madeleine Udell
Abstract:
This paper introduces the Nyström PCG algorithm for solving a symmetric positive-definite linear system. The algorithm applies the randomized Nyström method to form a low-rank approximation of the matrix, which leads to an efficient preconditioner that can be deployed with the conjugate gradient algorithm. Theoretical analysis shows that preconditioned system has constant condition number as soon…
▽ More
This paper introduces the Nyström PCG algorithm for solving a symmetric positive-definite linear system. The algorithm applies the randomized Nyström method to form a low-rank approximation of the matrix, which leads to an efficient preconditioner that can be deployed with the conjugate gradient algorithm. Theoretical analysis shows that preconditioned system has constant condition number as soon as the rank of the approximation is comparable with the number of effective degrees of freedom in the matrix. The paper also develops adaptive methods that provably achieve similar performance without knowledge of the effective dimension. Numerical tests show that Nyström PCG can rapidly solve large linear systems that arise in data analysis problems, and it surpasses several competing methods from the literature.
△ Less
Submitted 17 December, 2021; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Randomized block Krylov methods for approximating extreme eigenvalues
Authors:
Joel A. Tropp
Abstract:
Randomized block Krylov subspace methods form a powerful class of algorithms for computing the extreme eigenvalues of a symmetric matrix or the extreme singular values of a general matrix. The purpose of this paper is to develop new theoretical bounds on the performance of randomized block Krylov subspace methods for these problems. For matrices with polynomial spectral decay, the randomized block…
▽ More
Randomized block Krylov subspace methods form a powerful class of algorithms for computing the extreme eigenvalues of a symmetric matrix or the extreme singular values of a general matrix. The purpose of this paper is to develop new theoretical bounds on the performance of randomized block Krylov subspace methods for these problems. For matrices with polynomial spectral decay, the randomized block Krylov method can obtain an accurate spectral norm estimate using only a constant number of steps (that depends on the decay rate and the accuracy). Furthermore, the analysis reveals that the behavior of the algorithm depends in a delicate way on the block size. Numerical evidence confirms these predictions.
△ Less
Submitted 1 October, 2021;
originally announced October 2021.
-
Learning to Forecast Dynamical Systems from Streaming Data
Authors:
Dimitris Giannakis,
Amelia Henriksen,
Joel A. Tropp,
Rachel Ward
Abstract:
Kernel analog forecasting (KAF) is a powerful methodology for data-driven, non-parametric forecasting of dynamically generated time series data. This approach has a rigorous foundation in Koopman operator theory and it produces good forecasts in practice, but it suffers from the heavy computational costs common to kernel methods. This paper proposes a streaming algorithm for KAF that only requires…
▽ More
Kernel analog forecasting (KAF) is a powerful methodology for data-driven, non-parametric forecasting of dynamically generated time series data. This approach has a rigorous foundation in Koopman operator theory and it produces good forecasts in practice, but it suffers from the heavy computational costs common to kernel methods. This paper proposes a streaming algorithm for KAF that only requires a single pass over the training data. This algorithm dramatically reduces the costs of training and prediction without sacrificing forecasting skill. Computational experiments demonstrate that the streaming KAF method can successfully forecast several classes of dynamical systems (periodic, quasi-periodic, and chaotic) in both data-scarce and data-rich regimes. The overall methodology may have wider interest as a new template for streaming kernel regression.
△ Less
Submitted 21 September, 2021; v1 submitted 20 September, 2021;
originally announced September 2021.
-
Tensor Random Projection for Low Memory Dimension Reduction
Authors:
Yiming Sun,
Yang Guo,
Joel A. Tropp,
Madeleine Udell
Abstract:
Random projections reduce the dimension of a set of vectors while preserving structural information, such as distances between vectors in the set. This paper proposes a novel use of row-product random matrices in random projection, where we call it Tensor Random Projection (TRP). It requires substantially less memory than existing dimension reduction maps. The TRP map is formed as the Khatri-Rao p…
▽ More
Random projections reduce the dimension of a set of vectors while preserving structural information, such as distances between vectors in the set. This paper proposes a novel use of row-product random matrices in random projection, where we call it Tensor Random Projection (TRP). It requires substantially less memory than existing dimension reduction maps. The TRP map is formed as the Khatri-Rao product of several smaller random projections, and is compatible with any base random projection including sparse maps, which enable dimension reduction with very low query cost and no floating point operations. We also develop a reduced variance extension. We provide a theoretical analysis of the bias and variance of the TRP, and a non-asymptotic error analysis for a TRP composed of two smaller maps. Experiments on both synthetic and MNIST data show that our method performs as well as conventional methods with substantially less storage.
△ Less
Submitted 30 April, 2021;
originally announced May 2021.
-
Concentration for random product formulas
Authors:
Chi-Fang Chen,
Hsin-Yuan Huang,
Richard Kueng,
Joel A. Tropp
Abstract:
Quantum simulation has wide applications in quantum chemistry and physics. Recently, scientists have begun exploring the use of randomized methods for accelerating quantum simulation. Among them, a simple and powerful technique, called qDRIFT, is known to generate random product formulas for which the average quantum channel approximates the ideal evolution. qDRIFT achieves a gate count that does…
▽ More
Quantum simulation has wide applications in quantum chemistry and physics. Recently, scientists have begun exploring the use of randomized methods for accelerating quantum simulation. Among them, a simple and powerful technique, called qDRIFT, is known to generate random product formulas for which the average quantum channel approximates the ideal evolution. qDRIFT achieves a gate count that does not explicitly depend on the number of terms in the Hamiltonian, which contrasts with Suzuki formulas. This work aims to understand the origin of this speed-up by comprehensively analyzing a single realization of the random product formula produced by qDRIFT. The main results prove that a typical realization of the randomized product formula approximates the ideal unitary evolution up to a small diamond-norm error. The gate complexity is already independent of the number of terms in the Hamiltonian, but it depends on the system size and the sum of the interaction strengths in the Hamiltonian. Remarkably, the same random evolution starting from an arbitrary, but fixed, input state yields a much shorter circuit suitable for that input state. In contrast, in deterministic settings, such an improvement usually requires initial state knowledge. The proofs depend on concentration inequalities for vector and matrix martingales, and the framework is applicable to other randomized product formulas. Our bounds are saturated by certain commuting Hamiltonians.
△ Less
Submitted 2 July, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.
-
Nonlinear Matrix Concentration via Semigroup Methods
Authors:
De Huang,
Joel A. Tropp
Abstract:
Matrix concentration inequalities provide information about the probability that a random matrix is close to its expectation with respect to the $l_2$ operator norm. This paper uses semigroup methods to derive sharp nonlinear matrix inequalities. In particular, it is shown that the classic Bakry-Émery curvature criterion implies subgaussian concentration for "matrix Lipschitz" functions. This argu…
▽ More
Matrix concentration inequalities provide information about the probability that a random matrix is close to its expectation with respect to the $l_2$ operator norm. This paper uses semigroup methods to derive sharp nonlinear matrix inequalities. In particular, it is shown that the classic Bakry-Émery curvature criterion implies subgaussian concentration for "matrix Lipschitz" functions. This argument circumvents the need to develop a matrix version of the log-Sobolev inequality, a technical obstacle that has blocked previous attempts to derive matrix concentration inequalities in this setting. The approach unifies and extends much of the previous work on matrix concentration. When applied to a product measure, the theory reproduces the matrix Efron-Stein inequalities due to Paulin et al. It also handles matrix-valued functions on a Riemannian manifold with uniformly positive Ricci curvature.
△ Less
Submitted 6 January, 2021; v1 submitted 30 June, 2020;
originally announced June 2020.
-
From Poincaré Inequalities to Nonlinear Matrix Concentration
Authors:
De Huang,
Joel A. Tropp
Abstract:
This paper deduces exponential matrix concentration from a Poincaré inequality via a short, conceptual argument. Among other examples, this theory applies to matrix-valued functions of a uniformly log-concave random vector. The proof relies on the subadditivity of Poincaré inequalities and a chain rule inequality for the trace of the matrix Dirichlet form. It also uses a symmetrization technique t…
▽ More
This paper deduces exponential matrix concentration from a Poincaré inequality via a short, conceptual argument. Among other examples, this theory applies to matrix-valued functions of a uniformly log-concave random vector. The proof relies on the subadditivity of Poincaré inequalities and a chain rule inequality for the trace of the matrix Dirichlet form. It also uses a symmetrization technique to avoid difficulties associated with a direct extension of the classic scalar argument.
△ Less
Submitted 6 January, 2021; v1 submitted 30 June, 2020;
originally announced June 2020.
-
Matrix Concentration for Products
Authors:
De Huang,
Jonathan Niles-Weed,
Joel A. Tropp,
Rachel Ward
Abstract:
This paper develops nonasymptotic growth and concentration bounds for a product of independent random matrices. These results sharpen and generalize recent work of Henriksen-Ward, and they are similar in spirit to the results of Ahlswede-Winter and of Tropp for a sum of independent random matrices. The argument relies on the uniform smoothness properties of the Schatten trace classes.
This paper develops nonasymptotic growth and concentration bounds for a product of independent random matrices. These results sharpen and generalize recent work of Henriksen-Ward, and they are similar in spirit to the results of Ahlswede-Winter and of Tropp for a sum of independent random matrices. The argument relies on the uniform smoothness properties of the Schatten trace classes.
△ Less
Submitted 11 March, 2020;
originally announced March 2020.
-
Scalable Semidefinite Programming
Authors:
Alp Yurtsever,
Joel A. Tropp,
Olivier Fercoq,
Madeleine Udell,
Volkan Cevher
Abstract:
Semidefinite programming (SDP) is a powerful framework from convex optimization that has striking potential for data science applications. This paper develops a provably correct randomized algorithm for solving large, weakly constrained SDP problems by economizing on the storage and arithmetic costs. Numerical evidence shows that the method is effective for a range of applications, including relax…
▽ More
Semidefinite programming (SDP) is a powerful framework from convex optimization that has striking potential for data science applications. This paper develops a provably correct randomized algorithm for solving large, weakly constrained SDP problems by economizing on the storage and arithmetic costs. Numerical evidence shows that the method is effective for a range of applications, including relaxations of MaxCut, abstract phase retrieval, and quadratic assignment. Running on a laptop equivalent, the algorithm can handle SDP instances where the matrix variable has over $10^{14}$ entries.
△ Less
Submitted 25 March, 2021; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Binary Component Decomposition Part I: The Positive-Semidefinite Case
Authors:
Richard Kueng,
Joel A. Tropp
Abstract:
This paper studies the problem of decomposing a low-rank positive-semidefinite matrix into symmetric factors with binary entries, either $\{\pm 1\}$ or $\{0,1\}$. This research answers fundamental questions about the existence and uniqueness of these decompositions. It also leads to tractable factorization algorithms that succeed under a mild deterministic condition. A companion paper addresses th…
▽ More
This paper studies the problem of decomposing a low-rank positive-semidefinite matrix into symmetric factors with binary entries, either $\{\pm 1\}$ or $\{0,1\}$. This research answers fundamental questions about the existence and uniqueness of these decompositions. It also leads to tractable factorization algorithms that succeed under a mild deterministic condition. A companion paper addresses the related problem of decomposing a low-rank rectangular matrix into a binary factor and an unconstrained factor.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Binary component decomposition Part II: The asymmetric case
Authors:
Richard Kueng,
Joel A. Tropp
Abstract:
This paper studies the problem of decomposing a low-rank matrix into a factor with binary entries, either from $\{\pm 1\}$ or from $\{0,1\}$, and an unconstrained factor. The research answers fundamental questions about the existence and uniqueness of these decompositions. It also leads to tractable factorization algorithms that succeed under a mild deterministic condition. This work builds on a c…
▽ More
This paper studies the problem of decomposing a low-rank matrix into a factor with binary entries, either from $\{\pm 1\}$ or from $\{0,1\}$, and an unconstrained factor. The research answers fundamental questions about the existence and uniqueness of these decompositions. It also leads to tractable factorization algorithms that succeed under a mild deterministic condition. This work builds on a companion paper that addresses the related problem of decomposing a low-rank positive-semidefinite matrix into symmetric binary factors.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Streaming Low-Rank Matrix Approximation with an Application to Scientific Simulation
Authors:
Joel A. Tropp,
Alp Yurtsever,
Madeleine Udell,
Volkan Cevher
Abstract:
This paper argues that randomized linear sketching is a natural tool for on-the-fly compression of data matrices that arise from large-scale scientific simulations and data collection. The technical contribution consists in a new algorithm for constructing an accurate low-rank approximation of a matrix from streaming data. This method is accompanied by an a priori analysis that allows the user to…
▽ More
This paper argues that randomized linear sketching is a natural tool for on-the-fly compression of data matrices that arise from large-scale scientific simulations and data collection. The technical contribution consists in a new algorithm for constructing an accurate low-rank approximation of a matrix from streaming data. This method is accompanied by an a priori analysis that allows the user to set algorithm parameters with confidence and an a posteriori error estimator that allows the user to validate the quality of the reconstructed matrix. In comparison to previous techniques, the new method achieves smaller relative approximation errors and is less sensitive to parameter choices. As concrete applications, the paper outlines how the algorithm can be used to compress a Navier--Stokes simulation and a sea surface temperature dataset.
△ Less
Submitted 22 February, 2019;
originally announced February 2019.
-
An Optimal-Storage Approach to Semidefinite Programming using Approximate Complementarity
Authors:
Lijun Ding,
Alp Yurtsever,
Volkan Cevher,
Joel A. Tropp,
Madeleine Udell
Abstract:
This paper develops a new storage-optimal algorithm that provably solves generic semidefinite programs (SDPs) in standard form. This method is particularly effective for weakly constrained SDPs. The key idea is to formulate an approximate complementarity principle: Given an approximate solution to the dual SDP, the primal SDP has an approximate solution whose range is contained in the eigenspace w…
▽ More
This paper develops a new storage-optimal algorithm that provably solves generic semidefinite programs (SDPs) in standard form. This method is particularly effective for weakly constrained SDPs. The key idea is to formulate an approximate complementarity principle: Given an approximate solution to the dual SDP, the primal SDP has an approximate solution whose range is contained in the eigenspace with small eigenvalues of the dual slack matrix. For weakly constrained SDPs, this eigenspace has very low dimension, so this observation significantly reduces the search space for the primal solution.
This result suggests an algorithmic strategy that can be implemented with minimal storage:
(1) Solve the dual SDP approximately;
(2) compress the primal SDP to the eigenspace with small eigenvalues of the dual slack matrix;
(3) solve the compressed primal SDP.
The paper also provides numerical experiments showing that this approach is successful for a range of interesting large-scale SDPs.
△ Less
Submitted 17 June, 2020; v1 submitted 9 February, 2019;
originally announced February 2019.
-
Concentration of the Intrinsic Volumes of a Convex Body
Authors:
Martin Lotz,
Michael B. McCoy,
Ivan Nourdin,
Giovanni Peccati,
Joel A. Tropp
Abstract:
The intrinsic volumes are measures of the content of a convex body. This paper uses probabilistic and information-theoretic methods to study the sequence of intrinsic volumes of a convex body. The main result states that the intrinsic volume sequence concentrates sharply around a specific index, called the central intrinsic volume. Furthermore, among all convex bodies whose central intrinsic volum…
▽ More
The intrinsic volumes are measures of the content of a convex body. This paper uses probabilistic and information-theoretic methods to study the sequence of intrinsic volumes of a convex body. The main result states that the intrinsic volume sequence concentrates sharply around a specific index, called the central intrinsic volume. Furthermore, among all convex bodies whose central intrinsic volume is fixed, an appropriately scaled cube has the intrinsic volume sequence with maximum entropy.
△ Less
Submitted 19 March, 2019; v1 submitted 29 October, 2018;
originally announced October 2018.
-
Fast state tomography with optimal error bounds
Authors:
Madalin Guta,
Jonas Kahn,
Richard Kueng,
Joel A. Tropp
Abstract:
Projected least squares (PLS) is an intuitive and numerically cheap technique for quantum state tomography. The method first computes the least-squares estimator (or a linear inversion estimator) and then projects the initial estimate onto the space of states. The main result of this paper equips this point estimator with a rigorous, non-asymptotic confidence region expressed in terms of the trace…
▽ More
Projected least squares (PLS) is an intuitive and numerically cheap technique for quantum state tomography. The method first computes the least-squares estimator (or a linear inversion estimator) and then projects the initial estimate onto the space of states. The main result of this paper equips this point estimator with a rigorous, non-asymptotic confidence region expressed in terms of the trace distance. The analysis holds for a variety of measurements, including 2-designs and Pauli measurements. The sample complexity of the estimator is comparable to the strongest convergence guarantees available in the literature and -- in the case of measuring the uniform POVM -- saturates fundamental lower bounds.The results are derived by reinterpreting the least-squares estimator as a sum of random matrices and applying a matrix-valued concentration inequality. The theory is supported by numerical simulations for mutually unbiased bases, Pauli observables, and Pauli basis measurements.
△ Less
Submitted 28 September, 2018;
originally announced September 2018.
-
Simplicial faces of the set of correlation matrices
Authors:
Joel A. Tropp
Abstract:
This paper concerns the facial geometry of the set of $n \times n$ correlation matrices. The main result states that almost every set of $r$ vertices generates a simplicial face, provided that $r \leq \sqrt{\mathrm{c} n}$, where $\mathrm{c}$ is an absolute constant. This bound is qualitatively sharp because the set of correlation matrices has no simplicial face generated by more than $\sqrt{2n}$ v…
▽ More
This paper concerns the facial geometry of the set of $n \times n$ correlation matrices. The main result states that almost every set of $r$ vertices generates a simplicial face, provided that $r \leq \sqrt{\mathrm{c} n}$, where $\mathrm{c}$ is an absolute constant. This bound is qualitatively sharp because the set of correlation matrices has no simplicial face generated by more than $\sqrt{2n}$ vertices.
△ Less
Submitted 2 January, 2018;
originally announced January 2018.
-
Fixed-Rank Approximation of a Positive-Semidefinite Matrix from Streaming Data
Authors:
Joel A. Tropp,
Alp Yurtsever,
Madeleine Udell,
Volkan Cevher
Abstract:
Several important applications, such as streaming PCA and semidefinite programming, involve a large-scale positive-semidefinite (psd) matrix that is presented as a sequence of linear updates. Because of storage limitations, it may only be possible to retain a sketch of the psd matrix. This paper develops a new algorithm for fixed-rank psd approximation from a sketch. The approach combines the Nyst…
▽ More
Several important applications, such as streaming PCA and semidefinite programming, involve a large-scale positive-semidefinite (psd) matrix that is presented as a sequence of linear updates. Because of storage limitations, it may only be possible to retain a sketch of the psd matrix. This paper develops a new algorithm for fixed-rank psd approximation from a sketch. The approach combines the Nystrom approximation with a novel mechanism for rank truncation. Theoretical analysis establishes that the proposed method can achieve any prescribed relative error in the Schatten 1-norm and that it exploits the spectral decay of the input matrix. Computer experiments show that the proposed method dominates alternative techniques for fixed-rank psd matrix approximation across a wide range of examples.
△ Less
Submitted 18 June, 2017;
originally announced June 2017.
-
Sketchy Decisions: Convex Low-Rank Matrix Optimization with Optimal Storage
Authors:
Alp Yurtsever,
Madeleine Udell,
Joel A. Tropp,
Volkan Cevher
Abstract:
This paper concerns a fundamental class of convex matrix optimization problems. It presents the first algorithm that uses optimal storage and provably computes a low-rank approximation of a solution. In particular, when all solutions have low rank, the algorithm converges to a solution. This algorithm, SketchyCGM, modifies a standard convex optimization scheme, the conditional gradient method, to…
▽ More
This paper concerns a fundamental class of convex matrix optimization problems. It presents the first algorithm that uses optimal storage and provably computes a low-rank approximation of a solution. In particular, when all solutions have low rank, the algorithm converges to a solution. This algorithm, SketchyCGM, modifies a standard convex optimization scheme, the conditional gradient method, to store only a small randomized sketch of the matrix variable. After the optimization terminates, the algorithm extracts a low-rank approximation of the solution from the sketch. In contrast to nonconvex heuristics, the guarantees for SketchyCGM do not rely on statistical models for the problem data. Numerical work demonstrates the benefits of SketchyCGM over heuristics.
△ Less
Submitted 22 February, 2017;
originally announced February 2017.
-
Practical sketching algorithms for low-rank matrix approximation
Authors:
Joel A. Tropp,
Alp Yurtsever,
Madeleine Udell,
Volkan Cevher
Abstract:
This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably…
▽ More
This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.
△ Less
Submitted 2 January, 2018; v1 submitted 31 August, 2016;
originally announced September 2016.
-
Universality laws for randomized dimension reduction, with applications
Authors:
Samet Oymak,
Joel A. Tropp
Abstract:
Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. This dimension reduction procedure succeeds when it preserves certain geometric features of the set.
The question is how large the embedding dimensi…
▽ More
Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. This dimension reduction procedure succeeds when it preserves certain geometric features of the set.
The question is how large the embedding dimension must be to ensure that randomized dimension reduction succeeds with high probability.
This paper studies a natural family of randomized dimension reduction maps and a large class of data sets. It proves that there is a phase transition in the success probability of the dimension reduction map as the embedding dimension increases. For a given data set, the location of the phase transition is the same for all maps in this family. Furthermore, each map has the same stability properties, as quantified through the restricted minimum singular value. These results can be viewed as new universality laws in high-dimensional stochastic geometry.
Universality laws for randomized dimension reduction have many applications in applied mathematics, signal processing, and statistics. They yield design principles for numerical linear algebra algorithms, for compressed sensing measurement ensembles, and for random linear codes. Furthermore, these results have implications for the performance of statistical estimation methods under a large class of random experimental designs.
△ Less
Submitted 16 September, 2017; v1 submitted 30 November, 2015;
originally announced November 2015.
-
The Expected Norm of a Sum of Independent Random Matrices: An Elementary Approach
Authors:
Joel A. Tropp
Abstract:
In contemporary applied and computational mathematics, a frequent challenge is to bound the expectation of the spectral norm of a sum of independent random matrices. This quantity is controlled by the norm of the expected square of the random matrix and the expectation of the maximum squared norm achieved by one of the summands; there is also a weak dependence on the dimension of the random matrix…
▽ More
In contemporary applied and computational mathematics, a frequent challenge is to bound the expectation of the spectral norm of a sum of independent random matrices. This quantity is controlled by the norm of the expected square of the random matrix and the expectation of the maximum squared norm achieved by one of the summands; there is also a weak dependence on the dimension of the random matrix. The purpose of this paper is to give a complete, elementary proof of this important, but underappreciated, inequality.
△ Less
Submitted 16 October, 2015; v1 submitted 15 June, 2015;
originally announced June 2015.
-
Integer factorization of a positive-definite matrix
Authors:
Joel A. Tropp
Abstract:
This paper establishes that every positive-definite matrix can be written as a positive linear combination of outer products of integer-valued vectors whose entries are bounded by the geometric mean of the condition number and the dimension of the matrix.
This paper establishes that every positive-definite matrix can be written as a positive linear combination of outer products of integer-valued vectors whose entries are bounded by the geometric mean of the condition number and the dimension of the matrix.
△ Less
Submitted 3 August, 2015; v1 submitted 31 May, 2015;
originally announced June 2015.
-
Second-Order Matrix Concentration Inequalities
Authors:
Joel A. Tropp
Abstract:
Matrix concentration inequalities give bounds for the spectral-norm deviation of a random matrix from its expected value. These results have a weak dimensional dependence that is sometimes, but not always, necessary. This paper identifies one of the sources of the dimensional term and exploits this insight to develop sharper matrix concentration inequalities. In particular, this analysis delivers…
▽ More
Matrix concentration inequalities give bounds for the spectral-norm deviation of a random matrix from its expected value. These results have a weak dimensional dependence that is sometimes, but not always, necessary. This paper identifies one of the sources of the dimensional term and exploits this insight to develop sharper matrix concentration inequalities. In particular, this analysis delivers two refinements of the matrix Khintchine inequality that use information beyond the matrix variance to reduce or eliminate the dimensional dependence.
△ Less
Submitted 3 August, 2016; v1 submitted 22 April, 2015;
originally announced April 2015.
-
An Introduction to Matrix Concentration Inequalities
Authors:
Joel A. Tropp
Abstract:
In recent years, random matrices have come to play a major role in computational mathematics, but most of the classical areas of random matrix theory remain the province of experts. Over the last decade, with the advent of matrix concentration inequalities, research has advanced to the point where we can conquer many (formerly) challenging problems with a page or two of arithmetic. The aim of this…
▽ More
In recent years, random matrices have come to play a major role in computational mathematics, but most of the classical areas of random matrix theory remain the province of experts. Over the last decade, with the advent of matrix concentration inequalities, research has advanced to the point where we can conquer many (formerly) challenging problems with a page or two of arithmetic. The aim of this monograph is to describe the most successful methods from this area along with some interesting examples that these techniques can illuminate.
△ Less
Submitted 7 January, 2015;
originally announced January 2015.
-
Solving ptychography with a convex relaxation
Authors:
Roarke Horstmeyer,
Richard Y. Chen,
Xiaoze Ou,
Brendan Ames,
Joel A. Tropp,
Changhuei Yang
Abstract:
Ptychography is a powerful computational imaging technique that transforms a collection of low-resolution images into a high-resolution sample reconstruction. Unfortunately, algorithms that are currently used to solve this reconstruction problem lack stability, robustness, and theoretical guarantees. Recently, convex optimization algorithms have improved the accuracy and reliability of several rel…
▽ More
Ptychography is a powerful computational imaging technique that transforms a collection of low-resolution images into a high-resolution sample reconstruction. Unfortunately, algorithms that are currently used to solve this reconstruction problem lack stability, robustness, and theoretical guarantees. Recently, convex optimization algorithms have improved the accuracy and reliability of several related reconstruction efforts. This paper proposes a convex formulation of the ptychography problem. This formulation has no local minima, it can be solved using a wide range of algorithms, it can incorporate appropriate noise models, and it can include multiple a priori constraints. The paper considers a specific algorithm, based on low-rank factorization, whose runtime and memory usage are near-linear in the size of the output image. Experiments demonstrate that this approach offers a 25% lower background variance on average than alternating projections, the current standard algorithm for ptychographic reconstruction.
△ Less
Submitted 3 December, 2014;
originally announced December 2014.
-
A foundation for analytical developments in the logarithmic region of turbulent channels
Authors:
Rashad Moarref,
Ati S. Sharma,
Joel A. Tropp,
Beverley J. McKeon
Abstract:
An analytical framework for studying the logarithmic region of turbulent channels is formulated. We build on recent findings (Moarref et al., J. Fluid Mech., 734, 2013) that the velocity fluctuations in the logarithmic region can be decomposed into a weighted sum of geometrically self-similar resolvent modes. The resolvent modes and the weights represent the linear amplification mechanisms and the…
▽ More
An analytical framework for studying the logarithmic region of turbulent channels is formulated. We build on recent findings (Moarref et al., J. Fluid Mech., 734, 2013) that the velocity fluctuations in the logarithmic region can be decomposed into a weighted sum of geometrically self-similar resolvent modes. The resolvent modes and the weights represent the linear amplification mechanisms and the scaling influence of the nonlinear interactions in the Navier-Stokes equations (NSE), respectively (McKeon & Sharma, J. Fluid Mech., 658, 2010). Originating from the NSE, this framework provides an analytical support for Townsend's attached-eddy model. Our main result is that self-similarity enables order reduction in modeling the logarithmic region by establishing a quantitative link between the self-similar structures and the velocity spectra. Specifically, the energy intensities, the Reynolds stresses, and the energy budget are expressed in terms of the resolvent modes with speeds corresponding to the top of the logarithmic region. The weights of the triad modes -the modes that directly interact via the quadratic nonlinearity in the NSE- are coupled via the interaction coefficients that depend solely on the resolvent modes (McKeon et al., Phys. Fluids, 25, 2013). We use the hierarchies of self-similar modes in the logarithmic region to extend the notion of triad modes to triad hierarchies. It is shown that the interaction coefficients for the triad modes that belong to a triad hierarchy follow an exponential function. The combination of these findings can be used to better understand the dynamics and interaction of flow structures in the logarithmic region. The compatibility of the proposed model with theoretical and experimental results is further discussed.
△ Less
Submitted 21 September, 2014;
originally announced September 2014.
-
Efron-Stein Inequalities for Random Matrices
Authors:
Daniel Paulin,
Lester Mackey,
Joel A. Tropp
Abstract:
This paper establishes new concentration inequalities for random matrices constructed from independent random variables. These results are analogous with the generalized Efron-Stein inequalities developed by Boucheron et al. The proofs rely on the method of exchangeable pairs.
This paper establishes new concentration inequalities for random matrices constructed from independent random variables. These results are analogous with the generalized Efron-Stein inequalities developed by Boucheron et al. The proofs rely on the method of exchangeable pairs.
△ Less
Submitted 15 August, 2014;
originally announced August 2014.
-
Convex recovery of a structured signal from independent random linear measurements
Authors:
Joel A. Tropp
Abstract:
This chapter develops a theoretical analysis of the convex programming method for recovering a structured signal from independent random linear measurements. This technique delivers bounds for the sampling complexity that are similar with recent results for standard Gaussian measurements, but the argument applies to a much wider class of measurement ensembles. To demonstrate the power of this appr…
▽ More
This chapter develops a theoretical analysis of the convex programming method for recovering a structured signal from independent random linear measurements. This technique delivers bounds for the sampling complexity that are similar with recent results for standard Gaussian measurements, but the argument applies to a much wider class of measurement ensembles. To demonstrate the power of this approach, the paper presents a short analysis of phase retrieval by trace-norm minimization. The key technical tool is a framework, due to Mendelson and coauthors, for bounding a nonnegative empirical process.
△ Less
Submitted 3 December, 2014; v1 submitted 5 May, 2014;
originally announced May 2014.
-
A low-order decomposition of turbulent channel flow via resolvent analysis and convex optimization
Authors:
R. Moarref,
M. R. Jovanovic,
J. A. Tropp,
A. S. Sharma,
B. J. McKeon
Abstract:
We combine resolvent-mode decomposition with techniques from convex optimization to optimally approximate velocity spectra in a turbulent channel. The velocity is expressed as a weighted sum of resolvent modes that are dynamically significant, non-empirical, and scalable with Reynolds number. To optimally represent DNS data at friction Reynolds number $2003$, we determine the weights of resolvent…
▽ More
We combine resolvent-mode decomposition with techniques from convex optimization to optimally approximate velocity spectra in a turbulent channel. The velocity is expressed as a weighted sum of resolvent modes that are dynamically significant, non-empirical, and scalable with Reynolds number. To optimally represent DNS data at friction Reynolds number $2003$, we determine the weights of resolvent modes as the solution of a convex optimization problem. Using only $12$ modes per wall-parallel wavenumber pair and temporal frequency, we obtain close agreement with DNS-spectra, reducing the wall-normal and temporal resolutions used in the simulation by three orders of magnitude.
△ Less
Submitted 21 May, 2014; v1 submitted 24 January, 2014;
originally announced January 2014.
-
The achievable performance of convex demixing
Authors:
Michael B. McCoy,
Joel A. Tropp
Abstract:
Demixing is the problem of identifying multiple structured signals from a superimposed, undersampled, and noisy observation. This work analyzes a general framework, based on convex optimization, for solving demixing problems. When the constituent signals follow a generic incoherence model, this analysis leads to precise recovery guarantees. These results admit an attractive interpretation: each si…
▽ More
Demixing is the problem of identifying multiple structured signals from a superimposed, undersampled, and noisy observation. This work analyzes a general framework, based on convex optimization, for solving demixing problems. When the constituent signals follow a generic incoherence model, this analysis leads to precise recovery guarantees. These results admit an attractive interpretation: each signal possesses an intrinsic degrees-of-freedom parameter, and demixing can succeed if and only if the dimension of the observation exceeds the total degrees of freedom present in the observation.
△ Less
Submitted 28 September, 2013;
originally announced September 2013.
-
From Steiner Formulas for Cones to Concentration of Intrinsic Volumes
Authors:
Michael B. McCoy,
Joel A. Tropp
Abstract:
The intrinsic volumes of a convex cone are geometric functionals that return basic structural information about the cone. Recent research has demonstrated that conic intrinsic volumes are valuable for understanding the behavior of random convex optimization problems. This paper develops a systematic technique for studying conic intrinsic volumes using methods from probability. At the heart of this…
▽ More
The intrinsic volumes of a convex cone are geometric functionals that return basic structural information about the cone. Recent research has demonstrated that conic intrinsic volumes are valuable for understanding the behavior of random convex optimization problems. This paper develops a systematic technique for studying conic intrinsic volumes using methods from probability. At the heart of this approach is a general Steiner formula for cones. This result converts questions about the intrinsic volumes into questions about the projection of a Gaussian random vector onto the cone, which can then be resolved using tools from Gaussian analysis. The approach leads to new identities and bounds for the intrinsic volumes of a cone, including a near-optimal concentration inequality.
△ Less
Submitted 12 July, 2015; v1 submitted 23 August, 2013;
originally announced August 2013.
-
Subadditivity of Matrix phi-Entropy and Concentration of Random Matrices
Authors:
Joel A. Tropp,
Richard Y. Chen
Abstract:
Matrix concentration inequalities provide a direct way to bound the typical spectral norm of a random matrix. The methods for establishing these results often parallel classical arguments, such as the Laplace transform method. This work develops a matrix extension of the entropy method, and it applies these ideas to obtain some matrix concentration inequalities.
Matrix concentration inequalities provide a direct way to bound the typical spectral norm of a random matrix. The methods for establishing these results often parallel classical arguments, such as the Laplace transform method. This work develops a matrix extension of the entropy method, and it applies these ideas to obtain some matrix concentration inequalities.
△ Less
Submitted 13 August, 2013;
originally announced August 2013.
-
Deriving Matrix Concentration Inequalities from Kernel Couplings
Authors:
Daniel Paulin,
Lester Mackey,
Joel A. Tropp
Abstract:
This paper derives exponential tail bounds and polynomial moment inequalities for the spectral norm deviation of a random matrix from its mean value. The argument depends on a matrix extension of Stein's method of exchangeable pairs for concentration of measure, as introduced by Chatterjee. Recent work of Mackey et al. uses these techniques to analyze random matrices with additive structure, while…
▽ More
This paper derives exponential tail bounds and polynomial moment inequalities for the spectral norm deviation of a random matrix from its mean value. The argument depends on a matrix extension of Stein's method of exchangeable pairs for concentration of measure, as introduced by Chatterjee. Recent work of Mackey et al. uses these techniques to analyze random matrices with additive structure, while the enhancements in this paper cover a wider class of matrix-valued random elements. In particular, these ideas lead to a bounded differences inequality that applies to random matrices constructed from weakly dependent random variables. The proofs require novel trace inequalities that may be of independent interest.
△ Less
Submitted 2 May, 2013;
originally announced May 2013.
-
Living on the edge: Phase transitions in convex programs with random data
Authors:
Dennis Amelunxen,
Martin Lotz,
Michael B. McCoy,
Joel A. Tropp
Abstract:
Recent research indicates that many convex optimization problems with random constraints exhibit a phase transition as the number of constraints increases. For example, this phenomenon emerges in the $\ell_1$ minimization method for identifying a sparse vector from random linear measurements. Indeed, the $\ell_1$ approach succeeds with high probability when the number of measurements exceeds a thr…
▽ More
Recent research indicates that many convex optimization problems with random constraints exhibit a phase transition as the number of constraints increases. For example, this phenomenon emerges in the $\ell_1$ minimization method for identifying a sparse vector from random linear measurements. Indeed, the $\ell_1$ approach succeeds with high probability when the number of measurements exceeds a threshold that depends on the sparsity level; otherwise, it fails with high probability.
This paper provides the first rigorous analysis that explains why phase transitions are ubiquitous in random convex optimization problems. It also describes tools for making reliable predictions about the quantitative aspects of the transition, including the location and the width of the transition region. These techniques apply to regularized linear inverse problems with random measurements, to demixing problems under a random incoherence model, and also to cone programs with random affine constraints.
The applied results depend on foundational research in conic geometry. This paper introduces a summary parameter, called the statistical dimension, that canonically extends the dimension of a linear subspace to the class of convex cones. The main technical result demonstrates that the sequence of intrinsic volumes of a convex cone concentrates sharply around the statistical dimension. This fact leads to accurate bounds on the probability that a randomly rotated cone shares a ray with a fixed cone.
△ Less
Submitted 25 April, 2014; v1 submitted 26 March, 2013;
originally announced March 2013.
-
Model-based scaling of the streamwise energy density in high-Reynolds number turbulent channels
Authors:
Rashad Moarref,
Ati S. Sharma,
Joel A. Tropp,
Beverley J. McKeon
Abstract:
We study the Reynolds number scaling and the geometric self-similarity of a gain-based, low-rank approximation to turbulent channel flows, determined by the resolvent formulation of McKeon & Sharma (2010), in order to obtain a description of the streamwise turbulence intensity from direct consideration of the Navier-Stokes equations. Under this formulation, the velocity field is decomposed into pr…
▽ More
We study the Reynolds number scaling and the geometric self-similarity of a gain-based, low-rank approximation to turbulent channel flows, determined by the resolvent formulation of McKeon & Sharma (2010), in order to obtain a description of the streamwise turbulence intensity from direct consideration of the Navier-Stokes equations. Under this formulation, the velocity field is decomposed into propagating waves (with single streamwise and spanwise wavelengths and wave speed) whose wall-normal shapes are determined from the principal singular function of the corresponding resolvent operator. Using the accepted scalings of the mean velocity in wall-bounded turbulent flows, we establish that the resolvent operator admits three classes of wave parameters that induce universal behavior with Reynolds number on the low-rank model, and which are consistent with scalings proposed throughout the wall turbulence literature. In addition, it was shown that a necessary condition for geometrically self-similar resolvent modes is the presence of a logarithmic turbulent mean velocity. We identify the scalings that constitute hierarchies of self-similar modes that are parameterized by the critical wall-normal location where the speed of the mode equals the local turbulent mean velocity. For the rank-1 model subject to broadband forcing, the integrated streamwise energy density takes a universal form which is consistent with the dominant near-wall turbulent motions. When the shape of the forcing is optimized to enforce matching with results from direct numerical simulations at low turbulent Reynolds numbers, further similarity appears. Representation of these weight functions using similarity laws enables prediction of the Reynolds number and wall-normal variations of the streamwise energy intensity at high Reynolds numbers (${Re}_τ\approx 10^3 - 10^{10}$).
△ Less
Submitted 3 October, 2013; v1 submitted 6 February, 2013;
originally announced February 2013.
-
Paved with Good Intentions: Analysis of a Randomized Block Kaczmarz Method
Authors:
Deanna Needell,
Joel A. Tropp
Abstract:
The block Kaczmarz method is an iterative scheme for solving overdetermined least-squares problems. At each step, the algorithm projects the current iterate onto the solution space of a subset of the constraints. This paper describes a block Kaczmarz algorithm that uses a randomized control scheme to choose the subset at each step. This algorithm is the first block Kaczmarz method with an (expecte…
▽ More
The block Kaczmarz method is an iterative scheme for solving overdetermined least-squares problems. At each step, the algorithm projects the current iterate onto the solution space of a subset of the constraints. This paper describes a block Kaczmarz algorithm that uses a randomized control scheme to choose the subset at each step. This algorithm is the first block Kaczmarz method with an (expected) linear rate of convergence that can be expressed in terms of the geometric properties of the matrix and its submatrices. The analysis reveals that the algorithm is most effective when it is given a good row paving of the matrix, a partition of the rows into well-conditioned blocks. The operator theory literature provides detailed information about the existence and construction of good row pavings. Together, these results yield an efficient block Kaczmarz scheme that applies to many overdetermined least-squares problem.
△ Less
Submitted 17 December, 2012; v1 submitted 18 August, 2012;
originally announced August 2012.
-
Factoring nonnegative matrices with linear programs
Authors:
Victor Bittorf,
Benjamin Recht,
Christopher Re,
Joel A. Tropp
Abstract:
This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear const…
▽ More
This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes.
△ Less
Submitted 2 February, 2013; v1 submitted 6 June, 2012;
originally announced June 2012.
-
Sharp recovery bounds for convex demixing, with applications
Authors:
Michael B. McCoy,
Joel A. Tropp
Abstract:
Demixing refers to the challenge of identifying two structured signals given only the sum of the two signals and prior information about their structures. Examples include the problem of separating a signal that is sparse with respect to one basis from a signal that is sparse with respect to a second basis, and the problem of decomposing an observed matrix into a low-rank matrix plus a sparse matr…
▽ More
Demixing refers to the challenge of identifying two structured signals given only the sum of the two signals and prior information about their structures. Examples include the problem of separating a signal that is sparse with respect to one basis from a signal that is sparse with respect to a second basis, and the problem of decomposing an observed matrix into a low-rank matrix plus a sparse matrix. This paper describes and analyzes a framework, based on convex optimization, for solving these demixing problems, and many others. This work introduces a randomized signal model which ensures that the two structures are incoherent, i.e., generically oriented. For an observation from this model, this approach identifies a summary statistic that reflects the complexity of a particular signal. The difficulty of separating two structured, incoherent signals depends only on the total complexity of the two structures. Some applications include (i) demixing two signals that are sparse in mutually incoherent bases; (ii) decoding spread-spectrum transmissions in the presence of impulsive errors; and (iii) removing sparse corruptions from a low-rank matrix. In each case, the theoretical analysis of the convex demixing method closely matches its empirical behavior.
△ Less
Submitted 10 January, 2014; v1 submitted 7 May, 2012;
originally announced May 2012.
-
Robust computation of linear models by convex relaxation
Authors:
Gilad Lerman,
Michael McCoy,
Joel A. Tropp,
Teng Zhang
Abstract:
Consider a dataset of vector-valued observations that consists of noisy inliers, which are explained well by a low-dimensional subspace, along with some number of outliers. This work describes a convex optimization problem, called REAPER, that can reliably fit a low-dimensional model to this type of data. This approach parameterizes linear subspaces using orthogonal projectors, and it uses a relax…
▽ More
Consider a dataset of vector-valued observations that consists of noisy inliers, which are explained well by a low-dimensional subspace, along with some number of outliers. This work describes a convex optimization problem, called REAPER, that can reliably fit a low-dimensional model to this type of data. This approach parameterizes linear subspaces using orthogonal projectors, and it uses a relaxation of the set of orthogonal projectors to reach the convex formulation. The paper provides an efficient algorithm for solving the REAPER problem, and it documents numerical experiments which confirm that REAPER can dependably find linear structure in synthetic and natural data. In addition, when the inliers lie near a low-dimensional subspace, there is a rigorous theory that describes when REAPER can approximate this subspace.
△ Less
Submitted 11 August, 2014; v1 submitted 17 February, 2012;
originally announced February 2012.