Let \(X\in \mathbb{R}^{p_1\times p_2}\) be an approximately low-rank matrix and \(Z\in \mathbb{R}^{p_1\times p_2}\) be a small “perturbation” matrix. The goal of the authors is to provide separate rate-optimal perturbation bounds for singular subspaces and rate-sharp bounds for the \(\sin\Theta\) distances between the left and right singular subspaces of \(X\) and \(X+Z\). A comparison with Wedin’s \(\sin\Theta\) theorem is done.
In the sequel, these perturbation bounds are used for low-rank matrix denoising and singular space estimation. It is shown that these new perturbation bounds are particularly powerful when the matrix dimensions differ significantly. Another field of eventual applications discussed in the paper is high-dimensional clustering and canonical correlation analysis. Results of selected simulations illustrate the approach, especially advantage to separate bounds for the left and right singular subspaces over the uniform bounds. The most important parts of the proofs can be found as supplementary material under doi:10.1214/17-AOS1541SUPP.


62H12 Estimation in multivariate analysis
62H25 Factor analysis and principal components; correspondence analysis
15B52 Random matrices (algebraic aspects)
62H30 Classification and discrimination; cluster analysis (statistical aspects)




