×

Covariance estimation for matrix-valued data. (English) Zbl 07784933

Summary: Covariance estimation for matrix-valued data has received an increasing interest in applications. Unlike previous works that rely heavily on matrix normal distribution assumption and the requirement of fixed matrix size, we propose a class of distribution-free regularized covariance estimation methods for high-dimensional matrix data under a separability condition and a bandable covariance structure. Under these conditions, the original covariance matrix is decomposed into a Kronecker product of two bandable small covariance matrices representing the variability over row and column directions. We formulate a unified framework for estimating bandable covariance, and introduce an efficient algorithm based on rank one unconstrained Kronecker product approximation. The convergence rates of the proposed estimators are established, and the derived minimax lower bound shows our proposed estimator is rate-optimal under certain divergence regimes of matrix size. We further introduce a class of robust covariance estimators and provide theoretical guarantees to deal with heavy-tailed data. We demonstrate the superior finite-sample performance of our methods using simulations and real applications from a gridded temperature anomalies dataset and an S&P 500 stock data analysis. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

Software:

TKPSVD

References:

[1] Aston, J. A., Pigoli, D., and Tavakoli, S. (2017), “Tests for Separability in Nonparametric Covariance Operators of Random Surfaces,” The Annals of Statistics, 45, 1431-1461. DOI: . · Zbl 1407.62147
[2] Avella-Medina, M., Battey, H. S., Fan, J., and Li, Q. (2018), “Robust Estimation of High-Dimensional Covariance and Precision Matrices,” Biometrika, 105, 271-284. DOI: . · Zbl 07072412
[3] Batselier, K., and Wong, N. (2017), “A Constructive Arbitrary-Degree Kronecker Product Decomposition of Tensors,” Numerical Linear Algebra with Applications, 24, e2097. DOI: . · Zbl 1424.15045
[4] Bickel, P. J., and Levina, E. (2008a), “Covariance Regularization by Thresholding,” The Annals of Statistics, 36, 2577-2604. DOI: . · Zbl 1196.62062
[5] Bickel, P. J., and Levina, E. (2008b), “Regularized Estimation of Large Covariance Matrices,” The Annals of Statistics, 36, 199-227. · Zbl 1132.62040
[6] Cai, T., and Liu, W. (2011), “Adaptive Thresholding for Sparse Covariance Matrix Estimation,” Journal of the American Statistical Association, 106, 672-684. DOI: . · Zbl 1232.62086
[7] Cai, T. T., Ren, Z., and Zhou, H. H. (2016), “Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation,” Electronic Journal of Statistics, 10, 1-59. · Zbl 1331.62272
[8] Cai, T. T., and Yuan, M. (2012), “Adaptive Covariance Matrix Estimation Through Block Thresholding,” The Annals of Statistics, 40, 2014-2042. DOI: . · Zbl 1257.62060
[9] Cai, T. T., and Zhang, A. (2018), “Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics,” The Annals of Statistics, 46, 60-89. DOI: . · Zbl 1395.62122
[10] Cai, T. T., Zhang, C.-H., and Zhou, H. H. (2010), “Optimal Rates of Convergence for Covariance Matrix Estimation,” The Annals of Statistics, 38, 2118-2144. DOI: . · Zbl 1202.62073
[11] Chen, Y., Chi, Y., Fan, J., Ma, C. (2021), “Spectral Methods for Data Science: A Statistical Perspective,” Foundations and Trends[textregistered] in Machine Learning, 14, 566-806. DOI: .
[12] Dawid, A. P. (1981), “Some Matrix-Variate Distribution Theory: Notational Considerations and a Bayesian Application,” Biometrika, 68, 265-274. DOI: . · Zbl 0464.62039
[13] Dutilleul, P. (1999), “The MLE Algorithm for the Matrix Normal Distribution,” Journal of Statistical Computation and Simulation, 64, 105-123. DOI: . · Zbl 0960.62056
[14] Eckart, C., and Young, G. (1936), “The Approximation of One Matrix by Another of Lower Rank,” Psychometrika, 1, 211-218. DOI: . · JFM 62.1075.02
[15] El Karoui, N. (2008), “Operator Norm Consistent Estimation of Large-Dimensional Sparse Covariance Matrices,” The Annals of Statistics, 36, 2717-2756. DOI: . · Zbl 1196.62064
[16] Fan, J., Wang, W., and Zhu, Z. (2021), “A Shrinkage Principle for Heavy-Tailed Data: High-Dimensional Robust Low-Rank Matrix Recovery,” The Annals of Statistics, 49, 1239-1266. DOI: . · Zbl 1479.62034
[17] Filipiak, K., Klein, D., and Roy, A. (2016), “Score Test for a Separable Covariance Structure with the First Component as Compound Symmetric Correlation Matrix,” Journal of Multivariate Analysis, 150, 105-124. DOI: . · Zbl 1347.62092
[18] Furrer, R., and Bengtsson, T. (2007), “Estimation of High-Dimensional Prior and Posterior Covariance Matrices in Kalman Filter Variants,” Journal of Multivariate Analysis, 98, 227-255. DOI: . · Zbl 1105.62091
[19] Galecki, A. T. (1994), “General Class of Covariance Structures for Two or More Repeated Factors in Longitudinal Data Analysis,” Communications in Statistics-Theory and Methods, 23, 3105-3119. DOI: . · Zbl 0875.62274
[20] Gu, M., and Shen, W. (2020), “Generalized Probabilistic Principal Component Analysis of Correlated Data,”Journal of Machine Learning Research, 21, 1-41. · Zbl 1497.62143
[21] Gupta, A., and Nagar, D. (1999), Matrix Variate Distributions (Vol. 104), Boca Raton, FL: CRC Press. · Zbl 0935.62064
[22] Hoff, P. D. (2011), “Separable Covariance Arrays via the Tucker Product, with Applications to Multivariate Relational Data,” Bayesian Analysis, 6, 179-196. DOI: . · Zbl 1330.62132
[23] Hu, W., Pan, T., Kong, D., and Shen, W. (2021), “Nonparametric Matrix Response Regression with Application to Brain Imaging Data Analysis,” Biometrics, 77, 1227-1240. DOI: . · Zbl 1520.62235
[24] Hu, W., Shen, W., Zhou, H., and Kong, D. (2020), “Matrix Linear Discriminant Analysis,” Technometrics, 62, 196-205. DOI: .
[25] Johnstone, I. M. (2001), “On the Distribution of the Largest Eigenvalue in Principal Components Analysis,” The Annals of Statistics, 29, 295-327. DOI: . · Zbl 1016.62078
[26] Johnstone, I. M., and Lu, A. Y. (2009), “On Consistency and Sparsity for Principal Components Analysis in High Dimensions,” Journal of the American Statistical Association, 104, 682-693. DOI: . · Zbl 1388.62174
[27] Kong, D., An, B., Zhang, J., and Zhu, H. (2020), “L2RM: Low-Rank Linear Regression Models for High-Dimensional Matrix Responses,” Journal of the American Statistical Association, 115, 403-424. DOI: . · Zbl 1437.62357
[28] Lambert, F. H., Webb, M. J., and Joshi, M. M. (2011), “The Relationship Between Land-Ocean Surface Temperature Contrast and Radiative Forcing,” Journal of Climate, 24, 3239-3256. DOI: .
[29] Lu, J., Han, F., and Liu, H. (2021), “Robust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tail,” IEEE Transactions on Information Theory, 67, 5283-5304. DOI: . · Zbl 1486.62164
[30] Lu, N., and Zimmerman, D. L. (2005), “The Likelihood Ratio Test for a Separable Covariance Matrix,”Statistics & Probability Letters, 73, 449-457. · Zbl 1071.62052
[31] Pitsianis, N. P. (1997), “The Kronecker Product in Approximation and Fast Transform Generation,” Ph. D. thesis, Cornell University.
[32] Shen, S. (2017), R Programming for Climate Data Analysis and Visualization: Computing and Plotting for NOAA Data Applications (1st rev ed.), San Diego: San Diego State University.
[33] Van Loan, C. F., and Pitsianis, N. (1993), “Approximation with Kronecker Products,” in Linear Algebra for Large Scale and Real-time Applications, eds. Moonen, M. S., Golub, G. H., and Moor, B. L. R., pp. 293-314, Dordrecht: Springer. · Zbl 0813.65078
[34] Visser, H., and Molenaar, J. (1995), “Trend Estimation and Regression Analysis in Climatological Time Series: An Application of Structural Time Series Models and the Kalman Filter,” Journal of Climate, 8, 969-979. DOI: .
[35] Wachter, K. W. (1978), “The Strong Limits of Random Matrix Spectra for Sample Matrices of Independent Elements,” The Annals of Probability, 6, 1-18. DOI: . · Zbl 0374.60039
[36] Wang, X., and Zhu, H. (2017), “Generalized Scalar-on-Image Regression Models via Total Variation,” Journal of the American Statistical Association, 112, 1156-1168. DOI: .
[37] Werner, K., Jansson, M., and Stoica, P. (2008), “On Estimation of Covariance Matrices with Kronecker Product Structure,” IEEE Transactions on Signal Processing, 56, 478-491. DOI: . · Zbl 1390.94472
[38] Wu, W. B., and Pourahmadi, M. (2009), “Banding Sample Autocovariance Matrices of Stationary Processes,”Statistica Sinica, 19, 1755-1768. · Zbl 1176.62083
[39] Xiao, L., and Bunea, F. (2014), “On the Theoretic and Practical Merits of the Banding Estimator for Large Covariance Matrices,” arXiv preprint arXiv:1402.0844.
[40] Zajkowski, K. (2020), “Bounds on Tail Probabilities for Quadratic Forms in Dependent sub-Gaussian Random Variables,” Statistics & Probability Letters, 167, 108898. · Zbl 1453.60064
[41] Zhou, H., and Li, L. (2014), “Regularized Matrix Regression,” Journal of the Royal Statistical Society, Series B, 76, 463-483. DOI: . · Zbl 07555458
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.