×

Degrees of freedom in low rank matrix estimation. (English) Zbl 06712162

Summary: The objective of this paper is to quantify the complexity of rank and nuclear norm constrained methods for low rank matrix estimation problems. Specifically, we derive analytic forms of the degrees of freedom for these types of estimators in several common settings. These results provide efficient ways of comparing different estimators and eliciting tuning parameters. Moreover, our analyses reveal new insights on the behavior of these low rank matrix estimators. These observations are of great theoretical and practical importance. In particular, they suggest that, contrary to conventional wisdom, for rank constrained estimators the total number of free parameters underestimates the degrees of freedom, whereas for nuclear norm penalization, it overestimates the degrees of freedom. In addition, when using most model selection criteria to choose the tuning parameter for nuclear norm penalization, it oftentimes suffices to entertain a finite number of candidates as opposed to a continuum of choices. Numerical examples are also presented to illustrate the practical implications of our results.

MSC:

62J99 Linear inference, regression
62H12 Estimation in multivariate analysis
Full Text: DOI

References:

[1] Akaike, H., Information theory and and an extension of the maximum likelihood principle, 267-281 (1973), New York · Zbl 0283.62006
[2] Alter O, Brown P O, Botstein D. Singular value decomposition for genome-wide expression data processing and modeling. Proc Nat Acad Sci USA, 2000, 97: 10101-10106 · doi:10.1073/pnas.97.18.10101
[3] Anderson T. Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann Math Statist, 1951, 22: 327-351 · Zbl 0043.13902 · doi:10.1214/aoms/1177729580
[4] Breiman L, Friedman J. Predicting multivariate responses in multiple linear regression. J Roy Statist Soc Ser B, 1997, 59: 3-54 · Zbl 0897.62068 · doi:10.1111/1467-9868.00054
[5] Brooks R, Stone M. Joint continuum regression for multiple predictands. J Amer Statist Assoc, 1994, 89: 1374-1377 · Zbl 0825.62638 · doi:10.1080/01621459.1994.10476876
[6] Candés, E. J.; Plan, Y., Matrix completion with noise, 925-936 (2009), New York
[7] Candés E J, Recht B. Exact matrix completion via convex optimization. Found Comput Math, 2008, 9: 717-772 · Zbl 1219.90124 · doi:10.1007/s10208-009-9045-5
[8] Candés E J, Tao T. The power of convex relaxation: Near-optimal matrix completion. IEEE Trans Inform Theory, 2009, 56: 2053-2080 · Zbl 1366.15021 · doi:10.1109/TIT.2010.2044061
[9] Craven P, Wahba G. Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math, 1979, 31: 317-403 · Zbl 0377.65007
[10] Donoho D, Johnstone I. Adapting to unknown smoothness via wavelet shrinkage. J Amer Statist Assoc, 1995, 90: 1200-1224 · Zbl 0869.62024 · doi:10.1080/01621459.1995.10476626
[11] Efron B. Estimation of the error rate: Improvement on cross-validation. J Amer Statist Assoc, 1983, 78: 316-331 · Zbl 0543.62079 · doi:10.1080/01621459.1983.10477973
[12] Efron B. How biased is the apparent error rate of a prediction rule. J Amer Statist Assoc, 1986, 81: 461-470 · Zbl 0621.62073 · doi:10.1080/01621459.1986.10478291
[13] Efron B. The estimation of prediction error: Covariance penalty and cross-validation. J Amer Statist Assoc, 2004, 99: 619-632 · Zbl 1117.62324 · doi:10.1198/016214504000000692
[14] Foster D P, George E I. The risk inflation criterion for multiple regression. Ann Statist, 1994, 22: 1947-1975 · Zbl 0829.62066 · doi:10.1214/aos/1176325766
[15] Frank I, Friedman J. A statistical view of some chemometrics regression tools (with discussion). Technometrics, 1993, 35: 109-148 · Zbl 0775.62288 · doi:10.1080/00401706.1993.10485033
[16] Gabriel K R. The biplot graphic display of matrices with application to principal component analysis. Biometrika, 1971, 58: 453-467 · Zbl 0228.62034 · doi:10.1093/biomet/58.3.453
[17] Gabriel K R. Least squares approximation of matrices by additive and multiplicative models. J Roy Statist Soc Ser B, 1978, 40: 186-196 · Zbl 0393.62019
[18] Gabriel K R. Generalised bilinear regression. Biometrika, 1998, 85: 689-700 · Zbl 0918.62060 · doi:10.1093/biomet/85.3.689
[19] Gower J C, Hand D J. Biplots. New York: Chapman & Hall, 1996 · Zbl 0867.62053
[20] Gross D. Recovering low-rank matrices from few coefficients in any basis. IEEE Trans Inform Theory, 2011, 57: 1548-1566 · Zbl 1366.94103 · doi:10.1109/TIT.2011.2104999
[21] Harshman R A, Green P E, Wind Y, et al. A model for the analysis of asymmetric data in marketing research. Marketing Sci, 1982, 1: 205-242 · doi:10.1287/mksc.1.2.205
[22] Hastie T, Tibshirani R. Generalized Additive Models. New York: Chapman & Hall, 1990 · Zbl 0747.62061
[23] Hoff P D. Model averaging and dimension selection for the singular value decomposition. J Amer Statist Assoc, 2007, 102: 674-685 · Zbl 1172.62318 · doi:10.1198/016214506000001310
[24] Hotelling H. The most predictable criterion. J Educ Psychol, 1935, 26: 139-142 · doi:10.1037/h0058165
[25] Hotelling H. Relations between two sets of variables. Biometrika, 1936, 28: 321-377 · Zbl 0015.40705 · doi:10.1093/biomet/28.3-4.321
[26] Hu Z, Fan C, Oh D, et al. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics, 2006, 7: 96, doi: 10.1186/1471-2164-7-96 · doi:10.1186/1471-2164-7-96
[27] Izenman A. Reduced-rank regression for the multivariate linear model. J Multivariate Anal, 1975, 5: 248-264 · Zbl 0313.62042 · doi:10.1016/0047-259X(75)90042-1
[28] Koltchinskii V, Lounici K, Tsybakov A. Nuclear norm penalization and optimal rates for noisy low rank matrix completion. Ann Statist, 2011, 39: 2302-2329 · Zbl 1231.62097 · doi:10.1214/11-AOS894
[29] Mallows C. Some comments on Cp. Technometrics, 1973, 15: 661-675 · Zbl 0269.62061
[30] Massy W. Principal components regression with exporatory statistical research. J Amer Statist Assoc, 1965, 60: 234-246 · doi:10.1080/01621459.1965.10480787
[31] Raychaudhuri S, Stuart J M, Altman R B. Principal components analysis to summarize microarray experiments: Application to sporulation time series. Pacific Symp Biocomputing, 2000, 5: 455-466
[32] Recht, B. A simpler approach to matrix completion. J Machine Learn Res, 2011, 12: 3413-3430 · Zbl 1280.68141
[33] Reinsel G, Velu R. Multivariate Reduced-Rank Regression. New York: Springer, 1998 · Zbl 0909.62066 · doi:10.1007/978-1-4757-2853-8
[34] Rohde A, Tsybakov A. Estimation of high-dimensional low-rank matrices. Ann Statist, 2011, 39: 887-930 · Zbl 1215.62056 · doi:10.1214/10-AOS860
[35] Schwarz G. Estimating the dimension of a model. Ann Statist, 1978, 6: 461-464 · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[36] Shen X, Huang H, Ye J. Adaptive model selection and assessment for exponential family distributions. Technometrics, 2004, 46: 306-317 · doi:10.1198/004017004000000338
[37] Shen X, Ye J. Adaptive model selection. J Amer Statist Assoc, 2002, 97: 210-221 · Zbl 1073.62509 · doi:10.1198/016214502753479356
[38] Stein C. Estimation of the mean of a multivariate normal distribution. J Amer Statist Assoc, 1973, https://statistics.stanford.edu/sites/default/files/EFS
[39] Stein C. Estimation of the mean of a multivariate normal distribution. Ann Statist, 1981, 9: 1135-1151 · Zbl 0476.62035 · doi:10.1214/aos/1176345632
[40] Troyanskaya O, Canto M, Sherlock G, et al. Missing value estimation methods for DNA microarrays. Bioinformatics, 2001, 17: 520-525 · doi:10.1093/bioinformatics/17.6.520
[41] Wold H. Soft Modeling by Latent Variables: The Nonlinear Iterative Partial Least Squares Approach. New York: Academic Press, 1975 · Zbl 0331.62058
[42] Ye J. On measuring and correcting the effects of data mining and model selection. J Amer Statist Assoc, 1998, 93: 120-130 · Zbl 0920.62056 · doi:10.1080/01621459.1998.10474094
[43] Yuan M, Ekici A, Lu Z, et al. Dimension reduction and coefficient estimation in multivariate linear regression. J Roy Statist Soc Ser B, 2007, 69: 329-346 · Zbl 07555355 · doi:10.1111/j.1467-9868.2007.00591.x
[44] Zou H, Hastie T, Tibshirani R. On the degrees of freedom of the Lasso. Ann Statist, 2007, 35: 2173-2192 · Zbl 1126.62061 · doi:10.1214/009053607000000127
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.