
Robust signal dimension estimation via SURE. (English) Zbl 07889348

Summary: The estimation of signal dimension under heavy-tailed latent variable models is studied. As a primary contribution, robust extensions of an earlier estimator based on Gaussian Stein’s unbiased risk estimation are proposed. These novel extensions are based on the framework of elliptical distributions and robust scatter matrices. Extensive simulation studies are conducted in order to compare the novel methods with several well-known competitors in both estimation accuracy and computational speed. The novel methods are applied to a financial asset return data set.


62-XX Statistics


changepoint; ICSNP; MNM


[1] Anderson, TW, Asymptotic theory for principal component analysis, Ann Math Stat, 34, 1, 122-148, 1963 · Zbl 0202.49504 · doi:10.1214/aoms/1177704248
[2] Bernard G, Verdebout T (2021) On some multivariate sign tests for scatter matrix eigenvalues. Economet Stat
[3] Borak S, Misiorek A, Weron R (2011) Models for heavy-tailed asset returns. In: Statistical tools for finance and insurance, pp 21-55. Springer, Berlin
[4] Brown, B., Statistical uses of the spatial median, J R Stat Soc Ser B, 45, 1, 25-30, 1983 · Zbl 0508.62046 · doi:10.1111/j.2517-6161.1983.tb01226.x
[5] Comon, P.; Jutten, C., Handbook of blind source separation: independent component analysis and applications, 2010, Cambridge: Academic Press, Cambridge
[6] Croux, C.; Haesbroeck, G., Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies, Biometrika, 87, 3, 603-618, 2000 · Zbl 0956.62047 · doi:10.1093/biomet/87.3.603
[7] Deng, WQ; Craiu, RV, Exploring dimension learning via a penalized probabilistic principal component analysis, J Stat Comput Simul, 93, 2, 266-297, 2023 · Zbl 07677416 · doi:10.1080/00949655.2022.2100890
[8] Dümbgen, L.; Tyler, DE, On the breakdown properties of some multivariate M-functionals, Scand J Stat, 32, 2, 247-264, 2005 · Zbl 1089.62056 · doi:10.1111/j.1467-9469.2005.00425.x
[9] Dürre, A.; Tyler, DE; Vogel, D., On the eigenvalues of the spatial sign covariance matrix in more than two dimensions, Stat Prob Lett, 111, 80-85, 2016 · Zbl 1341.62118 · doi:10.1016/j.spl.2016.01.009
[10] Fan, J.; Wang, W.; Zhu, Z., A shrinkage principle for heavy-tailed data: high-dimensional robust low-rank matrix recovery, Ann Stat, 49, 3, 1239, 2021 · Zbl 1479.62034 · doi:10.1214/20-AOS1980
[11] Fang, KW, Symmetric multivariate and related distributions, 2018, New York: CRC Press, New York · doi:10.1201/9781351077040
[12] Gai, J.; Stevenson, RL, Studentized dynamical system for robust object tracking, IEEE Trans Image Process, 20, 1, 186-199, 2010 · Zbl 1372.94344
[13] Gai J, Li Y, Stevenson RL (2008) An EM algorithm for robust Bayesian PCA with Student’s t-distribution. In: 2008 15th IEEE International Conference on Image Processing, pp. 2672-2675 . IEEE
[14] Haldane, J., Note on the median of a multivariate distribution, Biometrika, 35, 3-4, 414-417, 1948 · Zbl 0032.03601 · doi:10.1093/biomet/35.3-4.414
[15] Hettmansperger, TP; Randles, RH, A practical affine equivariant multivariate median, Biometrika, 89, 4, 851-860, 2002 · Zbl 1036.62045 · doi:10.1093/biomet/89.4.851
[16] Killick, R.; Eckley, I., changepoint: an R package for changepoint analysis, J Stat Softw, 58, 3, 1-19, 2014 · doi:10.18637/jss.v058.i03
[17] Luo, W.; Li, B., Combining eigenvalues and variation of eigenvectors for order determination, Biometrika, 103, 4, 875-887, 2016 · Zbl 1506.62304 · doi:10.1093/biomet/asw051
[18] Luo, W.; Li, B., On order determination by predictor augmentation, Biometrika, 108, 3, 557-574, 2021 · Zbl 07459716 · doi:10.1093/biomet/asaa077
[19] Magyar, A.; Tyler, DE, The asymptotic efficiency of the spatial median for elliptically symmetric distributions, Sankhya B, 73, 2, 165-192, 2011 · Zbl 1268.62051 · doi:10.1007/s13571-011-0032-x
[20] Marden, JI, Some robust estimates of principal components, Stat Prob Lett, 43, 4, 349-359, 1999 · Zbl 0939.62055 · doi:10.1016/S0167-7152(98)00272-7
[21] Milasevic, P.; Ducharme, G., Uniqueness of the spatial median, Ann Stat, 15, 3, 1332-1333, 1987 · Zbl 0631.62058 · doi:10.1214/aos/1176350511
[22] Minka, T., Automatic choice of dimensionality for PCA, Adv Neural Inform Process Syst, 78, 13, 2000
[23] Nordhausen, K.; Tyler, DE, A cautionary note on robust covariance plug-in methods, Biometrika, 102, 3, 573-588, 2015 · Zbl 1452.62416 · doi:10.1093/biomet/asv022
[24] Nordhausen, K.; Oja, H.; Tyler, DE, Asymptotic and bootstrap tests for subspace dimension, J Multivar Anal, 8, 104830104830, 2021
[25] Nordhausen K, Oja H, Tyler DE, Virta J (2021) ICtest: estimating and testing the number of interesting components in linear dimension reduction. R package version 0.3-4
[26] Nordhausen K, Sirkia S, Oja H, Tyler DE (2018) ICSNP: Tools for multivariate nonparametrics. R package version 1.1-1. https://CRAN.R-project.org/package=ICSNP
[27] Oja, H., Multivariate nonparametric methods with R: an approach based on spatial signs and ranks, 2010, New York: Springer, New York · Zbl 1269.62036 · doi:10.1007/978-1-4419-0468-3
[28] Paindaveine, D., A canonical definition of shape, Stat Prob Lett, 78, 14, 2240-2247, 2008 · Zbl 1283.62124 · doi:10.1016/j.spl.2008.01.094
[29] Pison, G.; Rousseeuw, PJ; Filzmoser, P.; Croux, C., Robust factor analysis, J Multivar Anal, 84, 1, 145-172, 2003 · Zbl 1038.62055 · doi:10.1016/S0047-259X(02)00007-6
[30] Schott, JR, A high-dimensional test for the equality of the smallest eigenvalues of a covariance matrix, J Multivar Anal, 97, 4, 827-843, 2006 · Zbl 1086.62072 · doi:10.1016/j.jmva.2005.05.003
[31] Stein, CM, Estimation of the mean of a multivariate normal distribution, Ann Stat, 8, 1135-1151, 1981 · Zbl 0476.62035
[32] Stewart, GW, Matrix algorithms: volume ii: eigensystems, 2001, Philadelphia: SIAM, Philadelphia · Zbl 0984.65031 · doi:10.1137/1.9780898718058
[33] Tibshirani, RJ; Taylor, J., Degrees of freedom in lasso problems, Ann Stat, 40, 2, 1198-1232, 2012 · Zbl 1274.62469 · doi:10.1214/12-AOS1003
[34] Tipping, ME; Bishop, CM, Probabilistic principal component analysis, J R Stat Soc Ser B, 61, 3, 611-622, 1999 · Zbl 0924.62068 · doi:10.1111/1467-9868.00196
[35] Tsay, RS, Analysis of financial time series, 2010, Hoboken: Wiley, Hoboken · Zbl 1209.91004 · doi:10.1002/9780470644560
[36] Tyler, DE, A distribution-free M-estimator of multivariate scatter, Ann Stat, 7, 234-251, 1987 · Zbl 0628.62053
[37] Tyler, DE; Critchley, F.; Dümbgen, L.; Oja, H., Invariant co-ordinate selection, J R Stat Soc Ser B, 71, 3, 549-592, 2009 · Zbl 1250.62032 · doi:10.1111/j.1467-9868.2009.00706.x
[38] Ulfarsson, MO; Solo, V., Dimension estimation in noisy PCA with SURE and random matrix theory, IEEE Trans Signal Process, 56, 12, 5804-5816, 2008 · Zbl 1390.94448 · doi:10.1109/TSP.2008.2005865
[39] Ulfarsson, MO; Solo, V., Selecting the number of principal components with SURE, IEEE Signal Process Lett, 22, 2, 239-243, 2015 · doi:10.1109/LSP.2014.2337276
[40] Virta J, Lietzén N, Viitasaari L, Ilmonen P (2020) Latent model extreme value index estimation. arXiv preprint arXiv:2003.10330
[41] Visuri, S.; Koivunen, V.; Oja, H., Sign and rank covariance matrices, J Stat Plan Inference, 91, 2, 557-575, 2000 · Zbl 0965.62049 · doi:10.1016/S0378-3758(00)00199-3
[42] Vogel D, Fried R (2015) Robust change detection in the dependence structure of multivariate time series. Modern Nonparametric. robust and multivariate methods. Springer, Cham, pp 265-288
[43] Wax, M.; Kailath, T., Detection of signals by information theoretic criteria, IEEE Trans Acoust Speech Signal Proces, 33, 2, 387-392, 1985 · doi:10.1109/TASSP.1985.1164557
[44] Wiesel, A., Geodesic convexity and covariance estimation, IEEE Trans Signal Process, 60, 12, 6182-6189, 2012 · Zbl 1393.94489 · doi:10.1109/TSP.2012.2218241
[45] Zhao, L.; Krishnaiah, PR; Bai, Z., On detection of the number of signals in presence of white noise, J Multivar Anal, 20, 1, 1-25, 1986 · Zbl 0617.62055 · doi:10.1016/0047-259X(86)90017-5
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.