×

Subspace estimation with automatic dimension and variable selection in sufficient dimension reduction. (English) Zbl 07820387

Summary: Sufficient dimension reduction (SDR) methods target finding lower-dimensional representations of a multivariate predictor to preserve all the information about the conditional distribution of the response given the predictor. The reduction is commonly achieved by projecting the predictor onto a low-dimensional subspace. The smallest such subspace is known as the Central Subspace (CS) and is the key parameter of interest for most SDR methods. In this article, we propose a unified and flexible framework for estimating the CS in high dimensions. Our approach generalizes a wide range of model-based and model-free SDR methods to high-dimensional settings, where the CS is assumed to involve only a subset of the predictors. We formulate the problem as a quadratic convex optimization so that the global solution is feasible. The proposed estimation procedure simultaneously achieves the structural dimension selection and coordinate-independent variable selection of the CS. Theoretically, our method achieves dimension selection, variable selection, and subspace estimation consistency at a high convergence rate under mild conditions. We demonstrate the effectiveness and efficiency of our method with extensive simulation studies and real data examples. Supplementary materials for this article are available online.

MSC:

62-XX Statistics

Software:

spls; msda; SOFAR
Full Text: DOI

References:

[1] Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009), “Simultaneous Analysis of Lasso and Dantzig Selector,” The Annals of Statistics, 37, 1705-1732. DOI: . · Zbl 1173.62022
[2] Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011), “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers,” Foundations and Trends® in Machine Learning, 3, 1-122. DOI: . · Zbl 1229.90122
[3] Bura, E., and Yang, J. (2011), “Dimension Estimation in Sufficient Dimension Reduction: A Unifying Approach,” Journal of Multivariate Analysis, 102, 130-142. DOI: . · Zbl 1206.62107
[4] Chen, X., Zou, C., and Cook, R. D. (2010), “Coordinate-Independent Sparse Sufficient Dimension Reduction and Variable Selection,” The Annals of Statistics, 38, 3696-3723. DOI: . · Zbl 1204.62107
[5] Chung, D., Chun, H., and Keles, S. (2019), spls: Sparse Partial Least Squares (SPLS) Regression and Classification. R package version 2.2-3.
[6] Chung, D., and Keles, S. (2010), “Sparse Partial Least Squares Classification for High Dimensional Data,” Statistical Applications in Genetics and Molecular Biology, 9. DOI: . · Zbl 1304.92041
[7] Cook, R. D. (1998), Regression Graphics: Ideas for Studying Regressions through Graphics, New York: Wiley. · Zbl 0903.62001
[8] Cook, R. D., and Forzani, L. (2008), “Principal Fitted Components for Dimension Reduction in Regression,” Statistical Science, 23, 485-501. DOI: . · Zbl 1329.62274
[9] Cook, R. D., and Li, B. (2002), “Dimension Reduction for Conditional Mean in Regression,” The Annals of Statistics, 30, 455-474. DOI: . · Zbl 1012.62035
[10] Cook, R. D., and Ni, L. (2006), Using Intraslice Covariances for Improved Estimation of the Central Subspace in Regression,” Biometrika, 93, 65-74. DOI: . · Zbl 1152.62019
[11] Cook, R. D., and Zhang, X. (2014), “Fused Estimators of the Central Subspace in Sufficient Dimension Reduction,” Journal of the American Statistical Association, 109, 815-827. DOI: . · Zbl 1367.62112
[12] Fan, J., and Li, R. (2001), “Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties,” Journal of the American statistical Association, 96, 1348-1360. DOI: . · Zbl 1073.62547
[13] Ferré, L. (1998), “Determining the Dimension in Sliced Inverse Regression and Related Methods,” Journal of the American Statistical Association, 93, 132-140. · Zbl 0908.62049
[14] Gabay, D., and Mercier, B. (1976), “A Dual Algorithm for the Solution of Nonlinear Variational Problems via Finite Element Approximation,” Computers & Mathematics with Applications, 2, 17-40. · Zbl 0352.65034
[15] Li, B. (2018), Sufficient Dimension Reduction: Methods and Applications with R, Boca Raton, FL: Chapman and Hall/CRC. · Zbl 1408.62011
[16] Li, K.-C. (1991), “Sliced Inverse Regression for Dimension Reduction,” Journal of the American Statistical Association, 86, 316-327. DOI: . · Zbl 0742.62044
[17] Li, K.-C. (1992), “On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein’s Lemma,” Journal of the American Statistical Association, 87, 1025-1039. DOI: . · Zbl 0765.62003
[18] Li, L. (2007), “Sparse Sufficient Dimension Reduction,” Biometrika, 94, 603-613. DOI: . · Zbl 1135.62062
[19] Lin, Q., Li, X., Huang, D., and Liu, J. S. (2021), “On the Optimality of Sliced Inverse Regression in High Dimensions,” The Annals of Statistics, 49, 1-20. DOI: . · Zbl 1464.62337
[20] Lin, Q., Zhao, Z., and Liu, J. S. (2019), “Sparse Sliced Inverse Regression via Lasso,” Journal of the American Statistical Association, 114, 1726-1739. DOI: . · Zbl 1428.62320
[21] Luo, W., and Li, B. (2016), “Combining Eigenvalues and Variation of Eigenvectors for Order Determination,” Biometrika, 103, 875-887. DOI: . · Zbl 1506.62304
[22] Ma, Y., and Zhu, L. (2014), “On Estimation Efficiency of the Central Mean Subspace,” Journal of the Royal Statistical Society, Series B, 76, 885-901. DOI: . · Zbl 1411.62104
[23] Mai, Q., Yang, Y., and Zou, H. (2015), “Multiclass Sparse Discriminant Analysis,” Statistica Sinica, 29, 97-111. · Zbl 1412.62081
[24] Neykov, M., Liu, J. S., and Cai, T. (2016), “L1-Regularized Least Squares for Support Recovery of High Dimensional Single Index Models with Gaussian Designs,” Journal of Machine Learning Research, 17, 1-37. · Zbl 1435.62273
[25] Raskutti, G., Wainwright, M. J., and Yu, B. (2010), “Restricted Eigenvalue Properties for Correlated Gaussian Designs,” The Journal of Machine Learning Research, 11, 2241-2259. · Zbl 1242.62071
[26] Richard, E., Gaïffas, S., and Vayatis, N. (2014), “Link Prediction in Graphs with Autoregressive Features,” The Journal of Machine Learning Research, 15, 565-593. · Zbl 1318.62183
[27] Schott, J. R. (1994), “Determining the Dimensionality in Sliced Inverse Regression,” Journal of the American Statistical Association, 89, 141-148. DOI: . · Zbl 0791.62069
[28] Székely, G. J., Rizzo, M. L., and Bakirov, N. K. (2007), “Measuring and Testing Dependence by Correlation of Distances,” The Annals of Statistics, 35, 2769-2794. DOI: . · Zbl 1129.62059
[29] Tan, K., Shi, L., and Yu, Z. (2020), “Sparse Sir: Optimal Rates and Adaptive Estimation,” The Annals of Statistics, 48, 64-85. DOI: . · Zbl 1451.62060
[30] Tan, K. M., Wang, Z., Liu, H., and Zhang, T. (2018a), “Sparse Generalized Eigenvalue Problem: Optimal Statistical Rates via Truncated Rayleigh Flow,” Journal of the Royal Statistical Society, Series B, 80, 1057-1086. DOI: . · Zbl 1407.62212
[31] Tan, K. M., Wang, Z., Zhang, T., Liu, H., and Cook, R. D. (2018b), “A Convex Formulation for High-Dimensional Sparse Sliced Inverse Regression,” Biometrika, 105, 769-782. DOI: . · Zbl 1506.62285
[32] Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series B, 58, 267-288. DOI: . · Zbl 0850.62538
[33] Uematsu, Y., Fan, Y., Chen, K., Lv, J., and Lin, W. (2019), “Sofar: Large-Scale Association Network Learning,” IEEE Transactions on Information Theory, 65, 4924-4939. DOI: . · Zbl 1432.68402
[34] Wainwright, M. J. (2019), High-Dimensional Statistics: A Non-asymptotic Viewpoint (Vol. 48), Cambridge: Cambridge University Press. · Zbl 1457.62011
[35] Xia, Y., Tong, H., Li, W. K., and Zhu, L.-X. (2002), “An Adaptive Estimation of Dimension Reduction Space,” Journal of the Royal Statistical Society, Series B, 64, 363-410. DOI: . · Zbl 1091.62028
[36] Ye, Z., and Weiss, R. E. (2003), “Using the Bootstrap to Select One of a New Class of Dimension Reduction Methods,” Journal of the American Statistical Association, 98, 968-979. DOI: . · Zbl 1045.62034
[37] Yin, X., and Hilafu, H. (2015), “Sequential Sufficient Dimension Reduction for Large p, Small n Problems,” Journal of the Royal Statistical Society, Series B, 77, 879-892. DOI: . · Zbl 1414.62194
[38] Yuan, M., and Lin, Y. (2006), “Model Selection and Estimation in Regression with Grouped Variables,” Journal of the Royal Statistical Society, Series B, 68, 49-67. DOI: . · Zbl 1141.62030
[39] Zeng, P. (2008), “Determining the Dimension of the Central Subspace and Central Mean Subspace,” Biometrika, 95, 469-479. DOI: . · Zbl 1437.62670
[40] Zhang, H., Patel, V. M., and Chellappa, R. (2017), “Low-Rank and Joint Sparse Representations for Multi-Modal Recognition,” IEEE Transactions on Image Processing, 26, 4741-4752. DOI: .
[41] Zhao, J., Niu, L., and Zhan, S. (2017), “Trace Regression Model with Simultaneously Low Rank and Row (column) Sparse Parameter,” Computational Statistics & Data Analysis, 116, 1-18. · Zbl 1466.62229
[42] Zhao, P., and Yu, B. (2006), “On Model Selection Consistency of Lasso,” The Journal of Machine Learning Research, 7, 2541-2563. · Zbl 1222.62008
[43] Zhou, H., and Li, L. (2014), “Regularized Matrix Regression,” Journal of the Royal Statistical Society, Series B, 76, 463-483. DOI: . · Zbl 07555458
[44] Zhou, K., Zha, H., and Song, L. (2013), “Learning Social Infectivity in Sparse Low-Rank Networks using Multi-Dimensional Hawkes Processes,”in Artificial Intelligence and Statistics, pp. 641-649. PMLR.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.