×

Sparse sufficient dimension reduction using optimal scoring. (English) Zbl 1365.62226

Summary: Sufficient dimension reduction is a body of theory and methods for reducing the dimensionality of predictors while preserving information on regressions. In this paper we propose a sparse dimension reduction method to perform interpretable dimension reduction. It is designed for situations in which the number of correlated predictors is very large relative to the sample size. The new procedure is based on the optimal scoring interpretation of the sliced inverse regression method. As a result, the regression framework of optimal scoring facilitates the use of commonly used regularization techniques. Simulation studies demonstrate the effectiveness and efficiency of the proposed approach.

MSC:

62H25 Factor analysis and principal components; correspondence analysis

Software:

boost; ElemStatLearn
Full Text: DOI

References:

[1] Bondell, H. D.; Li, L., Shrinkage inverse regression estimation for model-free variable selection, Journal of the Royal Statistical Society, Series B, 71, 287-299 (2009) · Zbl 05691142
[2] Chen, J.; Chen, Z., Extended Bayesian information criteria for model selection with large model spaces, Biometrika, 95, 759-771 (2008) · Zbl 1437.62415
[3] Chen, C. H.; Li, K. C., Can SIR be as popular as multiple linear regression?, Statistica Sinica, 8, 289-316 (1998) · Zbl 0897.62069
[4] Chen, X.; Zou, C.; Cook, R. D., Coordinate-independent sparse sufficient dimension reduction and variable selection, Annals of Statistics, 38, 3696-3723 (2010) · Zbl 1204.62107
[5] Clemmensen, L.; Hastie, T.; Witten, D.; Ersbøll, B., Sparse discriminant analysis, Technometrics, 53, 406-413 (2011)
[6] Cook, R. D., Regression Graphics: Ideas for Studying Regressions Through Graphics (1998), Wiley: Wiley New York · Zbl 0903.62001
[7] Cook, R. D., Using dimension-reduction subspaces to identify important inputs in models of physical systems, (Proceedings of the Section on Physical and Engineering Sciences (1994), American Statistical Association: American Statistical Association Alexandria, VA), 18-25
[8] Cook, R. D., Testing predictor contributions in sufficient dimension reduction, Annals of Statistics, 32, 1061-1092 (2004) · Zbl 1092.62046
[9] Cook, R. D.; Forzani, L., Likelihood-based sufficient dimension reduction, Journal of the American Statistical Association, 104, 197-208 (2009) · Zbl 1388.62041
[10] Cook, R. D.; Weisberg, S., Sliced inverse regression for dimension reduction: Comment, Journal of the American Statistical Association, 86, 328-332 (1991) · Zbl 1353.62037
[11] Cook, R. D.; Yin, X., Dimension reduction and visualization in discriminant analysis, Australian & New Zealand Journal of Statistics, 43, 901-999 (2001)
[12] Dettling, M., BagBoosting for tumor classification with gene expression data, Bioinformatics, 20, 3583-3593 (2004)
[13] Duan, N.; Li, K. C., Slicing regression: a link-free regression method, Annals of Statistics, 19, 505-530 (1991) · Zbl 0738.62070
[14] Dudoit, S.; Fridlyand, J.; Speed, T., Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association, 97, 77-87 (2002) · Zbl 1073.62576
[15] Friedman, J.; Hastie, T.; Höefling, H.; Tibshirani, R., Pathwise coordinate optimization, Annals of Applied Statistics, 1, 302-332 (2007) · Zbl 1378.90064
[16] Golub, T.; Slonim, D.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.; Coller, H.; Loh, M.; Downing, J.; Caligiuri, M.; Bloomfield, C.; Lander, E., Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, 286, 531-537 (1999)
[17] Hastie, T.; Buja, A.; Tibshirani, R., Penalized discriminant analysis, Annals of Statistics, 23, 73-102 (1995) · Zbl 0821.62031
[18] Hastie, T.; Tibshirani, R.; Buja, A., Flexible discriminant analysis by optimal scoring, Journal of the American Statistical Association, 89, 1255-1270 (1994) · Zbl 0812.62067
[19] Hastie, T.; Tibshirani, R.; Friedman, J., The Elements of Statistical Learning: Prediction, Inference and Data Mining (2009), Springer-Verlag: Springer-Verlag New York · Zbl 1273.62005
[20] Li, K. C., Sliced inverse regression for dimension reduction, Journal of the American Statistical Association, 86, 316-342 (1991) · Zbl 0742.62044
[21] Li, L., Sparse sufficient dimension reduction, Biometrika, 94, 603-613 (2007) · Zbl 1135.62062
[22] Li, L.; Cook, R. D.; Nachtsheim, C. J., Model-free variable selection, Journal of the Royal Statistical Society, Series B, 67, 285-299 (2005) · Zbl 1069.62053
[23] Li, B.; Dong, Y., Dimension reduction for non-elliptically distributed predictors, Annals of Statistics, 37, 1272-1298 (2009) · Zbl 1160.62050
[24] Li, B.; Wang, S., On directional regression for dimension reduction, Journal of the American Statistical Association, 102, 997-1008 (2007) · Zbl 1469.62300
[25] Li, L.; Yin, X., Sliced inverse regression with regularizations, Biometrics, 64, 124-131 (2008) · Zbl 1139.62055
[26] Ni, L.; Cook, R. D.; Tsai, C. L., A note on shrinkage sliced inverse regression, Biometrika, 92, 242-247 (2005) · Zbl 1068.62080
[27] Pardoe, I.; Yin, X.; Cook, R. D., Graphical tools for quadratic discriminant analysis, Technometrics, 49, 172-183 (2007)
[28] Tibshirani, R., Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B, 58, 267-288 (1996) · Zbl 0850.62538
[29] Wang, Q.; Yin, X., Estimation of inverse mean: an orthogonal series approach, Computational Statistics & Data Analysis, 55, 1656-1664 (2011) · Zbl 1328.65045
[30] Xia, Y.; Tong, H.; Li, W. K.; Zhu, L. X., An adaptive estimation of dimension reduction space, Journal of the Royal Statistical Society, Series B, 64, 363-410 (2002) · Zbl 1091.62028
[31] Ye, Z.; Weiss, R., Using the bootstrap to select one of a new class of dimension reduction methods, Journal of the American Statistical Association, 98, 968-979 (2003) · Zbl 1045.62034
[32] Yin, X.; Li, B., Sufficient dimension reduction based on an ensemble of minimum average variance estimators, Annals of Statistics, 39, 3392-3416 (2011) · Zbl 1246.62141
[33] Yin, X.; Li, B.; Cook, R. D., Successive direction extraction for estimating the central subspace in a multiple-index regression, Journal of Multivariate Analysis, 99, 1733-1757 (2008) · Zbl 1144.62030
[34] Zhou, J.; He, X., Dimension reduction based on constrained canonical correlation and variable filtering, Annals of Statistics, 36, 1649-1668 (2008) · Zbl 1142.62045
[35] Zhu, L. P.; Wang, T.; Zhu, L. X.; Ferré, L., Sufficient dimension reduction through discretization-expectation estimation, Biometrika, 97, 295-304 (2010) · Zbl 1205.62048
[36] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society, Series B, 67, 301-320 (2005) · Zbl 1069.62054
[37] Zou, H.; Hastie, T.; Tibshirani, R., Sparse principal component analysis, Journal of Computational and Graphical Statistics, 15, 265-286 (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.