×

Dynamic linear discriminant analysis in high dimensional space. (English) Zbl 1466.62352

Summary: High-dimensional data that evolve dynamically feature predominantly in the modern data era. As a partial response to this, recent years have seen increasing emphasis to address the dimensionality challenge. However, the non-static nature of these datasets is largely ignored. This paper addresses both challenges by proposing a novel yet simple dynamic linear programming discriminant (DLPD) rule for binary classification. Different from the usual static linear discriminant analysis, the new method is able to capture the changing distributions of the underlying populations by modeling their means and covariances as smooth functions of covariates of interest. Under an approximate sparse condition, we show that the conditional misclassification rate of the DLPD rule converges to the Bayes risk in probability uniformly over the range of the variables used for modeling the dynamics, when the dimensionality is allowed to grow exponentially with the sample size. The minimax lower bound of the estimation of the Bayes risk is also established, implying that the misclassification rate of our proposed rule is minimax-rate optimal. The promising performance of the DLPD rule is illustrated via extensive simulation studies and the analysis of a breast cancer dataset.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G07 Density estimation

Software:

msda; penalizedLDA

References:

[1] Alyass, A., Turcotte, M. and Meyre, D. (2015). From big data analysis to personalized medicine for all: Challenges and opportunities. BMC Med. Genomics 8 33.
[2] Anderson, T.W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley Interscience. · Zbl 1039.62044
[3] Bickel, P.J. and Levina, E. (2004). Some theory of Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10 989-1010. · Zbl 1064.62073 · doi:10.3150/bj/1106314847
[4] Cai, T. and Liu, W. (2011). A direct estimation approach to sparse linear discriminant analysis. J. Amer. Statist. Assoc. 106 1566-1577. · Zbl 1233.62129 · doi:10.1198/jasa.2011.tm11199
[5] Cai, T.T. and Guo, Z. (2017). Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Statist. 45 615-646. · Zbl 1371.62045 · doi:10.1214/16-AOS1461
[6] Cai, T.T., Zhang, C.-H. and Zhou, H.H. (2010). Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 2118-2144. · Zbl 1202.62073 · doi:10.1214/09-AOS752
[7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[8] Chen, Z. and Leng, C. (2016). Dynamic covariance models. J. Amer. Statist. Assoc. 111 1196-1208.
[9] Cleveland, W., Grosse, E. and Shyu, W. (1992). Local regression models. In Statistical Models in S 309-376.
[10] Efron, B. (1975). The efficiency of logistic regression compared to normal discriminant analysis. J. Amer. Statist. Assoc. 70 892-898. · Zbl 0319.62039 · doi:10.1080/01621459.1975.10480319
[11] Einmahl, U. and Mason, D.M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist. 33 1380-1403. · Zbl 1079.62040 · doi:10.1214/009053605000000129
[12] Fan, J., Feng, Y. and Tong, X. (2012). A road to classification in high dimensional space: The regularized optimal affine discriminant. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 745-771. · Zbl 1411.62167 · doi:10.1111/j.1467-9868.2012.01029.x
[13] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Monographs on Statistics and Applied Probability 66. London: CRC Press. · Zbl 0873.62037
[14] Fan, J., Ke, Z.T., Liu, H. and Xia, L. (2015). QUADRO: A supervised dimension reduction method via Rayleigh quotient optimization. Ann. Statist. 43 1498-1534. · Zbl 1317.62054 · doi:10.1214/14-AOS1307
[15] Fan, Y., Jin, J. and Yao, Z. (2013). Optimal classification in sparse Gaussian graphic model. Ann. Statist. 41 2537-2571. · Zbl 1294.62061 · doi:10.1214/13-AOS1163
[16] Fan, Y., Kong, Y., Li, D. and Zheng, Z. (2015). Innovated interaction screening for high-dimensional nonlinear classification. Ann. Statist. 43 1243-1272. · Zbl 1328.62383 · doi:10.1214/14-AOS1308
[17] Gu, J., Li, Q. and Yang, J.-C. (2015). Multivariate local polynomial kernel estimators: Leading bias and asymptotic distribution. Econometric Rev. 34 978-1009. · Zbl 1491.62027
[18] Guo, J., Levina, E., Michailidis, G. and Zhu, J. (2011). Joint estimation of multiple graphical models. Biometrika 98 1-15. · Zbl 1214.62058 · doi:10.1093/biomet/asq060
[19] Hall, P., Park, B.U. and Samworth, R.J. (2008). Choice of neighbor order in nearest-neighbor classification. Ann. Statist. 36 2135-2152. · Zbl 1274.62421 · doi:10.1214/07-AOS537
[20] Hao, N., Dong, B. and Fan, J. (2015). Sparsifying the Fisher linear discriminant by rotation. J. R. Stat. Soc. Ser. B. Stat. Methodol. 77 827-851. · Zbl 1414.62244 · doi:10.1111/rssb.12092
[21] Jiang, B. and Leng, C. (2016). High dimensional discrimination analysis via a semiparametric model. Statist. Probab. Lett. 110 103-110. · Zbl 1381.62173 · doi:10.1016/j.spl.2015.11.012
[22] Jiang, B., Wang, X. and Leng, C. (2018). A direct approach for sparse quadratic discriminant analysis. J. Mach. Learn. Res. 19 Paper No. 31, 37. · Zbl 1466.62353
[23] Krzanowski, W.J. (1993). The location model for mixtures of categorical and continuous variables. J. Classification 10 25-49. · Zbl 0775.62153 · doi:10.1007/BF02638452
[24] LeCam, L. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1 38-53. · Zbl 0255.62006
[25] Li, J. and Chen, S.X. (2012). Two sample tests for high-dimensional covariance matrices. Ann. Statist. 40 908-940. · Zbl 1274.62383 · doi:10.1214/12-AOS993
[26] Lin, Y. and Jeon, Y. (2003). Discriminant analysis through a semiparametric model. Biometrika 90 379-392. · Zbl 1034.62054 · doi:10.1093/biomet/90.2.379
[27] Lin, Z. and Bai, Z. (2010). Probability Inequalities. Heidelberg: Springer. · Zbl 1221.60001
[28] Mai, Q., Yang, Y. and Zou, H. (2019). Multiclass sparse discriminant analysis. Statist. Sinica 29 97-111. · Zbl 1412.62081
[29] Mai, Q. and Zou, H. (2013). A note on the connection and equivalence of three sparse linear discriminant analysis methods. Technometrics 55 243-246.
[30] Mai, Q., Zou, H. and Yuan, M. (2012). A direct approach to sparse discriminant analysis in ultra-high dimensions. Biometrika 99 29-42. · Zbl 1437.62550 · doi:10.1093/biomet/asr066
[31] Merlevède, F., Peligrad, M. and Rio, E. (2009). Bernstein inequality and moderate deviations under strong mixing conditions. In High Dimensional Probability V: The Luminy Volume. Inst. Math. Stat. (IMS) Collect. 5 273-292. Beachwood, OH: IMS. · Zbl 1243.60019
[32] Nadaraya, E.A. (1964). On estimating regression. Theory Probab. Appl. 9 141-142.
[33] Niu, Y.S., Hao, N. and Dong, B. (2018). A new reduced-rank linear discriminant analysis method and its applications. Statist. Sinica 28 189-202. · Zbl 1382.62034
[34] Pagan, A. and Ullah, A. (1999). Nonparametric Econometrics. Themes in Modern Econometrics. Cambridge: Cambridge Univ. Press.
[35] Pan, R., Wang, H. and Li, R. (2016). Ultrahigh-dimensional multiclass linear discriminant analysis by pairwise sure independence screening. J. Amer. Statist. Assoc. 111 169-179.
[36] Press, J. and Wilson, S. (1978). Choosing between logistic regression and discriminant analysis. J. Amer. Statist. Assoc. 73 699-705. · Zbl 0399.62060 · doi:10.1080/01621459.1978.10480080
[37] Shao, J., Wang, Y., Deng, X. and Wang, S. (2011). Sparse linear discriminant analysis by thresholding for high dimensional data. Ann. Statist. 39 1241-1265. · Zbl 1215.62062 · doi:10.1214/10-AOS870
[38] Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. New York: Springer. · Zbl 1176.62032
[39] Vairavan, S., Eswaran, H., Haddad, N., Rose, D., Preissl, H., Wilson, J., Lowery, C. and Govindan, R. (2009). Detection of discontinuous patterns in spontaneous brain activity of neonates and fetuses. IEEE Rev. Biomed. Eng. 56 2725-2729.
[40] van’t Veer, L., Dai, H., van de Vijver, M.J., He, Y.D. et al. (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature 415 530-536.
[41] Witten, D.M. and Tibshirani, R. (2011). Penalized classification using Fisher’s linear discriminant. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 753-772. · Zbl 1228.62079 · doi:10.1111/j.1467-9868.2011.00783.x
[42] Yu, B. (1997). Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam 423-435. New York: Springer.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.