×

Parsimonious tensor discriminant analysis. (English) Zbl 07796625

Summary: Discriminant analyses of multidimensional array data (i.e., tensors) are of substantial interest in numerous statistics and engineering research problems, such as signal processing, imaging, genetics, and brain-computer interfaces. In this study, we consider a multi-class discriminant analysis with a tensor-variate predictor and a categorical response. To overcome the high dimensionality and to exploit the tensor correlation structure, we propose the discriminant analysis with tensor envelope (DATE) model for simultaneous dimension reduction and classification. We extend the notion of tensor envelopes from regression to discriminant analysis and develop two complementary estimation procedures: DATE-L is a likelihood-based estimator that is shown to be asymptotically efficient when the sample size goes to infinity and the tensor dimension is fixed; DATE-D is a novel decomposition-based estimator suitable for high-dimensional problems. Interestingly, we show that DATE-D is still root-n consistent, even when the tensor dimensions on each model grow arbitrarily fast, but at a similar rate. We demonstrate the robustness and efficiency of our estimators using extensive simulations and real-data examples.

MSC:

62-XX Statistics
Full Text: DOI

References:

[1] Baranzini, S. E., Mousavi, P., Rio, J., Caillier, S. J., Stillman, A., Villoslada, P. et al. (2004). Transcription-based prediction of response to ifnβ using supervised computational methods. PLoS Biol 3, e2.
[2] Bi, Y. and Jeske, D. R. (2010). The efficiency of logistic regression compared to normal discriminant analysis under class-conditional classification noise. Journal of Multivariate Analysis 101, 1622-1637. · Zbl 1189.62104
[3] Chi, E. C. and Kolda, T. G. (2012). On tensors, sparsity, and nonnegative factorizations. SIAM Journal on Matrix Analysis and Applications 33, 1272-1299. · Zbl 1262.15029
[4] Cichocki, A., Mandic, D., De Lathauwer, L., Zhou, G., Zhao, Q., Caiafa, C. et al. (2015). Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Processing Magazine 32, 145-163.
[5] Conway, J. B. (2013). A Course in Functional Analysis. Springer Science & Business Media.
[6] Cook, R., Li, B. and Chiaromonte, F. (2010). Envelope models for parsimonious and efficient multivariate linear regression. Statistica Sinica 20, 927-960. · Zbl 1259.62059
[7] Cook, R. D. (2018). An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics. John Wiley & Sons. · Zbl 1407.62014
[8] Cook, R. D. and Zhang, X. (2015). Foundations for envelope models and methods. Journal of the American Statistical Association 110, 599-611. · Zbl 1390.62131
[9] Cook, R. D. and Zhang, X. (2016). Algorithms for envelope estimation. Journal of Computational and Graphical Statistics 25, 284-300.
[10] Dudoit, S., Fridlyand, J. and Speed, T. P. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97, 77-87. · Zbl 1073.62576
[11] Efron, B. (1975). The efficiency of logistic regression compared to normal discriminant analysis. Journal of the American Statistical Association 70, 892-898. · Zbl 0319.62039
[12] Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179-188.
[13] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33, 1-22.
[14] Gahrooei, M. R., Yan, H., Paynabar, K. and Shi, J. (2020). Multiple tensor-on-tensor regression: An approach for modeling processes with heterogeneous sources of data. Technometrics 63, 1-23.
[15] Gupta, A. K. and Nagar, D. K. (2018). Matrix Variate Distributions. Chapman and Hall/CRC.
[16] Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics 6, 164-189. · JFM 53.0095.01
[17] Hoff, P. D. (2015). Multilinear tensor regression for longitudinal relational data. The Annals of Applied Statistics 9, 1169. · Zbl 1454.62481
[18] Hore, V., Viñuela, A., Buil, A., Knight, J., McCarthy, M. I., Small, K. et al. (2016). Tensor decomposition for multiple-tissue gene expression experiments. Nature Genetics 48, 1094.
[19] Kolda, T. G. and Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review 51, 455-500. · Zbl 1173.65029
[20] Li, L. and Zhang, X. (2017). Parsimonious tensor response regression. Journal of the American Statistical Association 112, 1131-1146.
[21] Li, P. and Maiti, T. (2019). Universal consistency of support tensor machine. In 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), 608-609.
[22] Li, Q. and Schonfeld, D. (2014). Multilinear discriminant analysis for higher-order tensor data classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 2524-2537.
[23] Li, X., Xu, D., Zhou, H. and Li, L. (2018). Tucker tensor regression and neuroimaging analysis. Statistics in Biosciences 10, 520-545.
[24] Lock, E. F. (2018). Tensor-on-tensor regression. Journal of Computational and Graphical Statistics 27, 638-647. · Zbl 07498939
[25] Lyu, T., Lock, E. F. and Eberly, L. E. (2017). Discriminating sample groups with multi-way data. Biostatistics 18, 434-450.
[26] Manceur, A. M. and Dutilleul, P. (2013). Maximum likelihood estimation for the tensor normal distribution: Algorithm, minimum sample size, and empirical bias and dispersion. Journal of Computational and Applied Mathematics 239, 37-49. · Zbl 1255.65029
[27] Molstad, A. J. and Rothman, A. J. (2019). A penalized likelihood method for classification with matrix-valued predictors. Journal of Computational and Graphical Statistics 28, 11-22. · Zbl 07499008
[28] Pan, Y., Mai, Q. and Zhang, X. (2019). Covariate-adjusted tensor classification in high dimensions. Journal of the American Statistical Association 114, 1305-1319. · Zbl 1428.62291
[29] Raskutti, G., Yuan, M. and Chen, H. (2019). Convex regularization for high-dimensional multiresponse tensor regression. The Annals of Statistics 47, 1554-1584. · Zbl 1428.62324
[30] Sun, W. W. and Li, L. (2017). Store: Sparse tensor response regression and neuroimaging analysis. The Journal of Machine Learning Research 18, 4908-4944. · Zbl 1442.62773
[31] Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika 31, 279-311.
[32] Wang, X., Zhu, H. and Initiative, A. D. N. (2017). Generalized scalar-on-image regression models via total variation. Journal of the American Statistical Association 112, 1156-1168.
[33] Wang, Y., Meng, D. and Yuan, M. (2018). Sparse recovery: From vectors to tensors. National Science Review 5, 756-767.
[34] Witten, D. M. and Tibshirani, R. (2011). Penalized classification using Fisher’s linear discrimi-nant. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 73, 753-772. · Zbl 1228.62079
[35] Yan, H., Paynabar, K. and Pacella, M. (2019). Structured point cloud data analysis via regular-ized tensor regression for process modeling and optimization. Technometrics 61, 385-395.
[36] Yan, S., Xu, D., Yang, Q., Zhang, L., Tang, X. and Zhang, H.-J. (2006). Multilinear discriminant analysis for face recognition. IEEE Transactions on Image Processing 16, 212-220.
[37] Ye, J., Janardan, R. and Li, Q. (2004). Two-dimensional linear discriminant analysis. Advances in Neural Information Processing Systems 17, 1569-1576.
[38] Zhang, A. (2019). Cross: Efficient low-rank tensor completion. The Annals of Statistics 47, 936-964. · Zbl 1416.62298
[39] Zhang, A. and Xia, D. (2018). Tensor SVD: Statistical and computational limits. IEEE Transactions on Information Theory 64, 7311-7338. · Zbl 1432.62176
[40] Zhang, X., Deng, K. and Mai, Q. (2023). Envelopes and principal component regression. Electronic Journal of Statistics 17, 2447-2484. · Zbl 07784479
[41] Zhang, X. and Li, L. (2017). Tensor envelope partial least-squares regression. Technometrics 59, 426-436.
[42] Zhang, X. and Mai, Q. (2018). Model-free envelope dimension selection. Electronic Journal of Statistics 12, 2193-2216. · Zbl 1410.62086
[43] Zhang, X. and Mai, Q. (2019). Efficient integration of sufficient dimension reduction and prediction in discriminant analysis. Technometrics 61, 259-272.
[44] Zhong, W. and Suslick, K. S. (2015). Matrix discriminant analysis with application to colorimetric sensor array data. Technometrics 57, 524-534.
[45] Zhou, H. and Li, L. (2014). Regularized matrix regression. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 76, 463-483. · Zbl 07555458
[46] Zhou, H., Li, L. and Zhu, H. (2013). Tensor regression with applications in neuroimaging data analysis. Journal of the American Statistical Association 108, 540-552. · Zbl 06195959
[47] Ning Wang Department of Statistics, Beijing Normal University at Zhuhai, Guangdong, China. E-mail: ningwangbnu@bnu.edu.cn Wenjing Wang Department of Statistics, Florida State University, Tallahassee, FL 32306, USA.
[48] E-mail: wenjing.wang@stat.fsu.edu Xin Zhang Department of Statistics, Florida State University, Tallahassee, FL 32306, USA.
[49] E-mail: henry@stat.fsu.edu (Received December 2020; accepted May 2022)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.