Document Zbl 07862317

Model-based tensor low-rank clustering. (English) Zbl 07862317

J. Comput. Graph. Stat. 33, No. 1, 208-218 (2024).

Summary: Tensors have become prevalent in business applications and scientific studies. It is of great interest to analyze and understand the heterogeneity in tensor-variate observations. We propose a novel tensor low-rank mixture model (TLMM) to conduct efficient estimation and clustering on tensors. The model combines the Tucker low-rank structure in mean contrasts and the separable covariance structure to achieve parsimonious and interpretable modeling. To implement efficient computation under this model, we develop a low-rank enhanced expectation-maximization (LEEM) algorithm. The pseudo E-step and the pseudo M-step are carefully designed to incorporate variable selection and efficient parameter estimation. Numerical results in extensive experiments demonstrate the encouraging performance of the proposed method compared to popular vector and tensor methods. Supplementary materials for this article are available online.

MSC:

62-XX

Statistics

Keywords:

dimension reduction; EM algorithm; mixture models; sparsity; Tucker decomposition

Software:

CHIME; sparcl; msda; Cross

Cite Review PDF

Full Text: DOI

References:

[1]	Absil, P.-A., Mahony, R., and Sepulchre, R. (2009), Optimization Algorithms on Matrix Manifolds, Princeton: Princeton University Press.
[2]	Anandkumar, A., Ge, R., Hsu, D., Kakade, S. M., and Telgarsky, M. (2014), “Tensor Decompositions for Learning Latent Variable Models,”Journal of Machine Learning Research, 15, 2773-2832. · Zbl 1319.62109
[3]	Bickel, P. J., and Levina, E. (2004), “Some Theory for Fisher’s Linear Discriminant Function, ‘Naive Bayes’, and Some Alternatives When There Are Many More Variables Than Observations,” Bernoulli, 10, 989-1010. DOI: . · Zbl 1064.62073
[4]	Cai, B., Zhang, J., and Sun, W. W. (2021), “Jointly Modeling and Clustering Tensors in High Dimensions,” arXiv preprint arXiv:2104.07773 .
[5]	Cai, T., and Liu, W. (2011), “A Direct Estimation Approach to Sparse Linear Discriminant Analysis,”Journal of the American Statistical Association, 106, 1566-1577. DOI: . · Zbl 1233.62129
[6]	Cai, T. T., Ma, J., and Zhang, L. (2019), “CHIME: Clustering of High-Dimensional Gaussian Mixtures with EM Algorithm and its Optimality,”The Annals of Statistics, 47, 1234-1267. DOI: . · Zbl 1428.62182
[7]	Cook, R. D. (2018), An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics, Hoboken: Wiley. · Zbl 1407.62014
[8]	De Lathauwer, L., De Moor, B., and Vandewalle, J. (2000a), “A Multilinear Singular Value Decomposition,”SIAM Journal on Matrix Analysis and Applications, 21, 1253-1278. DOI: . · Zbl 0962.15005
[9]	De Lathauwer, L., De Moor, B., and Vandewalle, J. (2000b), “On the Best Rank-1 and Rank-(r_1, r_2,…, r_n) Approximation of Higher-Order Tensors,”SIAM Journal on Matrix Analysis and Applications, 21, 1324-1342. · Zbl 0958.15026
[10]	Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society, Series B, 39, 1-22. DOI: . · Zbl 0364.62022
[11]	Deng, K., and Zhang, X. (2021), “Tensor Envelope Mixture Model for Simultaneous Clustering and Multiway Dimension Reduction,”Biometrics, 78, 1067-1079. DOI: . · Zbl 1520.62177
[12]	Fan, J., and Fan, Y. (2008), “High Dimensional Classification Using Features Annealed Independence Rules,”Annals of Statistics, 36, 2605-2637. DOI: . · Zbl 1360.62327
[13]	Gallaugher, M. P., and McNicholas, P. D. (2018), “Finite Mixtures of Skewed Matrix Variate Distributions,”Pattern Recognition, 80, 83-93. DOI: .
[14]	Gao, X., Shen, W., Zhang, L., Hu, J., Fortin, N. J., Frostig, R. D., and Ombao, H. (2021), “Regularized Matrix Data Clustering and its Application to Image Analysis,”Biometrics, 77, 890-902. DOI: . · Zbl 1520.62208
[15]	Hao, B., Sun, W. W., Liu, Y., and Cheng, G. (2018), “Simultaneous Clustering and Estimation of Heterogeneous Graphical Models,”Journal of Machine Learning Research,18, 1-58. · Zbl 1473.62220
[16]	Kolda, T. G., and Bader, B. W. (2009), “Tensor Decompositions and Applications,”SIAM Review, 51, 455-500. DOI: . · Zbl 1173.65029
[17]	Li, J., Mai, Q., and Zhang, X. (2023+), “The Tucker Low-Rank Classification Model for Tensor Data,” manuscript.
[18]	Li, L., and Zhang, X. (2017), “Parsimonious Tensor Response Regression,”Journal of the American Statistical Association, 112, 1131-1146. DOI: .
[19]	Li, X., Xu, D., Zhou, H., and Li, L. (2018), “Tucker Tensor Regression and Neuroimaging Analysis,”Statistics in Biosciences, 10, 520-545. DOI: .
[20]	Lyu, T., Lock, E. F., and Eberly, L. E. (2017), “Discriminating Sample Groups with Multi-Way Data,”Biostatistics, 18, 434-450. DOI: .
[21]	Lyu, X., Sun, W. W., Wang, Z., Liu, H., Yang, J., and Cheng, G. (2019), “Tensor Graphical Model: Non-convex Optimization and Statistical Inference,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2024-2037. DOI: .
[22]	Mai, Q., Yang, Y., and Zou, H. (2019), “Multiclass Sparse Discriminant Analysis,”Statistica Sinica, 29, 97-111. · Zbl 1412.62081
[23]	Mai, Q., Zhang, X., Pan, Y., and Deng, K. (2021), “A Doubly Enhanced EM Algorithm for Model-based Tensor Clustering,”Journal of the American Statistical Association, 117, 2120-2134. DOI: . · Zbl 1515.62065
[24]	Manceur, A. M., and Dutilleul, P. (2013), “Maximum Likelihood Estimation for the Tensor Normal Distribution: Algorithm, Minimum Sample Size, and Empirical Bias and Dispersion,”Journal of Computational and Applied Mathematics, 239, 37-49. DOI: . · Zbl 1255.65029
[25]	Min, K., Mai, Q., and Zhang, X. (2022), “Fast and Separable Estimation in High-Dimensional Tensor Gaussian Graphical Models,”Journal of Computational and Graphical Statistics, 31, 294-300. DOI: . · Zbl 07546477
[26]	Pan, W., and Shen, X. (2007), “Penalized Model-based Clustering with Application to Variable Selection,”Journal of Machine Learning Research, 8, 1145-1164. · Zbl 1222.68279
[27]	Pan, Y., Mai, Q., and Zhang, X. (2019), “Covariate-Adjusted Tensor Classification in High Dimensions,”Journal of the American Statistical Association, 114, 1305-1319. DOI: . · Zbl 1428.62291
[28]	Sun, W. W., and Li, L. (2019), “Dynamic Tensor Clustering,”Journal of the American Statistical Association, 114, 1894-1907. DOI: . · Zbl 1428.62260
[29]	Tait, P. A., McNicholas, P. D., and Obeid, J. (2020), “Clustering Higher Order Data: An Application to Pediatric Multi-variable Longitudinal Data,” arXiv preprint arXiv:1907.08566 .
[30]	Thompson, G. Z., Maitra, R., Meeker, W. Q., and Bastawros, A. F. (2020), “Classification with the Matrix-Variate-t Distribution,”Journal of Computational and Graphical Statistics, 29, 668-674. DOI: . · Zbl 07499305
[31]	Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,”Journal of the Royal Statistical Society, Series B, 58, 267-288. DOI: . · Zbl 0850.62538
[32]	Wen, Z., and Yin, W. (2013), “A Feasible Method for Optimization with Orthogonality Constraints,”Mathematical Programming, 142, 397-434. DOI: . · Zbl 1281.49030
[33]	Witten, D. M., and Tibshirani, R. (2010), “A Framework for Feature Selection in Clustering,”Journal of the American Statistical Association, 105, 713-726. DOI: . · Zbl 1392.62194
[34]	Xia, D., Yuan, M., and Zhang, C.-H. (2021), “Statistically Optimal and Computationally Efficient Low Rank Tensor Completion from Noisy Entries,”The Annals of Statistics, 49, 76-99. DOI: . · Zbl 1473.62184
[35]	Yuan, M., and Lin, Y. (2006), “Model Selection and Estimation in Regression with Grouped Variables,”Journal of the Royal Statistical Society, Series B, 68, 49-67. DOI: . · Zbl 1141.62030
[36]	Zhang, A. (2019), “Cross: Efficient Low-Rank Tensor Completion,”The Annals of Statistics, 47, 936-964. DOI: . · Zbl 1416.62298
[37]	Zhang, A., and Han, R. (2019), “Optimal Sparse Singular Value Decomposition for High-Dimensional High-Order Data,”Journal of the American Statistical Association, 114, 1708-1725. DOI: . · Zbl 1428.62262
[38]	Zhang, A., and Xia, D. (2018), “Tensor SVD: Statistical and Computational Limits,”IEEE Transactions on Information Theory, 64, 7311-7338. DOI: . · Zbl 1432.62176
[39]	Zhang, A. R., Luo, Y., Raskutti, G., and Yuan, M. (2020), “ISLET: Fast and Optimal Low-Rank Tensor Regression via Importance Sketching,”SIAM Journal on Mathematics of Data Science, 2, 444-479. DOI: . · Zbl 1484.65095
[40]	Zhang, X. L., Begleiter, H., Porjesz, B., Wang, W., and Litke, A. (1995), “Event Related Potentials during Object Recognition Tasks,”Brain research bulletin,38, 531-538. DOI: .
[41]	Zhou, H., Li, L., and Zhu, H. (2013), “Tensor Regression with Applications in Neuroimaging Data Analysis,”Journal of the American Statistical Association, 108, 540-552. DOI: . · Zbl 06195959

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.