×

Alternating Mahalanobis distance minimization for accurate and well-conditioned CP decomposition. (English) Zbl 1527.15026

Summary: Canonical polyadic decomposition (CPD) is prevalent in chemometrics, signal processing, data mining, and many more fields. While many algorithms have been proposed to compute the CPD, alternating least squares (ALS) remains one of the most widely used algorithms for computing the decomposition. Recent works have introduced the notion of eigenvalues and singular values of a tensor and explored applications of eigenvectors and singular vectors in signal processing, data analytics, and various other fields. We introduce a new formulation for deriving singular values and vectors of a tensor by considering the critical points of a function differently from previous works. Computing these critical points in an alternating manner motivates an alternating optimization algorithm which corresponds to the ALS algorithm in the matrix case. However, for tensors with order greater than or equal to 3, it minimizes an objective function which is different from the commonly used least squares loss. Alternating optimization of this new objective leads to simple updates to the factor matrices with the same asymptotic computational cost as ALS. We show that a subsweep of this algorithm can achieve a superlinear convergence rate for exact CPD when the known rank is not larger than the mode lengths of the input tensor. We verify our theoretical arguments experimentally. We then view the algorithm as optimizing a Mahalanobis distance with respect to each factor with the ground metric dependent on the other factors. This perspective allows us to generalize our approach to interpolate between updates corresponding to the ALS and the new algorithm to manage the tradeoff between stability and fitness of the decomposition. Our experimental results show that for approximating synthetic and real-world tensors, this algorithm and its variants converge to a better conditioned decomposition with comparable and sometimes better fitness as compared to the ALS algorithm.

MSC:

15A69 Multilinear algebra, tensor calculus
15A72 Vector and tensor algebra, theory of invariants
65K10 Numerical optimization and variational techniques
65Y20 Complexity and performance of numerical algorithms
65Y04 Numerical algorithms for computer arithmetic, etc.
68W25 Approximation algorithms

Software:

PySCF

References:

[1] Acar, E., Dunlavy, D. M., and Kolda, T. G., A scalable optimization approach for fitting canonical tensor decompositions, J. Chemometrics, 25 (2011), pp. 67-86.
[2] Afshar, A., Yin, K., Yan, S., Qian, C., Ho, J. C., Park, H., and Sun, J., Swift: Scalable Wasserstein factorization for sparse nonnegative tensors, in Proceedings of the AAAI Conference, , 2021.
[3] Anandkumar, A., Ge, R., Hsu, D. J., Kakade, S. M., and Telgarsky, M., Tensor decompositions for learning latent variable models, J. Mach. Learn. Res., 15 (2014), pp. 2773-2832. · Zbl 1319.62109
[4] Ballard, G., Hayashi, K., and Kannan, R., Parallel Nonnegative CP Decomposition of Dense Tensors, preprint, arXiv:1806.07985, 2018.
[5] Ballard, G., Knight, N., and Rouse, K., Communication lower bounds for matricized tensor times Khatri-Rao product, in Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), , IEEE, 2018, pp. 557-567.
[6] Battaglino, C., Ballard, G., and Kolda, T. G., A practical randomized CP tensor decomposition, SIAM J. Matrix Anal. Appl., 39 (2018), pp. 876-901. · Zbl 1444.65016
[7] Bellet, A., Habrard, A., and Sebban, M., A Survey on Metric Learning for Feature Vectors and Structured Data, preprint, arXiv:1306.6709, 2013. · Zbl 1308.68005
[8] Belouchrani, A., Abed-Meraim, K., Cardoso, J.-F., and Moulines, E., A blind source separation technique using second-order statistics, IEEE Trans. Signal Process., 45 (1997), pp. 434-444.
[9] Biswal, S., Sun, H., Goparaju, B., Westover, M. B., Sun, J., and Bianchi, M. T., Expert-level sleep scoring with deep neural networks, J. Amer. Med. Inform. Assoc., 25 (2018), pp. 1643-1650.
[10] Breiding, P. and Vannieuwenhoven, N., The condition number of joint decompositions, SIAM J. Matrix Anal. Appl., 39 (2018), pp. 287-309. · Zbl 1384.49035
[11] Bro, R., PARAFAC tutorial and applications, Chemometrics Intell. Lab. Syst., 38 (1997), pp. 149-171.
[12] Carroll, J. D. and Chang, J.-J., Analysis of individual differences in multidimensional scaling via an n-way generalization of Eckart-Young decomposition, Psychometrika, 35 (1970), pp. 283-319. · Zbl 0202.19101
[13] Choi, J., Liu, X., Smith, S., and Simon, T., Blocking optimization techniques for sparse tensor computation, in Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), , IEEE, 2018, pp. 568-577.
[14] Comon, P., Tensor diagonalization, a useful tool in signal processing, IFAC Proc., 27 (1994), pp. 77-82.
[15] Cong, F., Lin, Q.-H., Kuang, L.-D., Gong, X.-F., Astikainen, P., and Ristaniemi, T., Tensor decomposition of EEG signals: A brief review, J. Neurosci. Methods, 248 (2015), pp. 59-69.
[16] Cui, C.-F., Dai, Y.-H., and Nie, J., All real eigenvalues of symmetric tensors, SIAM J. Matrix Anal. Appl., 35 (2014), pp. 1582-1601. · Zbl 1312.65053
[17] Cuturi, M. and Avis, D., Ground metric learning, J. Mach. Learn. Res., 15 (2014), pp. 533-564. · Zbl 1317.68149
[18] De Maesschalck, R., Jouan-Rimbaud, D., and Massart, D. L., The Mahalanobis distance, Chemometrics Intell. Lab. Syst., 50 (2000), pp. 1-18.
[19] Mahalanobis, P. C., On the generalised distance in statistics, Proc. Natl. Instit. Sci. India, 2 (1936), pp. 49-55. · Zbl 0015.03302
[20] Dewaele, N., Breiding, P., and Vannieuwenhoven, N., The Condition Number of Many Tensor Decompositions Is Invariant Under Tucker Compression, manuscript, 2021. · Zbl 1523.65040
[21] Harshman, R. A., Foundations of the PARAFAC Procedure: Models and Conditions for an Explanatory Multimodal Factor Analysis, University of California at Los Angeles, Los Angeles, CA, 1970.
[22] Hayashi, K., Ballard, G., Jiang, J., and Tobia, M., Shared Memory Parallelization of MTTKRP for Dense Tensors, preprint, arXiv:1708.08976, 2017.
[23] Hillar, C. J. and Lim, L.-H., Most tensor problems are NP-hard, J. ACM, 60 (2013), pp. 45:1-45:39. · Zbl 1281.68126
[24] Hitchcock, F. L., The expression of a tensor or a polyadic as a sum of products, Stud. Appl. Math., 6 (1927), pp. 164-189. · JFM 53.0095.01
[25] Hyvärinen, A., Survey on Independent Component Analysis, manuscript, 1999.
[26] Karlsson, L., Kressner, D., and Uschmajew, A., Parallel algorithms for tensor completion in the CP format, Parallel Comput., 57 (2016), pp. 222-234.
[27] Kaya, O., High Performance Parallel Algorithms for Tensor Decompositions, Ph.D. thesis, Université de Lyon, 2017.
[28] Kaya, O. and Robert, Y., Computing dense tensor decompositions with optimal dimension trees, Algorithmica, 81 (2019), pp. 2092-2121. · Zbl 1421.68259
[29] Kaya, O. and Uçar, B., Parallel CP Decomposition of Sparse Tensors Using Dimension Trees, Ph.D. thesis, Inria-Research Centre Grenoble-Rhône-Alpes, 2016. · Zbl 1383.65037
[30] Kemp, B., Zwinderman, A. H., Tuk, B., Kamphuisen, H. A., and Oberye, J. J., Analysis of a sleep-dependent neuronal feedback loop: The slow-wave microcontinuity of the EEG, IEEE Trans. Biomed. Eng., 47 (2000), pp. 1185-1194.
[31] Kolda, T. G. and Bader, B. W., Tensor decompositions and applications, SIAM Rev., 51 (2009), pp. 455-500. · Zbl 1173.65029
[32] Kolda, T. G. and Mayo, J. R., Shifted power method for computing tensor eigenpairs, SIAM J. Matrix Anal. Appl., 32 (2011), pp. 1095-1124. · Zbl 1247.65048
[33] Kulis, B., Metric learning: A survey, Found. Trends Mach. Learn., 5 (2013), pp. 287-364. · Zbl 1278.68014
[34] Li, G., Qi, L., and Yu, G., The Z-eigenvalues of a symmetric tensor and its application to spectral hypergraph theory, Numer. Linear Algebra Appl., 20 (2013), pp. 1001-1029. · Zbl 1313.65081
[35] Li, J., Usevich, K., and Comon, P., Globally convergent Jacobi-type algorithms for simultaneous orthogonal symmetric tensor diagonalization, SIAM J. Matrix Anal. Appl., 39 (2018), pp. 1-22. · Zbl 1378.15016
[36] Liang, M. and Zheng, B., Further results on Moore-Penrose inverses of tensors with application to tensor nearness problems, Comput. Math. Appl., 77 (2019), pp. 1282-1293. · Zbl 1442.15004
[37] Lim, L.-H., Singular values and eigenvalues of tensors: A variational approach, in Proceedings of the 1st IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, , IEEE, 2005, pp. 129-132.
[38] Ma, L. and Solomonik, E., Accelerating Alternating Least Squares for Tensor Decomposition by Pairwise Perturbation, preprint, arXiv:1811.10573, 2018. · Zbl 07584139
[39] Ma, L. and Solomonik, E., Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree, in Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), , IEEE, 2021, pp. 412-421.
[40] Maruhashi, K., Guo, F., and Faloutsos, C., Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis, in Proceedings of the International Conference on Advances in Social Networks Analysis and Mining, , IEEE, 2011, pp. 203-210.
[41] Mitchell, B. C. and Burdick, D. S., Slowly converging PARAFAC sequences: Swamps and two-factor degeneracies, J. Chemometrics, 8 (1994), pp. 155-168.
[42] Mitchell, D., Ye, N., and De Sterck, H., Nesterov Acceleration of Alternating Least Squares for Canonical Tensor Decomposition, preprint, arXiv:1810.05846, 2018. · Zbl 1488.65108
[43] Mohlenkamp, M. J., Musings on multilinear fitting, Linear Algebra Appl., 438 (2013), pp. 834-852. · Zbl 1258.15010
[44] Murphy, K. R., Stedmon, C. A., Graeber, D., and Bro, R., Fluorescence spectroscopy and multi-way techniques. PARAFAC, Anal. Methods, 5 (2013), pp. 6557-6566.
[45] Nion, D. and De Lathauwer, L., An enhanced line search scheme for complex-valued tensor decompositions. Application in DS-CDMA, Signal Process., 88 (2008), pp. 749-755. · Zbl 1186.94262
[46] Paatero, P., A weighted non-negative least squares algorithm for three-way PARAFAC factor analysis, Chemometrics Intell. Lab. Syst., 38 (1997), pp. 223-242.
[47] Phan, A.-H., Tichavský, P., and Cichocki, A., Fast alternating LS algorithms for high order CANDECOMP/PARAFAC tensor factorizations, IEEE Trans. Signal Process., 61 (2013), pp. 4834-4846.
[48] Phan, A.-H., Tichavský, P., and Cichocki, A., Low complexity damped Gauss-Newton algorithms for CANDECOMP/PARAFAC, SIAM J. Matrix Anal. Appl., 34 (2013), pp. 126-147. · Zbl 1365.65071
[49] Qi, L., Chen, H., and Chen, Y., Tensor Eigenvalues and their Applications, Springer, Berlin, 2018. · Zbl 1398.15001
[50] Rajih, M., Comon, P., and Harshman, R. A., Enhanced line search: A novel method to accelerate PARAFAC, SIAM J. Matrix Anal. Appl., 30 (2008), pp. 1128-1147. · Zbl 1168.65313
[51] Robeva, E., Orthogonal decomposition of symmetric tensors, SIAM J. Matrix Anal. Appl., 37 (2016), pp. 86-102.
[52] Robeva, E. and Seigal, A., Singular vectors of orthogonally decomposable tensors, Linear Multilinear Algebra, 65 (2017), pp. 2457-2471. · Zbl 1390.13083
[53] Schatz, M. D., Low, T. M., van de Geijn, R. A., and Kolda, T. G., Exploiting symmetry in tensors for high performance: Multiplication with symmetric tensors, SIAM J. Sci. Comput., 36 (2014), pp. C453-C479. · Zbl 1307.65057
[54] Sidiropoulos, N. D., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E. E., and Faloutsos, C., Tensor decomposition for signal processing and machine learning, IEEE Trans. Signal Process., 65, pp. 3551-3582. · Zbl 1415.94232
[55] Singh, N., Ma, L., Yang, H., and Solomonik, E., Comparison of accuracy and scalability of Gauss-Newton and alternating least squares for CANDECOMC/PARAFAC decomposition, SIAM J. Sci. Comput., 43 (2021), pp. C290-C311. · Zbl 07379625
[56] Sorber, L., Domanov, I., Van Barel, M., and Lathauwer, L., Exact line and plane search for tensor optimization, Comput. Optim. Appl., 63 (2016), pp. 121-142. · Zbl 1361.90046
[57] Sorber, L., Van Barel, M., and De Lathauwer, L., Optimization-based algorithms for tensor decompositions: Canonical polyadic decomposition, decomposition in rank-\((l_r,l_r,1)\) terms, and a new generalization, SIAM J. Optim., 23 (2013), pp. 695-720. · Zbl 1277.90073
[58] Sun, L., Zheng, B., Bu, C., and Wei, Y., Moore-Penrose inverse of tensors via Einstein product, Linear Multilinear Algebra, 64 (2016), pp. 686-698. · Zbl 1341.15019
[59] Sun, Q., Berkelbach, T. C., Blunt, N. S., Booth, G. H., Guo, S., Li, Z., Liu, J., McClain, J. D., Sayfutyarova, E. R., Sharma, S., et al., PySCF: The Python-based simulations of chemistry framework, Wiley Interdiscip. Rev. Comput. Mol. Sci., 8 (2018), e1340.
[60] Tichavský, P., Phan, A. H., and Cichocki, A., A further improvement of a fast damped Gauss-Newton algorithm for CANDECOMP-PARAFAC tensor decomposition, in Proceedings of the International Conference on Acoustics, Speech and Signal Processing, , IEEE, 2013, pp. 5964-5968.
[61] Tichavský, P., Phan, A. H., and Cichocki, A., Non-orthogonal tensor diagonalization, Signal Process., 138 (2017), pp. 313-320.
[62] Tomasi, G. and Bro, R., PARAFAC and missing values, Chemometrics Intell. Lab. Syst., 75 (2005), pp. 163-180.
[63] Tomasi, G. and Bro, R., A comparison of algorithms for fitting the PARAFAC model, Comput. Statist. Data Anal., 50 (2006), pp. 1700-1734. · Zbl 1445.62136
[64] Uschmajew, A., Local convergence of the alternating least squares algorithm for canonical tensor approximation, SIAM J. Matrix Anal. Appl., 33 (2012), pp. 639-652. · Zbl 1252.65085
[65] Usevich, K., Li, J., and Comon, P., Approximate matrix and tensor diagonalization by unitary transformations: Convergence of Jacobi-type algorithms, SIAM J. Optim., 30 (2020), pp. 2998-3028. · Zbl 1453.90168
[66] Vannieuwenhoven, N., Condition numbers for the tensor rank decomposition, Linear Algebra Appl., 535 (2017), pp. 35-86. · Zbl 1375.65063
[67] Vannieuwenhoven, N., Meerbergen, K., and Vandebril, R., Computing the gradient in optimization algorithms for the CP decomposition in constant memory through tensor blocking, SIAM J. Sci. Comput., 37 (2015), pp. C415-C438. · Zbl 1320.65066
[68] Zen, G., Ricci, E., and Sebe, N., Simultaneous ground metric learning and matrix factorization with earth mover’s distance, in Proceedings of the 22nd International Conference on Pattern Recognition, , IEEE, 2014, pp. 3690-3695.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.