×

A \(\mu\)-mode BLAS approach for multidimensional tensor-structured problems. (English) Zbl 1528.15001

Summary: In this manuscript, we present a common tensor framework which can be used to generalize one-dimensional numerical tasks to arbitrary dimension \(d\) by means of tensor product formulas. This is useful, for example, in the context of multivariate interpolation, multidimensional function approximation using pseudospectral expansions and solution of stiff differential equations on tensor product domains. The key point to obtain an efficient-to-implement BLAS formulation consists in the suitable usage of the \(\mu\)-mode product (also known as tensor-matrix product or mode-\(n\) product) and related operations, such as the Tucker operator. Their MathWorks MATLAB\textsuperscript®/GNU Octave implementations are discussed in the paper, and collected in the package KronPACK. We present numerical results on experiments up to dimension six from different fields of numerical analysis, which show the effectiveness of the approach.

MSC:

15-04 Software, source code, etc. for problems pertaining to linear algebra
15A69 Multilinear algebra, tensor calculus
65Y04 Numerical algorithms for computer arithmetic, etc.

References:

[1] Dongarra, JJ; Du Croz, J.; Hammarling, S.; Duff, IS, A set of level 3 basic linear algebra subprograms, ACM Trans. Math. Softw., 16, 1, 1-17 (1990) · Zbl 0900.65115 · doi:10.1145/77626.79170
[2] Intel Corporation: Intel Math Kernel Library. https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html (2021). Accessed 27 Dec 2021
[3] Xianyi, Z., Qian, W., Yunquan, Z.: Model-driven level 3 BLAS performance optimization on Loongson 3A processor. In: 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp 684-691 (2012). Accessed 27 Dec 2021
[4] NVIDIA Corporation: cuBLAS documentation. https://docs.nvidia.com/cuda/cublas/index.html (2021). Accessed 27 Dec 2021
[5] Kolda, T.G.: Multilinear operators for higher-order decompositions. Technical Report SAND2006-2081 Sandia National Laboratories (2006)
[6] Kolda, TG; Bader, BW, Tensor decompositions and applications, SIAM Rev., 51, 3, 455-500 (2009) · Zbl 1173.65029 · doi:10.1137/07070111X
[7] Li, J., Battaglino, C., Perros, I., Sun, J., Vuduc, R.: An input-adaptive and in-place approach to dense tensor-times-matrix multiply. In: SC ’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Association for Computing Machinery, New York (2015)
[8] Rogers, D.M.: Efficient primitives for standard tensor linear algebra. In: XSEDE16: Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale. Association for Computing Machinery, New York (2016)
[9] Springer, P.; Bientinesi, P., Design of a high-performance GEMM-like tensor-tensor multiplication, ACM Trans. Math. Softw., 44, 3, 1-29 (2018) · Zbl 1484.65092 · doi:10.1145/3157733
[10] Matthews, DA, High-performance tensor contraction without transposition, SIAM J. Sci. Comput., 40, 1, 1-24 (2018) · Zbl 1379.65024 · doi:10.1137/16M108968X
[11] Bader, B.W., Kolda, T.G., et al.: Tensor Toolbox for MATLAB, Version 3.2.1. https://www.tensortoolbox.org (2021). Accessed 27 Dec 2021
[12] Vervilet, N., Debals, O., Sorber, L., Van Barel, M., De Lathauwer, L.: Tensorlab 3.0. https://tensorlab.net. Available online (2016). Accessed 27 Dec 2021
[13] Boyd, JP, Chebyshev and Fourier Spectral Methods (2000), New York: DOVER Publications Inc., New York
[14] de Boor, C., A Practical Guide to Splines, Revised edn. Applied Mathematical Sciences, vol. 27 (2001), New York: Springer, New York · Zbl 0987.65015
[15] Caliari, M.; Cassini, F.; Einkemmer, L.; Ostermann, A.; Zivcovich, F., A μ-mode integrator for solving evolution equations in Kronecker form, J. Comput. Phys., 455, 110989 (2022) · Zbl 07518078 · doi:10.1016/j.jcp.2022.110989
[16] Bertolazzi, E., Falini, A., Mazzia, F.: The object oriented C++ library QIBSH++ for Hermite spline quasi interpolation. arXiv:2208.03260 (2022)
[17] Al-Mohy, AH; Higham, NJ, A New Scaling and Squaring Algorithm for the Matrix Exponential, SIAM J. Matrix Anal. Appl., 31, 3, 970-989 (2009) · Zbl 1194.15021 · doi:10.1137/09074721X
[18] Caliari, M.; Zivcovich, F., On-the-fly backward error estimate for matrix exponential approximation by Taylor algorithm, J. Comput. Appl. Math., 346, 532-548 (2019) · Zbl 1452.65082 · doi:10.1016/j.cam.2018.07.042
[19] Al-Mohy, AH; Higham, NJ, Computing the action of the matrix exponential with an application to exponential integrators, SIAM J. Sci. Comput., 33, 2, 488-511 (2011) · Zbl 1234.65028 · doi:10.1137/100788860
[20] Niesen, J.; Wright, WM, Algorithm 919: A Krylov subspace algorithm for evaluating the ϕ-functions appearing in exponential integrators, ACM Trans. Math. Softw., 38, 3, 1-19 (2012) · Zbl 1365.65185 · doi:10.1145/2168773.2168781
[21] Gaudreault, S.; Rainwater, G.; Tokman, M., KIOPS: A fast adaptive Krylov subspace solver for exponential integrators, J. Comput. Phys., 372, 236-255 (2018) · Zbl 1418.65074 · doi:10.1016/j.jcp.2018.06.026
[22] Caliari, M.; Cassini, F.; Zivcovich, F., Approximation of the matrix exponential for matrices with a skinny field of values, BIT Numer. Math., 60, 4, 1113-1131 (2020) · Zbl 1455.65069 · doi:10.1007/s10543-020-00809-0
[23] Neudecker, H., A Note on Kronecker matrix products and matrix equation systems, SIAM J. Appl. Math., 17, 3, 603-606 (1969) · Zbl 0185.08204 · doi:10.1137/0117057
[24] Arbenz, P.; Říha, L., Batched transpose-free ADI-type preconditioners for a Poisson solver on GPGPUs, J. Parallel Distrib. Comput., 137, 148-159 (2020) · doi:10.1016/j.jpdc.2019.11.004
[25] Kirsten, G., Simoncini, V.: A matrix-oriented POD-DEIM algorithm applied to nonlinear differential matrix equations. arXiv:2006.13289 (2020)
[26] Palitta, D.; Simoncini, V., Matrix-equation-based strategies for convection-diffusion equations, BIT Numer. Math., 56, 2, 751-776 (2016) · Zbl 1341.65042 · doi:10.1007/s10543-015-0575-8
[27] Chen, M.; Kressner, D., Recursive blocked algorithms for linear systems with Kronecker product structure, Numer. Algorithms, 84, 3, 1199-1216 (2020) · Zbl 1442.65063 · doi:10.1007/s11075-019-00797-5
[28] Bao, W.; Li, H.; Shen, J., A Generalized-laguerre-fourier-hermite pseudospectral method for computing the dynamics of rotating bose-einstein condensates, SIAM J. Sci. Comput., 31, 5, 3685-3711 (2009) · Zbl 1205.82096 · doi:10.1137/080739811
[29] Tang, T., The hermite spectral method for gaussian-type functions, SIAM J. Sci. Comput., 14, 3, 594-606 (1993) · Zbl 0782.65110 · doi:10.1137/0914038
[30] Szegö, G.: Orthogonal Polynomials, 4th edn., vol. 23. Colloquium Publications, American Mathematical Society, Providence (1975) · Zbl 0305.42011
[31] Driscoll, T.A., Hale, N., Trefethen, L.N. (eds.): Chebfun Guide. Pafnuty Publications, Oxford (2014)
[32] Berrut, J-P; Trefethen, LN, Barycentric lagrange interpolation, SIAM Rev., 46, 3, 501-517 (2004) · Zbl 1061.65006 · doi:10.1137/S0036144502417715
[33] Trefethen, LN, Multivariate polynomial approximation in the hypercube, Proc. Am. Math. Soc., 145, 11, 4837-4844 (2017) · Zbl 1380.41011 · doi:10.1090/proc/13623
[34] Zoppou, C.; Knight, JH, Analytical solution of a spatially variable coefficient advection-diffusion equation in up to three dimensions, Appl. Math. Model., 23, 9, 667-685 (1999) · Zbl 0956.76089 · doi:10.1016/S0307-904X(99)00005-0
[35] Hochbruck, M.; Ostermann, A., Explicit exponential runge-kutta methods for semilinear parabolic problems, SIAM J. Numer. Anal., 43, 3, 1069-1090 (2005) · Zbl 1093.65052 · doi:10.1137/040611434
[36] Van Loan, CF, The ubiquitous Kronecker product, J. Comput. Appl. Math., 123, 1-2, 85-100 (2000) · Zbl 0966.65039 · doi:10.1016/S0377-0427(00)00393-9
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.