×

Block Gram-Schmidt algorithms and their stability properties. (English) Zbl 1490.65074

Summary: Block Gram-Schmidt algorithms serve as essential kernels in many scientific computing applications, but for many commonly used variants, a rigorous treatment of their stability properties remains open. This work provides a comprehensive categorization of block Gram-Schmidt algorithms, particularly those used in Krylov subspace methods to build orthonormal bases one block vector at a time. Known stability results are assembled, and new results are summarized or conjectured for important communication-reducing variants. Additionally, new block versions of low-synchronization variants are derived, and their efficacy and stability are demonstrated for a wide range of challenging examples. Numerical examples are computed with a versatile Matlab package hosted at https://github.com/katlund/BlockStab, and scripts for reproducing all results in the paper are provided. Block Gram-Schmidt implementations in popular software packages are discussed, along with a number of open problems. An appendix containing all algorithms type-set in a uniform fashion is provided.

MSC:

65F25 Orthogonalization in numerical linear algebra
15A23 Factorization of matrices
65F10 Iterative numerical methods for linear systems

References:

[1] Golub, G.; Underwood, R., The block Lanczos method for computing eigenvalues, (Rice, J. R., Mathematical Software (1977), Academic Press), 361-377 · Zbl 0407.68040
[2] Stewart, G. W., A Krylov-Schur algorithm for large eigenproblems, SIAM J. Matrix Anal. Appl., 23, 3, 601-614 (2001) · Zbl 1003.65045
[3] O’Leary, D. P., The block conjugate gradient algorithm and related methods, Linear Algebra Appl., 29, 1980, 293-322 (1980) · Zbl 0426.65011
[4] Baker, A. H.; Dennis, J. M.; Jessup, E. R., On improving linear solver performance: a block variant of GMRES, SIAM J. Sci. Comput., 27, 5, 1608-1626 (2006) · Zbl 1099.65029
[5] Ballard, G.; Carson, E. C.; Demmel, J. W.; Hoemmen, M.; Knight, N.; Schwartz, O., Communication lower bounds and optimal algorithms for numerical linear algebra, Acta Numer., 23, 2014, 1-155 (2014) · Zbl 1396.65082
[6] Grigori, L.; Moufawad, S.; Nataf, F., Enlarged Krylov subspace conjugate gradient methods for reducing communication, SIAM J. Matrix Anal. Appl., 37, 2, 744-773 (2016) · Zbl 1382.65086
[7] Demmel, J.; Grigori, L.; Hoemmen, M.; Langou, J., Communication-optimal parallel and sequential QR and LU factorizations (2008)
[8] Björck, Å., Solving linear least squares problems by Gram-Schmidt orthogonalization, BIT, 7, 1-21 (1967) · Zbl 0183.17802
[9] Björck, Å.; Paige, C. C., Loss and recapture of orthogonality in the modified Gram-Schmidt algorithm, SIAM J. Matrix Anal. Appl., 13, 1, 176-190 (1992) · Zbl 0747.65026
[10] Higham, N. J., Accuracy and Stability of Numerical Algorithms (2002), SIAM: SIAM Philadelphia · Zbl 1011.65010
[11] Paige, C. C.; Rozložník, M.; Strakoš, Z., Modified Gram-Schmidt (MGS), least squares, and backward stability of MGS-GMRES, SIAM J. Matrix Anal. Appl., 28, 1, 264-284 (2006) · Zbl 1113.65028
[12] Yamamoto, Y.; Nakatsukasa, Y.; Yanagisawa, Y.; Fukaya, T., Roundoff error analysis of the CholeskyQR2 algorithm, Electron. Trans. Numer. Anal., 44, 306-326 (2015) · Zbl 1330.65049
[13] Fukaya, T.; Kannan, R.; Nakatsukasa, Y.; Yamamoto, Y.; Yanagisawa, Y., Shifted Cholesky QR for computing the QR factorization of ill-conditioned matrices, SIAM J. Sci. Comput., 42, 1, A477-A503 (2020) · Zbl 1434.65041
[14] Leon, S. J.; Björck, Å.; Gander, W., Gram-Schmidt orthogonalization: 100 years and more, Numer. Linear Algebra Appl., 20, 3, 492-532 (2013) · Zbl 1313.65086
[15] Barlow, J. L., Block modified Gram-Schmidt algorithms and their analysis, SIAM J. Matrix Anal. Appl., 40, 4, 1257-1290 (2019) · Zbl 1427.65054
[16] Barlow, J. L.; Smoktunowicz, A., Reorthogonalized block classical Gram-Schmidt, Numer. Math., 123, 3, 395-423 (2013) · Zbl 1269.65042
[17] Vanderstraeten, D., An accurate parallel block Gram-Schmidt algorithm without reorthogonalization, Numer. Linear Algebra Appl., 7, 4, 219-236 (2000) · Zbl 1051.65046
[18] Jalby, W.; Philippe, B., Stability analysis and improvement of the block Gram-Schmidt algorithm, SIAM J. Sci. Comput., 12, 5, 1058-1073 (1991) · Zbl 0734.65034
[19] Sadkane, M., Block-Arnoldi and Davidson methods for unsymmetric large eigenvalue problems, Numer. Math., 64, 1, 195-211 (1993) · Zbl 0791.65021
[20] Simoncini, V.; Gallopoulos, E., Convergence properties of block GMRES and matrix polynomials, Linear Algebra Appl., 247, 97-119 (1996) · Zbl 0861.65023
[21] Saad, Y., Iterative Methods for Sparse Linear Systems (2003), SIAM: SIAM Philadelphia · Zbl 1002.65042
[22] Morgan, R. B., Restarted block-GMRES with deflation of eigenvalues, Appl. Numer. Math., 54, 2, 222-236 (2005) · Zbl 1074.65043
[23] Gutknecht, M. H., Block Krylov space methods for linear systems with multiple right-hand sides: an introduction, (Siddiqi, A. H.; Duff, I. S.; Christensen, O., Modern Mathematical Models, Methods and Algorithms for Real World Systems (2007), Anamaya Publishers), 420-447
[24] Hoemmen, M., Communication-avoiding Krylov subspace methods (2010), Department of Electrical Engineering and Computer Science, University of California at Berkeley, PhD thesis
[25] Świrydowicz, K.; Langou, J.; Ananthan, S.; Yang, U.; Thomas, S., Low synchronization Gram-Schmidt and generalized minimal residual algorithms, Numer. Linear Algebra Appl., 28, 2, Article e2343 pp. (2021) · Zbl 1474.65103
[26] Demmel, J. W.; Grigori, L.; Hoemmen, M.; Langou, J., Communication-optimal parallel and sequential QR and LU factorizations, SIAM J. Sci. Comput., 34, 1, A206-A239 (2012) · Zbl 1241.65028
[27] Bienz, A.; Gropp, W.; Olson, L., Node-aware improvements to allreduce, (Proceedings of the 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI) (2019), ACM)
[28] Smoktunowicz, A.; Barlow, J. L.; Langou, J., A note on the error analysis of classical Gram-Schmidt, Numer. Math., 105, 2, 299-313 (2006) · Zbl 1108.65021
[29] Carson, E.; Lund, K.; Rozložník, M., The stability of block variants of classical Gram-Schmidt, SIAM J. Matrix Anal. Appl., 42, 3, 1365-1380 (2021) · Zbl 1476.65060
[30] Yamazaki, I.; Thomas, S.; Hoemmen, M.; Boman, E. G.; Świrydowicz, K.; Eilliot, J. J., Low-synchronization orthogonalization schemes for s-step and pipelined Krylov solvers in Trilinos, (Proceedings of the SIAM Conference on Parallel Processing 2020. Proceedings of the SIAM Conference on Parallel Processing 2020, Seattle WA (2020))
[31] Abdelmalek, N. N., Round off error analysis for Gram-Schmidt method and solution of linear least squares problems, BIT, 11, 345-367 (1971) · Zbl 0236.65031
[32] Parlett, B. N., The Symmetric Eigenvalue Problem (1998), SIAM: SIAM Philadelphia · Zbl 0885.65039
[33] Daniel, J. W.; Gragg, W. B.; Kaufman, L.; Stewart, G. W., Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization, Math. Comput., 30, 772-795 (1976) · Zbl 0345.65021
[34] Hoffmann, W., Iterative algorithms for Gram-Schmidt orthogonalization, Computing, 41, 4, 335-348 (1989) · Zbl 0667.65037
[35] Stewart, G. W., Matrix Algorithms. Volume 1: Basic Decompositions (1998), SIAM: SIAM Philadelphia · Zbl 0910.65012
[36] Stewart, G. W., Block Gram-Schmidt orthogonalization, SIAM J. Sci. Comput., 31, 1, 761-775 (2008) · Zbl 1185.65069
[37] Ruhe, A., Numerical aspects of Gram-Schmidt orthogonalization of vectors, Linear Algebra Appl., 52, 591-601 (1983) · Zbl 0515.65036
[38] Vital, B., Étude de quelques méthodes de résolution de problèmes linéaires de grande taille sur multiprocesseur (1990), Université de Rennes, PhD thesis
[39] Kim, S. K.; Chronopoulos, A. T., An efficient parallel algorithm for extreme eigenvalues of sparse nonsymmetric matrices, Int. J. Supercomput. Appl., 6, 1, 98-111 (1992)
[40] Schreiber, R.; Van Loan, C., A storage-efficient WY representation for products of Householder transformations, SIAM J. Sci. Stat. Comput., 10, 1, 53-57 (1989) · Zbl 0664.65025
[41] Walker, H. F., Implementation of the GMRES method using Householder transformations, SIAM J. Sci. Stat. Comput., 9, 1, 152-163 (1988) · Zbl 0698.65021
[42] Sun, X., Aggregations of elementary transformations (1996), Duke University, Tech. Rep. DUKE-TR-1996-03
[43] Golub, G. H.; Van Loan, C. F., Matrix Computations (2013), Johns Hopkins University Press: Johns Hopkins University Press Baltimore · Zbl 1268.65037
[44] Carson, E. C., An adaptive s-step conjugate gradient algorithm with dynamic basis updating, Appl. Math., 65, 2, 123-151 (2020) · Zbl 1538.65096
[45] Kiełbasiński, A., Analiza numeryczna algorytmu ortogonałizacji Grama-Schmidta, Ser. III Mat. Stosow., II, 15-35 (1974)
[46] Giraud, L.; Langou, J.; Rozložník, M.; Van Den Eshof, J., Rounding error analysis of the classical Gram-Schmidt orthogonalization process, Numer. Math., 101, 87-100 (2005) · Zbl 1075.65060
[47] Giraud, L.; Langou, J., When modified Gram-Schmidt generates a well-conditioned set of vectors, IMA J. Numer. Anal., 22, 4, 521-528 (2002) · Zbl 1027.65050
[48] Gander, W., Algorithms for the QR-decomposition, (Seminar für Angewandte Mathematik (1980), Eidgenössische Technische Hochschule) (April 1980), Tech. Rep. 80-02
[49] Mori, D.; Yamamoto, Y.; Zhang, S. L., Backward error analysis of the AllReduce algorithm for Householder QR decomposition, Jpn. J. Ind. Appl. Math., 29, 1, 111-130 (2012) · Zbl 1260.65036
[50] Carson, E. C., Communication-avoiding Krylov subspace methods in theory and practice (2015), Department of Electrical Engineering and Computer Science, University of California: Department of Electrical Engineering and Computer Science, University of California Berkeley, PhD thesis
[51] Yamazaki, I.; Tomov, S.; Dongarra, J., Mixed-precision Cholesky QR factorization and its case studies on multicore CPU with multiple GPUs, SIAM J. Sci. Comput., 37, 3, C307-C330 (2015) · Zbl 1320.65046
[52] Yamazaki, I.; Tomov, S.; Kurzak, J.; Dongarra, J.; Barlow, J. L., Mixed-precision block Gram Schmidt orthogonalization, (Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA ’15 (2015), ACM)
[53] Yang, L. M.; Fox, A.; Sanders, G., Rounding error analysis of mixed precision block Householder QR algorithms (2021) · Zbl 1512.65066
[54] Yamazaki, I.; Hoemmen, M., Communication-avoiding & pipelined Krylov solvers in Trilinos, (SIAM Conference on Computational Science and Engineering. SIAM Conference on Computational Science and Engineering, Spokane, Washington (2019)), presentation
[55] Jolivet, P.; Roman, J. E.; Zampini, S., KSPHPDDM and PCHPDDM: extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners, Comput. Math. Appl., 84, 277-295 (2021) · Zbl 1524.65135
[56] Knyazev, A. V., Toward the optimal preconditioned eigensolver: locally optimal block preconditioned conjugate gradient method, SIAM J. Sci. Comput., 23, 2, 517-541 (2001) · Zbl 0992.65028
[57] Sakurai, T.; Sugiura, H., A projection method for generalized eigenvalue problems using numerical integration, J. Comput. Appl. Math., 159, 1, 119-128 (2003) · Zbl 1037.65040
[58] Eiermann, M.; Ernst, O. G.; Güttel, S., Deflated restarting for matrix functions, SIAM J. Matrix Anal. Appl., 32, 2, 621-641 (2011) · Zbl 1264.65070
[59] Frommer, A.; Güttel, S.; Schweitzer, M., Efficient and stable Arnoldi restarts for matrix functions based on quadrature, SIAM J. Matrix Anal. Appl., 35, 2, 661-683 (2014) · Zbl 1309.65050
[60] Frommer, A.; Güttel, S.; Schweitzer, M., Convergence of restarted Krylov subspace methods for Stieltjes functions of matrices, SIAM J. Matrix Anal. Appl., 35, 4, 1602-1624 (2014) · Zbl 1316.65040
[61] Frommer, A.; Lund, K.; Szyld, D. B., Block Krylov subspace methods for functions of matrices II: modified block FOM, SIAM J. Matrix Anal. Appl., 41, 2, 804-837 (2020) · Zbl 1441.65051
[62] Güttel, S., Rational Krylov approximation of matrix functions: numerical methods and optimal pole selection, GAMM-Mitt., 36, 1, 8-31 (2013) · Zbl 1292.65043
[63] Parks, M. L.; De Sturler, E.; Mackey, G.; Johnson, D. D.; Maiti, S., Recycling Krylov subspaces for sequences of linear systems, SIAM J. Sci. Comput., 28, 5, 1651-1674 (2006) · Zbl 1123.65022
[64] Parks, M. L.; Soodhalter, K. M.; Szyld, D. B., A block recycled GMRES method with investigations into aspects of solver performance (2016)
[65] Gutknecht, M. H.; Schmelzer, T., Updating the QR decomposition of block tridiagonal and block Hessenberg matrices, Appl. Numer. Math., 58, 6, 871-883 (2008) · Zbl 1143.65032
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.