×

Mixed precision incomplete and factorized sparse approximate inverse preconditioning on GPUs. (English) Zbl 1514.65032

Sousa, Leonel (ed.) et al., Euro-Par 2021: parallel processing. 27th international conference on parallel and distributed computing, Lisbon, Portugal, September 1–3, 2021. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 12820, 550-564 (2021).
Summary: In this work, we present highly efficient mixed precision GPU-implementations of an Incomplete Sparse Approximate Inverse (ISAI) preconditioner for general non-symmetric matrices and a Factorized Sparse Approximate Inverse (FPSAI) preconditioner for symmetric positive definite matrices. While working with full double precision in all arithmetic operations, we demonstrate the benefit of decoupling the memory precision and storing the preconditioner in a more compact low precision floating point format to reduce the memory access volume and therefore preconditioner application time.
For the entire collection see [Zbl 1483.68013].

MSC:

65F08 Preconditioners for iterative methods
65Y05 Parallel numerical computation

Software:

Ginkgo
Full Text: DOI

References:

[1] Suitesparse matrix collection. https://sparse.tamu.edu
[2] Anzt, H., et al.: Ginkgo: a high performance numerical linear algebra library. J. Open Source Softw. 5(52), 2260 (2020). doi:10.21105/joss.02260
[3] Anzt, H., Cojean, T., Grützmacher, T.: Technical report: Design of the accessor. LLNL Report LLNL-SR-818775, January 2021
[4] Anzt, H., Dongarra, J., Flegar, G., Higham, N.J., Quintana-Ortí, E.S.: Adaptive precision in block-jacobi preconditioning for iterative sparse linear system solvers. Concurrency Comput. Practice Exp. 31(6), e4460 (2019)
[5] Anzt, H., Dongarra, J., Flegar, G., Quintana-Ortí, E.S.: Batched gauss-jordan elimination for block-jacobi preconditioner generation on gpus. In: Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2017, pp. 1-10. Association for Computing Machinery, New York (2017). doi:10.1145/3026937.3026940
[6] Anzt, H.; Flegar, G.; Grützmacher, T.; Quintana-Ortí, ES, Toward a modular precision ecosystem for high-performance computing, Int. J. High Performance Comput. Appli., 33, 6, 1069-1078 (2019) · doi:10.1177/1094342019846547
[7] Anzt, H.; Huckle, TK; Bräckle, J.; Dongarra, J., Incomplete sparse approximate inverses for parallel preconditioning, Parallel Comput., 71, 1-22 (2018) · doi:10.1016/j.parco.2017.10.003
[8] Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. SC 2009. Association for Computing Machinery, New York (2009). doi:10.1145/1654059.1654078, doi:10.1145/1654059.1654078
[9] Flegar, G., Anzt, H., Cojean, T., Quintana-Ortí, E.S.: Customized-precision Block-Jacobi preconditioning for Krylov iterative solvers on data-parallel manycore processors. ACM Trans. Math. Softw. (2020). under review. Available from the authors · Zbl 07467974
[10] Green, O., McColl, R., Bader, D.A.: GPU merge path: A GPU merging algorithm. In: Proceedings of the 26th ACM International Conference on Supercomputing, ICS 2012, pp. 331-340. ACM. doi:10.1145/2304576.2304621
[11] Grote, MJ; Huckle, T., Parallel preconditioning with sparse approximate inverses, SIAM J. Sci. Comput., 18, 3, 838-853 (1997) · Zbl 0872.65031 · doi:10.1137/S1064827594276552
[12] Kolotilina, L.Y., Yeremin, A.Y.: Factorized sparse approximate inverse preconditionings i. theory. SIAM J. Matrix Anal. Appl. 14(1), 45-58 (1993). doi:10.1137/0614004 · Zbl 0767.65037
[13] NVIDIA Corp.: Whitepaper: NVIDIA TESLA V100 GPU ARCHITECTURE (2017)
[14] Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM (2003) · Zbl 1031.65046
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.