×

Parallel tridiagonal matrix inversion with a hybrid multigrid-Thomas algorithm method. (English) Zbl 1491.65088

Summary: Tridiagonal matrix inversion is an important operation with many applications. It arises frequently in solving discretized one-dimensional elliptic partial differential equations, and forms the basis for many algorithms for block tridiagonal matrix inversion for discretized PDEs in higher-dimensions. In such systems, this operation is often the scaling bottleneck in parallel computation. In this paper, we derive a hybrid multigrid-Thomas algorithm designed to efficiently invert tridiagonal matrix equations in a highly-scalable fashion in the context of time evolving partial differential equation systems. We decompose the domain between processors, using multigrid to solve on a grid consisting of the boundary points of each processor’s local domain. We then reconstruct the solution on each processor using a direct solve with the Thomas algorithm. This algorithm has the same theoretical optimal scaling as cyclic reduction and recursive doubling. We use our algorithm to solve Poisson’s equation as part of the spatial discretization of a time-evolving PDE system. Our algorithm is faster than cyclic reduction per inversion and retains good scaling efficiency to twice as many cores.

MSC:

65M55 Multigrid methods; domain decomposition for initial value and initial-boundary value problems involving PDEs
65Y05 Parallel numerical computation
65F05 Direct numerical methods for linear systems and matrix inversion
35J05 Laplace operator, Helmholtz equation (reduced wave equation), Poisson equation
65M06 Finite difference methods for initial value and initial-boundary value problems involving PDEs

References:

[1] Wendt, J., (Computational Fluid Dynamics: An Introduction. Computational Fluid Dynamics: An Introduction, A von Karman Institute book (2008), Springer Berlin Heidelberg), URL https://books.google.co.uk/books?id=IIUkqI-HNbQC
[2] Jardin, S., (Computational Methods in Plasma Physics. Computational Methods in Plasma Physics, Chapman & Hall/CRC Computational Science (2010), CRC Press), URL https://books.google.co.uk/books?id=gZzf_B56FDcC · Zbl 1198.76002
[3] Hockney, R. W., A fast direct solution of Poisson’s equation using Fourier analysis, J. ACM, 12, 1, 95-113 (1965) · Zbl 0139.10902
[4] Bertaccini, D.; Durastante, F., (Iterative Methods and Preconditioning for Large and Sparse Linear Systems with Applications. Iterative Methods and Preconditioning for Large and Sparse Linear Systems with Applications, Chapman & Hall/CRC Monographs and Research Notes in Mathematics (2018), CRC Press), URL https://books.google.co.uk/books?id=YmpQDwAAQBAJ · Zbl 1386.65001
[5] Knott, G. D., (Interpolating Cubic Splines. Interpolating Cubic Splines, Progress in Computer Science and Applied Logic (2012), Birkhäuser Boston), URL https://books.google.co.uk/books?id=7SPUBwAAQBAJ · Zbl 1057.41001
[6] Ferguson, R., Practical Algorithms for 3D Computer Graphics, Second Edition (2013), Taylor & Francis, URL https://books.google.co.uk/books?id=NKONAgAAQBAJ
[7] Løiten, M., Technical University of Denmark (2017), Technical University of Denmark, https://github.com/CELMA-project/dissertation/releases/download/v1.1.x/17_PhD_Loeiten.pdf
[8] Gallopoulos, E.; Philippe, B.; Sameh, A., (Parallelism in Matrix Computations. Parallelism in Matrix Computations, Scientific Computation (2015), Springer Netherlands), URL https://books.google.co.uk/books?id=q9xECgAAQBAJ
[9] Sun, X.-H., Application and accuracy of the parallel diagonal dominant algorithm, Parallel Comput., 21, 8, 1241-1267 (1995) · Zbl 0873.65016
[10] Sun, X.-H.; Moitra, S., A Fast Parallel Tridiagonal Algorithm for a Class of CFD Applications, Vol. 3585 (1996), Citeseer
[11] Stone, H. S., An efficient parallel algorithm for the solution of a tridiagonal linear system of equations, J. ACM, 20, 1, 27-38 (1973) · Zbl 0269.65018
[12] Stone, H. S., Parallel tridiagonal equation solvers, ACM Trans. Math. Softw., 1, 4, 289-307 (1975) · Zbl 0315.65018
[13] Polizzi, E.; Sameh, A. H., A parallel hybrid banded system solver: the SPIKE algorithm, Parallel Comput., 32, 2, 177-194 (2006)
[14] Spring, B. S.; Polizzi, E.; Sameh, A. H., A feature complete SPIKE banded algorithm and solver (2018), ArXiv Preprint arXiv:1811.03559
[15] Dudson, B. D.; Hill, P. A.; Dickinson, D.; Parker, J. T.; Allen, A.; Breyiannia, G.; Brown, J.; Easy, L.; Farley, S.; Friedman, B.; Grinaker, E.; Izacard, O.; Joseph, I.; Kim, M.; Leconte, M.; Leddy, J.; Liten, M.; Ma, C.; Madsen, J.; Meyerson, D.; Naylor, P.; Myers, S.; Omotani, J.; Rhee, T.; Sauppe, J.; Savage, K.; Seto, H.; Schwörer, D.; Shanahan, B.; Thomas, M.; Tiwari, S.; Umansky, M.; Walkden, N.; Wang, L.; Wang, Z.; Xi, P.; Xia, T.; Xu, X.; Zhang, H.; Bokshi, A.; Muhammed, H.; Estarellas, M.; Riva, F., BOUT++ v4.3.1 (2020), Zenodo
[16] Fedorenko, R. P., The speed of convergence of one iterative process, USSR Comput. Math. Math. Phys., 4, 3, 227-235 (1964) · Zbl 0148.39501
[17] Bakhvalov, N. S., On the convergence of a relaxation method with natural constraints on the elliptic operator, USSR Comput. Math. Math. Phys., 6, 5, 101-135 (1966) · Zbl 0154.41002
[18] Hackbusch, W., Ein Iteratives Verfahren Zur Schnellen AuflÖSung Elliptischer Randwertprobleme (1976), Math. Inst., Univ.
[19] Brandt, A., Multi-level adaptive technique (MLAT) for fast numerical solution to boundary value problems, (Proceedings of the Third International Conference on Numerical Methods in Fluid Mechanics (1973), Springer), 82-89 · Zbl 0259.76013
[20] Brandt, A., Multi-level adaptive solutions to boundary-value problems, Math. Comp., 31, 138, 333-390 (1977) · Zbl 0373.65054
[21] Wesseling, P., Introduction to Multigrid MethodsNASA Technical Report (1995)
[22] Briggs, W.; Henson, V.; McCormick, S., A Multigrid Tutorial, 2nd Edition (2000), SIAM · Zbl 0958.65128
[23] Hackbusch, W., (Multi-Grid Methods and Applications. Multi-Grid Methods and Applications, Springer Series in Computational Mathematics (2013), Springer Berlin Heidelberg), URL https://books.google.co.uk/books?id=jJ36CAAAQBAJ · Zbl 0595.65106
[24] Trottenberg, U.; Ulrich Trottenberg, C.; Oosterlee, C.; Schuller, A.; Brandt, A.; Oswald, P.; Stüben, K., Multigrid (2001), Elsevier Science, URL https://books.google.co.uk/books?id=-og1wD-Nx_wC · Zbl 0976.65106
[25] Byrne, G. D.; Hindmarsh, A. C., PVODE, an ODE solver for parallel computers, Int. J. High Perform. Comput. Appl., 13, 4, 354-365 (1999), URL arXiv:https://doi.org/10.1177/109434209901300405
[26] Kang, J.-H., Parallel tri-diagonal matrix solver using cyclic reduction (CR), parallel CR (PCR), and Thomas+PCR hybrid algorithm (2019), URL https://github.com/jihoonakang/parallel_tdma_cpp
[27] Austin, T. M.; Berndt, M.; Moulton, J. D., A memory efficient parallel tridiagonal solver (2004), Preprint la-VR-03-4149
[28] Parker, J. T.; Hill, P. A.; Dickinson, D.; Dudson, B. D., Files and Plotting Scripts for “Parallel Tridiagonal Matrix Inversion with a Hybrid Multigrid-Thomas Algorithm Method” (2020), Zenodo
[29] Hoefler, T.; Dinan, J.; Buntinas, D.; Balaji, P.; Barrett, B.; Brightwell, R.; Gropp, W.; Kale, V.; Thakur, R., MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory, Computing, 95, 12, 1121-1136 (2013)
[30] Barnes, M.; Dickinson, D.; Dorland, W.; Hill, P. A.; Parker, J. T.; Roach, C. M.; Biggs-Fox, S.; Christen, N.; Numata, R.; Wilkie, G.; Anton, L.; Ball, J.; Baumgaertel, J.; Colyer, G.; Hardman, M.; Hein, J.; Highcock, E.; Howes, G.; Jackson, A.; Kotschenreuther, M. T.; Lee, J.; Leggate, H.; Mandell, N.; Mauriya, A.; Tatsuno, T.; Van Wyk, F., GS2 Gyrokinetics Software (2020), Zenodo
[31] Anton, L.; van Wyk, F.; Highcock, E.; Roach, C.; Parker, J. T., Enhancing scalability of the gyrokinetic code GS2 by using MPI shared memory for FFTs, Proc. Cray User Group (2016), URL https://cug.org/proceedings/cug2016_proceedings/includes/files/pap124s2-file1.pdf
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.