×

Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines. (English) Zbl 1425.65186

Summary: This paper derives theoretical estimates of the computational cost for isogeometric multi-frontal direct solver executed on parallel distributed memory machines. We show theoretically that for the \(C^{p - 1}\) global continuity of the isogeometric solution, both the computational cost and the communication cost of a direct solver are of order \(\mathcal{O}(\log(N) p^2)\) for the one dimensional (1D) case \(, \mathcal{O}(N p^2)\) for the two dimensional (2D) case, and \(\mathcal{O}(N^{4 / 3} p^2)\) for the three dimensional (3D) case, where \(N\) is the number of degrees of freedom and \(p\) is the polynomial order of the B-spline basis functions. The theoretical estimates are verified by numerical experiments performed with three parallel multi-frontal direct solvers: MUMPS, PaStiX and SuperLU, available through PETIGA toolkit built on top of PETSc. Numerical results confirm these theoretical estimates both in terms of \(p\) and \(N\). For a given problem size, the strong efficiency rapidly decreases as the number of processors increases, becoming about 20% for 256 processors for a 3D example with \(128^3\) unknowns and linear B-splines with \(C^0\) global continuity, and 15% for a 3D example with \(64^3\) unknowns and quartic B-splines with \(C^3\) global continuity. At the same time, one cannot arbitrarily increase the problem size, since the memory required by higher order continuity spaces is large, quickly consuming all the available memory resources even in the parallel distributed memory version. Numerical results also suggest that the use of distributed parallel machines is highly beneficial when solving higher order continuity spaces, although the number of processors that one can efficiently employ is somehow limited.

MSC:

65N30 Finite element, Rayleigh-Ritz and Galerkin methods for boundary value problems involving PDEs
65D17 Computer-aided design (modeling of curves and surfaces)
65Y05 Parallel numerical computation
74S05 Finite element methods applied to problems in solid mechanics

References:

[1] Cottrell, J. A.; Hughes, T. J.R.; Bazilevs, Y., Isogeometric analysis: toward unification of CAD and FEA, (2009), John Wiley and Sons · Zbl 1378.65009
[2] El maliki, A.; Fortin, M.; Tardieu, N.; Fortin, A., Iterative solvers for 3D linear and nonlinear elasticity problems: displacement and mixed formulations, Internat. J. Numer. Methods Engrg., 83, 1780-1802, (2010) · Zbl 1202.74005
[3] Hiptmair, R., Multigrid method for maxwell’s equations, SIAM J. Numer. Anal., 36, 1, 204-225, (1998) · Zbl 0922.65081
[4] Arnold, D. N.; Falk, R. S.; Winther, R., Multigrid in \(H(\operatorname{div})\) and \(H(\operatorname{curl})\), Numer. Math., 85, 2, 197-217, (2000) · Zbl 0974.65113
[5] Bazilevs, Z.; Michler, C.; Calo, V. M.; Hughes, T. J.R., Isogeometric variational multiscale modeling of wall-bounded turbulent flows with weakly enforced boundary conditions on unstretched meshes, Comput. Methods Appl. Mech. Engrg., 13-16, 780-790, (2010) · Zbl 1406.76023
[6] V.M. Calo, H. Gómez, Y. Bazilevs, G.P. Johnson, T.J.R. Hughes, Simulation of engineering applications using isogeometric analysis, in: Proceedings of Tera Grid, 2008.
[7] Collier, N.; Dalcin, L.; Pardo, D.; Calo, V. M., The cost of continuity: performance of iterative solvers on isogeometric finite elements, SIAM J. Sci. Comput., 35, 2, A767-A784, (2013) · Zbl 1266.65221
[8] Buffa, A.; Harbrecht, H.; Kunoth, A.; Sangalli, G., BPX-preconditioning for isogeometric analysis, Comput. Methods Appl. Mech. Engrg., 265, 63-70, (2013) · Zbl 1286.65151
[9] Gao, L.; Calo, V. M., Preconditioners based on the alternating-direction-implicit algorithm for the 2D steady-state diffusion equation with orthotropic heterogeneous coefficients, J. Comput. Appl. Math., 273, 274-295, (2015) · Zbl 1295.65111
[10] Paszyńska, A.; Paszyński, M.; Jopek, K.; Woźniak, M.; Goik, D.; Gurgul, P.; AbouEisha, H.; Moshkov, M.; Calo, V. M.; Lenerth, A.; Nguyen, D.; Pingali, K., Quasi-optimal elimination trees for 2D grids with singularities, Sci. Program., (2014), in press
[11] Goik, D.; Jopek, K.; Paszyński, M.; Lenharth, A.; Nguyen, D.; Pingali, K., Graph grammar based multi-thread multi-frontal direct solver with Galois scheduler, Procedia Comput. Sci., 29, 960-969, (2014)
[12] AbouEisha, H.; Moshkov, M.; Calo, V.; Paszynski, M.; Goik, D.; Jopek, K., Dynamic programming algorithm for generation of optimal elimination trees for multi-frontal direct solver over H-refined grids, Procedia Comput. Sci., 29, 947-959, (2014)
[13] Golub, G. H.; Van Loan, C. F., Matrix computations, (1996), Johns Hopkins University Press · Zbl 0865.65009
[14] Irons, B., A frontal solution program for finite-element analysis, Internat. J. Numer. Methods Engrg., 2, 5-32, (1970) · Zbl 0252.73050
[15] Duff, I. S.; Reid, J. K., The multifrontal solution of indefinite sparse symmetric linear, ACM Trans. Math. Software, 9, 3, 302-325, (1983) · Zbl 0515.65022
[16] Geng, P.; Oden, T. J.; van de Geijn, R. A., A parallel multifrontal algorithm and its implementation, Comput. Methods Appl. Mech. Engrg., 19, 289-301, (1997) · Zbl 0925.65077
[17] Duff, I. S.; Reid, J. K., The multifrontal solution of unsymmetric sets of linear systems, SIAM J. Sci. Comput., 5, 633-641, (1984) · Zbl 0557.65017
[18] Amestoy, P. R.; Guermouche, A.; L’Excellent, J.-Y.; Pralet, S., Hybrid scheduling for the parallel solution of linear systems, Comput. Methods Appl. Mech. Engrg., 2, 32, 136-156, (2001)
[19] Lin, L.; Yang, L. C.; Lu, J.; Ying, L.; Weinan, E., A fast parallel algorithm for selected inversion of structured sparse matrices wtih application to 2D electronic structure calculations, SIAM J. Sci. Comput., 33, 3, 1329-1351, (2011) · Zbl 1230.65039
[20] Bientinesi, P.; Eijkhout, V.; Kim, K.; Kurtz, J.; van de Geijn, R., Sparse direct factorizations through unassembled hyper-matrices, Comput. Methods Appl. Mech. Engrg., 199, 430-438, (2010) · Zbl 1227.65111
[21] Woźniak, M.; Kuźnik, K.; Paszyński, M.; Calo, V. M.; Pardo, D., Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers, Comput. Math. Appl., 67, 10, 1864-1883, (2014) · Zbl 1367.65257
[22] Paszyński, M.; Pardo, D.; Paszyńska, A., Parallel multi-frontal solver for \(p\) adaptive finite element modeling of multi-physics computational problems, J. Comput. Sci., 1, 48-54, (2010)
[23] Paszyński, M.; Pardo, D.; Torres-Verdin, C.; Demkowicz, L.; Calo, V., A parallel direct solver for self-adaptive \(h p\) finite element method, J. Parallel Distrib. Comput., 70, 270-281, (2010) · Zbl 1233.68079
[24] Paszyński, M.; Schaefer, R., Graph grammar driven partial differential eqautions solver, Concurr. Comput.: Pract. Exp., 22, 9, 1063-1097, (2010)
[25] Collier, N. O.; Pardo, D.; Paszyński, M.; Dalcín, L.; Calo, V. M., The cost of continuity: a study of the performance of isogeometric finite elements using direct solvers, Comput. Methods Appl. Mech. Engrg., 213-216, 353-361, (2012) · Zbl 1243.65137
[26] STAMPEDE Linux cluster user guide, Texas Advanced Computing Center, https://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide, 2014.
[27] N. Collier, L. Dalcin, V.M. Calo, PetIGA: high-performance isogeometric analysis, http://arxiv.org/abs/1305.4452, 2013. · Zbl 1266.65221
[28] S. Balay, S. Abhyankar, M.F. Adams, J. Brown, P. Brune, K. Buschelman, V. Eijkhout, W.D. Gropp, D. Kaushik, M.G. Knepley, L. Curfman McInnes, K. Rupp, B.F. Smith, H. Zhang, PETSc Web Page, http://www.mcs.anl.gov/petsc, 2014.
[29] S. Balay, S. Abhyankar, M.F. Adams, J. Brown, P. Brune, K. Buschelman, V. Eijkhout, W.D. Gropp, D. Kaushik, M.G. Knepley, L. Curfman McInnes, K. Rupp, B.F. Smith, H. Zhang, PETSc User Manual, Argonne National Laboratory ANL-95/11—Revision 3.4, 2013.
[30] Balay, S.; Gropp, W. D.; Curfman McInnes, L.; Smith, B. F., Efficient management of parallelism in object oriented numerical software libraries, (Arge, E.; Bruaset, A. M.; Langtangen, H. P., Modern Software Tools in Scientific Computing, (1997), Birkhäuser Press) · Zbl 0882.65154
[31] Amestoy, P. R.; Duff, I. S., Multifrontal parallel distributed symmetric and unsymmetric solvers, Comput. Methods Appl. Mech. Engrg., 184, 501-520, (2000) · Zbl 0956.65017
[32] Amestoy, P. R.; Duff, I. S.; Koster, J.; L’Excellent, J. Y., A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM J. Matrix Anal. Appl., 1, 23, 15-41, (2001) · Zbl 0992.65018
[33] Blackford, L. S.; Choi, J.; Cleary, A.; D’Azevedo, E.; Demmel, J.; Dhillon, I.; Dongarra, J.; Hammarling, S.; Henry, G.; Petitet, A.; Stanley, K.; Walker, D.; Whaley, R. C., Scalapack users’ guide, (1999), Society for Industrial and Applied Mathematics Philadelphia, PA · Zbl 0886.65022
[34] Li, Xiaoye S., An overview of superlu: algorithms, implementation, and user interface, TOMS Trans. Math. Software, 31, 3, 302-325, (2005) · Zbl 1136.65312
[35] X.S. Li, J.W. Demmel, J.R. Gilbert, iL. Grigori, M. Shao, I. Yamazaki, SuperLU Users’ Guide, Lawrence Berkeley National Laboratory, LBNL-44289, http://crd.lbl.gov/ xiaoye/SuperLU/, 1999.
[36] Hénon, P.; Ramet, P.; Roman, J., Pastix: a high-performance parallel direct solver for sparse symmetric definite systems, Parallel Comput., 28, 2, 301-321, (2002) · Zbl 0984.68208
[37] Paszyński, M., Minimizing the memory usage by out-of-core multi-frontal direct solver, Comput. Assist. Mech. Eng. Sci., 20, 15-41, (2013)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.