×

Leveraging HPC accelerator architectures with modern techniques – hydrologic modeling on GPUs with ParFlow. (English) Zbl 1473.65363

Comput. Geosci. 25, No. 5, 1579-1590 (2021); correction ibid. 25, No. 5, 1591 (2021).
Summary: Rapidly changing heterogeneous supercomputer architectures pose a great challenge to many scientific communities trying to leverage the latest technology in high-performance computing. Many existing projects with a long development history have resulted in a large amount of code that is not directly compatible with the latest accelerator architectures. Furthermore, due to limited resources of scientific institutions, developing and maintaining architecture-specific ports is generally unsustainable. In order to adapt to modern accelerator architectures, many projects rely on directive-based programming models or build the codebase tightly around a third-party domain-specific language or library. This introduces external dependencies out of control of the project. The presented paper tackles the issue by proposing a lightweight application-side adaptor layer for compute kernels and memory management resulting in a versatile and inexpensive adaptation of new accelerator architectures with little draw backs. A widely used hydrologic model demonstrates that such an approach pursued more than 20 years ago is still paying off with modern accelerator architectures as demonstrated by a very significant performance gain from NVIDIA A100 GPUs, high developer productivity, and minimally invasive implementation; all while the codebase is kept well maintainable in the long-term.

MSC:

65Y10 Numerical algorithms for specific classes of architectures
86-04 Software, source code, etc. for problems pertaining to geophysics

References:

[1] PRACE, The scientific case for computing in Europe 2018-2026 (2018)
[2] Lawrence, BN; Rezny, M.; Budich, R.; Bauer, P.; Behrens, J.; Carter, M.; Deconinck, W.; Ford, R.; Maynard, C.; Mullerworth, S.; Osuna, C.; Porter, A.; Serradell, K.; Valcke, S.; Wedi, N.; Wilson, S., Crossing the chasm: how to develop weather and climate models for next generation computers?, Geosci. Model Dev., 11, 5, 1799-1821 (2018) · doi:10.5194/gmd-11-1799-2018
[3] MPI Forum (1994) MPI: a message-passing interface standard. Tech. rep., University of Tennessee
[4] Leiserson, CE; Thompson, NC; Emer, JS; Kuszmaul, BC; Lampson, BW; Sanchez, D.; Schardl, TB, There’s plenty of room at the top: what will drive computer performance after Moore’s law?, Science, 368, 6495, eaam9744 (2020) · doi:10.1126/science.aam9744
[5] Rathgeber, F.; Ham, DA; Mitchell, L.; Lange, M.; Luporini, F.; McRae, AT; Bercea, GT; Markall, GR; Kelly, PH, Firedrake: automating the finite element method by composing abstractions, ACM Trans. Math. Softw., 43, 3, 1-27 (2016) · Zbl 1396.65144 · doi:10.1145/2998441
[6] Thaler F, Moosbrugger S, Osuna C, Bianco M, Vogt H, Afanasyev A, Mosimann L, Fuhrer O, Schulthess TC, Hoefler T (2019) Porting the COSMO weather model to manycore CPUs. In: proceedings of the platform for advanced scientific computing conference, PASC 2019, Association for Computing Machinery, Inc, New York, NY, USA, pp 1-11, doi:10.1145/3324989.3325723, URL doi:10.1145/3324989.3325723
[7] Adams, SV; Ford, RW; Hambley, M.; Hobson, JM; Kavcic, I.; Maynard, CM; Melvin, T.; Mueller, EH; Mullerworth, S.; Porter, AR; Rezny, M.; Shipway, BJ; Wong, R., LFRic: meeting the challenges of scalability and performance portability in weather and climate models, J. Parallel. Distr. Com., 132, 383-396 (2018) · doi:10.1016/j.jpdc.2019.02.007
[8] Zenker, E., Worpitz, B., Widera, R., Huebl, A., Juckeland, G., Knupfer, A., Nagel, W.E., Bussmann, M.: Alpaka - an abstraction library for parallel kernel acceleration. In: proceedings - 2016 IEEE 30th international parallel and distributed processing symposium, IPDPS 2016, Institute of Electrical and Electronics Engineers Inc., pp 631-640. (2016). doi:10.1109/IPDPSW.2016.50
[9] Edwards, HC; Sunderland, D.; Porter, V.; Amsler, C.; Mish, S., Manycore performance-portability: Kokkos multidimensional array library, Sci. Program., 20, 2, 89-114 (2012) · doi:10.3233/SPR-2012-0343
[10] Beckingsale DA, Scogland TR, Burmark J, Hornung R, Jones H, Killian W, Kunen AJ, Pearce O, Robinson P, Ryujin BS (2019) RAJA: portable performance for large-scale scientific applications. In: Proceedings of P3HPC 2019: International Workshop on Performance, Portability and Productivity in HPC - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis, Institute of Electrical and Electronics Engineers Inc., pp 71-81, doi:10.1109/P3HPC49587.2019.00012
[11] Kuffour, BNO; Engdahl, NB; Woodward, CS; Condon, LE; Kollet, S.; Maxwell, RM, Simulating coupled surface-subsurface flows with ParFlow v3.5.0: capabilities, applications, and ongoing development of an open-source, massively parallel, integrated hydrologic model, Geosci. Model Dev., 13, 3, 1373-1397 (2020) · doi:10.5194/gmd-13-1373-2020
[12] Woodward CS (1998) A Newton-Krylov-multigrid solver for variably saturated flow problems. Transactions on Ecology and the Environment 17
[13] Kollet, SJ; Maxwell, RM, Integrated surface-groundwater flow modeling: a free-surface overland flow boundary condition in a parallel groundwater flow model, Adv. Water Resour., 29, 7, 945-958 (2006) · doi:10.1016/j.advwatres.2005.08.006
[14] Maxwell, RM, A terrain-following grid transform and preconditioner for parallel, large-scale, integrated hydrologic modeling, Adv. Water Resour., 53, 109-117 (2013) · doi:10.1016/j.advwatres.2012.10.001
[15] Pleiter D, Herten A (2020) Enabling applications for the JUWELS booster [A21365]. NVIDIA GPU Technology Conference
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.