×

High-performance generation of the Hamiltonian and overlap matrices in FLAPW methods. (English) Zbl 1376.65067

Summary: One of the greatest efforts of computational scientists is to translate the mathematical model describing a class of physical phenomena into large and complex codes. Many of these codes face the difficulty of implementing the mathematical operations in the model in terms of low level optimized kernels offering both performance and portability. Legacy codes suffer from the additional curse of rigid design choices based on outdated performance metrics (e.g. minimization of memory footprint). Using a representative code from the Materials Science community, we propose a methodology to restructure the most expensive operations in terms of an optimized combination of dense linear algebra (BLAS3) kernels. The resulting algorithm guarantees an increased performance and an extended life span of this code, enabling larger scale simulations.

MSC:

65F30 Other matrix algorithms (MSC2010)
81V70 Many-body theory; quantum Hall effect
65Y20 Complexity and performance of numerical algorithms
65Z05 Applications to the sciences
81-08 Computational methods for problems pertaining to quantum theory

References:

[2] Peise, E.; Fabregat-Traver, D.; Bientinesi, P., On the performance prediction of BLAS-based tensor contractions, high performance computing systems, Perform. Model. Benchmarking Simul., 8966, 193-212 (2015)
[3] Bischof, C.; Loan, C. V., The wy representation for products of householder matrices, SIAM J. Sci. Stat. Comput., 8, 1, s2-s13 (1987) · Zbl 0628.65033
[4] Joffrain, T.; Low, T. M.; Quintana-Ortí, E. S.; Geijn, R.v.d.; Zee, F. G.V., Accumulating householder transformations, revisited, ACM Trans. Math. Software, 32, 2, 169-179 (2006), URL http://doi.acm.org/10.1145/1141885.1141886 · Zbl 1365.65106
[5] Canning, A.; Mannstadt, W.; Freeman, A. J., Parallelization of the FLAPW method, Comput. Phys. Comm., 130, 3, 233-243 (2000) · Zbl 0956.82505
[6] Petersen, M.; Wagner, F.; Hufnagel, L.; Scheffler, M.; Blaha, P.; Schwarz, K., Improving the efficiency of FP-LAPW calculations, Comput. Phys. Comm., 126, 3, 294-309 (2000) · Zbl 1040.82500
[7] Nogueira, F.; Marques, M. A.L.; Fiolhais, C., A primer in density functional theory, (Lecture Notes in Physics (2003), Springer: Springer Berlin) · Zbl 1030.00046
[8] Sholl, D.; Steckel, J. A., (Density Functional Theory, A Practical Introduction (2011), John Wiley & Sons)
[9] Wimmer, E.; Krakauer, H.; Weinert, M.; Freeman, A. J., Full-potential self-consistent linearized-augmented-plane-wave method for calculating the electronic-structure of molecules and surfaces - O2 molecule, Phys. Rev. B, 24, 2, 864-875 (1981)
[10] Jansen, H. J.F.; Freeman, A. J., Total-energy full-potential linearized augmented-plane-wave method for bulk solids - electronic and structural-properties of tungsten, Phys. Rev. B, 30, 2, 561-569 (1984)
[11] Burke, K., Perspective on density functional theory, J. Chem. Phys., 136, 15, 150901 (2012)
[12] Hohenberg, P., Inhomogeneous electron gas, Phys. Rev., 136, 3B, B864-B871 (1964)
[13] Kohn, W.; Sham, L. J., Self-consistent equations including exchange and correlation effects, Phys. Rev., 140, A1133-A1138 (1965)
[14] Weinert, M., Solution of Poisson’s equation: Beyond Ewald-type methods, J. Math. Phys., 22, 11 (1981)
[15] Ashcroft, N.; Mermin, N., Solid State Physics, HRW international editions (1976), Holt, Rinehart and Winston · Zbl 1118.82001
[16] Kurz, P., Non-collinear magnetism at surfaces and in ultrathin films (2000), RWTH Aachen, URL http://www.fz-juelich.de/pgi/pgi-1/DE/Leistungen/MasterDiplomDr/_node.html
[17] Singh, D. J.; Nordström, L., Planewaves, Pseudopotentials and the LAPW the Method (2006), Springer: Springer US
[19] Napoli, E. Di; Berljafa, M., Block iterative eigensolvers for sequences of correlated eigenvalue problems, Comput. Phys. Comm., 184, 11, 2478-2488 (2013) · Zbl 1349.81016
[20] Berljafa, M.; Wortmann, D.; Napoli, E. Di, An optimized and scalable eigensolver for sequences of eigenvalue problems, Concurr. Comput.: Pract. Exp., 27, 4, 905-922 (2014)
[21] Blaha, P.; Hofstätter, H.; Koch, O.; Laskowski, R.; Schwarz, K., Iterative diagonalization in augmented plane wave based methods in electronic structure calculations, J. Comput. Phys., 229, 2, 453-460 (2010) · Zbl 1183.82003
[22] Knuth, D. E., Structured programming with go to statements, ACM Comput. Surv., 6, 4, 261-301 (1974) · Zbl 0301.68014
[23] Poulson, J.; Marker, B.; van de Geijn, R. A.; Hammond, J. R.; Romero, N. A., Elemental: A new framework for distributed memory dense matrix computations, ACM Trans. Math. Software, 39, 2, 13:1-13:24 (2013), URL http://doi.acm.org/10.1145/2427023.2427030 · Zbl 1295.65137
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.