×

Task-based parallel computation of the density matrix in quantum-based molecular dynamics using graph partitioning. (English) Zbl 1383.82005

The paper under review presents a way to improve the performance of electronic structure calculations that rely on accurate density matrix calculations. In this paper, the authors are interested in studying how runtime systems can be used for automatic load-balancing in G-SP2 (graph-based second-order spectral projection) calculations to fully exploit computing resources. Some previous methods are extended which were designed to minimize the total computational cost of density matrix calculations via G-SP2 but without any considerations about the details of the parallel framework. The effectiveness of different ways to generate subproblems is also investigated. A simple locality-based scheme can be advantageous in certain cases, but packages like the standard graph partitioning software package METIS are more general and take distance into account indirectly. In the final part of this paper, some numerical approach illustrate the results.

MSC:

82-08 Computational methods (statistical mechanics) (MSC2010)
82-04 Software, source code, etc. for problems pertaining to statistical mechanics
81V55 Molecular physics
82C80 Numerical methods of time-dependent statistical mechanics (MSC2010)
82B10 Quantum equilibrium statistical mechanics (general)
15-04 Software, source code, etc. for problems pertaining to linear algebra
92E10 Molecular structure (graph-theoretic methods, methods of differential topology, etc.)
65Y05 Parallel numerical computation
82D60 Statistical mechanics of polymers

References:

[1] A. S. Banerjee, L. Lin, W. Hu, C. Yang, and J. E. Pask, {\it Chebyshev polynomial filtered subspace iteration in the discontinuous Galerkin method for large-scale electronic structure calculations}, J. Chem. Phys., 145 (2016), 154101.
[2] N. Bock and M. Challacombe, {\it An optimized sparse approximate matrix multiply for matrices with decay}, SIAM J. Sci. Comput., 35 (2013), pp. C72-C98. · Zbl 1264.65062
[3] U. Borštnik, J. VandeVondele, V. Weber, and J. Hutter, {\it Sparse matrix multiplication: The distributed block-compressed sparse row library}, Parallel Comput., 40 (2014), pp. 47-58.
[4] D. Bowler and T. Miyazaki, {\it Methods in electronic structure calculations}, Rep. Progr. Phys., 75 (2012), 036503.
[5] Z. Budimlić, M. Burke, V. Cavé, K. Knobe, G. Lowney, R. Newton, J. Palsberg, D. Peixotto, V. Sarkar, F. Schlimbach, and S. Taş\irlar, {\it Concurrent collections}, Sci. Program., 18 (2010), pp. 203-217.
[6] M. G. Burke, K. Knobe, R. Newton, and V. Sarkar, {\it Concurrent Collections Programming Model}, in Encyclopedia of Parallel Computing, Springer, New York, 2011, pp. 364-371.
[7] M. Cawkwell and A. Niklasson, {\it Energy conserving, linear scaling Born-Oppenheimer molecular dynamics}, J. Chem. Phys., 137 (2012), 134105.
[8] M. Cawkwell, E. Sanville, S. Mniszewski, and A. Niklasson, {\it Computing the density matrix in electronic structure theory on graphics processing units}, J. Chem. Theory Comput., 8 (2012), pp. 4094-4101.
[9] M. J. Cawkwell, M. A. Wood, A. Niklasson, and S. M. Mniszewski, {\it Computation of the density matrix in electronic structure theory in parallel on multiple graphics processing units}, J. Chem. Theory Comput., 10 (2014), pp. 5391-5396.
[10] H. N. Djidjev, G. Hahn, S. Mniszewski, A. Niklasson, and V. B. Sardeshmukh, {\it Graph partitioning methods for fast parallel quantum molecular dynamics}, in Proceedings of the SIAM Workshop on Combinatorial Scientific Computing, 2016.
[11] J.-L. Fattebert, D. Osei-Kuffuor, E. W. Draeger, T. Ogitsu, and W. D. Krauss, {\it Modeling dilute solutions using first-principles molecular dynamics: Computing more than a million atoms with over a million cores}, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, 2016, pp. 12-22.
[12] D. Frenkel and B. Smit, {\it Understanding Molecular Simulation: From Algorithms to Applications}, Vol. 1, Academic Press, New York, 2001. · Zbl 0889.65132
[13] L. Genovese, A. Neelov, S. Goedecker, T. Deutsch, S. A. Ghasemi, A. Willand, D. Caliste, O. Zilberberg, M. Rayson, A. Bergman, et al., {\it Daubechies wavelets as a basis set for density functional pseudopotential calculations}, J. Chem. Phys., 129 (2008), 014109.
[14] S. Gerschgorin, {\it Über die Abgrenzung der Eigenwerte einer Matrix}, Izv. Akad. Nauk. USSR Otd. Fiz.-Mat. Nauk., 6 (1931), pp. 749-754. · Zbl 0003.00102
[15] S. Goedecker, {\it Decay properties of the finite-temperature density matrix in metals}, Phys. Rev. B, 58 (1998), p. 3501. · Zbl 0920.35002
[16] S. Goedecker, {\it Linear scaling electronic structure methods}, Rev. Modern Phys., 71 (1999), pp. 1085-1123.
[17] J. L. Gustafson, {\it Fixed time, tiered memory, and superlinear speedup}, in Proceedings of the 5th Distributed Memory Computing Conference, 1990, pp. 1255-1260.
[18] P. D. Haynes, A. Mostof, C. Skylaris, and M. C. Payne, {\it ONETEP: Linear-scaling density-functional theory with plane-waves}, in J. Phys. Conf. Ser., 26 (2006), pp. 143-148.
[19] B. Hess, C. Kutzner, D. Van Der Spoel, and E. Lindahl, {\it GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation}, J. Chem. Theory Comput., 4 (2008), pp. 435-447.
[20] S. Ismail-Beigi and T. Arias, {\it Locality of the density matrix in metals, semiconductors, and insulators}, Phys. Rev. Lett., 82 (1999), pp. 2127-2130.
[21] L. V. Kale, K. Pattabiraman, and C. W. Lee, {\it Charm++ tutorial}, .
[22] G. Karypis and V. Kumar, {\it A fast and high quality multilevel scheme for partitioning irregular graphs}, SIAM J. Sci. Comput., 20 (1999), pp. 359-392. · Zbl 0915.68129
[23] G. Karypis and V. Kumar, {\it Multilevel k-way hypergraph partitioning}, VLSI Des., 11 (2000), pp. 285-300.
[24] M. Lange, G. Gorman, M. Weiland, L. Mitchell, and J. Southern, {\it Achieving efficient strong scaling with PETSc using hybrid MPI/OpenMP optimisation}, in International Supercomputing Conference, Lecture Notes in Comput. Sci. 7905, Springer, New York, 2013, pp. 97-108.
[25] S. Mniszewski, M. Cawkwell, M. Wall, J. Mohd-Yusof, N. Bock, T. Germann, and A. Niklasson, {\it Efficient parallel linear scaling construction of the density matrix for Born Oppenheimer molecular dynamics}, J. Chem. Theory Comput., 11 (2015), pp. 4644-4654.
[26] A. Niklasson, S. M. Mniszewski, C. F. Negre, M. J. Cawkwell, P. J. Swart, J. Mohd-Yusof, T. C. Germann, M. E. Wall, N. Bock, and H. Djidjev, {\it Graph-based linear scaling electronic structure theory}, J. Chem. Phys., 144 (2016), 234101.
[27] A. Niklasson, C. Tymczak, and M. Challacombe, {\it Trace resetting density matrix purification in o(n) self-consistent-field theory}, J. Chem. Phys., 118 (2003), pp. 8611-8620.
[28] A. M. Niklasson and M. Challacombe, {\it Density matrix perturbation theory}, Phys. Rev. Lett., 92 (2004), 193001.
[29] L. Pilla, T. Bozzetti, M. Castro, P. Navaux, and J.-F. Méhaut, {\it ComprehensiveBench: A benchmark for the extensive evaluation of global scheduling algorithms}, J. Phys. Conf. Ser., 649 (2015), pp. 1-12, .
[30] E. H. Rubensson and E. Rudberg, {\it Chunks and Tasks: A programming model for parallelization of dynamic algorithms}, Parallel Comput., 40 (2014), pp. 328-343.
[31] P. Sanders and C. Schulz, {\it Think locally, act globally: Highly balanced graph partitioning}, in Proceedings of the 12th International Symposium on Experimental Algorithms (SEA’13), Lecture Notes in Comput. Sci. 7933, Springer, New York, 2013, pp. 164-175.
[32] E. Sanville, N. Bock, J. Coe, E. Martinez, S. Mniszewski, C. Negre, A. Niklasson, and M. Cawkwell, {\it Los Alamos Transferable Tight-Binding for Energetics (LATTE), Los Alamos National Laboratory}, LA-UR-10-004, 2010.
[33] A. M. Sena, T. Miyazaki, and D. R. Bowler, {\it Linear scaling constrained density functional theory in conquest}, J. Chem. Theory Comput., 7 (2011), pp. 884-889.
[34] C. Xavier and S. S. Iyengar, {\it Introduction to Parallel Algorithms}, Vol. 1, John Wiley & Sons, New York, 1998. · Zbl 0948.68220
[35] Y. Zhou, Y. Saad, M. L. Tiago, and J. R. Chelikowsky, {\it Self-consistent-field calculations using Chebyshev-filtered subspace iteration}, J. Comput. Phys., 219 (2006), pp. 172-184. · Zbl 1105.65111
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.