×

Optimal broadcast for fully connected processor-node networks. (English) Zbl 1243.68033

Summary: We develop and implement an optimal broadcast algorithm for fully connected processor networks under a bidirectional communication model in which each processor can simultaneously send a message to one processor and receive a message from another, possibly different processor. For any number of processors \(p\) the algorithm requires \(N - 1+\lceil \log p\rceil\) communication rounds to broadcast \(N\) blocks of data from a root processor to the remaining processors, meeting the lower bound in the model. For data of size \(m\), assuming that sending and receiving data of size \(m^{\prime}\) takes time \(\alpha +\beta m^{\prime}\), the best running time that can be achieved by the division of \(m\) into equal-sized blocks is \(\sqrt{(\lceil \log p\rceil -1)\alpha} + \sqrt{\beta m})^2\). The algorithm uses a regular, circulant graph communication pattern, and degenerates into a binomial tree broadcast when the number of blocks to be broadcast is one. The algorithm is furthermore well suited to fully connected clusters of SMP (symmetric multi-processor) nodes. The algorithm is implemented as part of an MPI (message passing interface) library. We demonstrate significant practical bandwidth improvements of up to a factor 1.5 over several other, commonly used broadcast algorithms on both a small SMP cluster and a 72 node NEC SX vector supercomputer.

MSC:

68M10 Network design and communication in computer systems
68W05 Nonnumerical algorithms
68W40 Analysis of algorithms

Software:

MPI
Full Text: DOI

References:

[1] Alexandrov, A.; Ionescu, M. F.; Schauser, K. E.; Scheiman, C. J.: Loggp: incorporating long messages into the logp model for parallel computation, Journal of parallel and distributed computing 44, No. 1, 71-79 (1997)
[2] Bar-Noy, A.; Bruck, J.; Ho, C. -T.; Kipnis, S.; Schieber, B.: Computing global combine operations in the multiport postal model, IEEE transactions on parallel and distributed systems 6, No. 8, 896-900 (1995)
[3] Bar-Noy, A.; Kipnis, S.: Broadcasting multiple messages in simultaneous send/receive systems, Discrete applied mathematics 55, No. 2, 95-105 (1994) · Zbl 0815.68016 · doi:10.1016/0166-218X(94)90001-9
[4] Bar-Noy, A.; Kipnis, S.: Designing broadcasting algorithms in the postal model for message-passing systems, Mathematical systems theory 27, No. 5, 431-452 (1994) · Zbl 0812.68079 · doi:10.1007/BF01184933
[5] Bar-Noy, A.; Kipnis, S.; Schieber, B.: Optimal multiple message broadcasting in telephone-like communication systems, Discrete applied mathematics 100, No. 1–2, 1-15 (2000) · Zbl 0986.90008 · doi:10.1016/S0166-218X(99)00155-9
[6] Barnett, M.; Gupta, S.; Payne, D. G.; Schuler, L.; Van De Geijn, R.; Watts, J.: Building a high-performance collective communication library, (1994)
[7] Barnett, M.; Payne, D. G.; Van De Geijn, R. A.; Watts, J.: Broadcasting on meshes with wormhole routing, Journal of parallel and distributed computing 35, No. 2, 111-122 (1996)
[8] Bruck, J.; Coster, L. D.; Dewulf, N.; Ho, C. -T.; Lauwereins, R.: On the design and implementation of broadcast and global combine operations using the postal model, IEEE transactions on parallel and distributed systems 7, No. 3, 256-265 (1996)
[9] J. Bruck, R. Cypher, C.-T. Ho, Multiple message broadcasting with generalized fibonacci trees, in: Symposium on Parallel and Distributed Processing (SPDP), 1992, pp. 424–431
[10] Buntinas, D.; Panda, D. K.; Duato, J.; Sadayappan, P.: Broadcast/multicast over myrinet using nic-assisted multidestination messages, (2000)
[11] E. Chan, R.A. van de Geijn, W. Gropp, R. Thakur, Collective communication on architectures that support simultaneous communication over multiple link, in: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2006, pp. 2–11
[12] E.W. Chan, M.F. Heimlich, A. Purkayastha, R.A. van de Geijn, On optimizing collective communication, in: IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004
[13] Culler, D. E.; Karp, R. M.; Patterson, D.; Sahay, A.; Santos, E. E.; Schauser, K. E.; Subramonian, R.; Von Eicken, T.: Logp: A practical model of parallel computation, Communications of the ACM 39, No. 11, 78-85 (1996)
[14] Fraigniaud, P.; Lazard, E.: Methods and problems of communication in usual networks, Discrete applied mathematics 53, No. 1–3, 79-133 (1994) · Zbl 0818.94029 · doi:10.1016/0166-218X(94)90180-5
[15] Gołebiewski, M.; Ritzdorf, H.; Träff, J. L.; Zimmermann, F.: The MPI/SX implementation of MPI for nec’s SX-6 and other NEC platforms, NEC research development 44, No. 1, 69-74 (2003)
[16] Gropp, W.; Lusk, E.: Reproducible measurements of MPI performance characteristics, Lecture notes in computer science 1697, 11-18 (1999)
[17] Hedetniemi, S. M.; Hedetniemi, T.; Liestman, A. L.: A survey of gossiping and broadcasting in communication networks, Networks 18, 319-349 (1988) · Zbl 0649.90047 · doi:10.1002/net.3230180406
[18] T. Hoefler, C. Siebert, W. Rehm, A practically constant-time MPI broadcast algorithm for large-scale InfiniBand clusters with multicast, in: Workshop on Communication Architecture for Clusters (CAC) held in conjunction with International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007
[19] Jia, B.: Process cooperation in multiple message broadcast, Lecture notes in computer science 4757, 27-35 (2007)
[20] Johnsson, S. L.; Ho, C. -T.: Optimum broadcasting and personalized communication in hypercubes, IEEE transactions on computers 38, No. 9, 1249-1268 (1989) · Zbl 1395.68034
[21] Juhász, S.; Kovács, F.: Asynchronous distributed broadcasting in cluster environment, Lecture notes in computer science 3241, 164-172 (2004)
[22] Kwon, O. -H.; Chwa, K. -Y.: Multiple message broadcasting in communication networks, Networks 26, 253-261 (1995) · Zbl 0856.90048 · doi:10.1002/net.3230260409
[23] J. Liu, A.R. Mamidala, D.K. Panda, Fast and scalable MPI-level broadcast using InfiniBand’s hardware multicast support, in: 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
[24] P. Patarasu, A. Faraj, X. Yuan, Pipelined broadcast on ethernet switched clusters, in: International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006, p. 128
[25] J. Pješivac-Grbović, T. Angskun, G. Bosilca, G.E. Fagg, E. Gabriel, J. Dongarra, Performance analysis of MPI collective operations, in: International Parallel and Distributed Processing Symposium (IPDPS 2005), Workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems (PMEO), 2005
[26] H. Ritzdorf, J.L. Träff, Collective operations in NEC’s high-performance MPI libraries, in: International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006, p. 100
[27] Sanders, P.; Sibeyn, J. F.: A bandwidth latency tradeoff for broadcast and reduction, Information processing letters 86, No. 1, 33-38 (2003) · Zbl 1173.68847 · doi:10.1016/S0020-0190(02)00473-8
[28] Santos, E. E.: Optimal and near-optimal algorithms for k-item broadcast, Journal of parallel and distributed computing 57, No. 2, 121-139 (1999) · Zbl 0929.68124 · doi:10.1006/jpdc.1999.1529
[29] Snir, M.; Otto, S.; Huss-Lederman, S.; Walker, D.; Dongarra, J.: MPI–the complete reference, The MPI core 1 (1998)
[30] Thakur, R.; Gropp, W. D.; Rabenseifner, R.: Improving the performance of collective operations in MPICH, International journal on high performance computing applications 19, 49-66 (2004)
[31] V. Tipparaju, J. Nieplocha, D.K. Panda, Fast collective operations using shared and remote memory access protocols on clusters, in: 17th International Parallel and Distributed Processing Symposium (IPDPS03), 2003, p. 84
[32] Träff, J. L.: A simple work-optimal broadcast algorithm for message-passing parallel systems, Lecture notes in computer science 3241, 173-180 (2004)
[33] Träff, J. L.; Ripke, A.: An optimal broadcast algorithm adapted to SMP-clusters, Lecture notes in computer science 666, 48-56 (2005)
[34] Träff, J. L.; Ripke, A.: Optimal broadcast for fully connected networks, Lecture notes in computer science 3726, 45-56 (2005)
[35] Watts, J.; Van De Geijn, R. A.: A pipelined broadcast for multidimensional meshes, Parallel processing letters 5, 281-292 (1995)
[36] M.-S. Wu, R.A. Kendall, K. Wright, Optimizing collective communications on SMP clusters, in: 34th International Conference on Parallel Processing (ICPP 2005), 2005, pp. 399–407
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.