×

Round compression for parallel matching algorithms. (English) Zbl 1445.68331

Summary: For over a decade now we have been witnessing the success of massive parallel computation frameworks, such as MapReduce, Hadoop, Dryad, or Spark. Compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises however in this context is can we leverage this additional power to obtain even faster parallel algorithms? A prominent example here is the maximum matching problem. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in \(O(\log{n})\) rounds. S. Lattanzi et al. [“Filtering: a method for solving graph problems in MapReduce”, in: Proceedings of the 23rd annual ACM symposium on parallelism in algorithms and architectures, SPAA’11. New York, NY: Association for Computing Machinery (ACM). 85–94 (2011; doi:10.1145/1989493.1989505)] showed that if each machine has \(n^{1+\Omega(1)}\) memory, this problem can also be solved 2-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow-up work, seem though to get stuck in a fundamental way at roughly \(O(\log{n})\) rounds once we enter the (at most) near-linear memory regime. In this paper, we break the above \(O(\log n)\) round complexity bound even in the case of slightly sublinear memory per machine. In fact, our improvement here is almost exponential: we are able to deliver a \((1+\epsilon)\)-approximate maximum matching for any fixed constant \(\epsilon>0\) in \(O((\log \log n)^2)\) rounds. To establish our result we need to deviate from the previous work in two important ways. First, we use vertex-based graph partitioning, instead of the edge-based approaches that were utilized so far. Second, we develop a technique of round compression.

MSC:

68W10 Parallel algorithms in computer science
05C70 Edge subsets with special properties (factorization, matching, partitioning, covering and packing, etc.)
68W25 Approximation algorithms
68W40 Analysis of algorithms
Full Text: DOI

References:

[1] K. J. Ahn and S. Guha, Access to data and number of iterations: Dual primal algorithms for maximum matching under resource constraints, in Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2015, Portland, OR, 2015, ACM, New York, 2015, pp. 202-211, https://doi.org/10.1145/2755573.2755586.
[2] N. Alon, L. Babai, and A. Itai, A fast and simple randomized parallel algorithm for the maximal independent set problem, J. Algorithms, 7 (1986), pp. 567-583, https://doi.org/10.1016/0196-6774(86)90019-2. · Zbl 0631.68063
[3] A. Andoni, A. Nikolov, K. Onak, and G. Yaroslavtsev, Parallel algorithms for geometric graph problems, in Proceedings of the 46th ACM Symposium on Theory of Computing, STOC 2014, New York, 2014, ACM, New York, 2014, pp. 574-583, https://doi.org/10.1145/2591796.2591805. · Zbl 1315.05133
[4] S. Assadi, Simple Round Compression for Parallel Vertex Cover, preprint, https://arxiv.org/abs/1709.04599, (2017)
[5] S. Assadi, M. Bateni, A. Bernstein, V. Mirrokni, and C. Stein, Coresets meet EDCS: Algorithms for matching and vertex cover on massive graphs, in Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, Philadelphia, 2019, pp. 1616-1635. · Zbl 1431.68143
[6] S. Assadi and S. Khanna, Randomized composable coresets for matching and vertex cover, in Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2017, Washington DC, ACM, New York, 2017, pp. 3-12, https://doi.org/10.1145/3087556.3087581.
[7] P. Beame and J. Hastad, Optimal bounds for decision problems on the CRCW PRAM, in Proceedings of the 19th Annual ACM Symposium on Theory of Computing, STOC 1987, New York, ACM, New York, 1987, pp. 83-93, https://doi.org/10.1145/28395.28405. · Zbl 0825.68378
[8] P. Beame, P. Koutris, and D. Suciu, Communication steps for parallel query processing, in Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2013, New York, ACM, New York, 2013, pp. 273-284, https://doi.org/10.1145/2463664.2465224. · Zbl 1426.68074
[9] P. Beame, P. Koutris, and D. Suciu, Skew in parallel query processing, in Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS’14, Snowbird, UT, ACM, New York, 2014, pp. 212-223, https://doi.org/10.1145/2594538.2594558.
[10] S. Behnezhad, M. Derakhshan, and M. Hajiaghayi, Brief Announcement: Semi-mapreduce Meets Congested Clique, preprint, https://arxiv.org/abs/1802.10297, 2018.
[11] S. Behnezhad, M. Hajiaghayi, and D. G. Harris, Exponentially Faster Massively Parallel Maximal Matching, , http://arxiv.org/abs/1901.03744 (2019).
[12] A. Bernstein and C. Stein, Fully dynamic matching in bipartite graphs, in Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming, ICALP 2015, Kyoto, Japan, 2015, Proceedings, Part I, Springer, Berlin, 2015, pp. 167-179, https://doi.org/10.1007/978-3-662-47672-7_14. · Zbl 1372.68203
[13] A. Bernstein and C. Stein, Faster fully dynamic matchings with small approximation ratios, in Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, SIAM, Philadelphia, 2016, pp. 692-711, https://doi.org/10.1137/1.9781611974331.ch50. · Zbl 1409.68203
[14] S. Bhattacharya, D. Chakrabarty, and M. Henzinger, Deterministic fully dynamic approximate vertex cover and fractional matching in \(O(1)\) amortized update time, in Proceedings of the 19th International Conference on Integer Programming and Combinatorial Optimization, IPCO 2017, Waterloo, ON, Canada, Springer, Cham, Switzerland, 2017, pp. 86-98, https://doi.org/10.1007/978-3-319-59250-3_8. · Zbl 1418.90269
[15] S. Bhattacharya, M. Henzinger, and G. F. Italiano, Deterministic fully dynamic data structures for vertex cover and matching, in Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, SIAM, Philadelphia, 2015, pp. 785-804, https://doi.org/10.1137/1.9781611973730.54. · Zbl 1372.68071
[16] S. Bhattacharya, M. Henzinger, and D. Nanongkai, New deterministic approximation algorithms for fully dynamic matching, in Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, ACM, New York, 2016, pp. 398-411, https://doi.org/10.1145/2897518.2897568. · Zbl 1376.68169
[17] S. Bhattacharya, M. Henzinger, and D. Nanongkai, Fully dynamic approximate maximum matching and minimum vertex cover in \(O(\log^3 n)\) worst case update time, in Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, SIAM, Philadelphia, 2017, pp. 470-489, https://doi.org/10.1137/1.9781611974782.30. · Zbl 1409.68204
[18] J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, in Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, Volume 6, OSDI’04, Berkeley, CA, 2004, USENIX Association, Berkeley, CA, pp. 10-10, http://dl.acm.org/citation.cfm?id=1251254.1251264.
[19] J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, Comm. ACM, 51 (2008), pp. 107-113, https://doi.org/10.1145/1327452.1327492.
[20] J. Edmonds, Paths, trees and flowers, Canadian J. Math., 17 (1965), pp. 449-467. · Zbl 0132.20903
[21] J. Feldman, S. Muthukrishnan, A. Sidiropoulos, C. Stein, and Z. Svitkina, On distributing symmetric streaming computations, ACM Trans. Algorithms, 6 (2010), 66, https://doi.org/10.1145/1824777.1824786. · Zbl 1300.68030
[22] M. Fischer, Improved deterministic distributed matching via rounding, Distrib. Comput., (2017), pp.1-13.
[23] M. Ghaffari, T. Gouleakis, C. Konrad, S. Mitrovic, and R. Rubinfeld, Improved massively parallel computation algorithms for MIS, matching, and vertex cover, in Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC 2018, Egham, UK, ACM, New York, 2018, pp. 129-138, https://doi.org/10.1145/3212734.3212743. · Zbl 1428.68376
[24] M. Ghaffari and J. Uitto, Sparsifying distributed algorithms with ramifications in massively parallel computation and centralized local computation, in Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, CA, SIAM, Philadelphia, 2019, pp. 1636-1653, https://doi.org/10.1137/1.9781611975482.99. · Zbl 1431.68133
[25] M. T. Goodrich, N. Sitchinava, and Q. Zhang, Sorting, searching, and simulation in the MapReduce framework, in International Symposium on Algorithms and Computation, Springer, Berlin, 2011, pp. 374-383. · Zbl 1350.68085
[26] M. Hanckowiak, M. Karonski, and A. Panconesi, On the distributed complexity of computing maximal matchings, SIAM J. Discrete Math., 15 (2001), pp. 41-57, https://doi.org/10.1137/S0895480100373121. · Zbl 0987.05079
[27] M. Hańćkowiak, M. Karoński, and A. Panconesi, A faster distributed algorithm for computing maximal matchings deterministically, in Proceedings of the Eighteenth Annual ACM Symposium on Principles of Distributed Computing, PODC ’99, New York, ACM, New York, 1999, pp. 219-228, https://doi.org/10.1145/301308.301360. · Zbl 1321.68469
[28] Z. Huang, B. Radunović, M. Vojnović, and Q. Zhang, Communication complexity of approximate matching in distributed graphs, in Proceedings of the 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015, Garching, Germany, Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 2015, pp. 460-473, https://doi.org/10.4230/LIPIcs.STACS.2015.460. · Zbl 1355.68100
[29] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, Dryad: Distributed data-parallel programs from sequential building blocks, ACM SIGOPS Oper. Syst. Rev., 41 (2007), pp. 59-72, https://doi.org/10.1145/1272998.1273005.
[30] A. Israeli and A. Itai, A fast and simple randomized parallel algorithm for maximal matching, Inform. Process. Lett., 22 (1986), pp. 77-80, https://doi.org/10.1016/0020-0190(86)90144-4. · Zbl 0588.68036
[31] A. Israeli and Y. Shiloach, An improved parallel algorithm for maximal matching, Inform. Process. Lett., 22 (1986), pp. 57-60, https://doi.org/10.1016/0020-0190(86)90141-9. · Zbl 0588.68035
[32] M. Kapralov, S. Khanna, and M. Sudan, Approximating matching size from random streams, in Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, OR, SIAM, Philadelphia, 2014, pp. 734-751, https://doi.org/10.1137/1.9781611973402.55. · Zbl 1423.68344
[33] H. Karloff, S. Suri, and S. Vassilvitskii, A model of computation for MapReduce, in Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, TX, SIAM, Philadelphia, 2010, pp. 938-948, https://doi.org/10.1137/1.9781611973075.76. · Zbl 1288.68247
[34] F. Kuhn, T. Moscibroda, and R. Wattenhofer, The price of being near-sighted, in Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm, SODA 2006, SIAM, Philadelphia, 2006, pp. 980-989, http://dl.acm.org/citation.cfm?id=1109557.1109666. · Zbl 1192.68044
[35] S. Lattanzi, B. Moseley, S. Suri, and S. Vassilvitskii, Filtering: A method for solving graph problems in MapReduce, in Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2011, San Jose, CA, ACM, New York, 2011, pp. 85-94, https://doi.org/10.1145/1989493.1989505.
[36] C. Lenzen, Optimal deterministic routing and sorting on the congested clique, in Proceedings of the 2013 ACM Symposium on Principles of Distributed Computing, ACM, New York, 2013, pp. 42-50. · Zbl 1323.68034
[37] Z. Lotker, B. Patt-Shamir, and S. Pettie, Improved distributed approximate matching, J. ACM, 62 (2015), 38, https://doi.org/10.1145/2786753. · Zbl 1426.68292
[38] M. Luby, A simple parallel algorithm for the maximal independent set problem, SIAM J. Comput., 15 (1986), pp. 1036-1053, https://doi.org/10.1137/0215074. · Zbl 0619.68058
[39] A. McGregor, Finding graph matchings in data streams, Approximation, Randomization and Combinatorial Optimization, Algorithms and Techniques, in Proceedings of the 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2005 and 9th InternationalWorkshop on Randomization and Computation, RANDOM 2005, Berkeley, CA, Springer, Berlin, 2005, pp. 170-181, https://doi.org/10.1007/11538462_15. · Zbl 1142.68460
[40] K. Onak, Round Compression for Parallel Graph Algorithms in Strongly Sublinear Space, http://arxiv.org/abs/1807.08745 (2018).
[41] K. Onak and R. Rubinfeld, Maintaining a large matching and a small vertex cover, in Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, MA, ACM, New York, 2010, pp. 457-464, https://doi.org/10.1145/1806689.1806753. · Zbl 1293.05305
[42] M. Parnas and D. Ron, Approximating the minimum vertex cover in sublinear time and a connection to distributed algorithms, Theoret. Comput. Sci., 381 (2007), pp. 183-196, https://doi.org/10.1016/j.tcs.2007.04.040. · Zbl 1188.68358
[43] T. Roughgarden, S. Vassilvitskii, and J. R. Wang, Shuffles and circuits: On lower bounds for modern parallel computation, in Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016, Asilomar State Beach/Pacific Grove, CA, ACM, New York, 2016, pp. 1-12, https://doi.org/10.1145/2935764.2935799. · Zbl 1426.68105
[44] T. White, Hadoop: The Definitive Guide, O’Reilly, Sebastopol, CA, 2012.
[45] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster computing with working sets, in 2nd USENIX Workshop on Hot Topics in Cloud Computing, HotCloud’10, Boston, MA, USENIX Association, Berkeley, CA, 2010, https://www.usenix.org/conference/hotcloud-10/spark-cluster-computing-working-sets.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.