×

Parallel community detection for massive graphs. (English) Zbl 1271.68201

Bader, David A. (ed.) et al., Graph partitioning and graph clustering. Proceedings of the 10th DIMACS implementation challenge workshop, Atlanta, GA, USA, February 13–14, 2012. Providence, RI: American Mathematical Society (AMS) (ISBN 978-0-8218-9038-7/pbk; 978-0-8218-9869-7/ebook). Contemporary Mathematics 588, 207-221 (2013).
Summary: Tackling the current volume of graph-structured data requires parallel tools. We extend our work on analyzing such massive graph data with a massively parallel algorithm for community detection that scales to current data sizes, clustering a real-world graph of over 100 million vertices and over 3 billion edges in under 500 seconds on a four-processor Intel E7-8870-based server.
Our algorithm achieves moderate parallel scalability without sacrificing sequential operational complexity. Community detection partitions a graph into subgraphs more densely connected within the subgraph than to the rest of the graph.
We take an agglomerative approach similar to Clauset, Newman, and Moore’s sequential algorithm, merging pairs of connected intermediate subgraphs to optimize different graph properties. Working in parallel opens new approaches to high performance.
We improve performance of our parallel community detection algorithm on both the Cray XMT2 and OpenMP platforms and adapt our algorithm to the DIMACS implementation challenge data set.
For the entire collection see [Zbl 1262.05001].

MSC:

68R10 Graph theory (including graph drawing) in computer science
68P05 Data structures
68W10 Parallel algorithms in computer science
05C85 Graph algorithms (graph-theoretic aspects)
68Q10 Modes of computation (nondeterministic, parallel, interactive, probabilistic, etc.)
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems

Software:

Pregel