×

Spectral methods for graph clustering - a survey. (English) Zbl 1250.68228

Summary: Graph clustering is an area in cluster analysis that looks for groups of related vertices in a graph. Due to its large applicability, several graph clustering algorithms have been proposed in the last years. A particular class of graph clustering algorithms is known as spectral clustering algorithms. These algorithms are mostly based on the eigen-decomposition of Laplacian matrices of either weighted or unweighted graphs. This survey presents different graph clustering formulations, most of which based on graph cut and partitioning problems, and describes the main spectral clustering algorithms found in literature that solve these problems.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
05C85 Graph algorithms (graph-theoretic aspects)
90C35 Programming involving graphs or networks

Software:

LAPACK; SpectralNET
Full Text: DOI

References:

[1] Alpert, C.J., Kahng, A.B., 1994. Multi-way partitioning via spacefilling curves and dynamic programming. In: Proceeding of the 31st ACM/IEEE Design Automation Conference, pp. 652-657.; Alpert, C.J., Kahng, A.B., 1994. Multi-way partitioning via spacefilling curves and dynamic programming. In: Proceeding of the 31st ACM/IEEE Design Automation Conference, pp. 652-657.
[2] Alpert, C.; Kahng, A. B.; Yao, S. Z., Spectral partitioning with multiple eigenvectors, Discrete Applied Mathematics, 90, 3-26 (1999) · Zbl 0918.68076
[3] Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., Mckenney, A., Sorensen, D., 1990. LAPACK: A Portable Linear Algebra Library for High-Performance Computers, Technical Report CS-90-105, Computer Science Department, University of Tennessee, Knoxville, TN.; Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., Mckenney, A., Sorensen, D., 1990. LAPACK: A Portable Linear Algebra Library for High-Performance Computers, Technical Report CS-90-105, Computer Science Department, University of Tennessee, Knoxville, TN. · Zbl 0843.65018
[4] Bach, F. R.; Jordan, M. I., Learning spectral clustering, (Thrun, S.; Saul, L.; Schölkopf, B., Advances in Neural Information Processing Systems, vol. 16 (2004), MIT Press: MIT Press Cambridge), 305-312
[5] Boccaletti, S.; Latora, V.; Moreno, Y.; Chavez, M.; Hwang, D.-U., Complex networks: Structure and dynamics, Physics Reports, 424, 175-308 (2006) · Zbl 1371.82002
[6] Brand, M., Huang, K., 2003. A unifying theorem for spectral embedding and clustering. In: Bishop, C., Frey, B. (Eds.), Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics.; Brand, M., Huang, K., 2003. A unifying theorem for spectral embedding and clustering. In: Bishop, C., Frey, B. (Eds.), Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics.
[7] Chan, P. K.; Schlag, M. D.F.; Zien, J. Y., Spectral k-way ratio-cut partitioning and clustering, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 13, 1088-1096 (1994)
[8] Chung, F. R.K., Spectral graph theory, (No. 92 in CBMS Regional Conference Series in Mathematics (1994), American Mathematical Society) · Zbl 0872.05052
[9] Cormack, R. M., A review of classification, Journal of the Royal Statistical Society, 134, 321-367 (1971)
[10] Cvetković, D.; Doob, M.; Sachs, H., Spectra of Graphs: Theory and Application (1979), Academic Press: Academic Press New York
[11] De Bie, T.; Cristianini, N., Fast SDP relaxations of graph cut clustering, transduction, and other combinatorial problems, Journal of Machine Learning Research, 7, 1409-1436 (2006) · Zbl 1222.68179
[12] Diestel, R., Graph theory, Graduate Texts in Mathematics, vol. 173 (2005), Springer-Verlag: Springer-Verlag Heidelberg · Zbl 1074.05001
[13] Ding, C., He, X., Zha, H., Gu, M., Simon, H., 2001. A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, pp. 107-114.; Ding, C., He, X., Zha, H., Gu, M., Simon, H., 2001. A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the IEEE International Conference on Data Mining, ICDM 2001, pp. 107-114.
[14] Ding, C.; He, X.; Meraz, R. F.; Holbrook, S. R., A unified representation of multiprotein complex data for modeling interaction networks, Proteins: Structure, Function, and Bioinformatics, 57, 99-108 (2004)
[15] Ding, C.; He, X.; Simon, H. D., Nonnegative Lagrangian relaxation of \(k\)-means and spectral clustering, (Machine Learning: ECML 2005. Machine Learning: ECML 2005, Lecture Notes in Computer Science (2005), Springer: Springer Berlin, Heidelberg), 530-538
[16] Donath, W.; Hoffman, A., Lower bounds for the partitioning of graphs, IBM Journal Research and Development, 17, 5, 420-425 (1973) · Zbl 0259.05112
[17] Fiedler, M., A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory, Czechoslovak Mathematical Journal, 25, 619-633 (1975) · Zbl 0437.15004
[18] Filippone, M.; Camastra, F.; Masulli, F.; Rovetta, S., A survey of kernel and spectral methods for clustering, Pattern Recognition, 41, 176-190 (2008) · Zbl 1122.68530
[19] Ford, L. R.; Fulkerson, D. R., Flows in Networks (1962), Princeton University Press: Princeton University Press Princeton · Zbl 0139.13701
[20] Forman, J. J.; Clemons, P. A.; Schreiber, S. L.; Haggarty, S. J., SpectralNET - An application for spectral graph analysis and visualization, BMC Bioinformatics, 6, 260 (2005)
[21] Fowlkes, C.; Belongie, S.; Chung, F.; Malik, J., Spectral grouping using the Nyström method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 214-225 (2004)
[22] Gleiser, P.; Danon, L., Community structure in jazz, Advances in Complex Systems, 6, 565-573 (2003)
[23] Gu, M., Zha, H., Ding, C., He, X., Simon, H., 2001. Spectral Relaxation Models and Structure Analysis for \(k\); Gu, M., Zha, H., Ding, C., He, X., Simon, H., 2001. Spectral Relaxation Models and Structure Analysis for \(k\)
[24] Hagen, L.; Kahng, A., New spectral methods for ratio cut partitioning and clustering, IEEE Transactions on Computer-Aided Design, 11, 1074-1088 (1992)
[25] Hall, K. M., An r-dimensional quadratic placement algorithm, Management Science, 17, 219-229 (1970) · Zbl 0203.52503
[26] Hansen, P.; Jaumard, B., Cluster analysis and mathematical programming, Mathematical Programming, 79, 191-215 (1997) · Zbl 0887.90182
[27] Hückel, E., Quantentheoretische Beiträge zum Benzolproblem, Zeitschrift füur Physik a Hadrons and Nuclei, 72, 310-337 (1931) · Zbl 0002.43001
[28] Huttenhower, C.; Flamholz, A. I.; Landis, J. N.; Sahi, S.; Myers, C. L.; Olszewski, K. L.; Hibbs, M. A.; Siemers, N. O.; Troyanskaya, O. G.; Coller, H. A., Nearest neighbor networks: Clustering expression data based on gene neighborhoods, BMC Bioinformatics, 8, 250 (2007)
[29] Kannan, N.; Vishveshwara, S., Identification of side-chain clusters in protein structures by a graph spectral method, Journal of Molecular Biology, 292, 441-464 (1999)
[30] Kim, J.; Choi, S., Semidefinite spectral clustering, Pattern Recognition, 39, 11, 2025-2035 (2006) · Zbl 1102.68626
[31] Kulis, B., Surendran, A.C., Platt, J.C., 2007. Fast low-rank semidefinite programming for embedding and clustering. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics.; Kulis, B., Surendran, A.C., Platt, J.C., 2007. Fast low-rank semidefinite programming for embedding and clustering. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics.
[32] Leighton, T., Rao, S., 1988. An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In: Annual IEEE Symposium on Foundations of Computer Science, pp. 422-431.; Leighton, T., Rao, S., 1988. An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms. In: Annual IEEE Symposium on Foundations of Computer Science, pp. 422-431.
[33] Luo, B.; Wilson, R. C.; Hancock, E. R., Spectral embedding of graphs, Pattern Recognition, 36, 2213-2230 (2003) · Zbl 1054.68105
[34] Lusseau, D.; Schneider, K.; Boisseau, O. J.; Haase, P.; Slooten, E.; Dawson, S. M., The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations, Behavioral Ecology and Sociobiology, 54, 396-405 (2003)
[35] Maier, M., von Luxburg, U., Hein, M., 2009. Influence of graph construction on graph-based clustering measures. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L., Curran, Hook, R. (Eds.), Advances in Neural Information Processing Systems, vol. 21, pp. 1025-1032.; Maier, M., von Luxburg, U., Hein, M., 2009. Influence of graph construction on graph-based clustering measures. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L., Curran, Hook, R. (Eds.), Advances in Neural Information Processing Systems, vol. 21, pp. 1025-1032.
[36] Meilă, M., Shi, J., 2001. A random walks view of spectral segmentation. In: Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics.; Meilă, M., Shi, J., 2001. A random walks view of spectral segmentation. In: Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics.
[37] Meilă, M., Xu, L., 2004. Multiway Cuts and Spectral Clustering, Technical Report. 442, University of Washington.; Meilă, M., Xu, L., 2004. Multiway Cuts and Spectral Clustering, Technical Report. 442, University of Washington.
[38] Mohar, B., The Laplacian spectrum of graphs, Graph Theory, Combinatorics, and Applications, 2, 871-898 (1991) · Zbl 0840.05059
[39] Newman, M. E.J., Finding community structure in networks using the eigenvectors of matrices, Physical Review E, 74, 036-104 (2006)
[40] Newman, M. E.J.; Girvan, M., Finding and evaluating community structure in networks, Physical Review E, 69, 026113 (2004)
[41] Ng, A.; Jordan, M.; Weiss, Y., On spectral clustering: Analysis and an algorithm, (Dietterich, T. G.; Becker, S.; Ghahramani, Z., Advances in Neural Information Processing Systems, vol. 14 (2002), MIT Press: MIT Press Cambridge, MA), 849-856
[42] Saerens, M., Fouss, F., Yen, L., Dupont, P., 2004. The principal components analysis of a graph, and its relationships to spectral clustering. In: Machine Learning: ECML 2004, Lecture Notes in Computer Science, vol. 3201, pp. 371-383.; Saerens, M., Fouss, F., Yen, L., Dupont, P., 2004. The principal components analysis of a graph, and its relationships to spectral clustering. In: Machine Learning: ECML 2004, Lecture Notes in Computer Science, vol. 3201, pp. 371-383. · Zbl 1132.68589
[43] Schaeffer, S. E., Graph clustering, Computer Science Review, 1, 27-64 (2007) · Zbl 1302.68237
[44] Schrijver, A., Combinatorial Optimization, Polyhedra and Efficiency, vol. B (2003), Springer-Verlag: Springer-Verlag Berlin, Heidelberg · Zbl 0542.90067
[45] Shi, J., Malik, J., 1997. Normalized cuts and image segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 731-737.; Shi, J., Malik, J., 1997. Normalized cuts and image segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 731-737.
[46] Shi, J.; Malik, J., Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888-905 (2000)
[47] Singh, V.; Mukherjee, L.; Peng, J.; Xu, J., Ensemble clustering using semidefinite programming, (Platt, J.; Koller, D.; Singer, Y.; Roweis, S., Advances in Neural Information Processing Systems, vol. 20 (2008), MIT Press: MIT Press Cambridge, MA), 1353-1360
[48] Stewart, G. W.; Sun, J., Matrix Perturbation Theory, Computer Science and Scientific Computing (1990), Academic Press · Zbl 0706.65013
[49] Ushioda, A., Kawasaki, J., 1996. Hierarchical clustering of words and application to nlp tasks. In: Ejerhed, E., Dagan, I. (Eds.), Fourth Workshop on Very Large Corpora. Association for Computational Linguistics, Somerset, New Jersey. pp. 28-41.; Ushioda, A., Kawasaki, J., 1996. Hierarchical clustering of words and application to nlp tasks. In: Ejerhed, E., Dagan, I. (Eds.), Fourth Workshop on Very Large Corpora. Association for Computational Linguistics, Somerset, New Jersey. pp. 28-41.
[50] Venables, W.N., Smith, D.M., 2010. An Introduction to R. R Development Core Team, The R Foundation for Statistical Computing, version 2.11.1.; Venables, W.N., Smith, D.M., 2010. An Introduction to R. R Development Core Team, The R Foundation for Statistical Computing, version 2.11.1.
[51] Von Luxburg, U., A tutorial on spectral clustering, Statistics and Computing, 17, 4, 395-416 (2007)
[52] Von Luxburg, U.; Belkin, M.; Bousquet, O., Consistency of spectral clustering, The Annals of Statistics, 36, 555-586 (2008) · Zbl 1133.62045
[53] Wei, Y. C., Cheng, C. K., 1989. Towards efficient hierarchical designs by ratio cut partitioning. In: Proceedings of the IEEE International Conference on Computer-Aided Design. pp. 298-301.; Wei, Y. C., Cheng, C. K., 1989. Towards efficient hierarchical designs by ratio cut partitioning. In: Proceedings of the IEEE International Conference on Computer-Aided Design. pp. 298-301.
[54] White, S. D.M.; Frenk, C. S., Galaxy formation through hierarchical clustering, Astrophysical Journal, 379, Part 1, 52-79 (1991)
[55] Wu, Z.; Leahy, R., An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 15, 11, 1101-1113 (1993)
[56] Xing, E., Jordan, M., 2003. On semidefinite relaxation for normalized \(k\); Xing, E., Jordan, M., 2003. On semidefinite relaxation for normalized \(k\)
[57] Yu, S.X., Shi, J., 2003. Multiclass spectral clustering. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, pp. 313-319.; Yu, S.X., Shi, J., 2003. Multiclass spectral clustering. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, pp. 313-319.
[58] Zachary, W., An information flow model for conflict and fission in small groups, Journal of Anthropological Research, 33, 452-473 (1977)
[59] Zelnik-Manor, L.; Perona, P., Self-tuning spectral clustering, (Advances in Neural Information Processing Systems, vol. 17 (2004), MIT Press), 1601-1608
[60] Zha, H., Ding, C., Gu, M., He, X., Simon, H., 2002. Spectral relaxation for \(k\); Zha, H., Ding, C., Gu, M., He, X., Simon, H., 2002. Spectral relaxation for \(k\)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.