×

GraphLSHC: towards large scale spectral hypergraph clustering. (English) Zbl 1475.68306

Summary: Hypergraph is popularly used for describing multi-relationships among objects in a unified manner, and spectral clustering is regarded as one of the most effective algorithms for partitioning those objects (vertices) into different communities. However, the traditional spectral clustering for hypergraph (HC) incurs expensive costs in terms of both time and space. In this paper, we propose a framework called GraphLSHC to tackle the scalability problem faced by the large scale hypergraph spectral clustering. In our solution, the hypergraph used in GraphLSHC is expanded into a general format to capture complicated higher-order relationships. Moreover, GraphLSHC is capable to simultaneously partition both vertices and hyperedges according to the “eigen-trick”, which provides an approach for reducing the computational complexity of the clustering. To improve the performance further, several hyperedge-based sampling techniques are proposed, which can supplement the sampled matrix with the whole graph information. We also give a theoretical guarantee for the error boundary of the supplement. Several experiments show the superiority of the proposed framework over the state-of-the-art algorithms.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
05C65 Hypergraphs
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

RCV1; DeepWalk; GraphLSH; LINE
Full Text: DOI

References:

[1] Wang, S.; Siskind, J. M., Image segmentation with ratio cut, PAMI, 25, 6, 675-690 (2003)
[2] Zhou, D.; Huang, J.; Schölkopf, B., Learning with hypergraphs: Clustering, classification, and embedding, NIPS, 1601-1608 (2006)
[3] Shi, J.; Malik, J., Normalized cuts and image segmentation, PAMI, 22, 8, 888-905 (2000)
[4] M. Meila, J. Shi, A random walks view of spectral segmentation. aistats, Ai and Statistics.
[5] Ng, A. Y.; Jordan, M. I.; Weiss, Y., On spectral clustering: Analysis and an algorithm, NIPS, 849-856 (2001)
[6] Purkait, P.; Chin, T.; Sadri, A.; Suter, D., Clustering with hypergraphs: The case for large hyperedges, PAMI, 39, 9, 1697-1711 (2017)
[7] J. Bu, S. Tan, C. Chen, C. Wang, H. Wu, L. Zhang, X. He, Music recommendation by unified hypergraph: combining social media information and music content, in: MM, 2010, pp. 391-400. doi:10.1145/1873951.1874005.
[8] Scabini, L. F.S.; Ribas, L. C.; Bruno, O. M., Spatio-spectral networks for color-texture analysis, Inf. Sci., 515, 64-79 (2020)
[9] Liu, Q.; Sun, Y.; Wang, C.; Liu, T.; Tao, D., Elastic net hypergraph learning for image clustering and semi-supervised classification, IEEE Trans. Image Processing, 26, 1, 452-463 (2017) · Zbl 1409.94387
[10] Zheng, X.; Luo, Y.; Sun, L.; Ding, X.; Zhang, J., A novel social network hybrid recommender system based on hypergraph topologic structure, World Wide Web, 21, 4, 985-1013 (2018)
[11] X. Zhao, N. Wang, H. Shi, H. Wan, J. Huang, Y. Gao, Hypergraph learning with cost interval optimization, in: Proceedings of the Thirty-Second AAAI 2-7, 2018, 2018, pp. 4522-4529.
[12] Huang, Y.; Liu, Q.; Lv, F.; Gong, Y.; Metaxas, D. N., Unsupervised image categorization by hypergraph partition, PAMI, 33, 6, 1266-1273 (2011)
[13] Li, X.; Hu, W.; Shen, C.; Dick, A. R.; Zhang, Z. M., Context-aware hypergraph construction for robust spectral clustering, TKDE, 26, 10, 2588-2597 (2014)
[14] Huang, S.; Elgammal, A. M.; Yang, D., On the effect of hyperedge weights on hypergraph learning, Image Vision Comput., 57, 89-101 (2017)
[15] Dong, Z.; Wang, S.; Liu, Q., Spectral based hypothesis testing for community detection in complex networks, Inf. Sci., 512, 1360-1371 (2020) · Zbl 1457.91305
[16] Zhang, K.; Kwok, J. T., Clustered nyström method for large scale manifold learning and dimension reduction, TNN, 21, 10, 1576-1587 (2010)
[17] Li, M.; Bi, W.; Kwok, J. T.; Lu, B., Large-scale nyström kernel matrix approximation using randomized SVD, TNNLS, 26, 1, 152-164 (2015)
[18] Jia, H.; Ding, S.; Du, M.; Xue, Y., Approximate normalized cuts without eigen-decomposition, Inf. Sci., 374, 135-150 (2016) · Zbl 1428.68241
[19] Cai, D.; Chen, X., Large scale spectral clustering via landmark-based sparse representation, TCYB, 45, 8, 1669-1680 (2015)
[20] Yu, J.; Zhu, C.; Zhang, J.; Huang, Q.; Tao, D., Spatial pyramid-enhanced netvlad with weighted triplet loss for place recognition, IEEE Trans. Neural Networks Learn. Syst., 31, 2, 661-674 (2020)
[21] Mohan, M.; Monteleoni, C., Beyond the nystrom approximation: Speeding up spectral clustering using uniform sampling and weighted kernel k-means, IJCAI, 2494-2500 (2017)
[22] M. Vladymyrov, M. Á. Carreira-Perpiñán, Locally linear landmarks for large-scale manifold learning, in: ECML PKDD, 2013, pp. 256-271. doi:10.1007/978-3-642-40994-3_17.
[23] Vladymyrov, M.; Carreira-Perpiñán, M.Á., The variational nystrom method for large-scale spectral problems, ICML, 211-220 (2016)
[24] Hong, C.; Yu, J.; Tao, D.; Wang, M., Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval, IEEE Trans. Industrial Electronics, 62, 6, 3742-3751 (2015)
[25] Yu, J.; Tan, M.; Zhang, H.; Tao, D.; Rui, Y., Hierarchical deep click feature prediction for fine-grained image recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (2019), 1-1
[26] F. Tian, B. Gao, Q. Cui, E. Chen, T. Liu, Learning deep representations for graph clustering, in: C.E. Brodley, P. Stone (Eds.), Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27-31, 2014, Québec City, Québec, Canada, AAAI Press, 2014, pp. 1293-1299. URL: http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8527.
[27] Zhang, J.; Yu, J.; Tao, D., Local deep-feature alignment for unsupervised dimension reduction, IEEE Trans. Image Process., 27, 5, 2420-2432 (2018) · Zbl 1409.94749
[28] B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: online learning of social representations, in: S.A. Macskassy, C. Perlich, J. Leskovec, W. Wang, R. Ghani (Eds.), The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24-27, 2014, ACM, 2014, pp. 701-710. doi:10.1145/2623330.2623732. URL: doi: 10.1145/2623330.2623732.
[29] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, LINE: large-scale information network embedding, in: A. Gangemi, S. Leonardi, A. Panconesi (Eds.), Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, May 18-22, 2015, ACM, 2015, pp. 1067-1077. doi:10.1145/2736277.2741093. URL: doi: 10.1145/2736277.2741093.
[30] Xu, R.; Che, Y.; Wang, X.; Hu, J.; Xie, Y., Stacked autoencoder-based community detection method via an ensemble clustering framework, Inf. Sci., 526, 151-165 (2020)
[31] Sunderrajan, S.; Manjunath, B. S., Context-aware hypergraph modeling for re-identification and summarization, TMM, 18, 1, 51-63 (2015)
[32] Dhillon, I. S., Co-clustering documents and words using bipartite spectral graph partitioning, KDD, 269-274 (2001)
[33] Fowlkes, C. C.; Belongie, S. J.; Chung, F. R.K.; Malik, J., Spectral grouping using the nyström method, PAMI, 26, 2, 214-225 (2004)
[34] M. Li, J.T. Kwok, B. Lu, Making large-scale nyström approximation possible, in: ICML, 2010, pp. 631-638.
[35] X. Chen, D. Cai, Large scale spectral clustering with landmark-based representation, in: AAAI, Vol. 5, 2011, p. 14.
[36] Drineas, P.; Kannan, R.; Mahoney, M. W., Fast monte carlo algorithms for matrices II: computing a low-rank approximation to a matrix, SIAM J. Comput., 36, 1, 158-183 (2006) · Zbl 1111.68148
[37] Achlioptas, D.; McSherry, F., Fast computation of low-rank matrix approximations, J. ACM, 54, 2, 9 (2007) · Zbl 1311.94031
[38] B. Ghojogh, F. Karray, M. Crowley, Eigenvalue and generalized eigenvalue problems: Tutorial, arXiv preprint arXiv:1903.11240.
[39] Wang, S.; Zhang, Z., Improving CUR matrix decomposition and the nyström approximation via adaptive sampling, Journal of Machine Learning Research, 14, 1, 2729-2769 (2013), URL: http://dl.acm.org/citation.cfm?id=2567748 · Zbl 1318.65023
[40] Lewis, D. D.; Yang, Y.; Rose, T. G.; Li, F., Rcv1: A new benchmark collection for text categorization research, JMLR, 5, 2, 361-397 (2004)
[41] Chen, W.; Song, Y.; Bai, H.; Lin, C.; Chang, E. Y., Parallel spectral clustering in distributed systems, PAMI, 33, 3, 568-586 (2011)
[42] Cai, D.; He, X.; Han, J., Document clustering using locality preserving indexing, TKDE, 17, 12, 1624-1637 (2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.