×

On maximum degree-based \(\gamma\)-quasi-clique problem: complexity and exact approaches. (English) Zbl 1388.05140

Summary: We consider the problem of finding a degree-based \(\gamma\)-quasi-clique of maximum cardinality in a given graph for some fixed \(\gamma\in(0,1]\). A degree-based \(\gamma\)-quasi-clique (often referred to as simply a quasi-clique) is a subgraph, where the degree of each vertex is at least \(\gamma\) times the maximum possible degree of a vertex in the subgraph. A degree-based \(\gamma\)-quasi-clique is a relative clique relaxation model, where the case of \(\gamma=1\) corresponds to the well-known concept of a clique. In this article, we first prove that the problem is NP-hard for any fixed \(\gamma\in(0,1]\), which addresses one of the open questions in the literature. More importantly, we also develop new exact solution methods for solving the problem and demonstrate their advantages and limitations in extensive computational experiments with both random and real-world networks. Finally, we outline promising directions of future research including possible functional generalizations of the considered clique relaxation model.

MSC:

05C69 Vertex subsets with special properties (dominating sets, independent sets, cliques, etc.)
05C07 Vertex degrees
90C10 Integer programming
Full Text: DOI

References:

[1] Human protein reference database, webpage. Available at http://www.hprd.org/, last accessed May 20, 2016.
[2] Network collection from M. Newman, webpage. Available at http://www-personal.umich.edu/mejn/netdata/, last accessed May 20, 2016.
[3] Pajek version 1.26, webpage. Available at http://vlado.fmf.uni-lj.si/pub/networks/pajek/, last accessed May 20, 2016.
[4] UCINET software datasets, webpage. Available at https://sites.google.com/site/ucinetsoftware/datasets/, last accessed December 22, 2014.
[5] J.Abello, P.M.Pardalos, and M.G.C.Resende, “On maximum clique problems in very large graphs,” External memory algorithms, J.Abello (ed.) and J.S.Vitter (ed.) (Editors), American Mathematical Society, Boston, MA, 1999, pp. 119-130. · Zbl 0947.68119
[6] J.Abello, M.G.C.Resende, and S.Sudarsky, “Massive quasi‐clique detection,” LATIN 2002: Theoretical informatics, S. Rajsbaum (Editor), Vol. 2286 of Lecture Notes in Computer Science, Springer, Berlin, 2002, pp. 598-612. · Zbl 1059.68597
[7] C.Aggarwal (ed.) and H.Wang (ed.) (Editors), Managing and mining graph data, Springer, Berlin, 2010. · Zbl 1185.68458
[8] R.Ahuja, T.Magnanti, and J.Orlin, Network flows: Theory, algorithms, and applications, Prentice Hall, Upper Saddle River, NJ, 1993. · Zbl 1201.90001
[9] R.Albert and A.L.Barabási, Statistical mechanics of complex networks, Rev Mod Phys74 (2002), 47-97. · Zbl 1205.82086
[10] O.Amini, D.Peleg, S.Pérennes, I.Sau, and S.Saurabh, On the approximability of some degree‐constrained subgraph problems, Discr Appl Math160 (2012), 1661-1679. · Zbl 1246.05040
[11] C.Balasubramaniam and S.Butenko, On robust clusters of minimum cardinality in networks, Ann Oper Res249 (2017), 17-37. · Zbl 1357.90163
[12] B.Balasundaram, S.Butenko, and I.V.Hicks, Clique relaxations in social network analysis: The maximum \(k\)‐plex problem, Oper Res59 (2011), 133-142. · Zbl 1218.90228
[13] B.Balasundaram, S.Butenko, and S.Trukhanov, Novel approaches for analyzing biological networks, J Comb Optim10 (2005), 23-39. · Zbl 1080.90010
[14] V.Boginski, S.Butenko, and P.Pardalos, Mining market data: A network approach, Comput Oper Res33 (2006), 3171-3184. · Zbl 1113.90079
[15] V.Boginski, S.Butenko, and P.M.Pardalos, Statistical analysis of financial networks, Comput Stat Data Anal48 (2005), 431-443. · Zbl 1429.62460
[16] I.M.Bomze, M.Budinich, P.M.Pardalos, and M.Pelillo, “The maximum clique problem,” Handbook of combinatorial optimization, D.Z.Du (ed.) and P.Pardalos (ed.) (Editors), Kluwer Academic Publishers, Dordrecht, 1999, pp. 1-74. · Zbl 1253.90188
[17] S.Borgatti, M.Everett, and J.Johnson, Analyzing social networks, SAGE Publications Limited, Thousand Oaks, CA, 2013.
[18] I.Boukhris, Z.Elouedi, T.Fober, M.Mernberger, and E.Hüllermeier, Similarity analysis of protein binding sites: A generalization of the maximum common subgraph measure based on quasi‐clique detection, Proc Ninth Int Conference Intelligent Syst Design Appl, IEEE, Piscataway, NJ, 2009, pp. 1245-1250.
[19] M.Brunato, H.H.Hoos, and R.Battiti, “On effectively finding maximal quasi‐cliques in graphs,” Learning and intelligent optimization, V.Maniezzo (ed.), R.Battiti (ed.), and J.P.Watson (ed.) (Editors), Springer‐Verlag, Berlin, 2008, pp. 41-55.
[20] A.Buchanan, J.Walteros, S.Butenko, and P.Pardalos, Solving maximum clique in sparse graphs: An \(O ( n m + n 2^{d / 4} )\) algorithm for \(d\)‐degenerate graphs, Optim Lett8 (2014), 1611-1617. · Zbl 1303.05184
[21] S.Butenko and W.Wilhelm, Clique‐detection models in computational biochemistry and genomics, Eur J Oper Res173 (2006), 1-17. · Zbl 1125.92025
[22] R.Carraghan and P.Pardalos, An exact algorithm for the maximum clique problem, Oper Res Lett9 (1990), 375-382. · Zbl 0711.90080
[23] F.Chung, L.Lu, and V.Vu, The spectra of random graphs with given expected degrees, Internet Math1 (2004), 257-275. · Zbl 1080.05021
[24] ILOG CPLEX, webpage. Available at http://www.ilog.com/, last accessed October 26, 2017.
[25] T.Davis and Y.Hu, The University of Florida Sparse Matrix Collection, ACM Trans Math Software38 (2011), 1-25. · Zbl 1365.65123
[26] DIMACS, 10th DIMACS Implementation Challenge, 2011, webpage. Available at http://www.cc.gatech.edu/dimacs10/index.shtml, last accessed September 9, 2013. · Zbl 1262.05001
[27] P.Erdős and A.Rényi, On random graphs, Publicationes Mathematicae Debrecen 6 (1959), 290-297. · Zbl 0092.15705
[28] M.R.Garey and D.S.Johnson, Computers and intractability: A guide to the theory of NP‐completeness, W. H. Freeman, New York, 1979. · Zbl 0411.68039
[29] M.Haraguchi and Y.Okubo, “A method for pinpoint clustering of web pages with pseudo‐clique search,” Federation over the web, K.Jantke (ed.), A.Lunzer (ed.), N.Spyratos (ed.), and Y.Tanaka (ed.) (Editors), Vol. 3847 of Lecture Notes in Computer Science, Springer Berlin, Heidelberg, 2006, pp. 59-78.
[30] M.Jackson, Social and economic networks, Princeton University Press, Princeton, NJ, 2010. · Zbl 1203.91001
[31] D.Jiang and J.Pei, Mining frequent cross‐graph quasi‐cliques, ACM Trans Knowl Discov Data2 (2009), 16:1-16:42.
[32] S.Kahruman‐Anderoglu, A.Buchanan, S.Butenko, and O.Prokopyev, On provably best construction heuristics for hard combinatorial optimization problems, Networks67 (2016), 238-245. · Zbl 1390.90467
[33] C.Komusiewicz, Multivariate algorithmics for finding cohesive subnetworks, Algorithms9 (2016), 21. · Zbl 1461.05213
[34] S.Kosub, “Local density,”Network analysis, U.Brandes (ed.) and T.Erlebach (ed.) (Editors), Springer, Berlin, 2005, pp. 112-142. · Zbl 1118.68332
[35] V.Lee, N.Ruan, R.Jin, and C.Aggarwal, “A survey of algorithms for dense subgraph discovery,” Managing and mining graph data, C.Aggarwal (ed.) and H.Wang (ed.) (Editors), Springer, Berlin, 2010, pp. 303-336.
[36] J.Leskovec and A.Krevl, SNAP Datasets: Stanford large network dataset collection, webpage. Available at http://snap.stanford.edu/data, last accessed May 20, 2016.
[37] G.Liu and L.Wong, Effective pruning techniques for mining quasi‐cliques, Proc Eur Conference Machine Learning Knowledge Discovery in Databases ‐ Part II (ECML PKDD ’08), Springer‐Verlag, Berlin, Heidelberg, 2008, pp. 33-49.
[38] R.Luce and A.Perry, A method of matrix analysis of group structure, Psychometrika14 (1949), 95-116.
[39] H.Matsuda, T.Ishihara, and A.Hashimoto, Classifying molecular sequences using a linkage graph with their pairwise similarities, Theor Comput Sci210 (1999), 305-325. · Zbl 0912.68218
[40] D.Matula and L.Beck, Smallest‐last ordering and clustering and graph coloring algorithms, J ACM30 (1983), 417-427. · Zbl 0628.68054
[41] M.Newman, Networks: An introduction, Oxford University Press, New York, 2010. · Zbl 1195.94003
[42] F.M.Pajouh, Z.Miao, and B.Balasundaram, A branch‐and‐bound approach for maximum quasi‐cliques, Ann Oper Res216 (2014), 145-161. · Zbl 1296.90130
[43] P.Pardalos (ed.), D.Z.Du (ed.), and R.Graham (ed.) (Editors), Handbook of combinatorial optimization, Springer, New York, 2013. · Zbl 1496.90001
[44] P.Pardalos and J.Xue, The maximum clique problem, J Global Optim4 (1994), 301-328. · Zbl 0797.90108
[45] J.Pattillo, A.Veremyev, S.Butenko, and V.Boginski, On the maximum quasi‐clique problem, Discr Appl Math161 (2013), 244-257. · Zbl 1254.05140
[46] J.Pattillo, N.Youssef, and S.Butenko, On clique relaxation models in network analysis, Eur J Oper Res226 (2013), 9-18. · Zbl 1292.05208
[47] J.Pei, D.Jiang, and A.Zhang, On mining cross‐graph quasi‐cliques, Proc Eleventh ACM SIGKDD Int Conference Knowledge Discovery in Data Mining (KDD ’05), ACM, New York, NY, 2005, pp. 228-238.
[48] Quick algorithm, webpage. Available at https://www.comp.nus.edu.sg/wongls/projects/pattern-spaces/quick- v1/, last accessed May 20, 2016.
[49] M.Resende (ed.) and P.Pardalos (ed.) (Editors), Handbook of optimization in telecommunications, Springer, New York, 2006. · Zbl 1100.90001
[50] S.B.Seidman and B.L.Foster, A graph‐theoretic generalization of the clique concept, J Math Sociol6 (1978), 139-154. · Zbl 0386.92015
[51] K.Sim, J.Li, V.Gopalkrishnan, and G.Liu, Mining maximal quasi‐bicliques to co‐cluster stocks and financial ratios for value investment, Proc Sixth Int Conference Data Mining (ICDM ’06), IEEE Computer Society, Washington, DC, 2006, pp. 1059-1063.
[52] A.Veremyev, V.Boginski, P.A.Krokhmal, and D.E.Jeffcoat, Dense percolation in large‐scale mean‐field random networks is provably “explosive”, PloS One7 (2012), e51883.
[53] A.Veremyev, O.Prokopyev, V.Boginski, and E.Pasiliao, Finding maximum subgraphs with relatively large vertex connectivity, Eur J Oper Res239 (2014), 349-362. · Zbl 1339.05407
[54] A.Veremyev, O.Prokopyev, S.Butenko, and E.Pasiliao, Exact MIP‐based approaches for finding maximum quasi‐cliques and dense subgraphs, Comput Optim Appl64 (2016), 171-214. · Zbl 1368.90138
[55] A.Veremyev, O.Prokopyev, and E.Pasiliao, Critical nodes for distance‐based connectivity and related problems in graphs, Networks66 (2015), 170-195.
[56] A.Verma, A.Buchanan, and S.Butenko, Solving the maximum clique and vertex coloring problems on very large sparse networks, INFORMS J Comput27 (2015), 164-177. · Zbl 1327.90356
[57] C.Vogiatzis, A.Veremyev, E.Pasiliao, and P.Pardalos, An integer programming approach for finding the most and the least central cliques, Optim Lett9 (2014), 615-633. · Zbl 1320.90050
[58] Q.Wu and J.K.Hao, A review on algorithms for maximum clique problems, Eur J Oper Res242 (2015), 693-709. · Zbl 1341.05241
[59] Z.Zeng, J.Wang, L.Zhou, and G.Karypis, Coherent closed quasi‐clique discovery from large dense graph databases, Proc 12th ACM SIGKDD Int Conference Knowledge Discovery Data Mining, ACM, Philadelphia, PA, 2006, pp. 797-802.
[60] Z.Zeng, J.Wang, L.Zhou, and G.Karypis, Out‐of‐core coherent closed quasi‐clique mining from large dense graph databases, ACM Trans Database Syst32 (2007), 13.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.