Abstract
The k-truss of a graph is the largest edge-induced subgraph such that every edge is contained in at least k triangles within the subgraph, where a triangle is a cycle consisting of three vertices. As a new notion of cohesive subgraphs, truss has recently attracted a lot of research attentions in the database and data mining fields. At the same time, uncertainty is an intrinsic property of massive graph data, and truss decomposition (i.e., finding all k-trusses of a graph) has become a key primitive on uncertain graphs. In this paper, we study the truss decomposition problem on uncertain graphs, that is, finding all highly probable k-trusses of an uncertain graph. We first give an formal statement of the truss decomposition problem on uncertain graphs. Then, we prove that the truss decomposition of an uncertain graph attains two elegant properties, namely uniqueness and hierarchy. We show that the truss decomposition of an uncertain graph can be found in \(O(m^{1.5}Q)\) time by proposing an in-memory algorithm called \(\mathtt {TD_{mem}}\), where m is the number of edges of the uncertain graph, and Q is at most the maximum number of common neighbors of the endpoints of an edge. When an uncertain graph is too large to fit into main memory, we propose an external-memory algorithm \(\mathtt {TD_{I/O}}\) to find the truss decomposition of the uncertain graph. Extensive experiments have been carried out to evaluate the practical performance of the proposed algorithms. The experimental results verify that both \(\mathtt {TD_{mem}}\) and \(\mathtt {TD_{I/O}}\) are efficient when an uncertain graph is small enough to fit into main memory, and that \(\mathtt {TD_{I/O}}\) is much faster than \(\mathtt {TD_{mem}}\) when the graph is too large to fit into main memory.
Similar content being viewed by others
References
Boldi P, Bonchi F, Gionis A, Tassa T (2012) Injecting uncertainty in graphs for identity obfuscation. Proc VLDB Endow (PVLDB) 5(11):1376–1387
Bonchi F, Gullo F, Kaltenbrunner A, Vokovich Y (2014) Core decomposition of uncertain graphs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’14), pp 1316–1325
Chakrabarti D, Zhan Y, Faloutsos C (2004) R-MAT: A recursive model for graph mining. In: Proceedings of the 4th SIAM international conference on data mining (SDM’04), pp 442–446
Cheng Y, Yuan Y, Wang G, Qiao B, Wang Z (2014) Efficient sampling methods for shortest path query over uncertain graphs. In: Proceedings of the 19th international conference on database systems for advanced applications (DASFAA’14), pp 124–140
Chu S, Cheng J (2012) Triangle listing in massive networks. ACM Trans Knowl Discov Data (TKDD) 6(4):17
Cohen J (2008) Trusses: cohesive subgraphs for social network analysis. Technical Report
Cohen J (2009) Graph twiddling in a mapreduce world. Comput Sci Eng 11(4):29–41
Huang X, Cheng H, Qin L, Tian W, Yu JX (2014) Querying \(k\)-truss community in large and dynamic graphs. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD’14), pp 1311–1322
Jin R, Liu L, Aggarwal CC (2011) Discovering highly reliable subgraphs in uncertain graphs. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’11), pp 992–1000
Jin R, Liu L, Ding B, Wang H (2011) Distance-constraint reachability computation in uncertain graphs. Proc VLDB Endow (PVLDB) 4(9):551–562
Khan A, Bonchi F, Gionis A, Gullo F (2014) Fast reliability search in uncertain graphs. In: Proceedings of the 17th international conference on extending database technology (EDBT’14), pp 535–546
Kollios G, Potamias M, Terzi E (2013) Clustering large probabilistic graphs. IEEE Trans Knowl Data Eng (TKDE) 25(2):325–336
Kong X, Ragin AB, Wang X, Yu PS (2013) Discriminative feature selection for uncertain graph classification. In: Proceedings of the 13th SIAM international conference on data mining (SDM’13), pp 82–93
Leskovec J (2013) Stanford large network dataset collection. http://snap.stanford.edu/data/
Li J, Zou Z, Gao H (2012) Mining frequent subgraphs over uncertain graph databases under probabilistic semantics. VLDB J 21(6):753–777
Li R, Yu JX, Mao R, Jin T (2014) Efficient and accurate query evaluation on uncertain graphs via recursive stratified sampling. In: Proceedings of the 30th IEEE international conference on data engineering (ICDE’14), pp 892–903
Lian X, Chen L (2011) Efficient query answering in probabilistic RDF graphs. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD’11), pp 157–168
Liu L, Jin R, Aggarwal CC, Shen Y (2012) Reliable clustering on uncertain graphs. In: Proceedings of the 12th IEEE international conference on data mining (ICDM’12), pp 459–468
Papapetrou O, Ioannou E, Skoutas D (2011) Efficient discovery of frequent subgraph patterns in uncertain graph databases. In: Proceedings of the 14th international conference on extending database technology (EDBT’11), pp 355–366
Parchas P, Gullo F, Papadias D, Bonchi F (2014) The pursuit of a good possible world: extracting representative instances of uncertain graphs. In: Proceedings of the ACM SIGMOD international conference on managementof data (SIGMOD’14), pp 967–978
Potamias M, Bonchi F, Gionis A, Kollios G (2010) k-nearest neighbors in uncertain graphs. Proc VLDB Endow (PVLDB) 3(1):997–1008
Schank T (2007) Algorithmic aspects of trianlge-based network analysis. Ph.D. thesis, Universität Karlsruhe, Fakultätfür Informatik
Song S, Zou Z, Liu K (2016) Triangle-based representative possible worlds of uncertain graphs. In: Proceedings of the 21st international conference on database systems for advanced applications (DASFAA’16)
Wang J, Cheng J (2012) Truss decomposition in massive networks. Proc VLDB Endow (PVLDB) 5(9):812–823
Wang N, Parthasarathy S, Tan K, Tung AKH (2008) CSV: visualizing and mining cohesive subgraphs. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD’08), pp 445–458
Yuan Y, Chen L, Wang G (2010) Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In: Proceedings of the 15th international conference on database systems for advanced applications (DASFAA’10), pp 155–170
Yuan Y, Wang G, Chen L, Wang H (2012) Efficient subgraph similarity search on large probabilistic graph databases. Proc VLDB Endow (PVLDB) 5(9):800–811
Yuan Y, Wang G, Chen L, Wang H (2013) Efficient keyword search on uncertain graph data. IEEE Trans Knowl Data Eng (TKDE) 25(12):2767–2779
Yuan Y, Wang G, Wang H, Chen L (2011) Efficient subgraph search over large uncertain graphs. Proc VLDB Endow (PVLDB) 4(11):876–886
Zhang Y, Parthasarathy S (2012) Extracting analyzing and visualizing triangle k-core motifs within networks. In: Proceedings of the 28th IEEE international conference on data engineering (ICDE’12), pp 1049–1060
Zhao F, Tung AKH (2012) Large scale cohesive subgraphs discovery for social network visual analysis. Proc VLDB Endow (PVLDB) 6(2):85–96
Zhu R, Zou Z, Li J (2015) Top-\(k\) reliability search on uncertain graphs. In: Proceedings of the 15th IEEE international conference on data mining (ICDM’15), pp 659–668
Zhu R, Zou Z, Li J (2016) SimRank computation on uncertain graphs. In: Proceedings of the 32nd IEEE international conference on data engineering (ICDE’16)
Zou Z (2013) Polynomial-time algorithm for finding denest subgraphs in uncertain graphs. In: Proceedings of the 11th workshop on mining and learning with graphs (MLG’13)
Zou Z, Gao H, Li J (2010) Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 633–642
Zou Z, Li J (2013) Structural-context similarities for uncertain graphs. In: Proceedings of the 13th IEEE international conference on data mining (ICDM’13), pp 1325–1330
Zou Z, Li J, Gao H, Zhang S (2009) Frequent subgraph pattern mining on uncertain graph data. Proceedings of the 18th ACM conference on information and knowledge management (CIKM’09), pp 583–592
Zou Z, Li J, Gao H, Zhang S (2010) Finding top-k maximal cliques in an uncertain graph. In: Proceedings of the 26th IEEE international conference on data engineering (ICDE’10), pp 649–652
Zou Z, Li J, Gao H, Zhang S (2010) Mining frequent subgraph patterns from uncertain graph data. IEEE Trans Knowl Data Eng (TKDE) 22(9):1203–1218
Acknowledgments
This work was partially supported by the 973 Program of China (No. 2011CB036202), the NSF of China (No. 61532015 and No. 61173023) and the HIT-Tencent Open Research Fund. We also thank Dr. Francesco Bochi for sharing their datasets.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zou, Z., Zhu, R. Truss decomposition of uncertain graphs. Knowl Inf Syst 50, 197–230 (2017). https://doi.org/10.1007/s10115-016-0943-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0943-y