NODAR: mining globally distributed substructures from a single labeled graph

Aya Hellal¹ &
Lotfi Ben Romdhane^1,2

445 Accesses
8 Citations
Explore all metrics

Abstract

Data mining in structured and semi-structured data focuses on frequent data values. However, in graph data mining, the focus is on common specific topologies. Graph mining, although its ubiquity, is a difficult task since it requires subgraph isomorphism which is known to be NP-complete. In order to effectively prune the search space and thereby save computational time, a graph mining algorithm requires that the support measure of a pattern to be no greater than that of its subpatterns. This property of the support measure is referred to in the literature as the down-closure, anti-monotonicity or admissibility. Unfortunately, when mining a single labeled graph, simply counting the occurrences of a graph pattern may not have the down-closure property. For this, most existing approaches mine frequent substructures in a set of labeled graphs (called also the transactional setting) and few efforts have been devoted to mining frequent globally distributed substructures in a single labeled graph. In this paper, we propose a graph mining algorithm, called NODAR(Non-Overlapping embeDding based grAph mineR), for computing common and globally distributed substructures in a single labeled graph. NODAR adopts the Depth-First Search (DFS) strategy and is based on the SMNOES (Size of Maximum Non Overlapping Embedding Set) as support measure. The core idea of NODAR is to automatically extract frequent subpatterns; and thus without frequency computation thanks to the down-closure property of SMNOES. By adopting this strategy in the computation of frequent substructures, NODAR reduces the number of subgraph isomorphism tests needed to compute pattern frequencies. Experimental results on monograph and transactional graph databases; and comparison with well-known probabilistic and exact algorithms; prove the efficacy of NODAR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-based substructure pattern mining with edge-weight

Article 08 March 2024

Resling: a scalable and generic framework to mine top-k representative subgraph patterns

Article 08 November 2017

Efficient Frequent Connected Induced Subgraph Mining in Graphs of Bounded Tree-Width

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Borgelt, C., & Berthold, M. R. (2002). Mining molecular fragments: Finding relevant substructures of molecules. In ICDM ’02: Proceedings of the 2002 IEEE international conference on data mining (p. 51). Washington, DC, USA: IEEE Computer Society.
Chapter Google Scholar
Chittimoori, R. N., Holder, L. B., & Cook, D. J. (1999). Applying the subdue substructure discovery system to the chemical toxicity domain. In Proceedings of the twelfth international Florida artificial intelligence research society conference (pp. 90–94). AAAI Press.
Cook, D. J., & Holder, L. B. (2007). Mining graph data. New York: Wiley.
MATH Google Scholar
Dharwadker, A. (2006). The clique algorithm. http://www.geocities.com/dharwadker/clique/. Accessed March 2011
Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., & Washio, T. (2003). Classifier construction by graph-based induction for graph-structured data. In PAKDD’03: Proceedings of the 7th Pacific-Asia conference on advances in knowledge discovery and data mining (pp. 52–62). Berlin: Springer.
Google Scholar
Ghazizadeh, S., & Chawathe, S. S. (2002). Seus: Structure extraction using summaries. In Discovery science (pp. 71–85).
Gudes, E., Shimony, S. E., & Vanetik, N. (2006). Discovering frequent graph patterns using disjoint paths. IEEE Transactions on Knowledge and Data Engineering, 18, 1441–1456.
Article Google Scholar
Huan, J., Wang, W., Prins, J., & Yang, J. (2004). Spin: Mining maximal frequent subgraphs from graph databases. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 581–586). New York: ACM.
Chapter Google Scholar
Inokuchi, A., Washio, T., & Motoda, H. (2000). An apriori-based algorithm for mining frequent substructures from graph data. In Proceedings of the 4th European conference on principles of data mining and knowledge discovery, PKDD ’00 (pp. 13–23). London: Springer.
Google Scholar
Jiang, X., Xiong, H., Wang, C., & Tan, A.-H. (2009). Mining globally distributed frequent subgraphs in a single labeled graph. Data & Knowledge Engineering, 68(10), 1034–1058.
Article Google Scholar
Kuramochi, M., & Karypis, G. (2001). Frequent subgraph discovery. In ICDM ’01: Proceedings of the 2001 IEEE international conference on data mining (pp. 313–320). Washington, DC: IEEE Computer Society.
Chapter Google Scholar
Kuramochi, M., & Karypis, G. (2004). An efficient algorithm for discovering frequent subgraphs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1038–1051.
Article Google Scholar
Nijssen, S., & Kok, J. N. (2004). A quickstart in frequent structure mining can make a difference. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’04 (pp. 647–652). New York: ACM.
Chapter Google Scholar
Schreiber, F., & Schwöbbermeyer, H. (2005). Frequency concepts and pattern detection for the analysis of motifs in networks. In Transactions on computational systems biology III (pp. 89–104). Berlin: Springer.
Chapter Google Scholar
Vanetik, N., Shimony, S. E., & Gudes, E. (2006). Support measures for graph data*. Data Mining and Knowledge Discovery, 13, 243–260.
Article MathSciNet MATH Google Scholar
Wörlein, M., Dreweke, E., Meinl, T., & Fischer, I. (2006). Edgar: The embedding-based graph miner. In Proceedings of the international workshop on mining and learning with graphs (MLG 2006 (pp. 221–228).
Yan, X., & Han, J. (2002). gspan: Graph-based substructure pattern mining. In ICDM ’02: Proceedings of the 2002 IEEE international conference on data mining (p. 721). Washington, DC: IEEE Computer Society.
Google Scholar

Download references

Acknowledgements

The implementations of EDGAR and SUBDUE algorithms were kindly provided by Mr. Marc Wörlein at the Department of Computer Science, University of Erlangen-Nuremberg.

Author information

Authors and Affiliations

MARS (Modeling of Automated Reasoning Systems) Research Group, Faculty of Sciences, University of Monastir, Monastir, Tunisia
Aya Hellal & Lotfi Ben Romdhane
High School of Sciences and Technology, University of Sousse, Hammam Sousse, Tunisia
Lotfi Ben Romdhane

Authors

Aya Hellal
View author publications
You can also search for this author in PubMed Google Scholar
Lotfi Ben Romdhane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lotfi Ben Romdhane.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hellal, A., Romdhane, L.B. NODAR: mining globally distributed substructures from a single labeled graph. J Intell Inf Syst 40, 1–15 (2013). https://doi.org/10.1007/s10844-012-0213-8

Download citation

Received: 21 October 2011
Revised: 08 June 2012
Accepted: 08 June 2012
Published: 04 July 2012
Issue Date: February 2013
DOI: https://doi.org/10.1007/s10844-012-0213-8

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Graph-based substructure pattern mining with edge-weight

Resling: a scalable and generic framework to mine top-k representative subgraph patterns

Efficient Frequent Connected Induced Subgraph Mining in Graphs of Bounded Tree-Width

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

NODAR: mining globally distributed substructures from a single labeled graph

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Graph-based substructure pattern mining with edge-weight

Resling: a scalable and generic framework to mine top-k representative subgraph patterns

Efficient Frequent Connected Induced Subgraph Mining in Graphs of Bounded Tree-Width

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation