×

Grammatical inference of directed acyclic graph languages with polynomial time complexity. (English) Zbl 1390.68362

Summary: In this paper we study the learning of graph languages. We extend the well-known classes of k-testability and k-testability in the strict sense languages to directed graph languages. We propose a grammatical inference algorithm to learn the class of directed acyclic k-testable in the strict sense graph languages. The algorithm runs in polynomial time and identifies this class of languages from positive data. We study its efficiency under several criteria, and perform a comprehensive experimentation with four datasets to show the validity of the method. Many fields, from pattern recognition to data compression, can take advantage of these results.

MSC:

68Q32 Computational learning theory
68Q45 Formal languages and automata

Software:

IgTM

References:

[1] Angluin, D.; Smith, C. H., Inductive inference: theory and methods, Comput. Surv., 15, 3, 237-269 (1983)
[2] Berwanger, D.; Janin, D., Automata on directed graphs: edge versus vertex marking, (Proceedings of 3rd International Workshop on Software Evolution Through Transformations. Proceedings of 3rd International Workshop on Software Evolution Through Transformations, LNCS, vol. 4178 (2006)), 46-60 · Zbl 1156.68444
[3] Blockeel, H.; Nijssen, S., Induction of node label controlled graph grammar rules, (Proceedings of 6th International Workshop on Mining and Learning with Graphs (2008))
[4] Bodlaender, H., Polynomial algorithms for graph isomorphism and chromatic index on partial \(k\)-trees, J. Algorithms, 11, 4, 631-643 (1990) · Zbl 0716.68042
[5] Branderburg, F. J.; Skodinis, K., Finite graph automata for linear and boundary graph languages, Theor. Comput. Sci., 332, 199-232 (2005) · Zbl 1070.68064
[6] Carrasco, R. C.; Oncina, J.; Calera-Rubio, J., Stochastic inference of regular tree languages, Mach. Learn., 44, 1-2, 185-197 (2001) · Zbl 0983.68106
[7] Clark, A.; Eyraud, R., Polynomial identification in the limit of substitutable context-free languages, J. Mach. Learn. Res., 8, 1725-1745 (2007) · Zbl 1222.68093
[8] Clark, A.; Eyraud, R.; Habrard, A., A polynomial algorithm for the inference of context free languages, (Proceedings of ICGI-08. Proceedings of ICGI-08, LNAI, vol. 5278 (2008)), 29-42 · Zbl 1177.68109
[9] Dosch, Ph.; Valveny, E., Report on the second symbol recognition contest, (Proc. 6th Int. Workshop on Graphics Recognition. Proc. 6th Int. Workshop on Graphics Recognition, GREC’05. Proc. 6th Int. Workshop on Graphics Recognition. Proc. 6th Int. Workshop on Graphics Recognition, GREC’05, LNCS, vol. 3926 (2005)), 381-397
[10] Florêncio, C. Costa, Learning node controlled graph grammars, (Proceedings of ICGI-08. Proceedings of ICGI-08, LNAI, vol. 5278 (2008)), 286-288 · Zbl 1177.68110
[11] Florêncio, C. Costa, Identification of hyperedge-replacement graph grammars, (Proceedings of 7th International Workshop on Mining and Learning with Graphs (2009))
[12] Javier Gallego-Sánchez, A.; Calera-Rubio, J.; López, Damián, Structural graph extraction from images, (Int. Symp. on Distributed Computing and Artificial Intelligence. Int. Symp. on Distributed Computing and Artificial Intelligence, DCAI 2012, Salamanca, Spain (2012))
[13] García, P.; Vidal, E., Inference of k-testable languages in the strict sense and application to syntactic pattern recognition, IEEE Trans. Pattern Anal. Mach. Intell., 12, 920-925 (1990)
[14] García, P., Learning \(k\)-Testable Tree Sets from Positive Data (1993), Universidad Politécnica de Valencia, Available on:
[15] Gold, E. M., Language identification in the limit, Inf. Control, 10, 447-474 (1967) · Zbl 0259.68032
[16] de la Higuera, C., Grammatical Inference. Learning Automata and Grammars (2010), Cambridge University Press · Zbl 1227.68112
[17] Jeltsch, E.; Kreowski, H. J., Grammatical inference based on hyperedge replacement, (4th International Workshop on Graph Grammars and Their Application to Computer Science. 4th International Workshop on Graph Grammars and Their Application to Computer Science, LNCS, vol. 532 (1991)), 461-474 · Zbl 0768.68082
[18] Jonyer, I.; Holder, L. B.; Cook, D. J., Concept formation using graph grammars, (Proceedings of the KDD Workshop on Multi-Relational Data Mining (2002)), 71-79
[19] Kazius, J.; McGuire, R.; Bursi, R., Derivation and validation of toxicophores for mutagenicity prediction, J. Med. Chem., 48, 1, 312-320 (2005)
[20] Kosala, R.; Blockeel, H.; Bruynooghe, M.; Van den Bussche, J., Information extraction from documents using \(k\)-testable tree automaton inference, Data Knowl. Eng., 58, 129-158 (2006)
[21] Kukluk, J. P.; Holder, L. B.; Cook, D. J., Inference of node replacement recursive graph grammars, (Proceedings of the Sixth SIAM International Conference on Data Mining (2006))
[22] López, D.; Calera-Rubio, J.; Gallego-Sánchez, A. Javier, Inference of k-testable directed acyclic graph languages, Workshop and Conference Proceedings. Workshop and Conference Proceedings, J. Mach. Learn. Res., 21, 149-163 (2012)
[23] López, D.; Sempere, J. M.; García, P., Inference of reversible tree languages, IEEE Trans. Syst. Man Cybern., Part B, Cybern., 34, 4, 1658-1665 (2004)
[24] Lucks, E. M., Isomorphism of graphs of bounded valence can be tested in polynomial time, J. Comput. Syst. Sci., 25, 42-65 (1982) · Zbl 0493.68064
[25] Matsui, H.; Sato, K.; Sakakibara, Y., Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures, Bioinformatics, 21, 11, 2611-2617 (2005)
[26] McNaughton, R.; Papert, S., Counter-Free Automata (1971), The MIT Press · Zbl 0232.94024
[27] McNaughton, R., Algebraic decision procedures for local testability, Math. Sysr. Theory, 8, 1, 60-76 (1974) · Zbl 0287.02022
[28] Miller, G. L., Isomorphism testing for graphs of bounded genus, (Proceedings of the 12th Annual ACM Symposium on Theory of Computing (1980)), 225-235
[29] Mukherjee, Sourav; Oates, Tim, Estimating graph parameters using graph grammars, (9th Intl. Colloquium. 9th Intl. Colloquium, ICGI’08. 9th Intl. Colloquium. 9th Intl. Colloquium, ICGI’08, LNAI, vol. 5278 (2008)), 292-294 · Zbl 1177.68120
[30] Nakamura, K., Incremental learning of context free grammars by bridging rule generation and search for semi-optimum rule sets, (Proceedings of ICGI-06. Proceedings of ICGI-06, LNAI, vol. 4201 (2006)), 72-83 · Zbl 1158.68413
[31] Perea, I.; López, D., Syntactic modelling and recognition of document images, (Proceedings of S+SSPR 2004. Proceedings of S+SSPR 2004, LNCS, vol. 3138 (2004)), 416-424 · Zbl 1104.68676
[32] Peris, P.; López, D.; Campos, M.; Sempere, J. M., Protein motif prediction by grammatical inference, (8th Intl. Colloquium. 8th Intl. Colloquium, ICGI’06. 8th Intl. Colloquium. 8th Intl. Colloquium, ICGI’06, LNAI, vol. 4201 (2006)), 175-187 · Zbl 1158.68416
[33] Peris, P.; López, D.; Campos, M., IgTM: an algorithm to predict transmembrane domains and topology in proteins, BMC - Bioinformatics, 9, 367 (2008)
[34] Pitt, L., Inductive inference, DFAs, and computational complexity, (Proc. 2nd Workshop on Analogical and Inductive Inference. Proc. 2nd Workshop on Analogical and Inductive Inference, LNAI, vol. 397 (1989)), 18-44
[35] Potthoff, A.; Seibert, S.; Thomas, W., Nondeterminism versus determinism on finite automata over directed acyclic graphs, Bull. Belg. Math. Soc., 1, 285-298 (1994) · Zbl 0803.68032
[36] Rico-Juan, J. R.; Calera-Rubio, J.; Carrasco, R. C., Smoothing and compression with stochastic \(k\)-testable tree languages, Pattern Recognit., 38, 1420-1430 (2005) · Zbl 1101.68637
[37] Riesen, K.; Bunke, H., IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning (2008), SSPR
[38] Ron, D.; Singer, Y.; Tishby, N., On the learnability and usage of acyclic probabilistic finite automata, J. Comput. Syst. Sci., 56, 133-152 (1998) · Zbl 0915.68124
[39] (Rozemberg, G., Handbook of Graph Grammars and Computing by Graph Transformation, vol. 1 (1997), Word Scientific) · Zbl 0908.68095
[40] Sakakibara, Y., Learning context-free grammars from structural data in polynomial time, Theor. Comput. Sci., 76, 223-242 (1990) · Zbl 0704.68067
[41] Sakakibara, Y., Efficient learning of context-free grammars from positive structural examples, Inf. Comput., 97, 23-60 (1992) · Zbl 0764.68146
[42] Sakakibara, Y., Grammatical inference in bioinformatics, IEEE Trans. Pattern Anal. Mach. Intell., 27, 7, 1051-1062 (2005)
[43] Sakakibara, Y., Learning context-free grammars using tabular representations, Pattern Recognit., 38, 1372-1383 (2005) · Zbl 1101.68638
[44] Sempere, J. M., Learning context-sensitive languages from linear structural information, (Proceedings of ICGI-08. Proceedings of ICGI-08, LNAI, vol. 5278 (2008)), 175-186 · Zbl 1177.68122
[45] Syropoulos, A., Mathematics of multisets, (Multiset Processing, Workshop on Membrane Computing 2000. Multiset Processing, Workshop on Membrane Computing 2000, LNCS, vol. 2235 (2001)), 347-358 · Zbl 1052.03030
[46] Verdú-Mas, J. L.; Carrasco, R. C.; Calera-Rubio, J., Parsing with probabilistic strictly locally testable tree languages, IEEE Trans. Pattern Anal. Mach. Intell., 27, 7, 1040-1047 (2005)
[47] Wilkinson, R.-A.; Geist, J.; Janet, S.; Grother, P.-J., The First Census Optical Character Recognition System Conference (1992), US Department of Commerce
[48] Yokomori, T.; Kobayashi, S., Learning local languages and their application to DNA sequence analysis, IEEE Trans. Pattern Anal. Mach. Intell., 20, 10, 1067-1079 (1998)
[49] Yokomori, T., Polynomial-time identification of very simple grammars from positive data, Theor. Comput. Sci., 298, 179-206 (2003) · Zbl 1038.68074
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.