×

Enumerating regular expressions and their languages. (English) Zbl 1517.68190

Pin, Jean-Éric (ed.), Handbook of automata theory. Volume I. Theoretical foundations. Berlin: European Mathematical Society (EMS). 459-491 (2021).
This chapter of the Handbook guides the reader through the process of enumerating rational expressions of a given length and, subsequently, obtaining lower and upper bounds for the number of rational languages determined by rational expressions of given length. The authors consider three different notions of length in this context and summarise the basic relations among them.
The enumeration of rational expressions of given length can be performed using the standard methodology of analytic combinatorics – see, e.g., [P. Flajolet and R. Sedgewick, Analytic combinatorics. Cambridge: Cambridge University Press (2009; Zbl 1165.05001)]. The key is to observe that the language of all rational expressions can be generated by an unambiguous context-free grammar. This grammar may differ depending on precisely which rational expressions one considers to be valid (for instance, it is debatable whether one should allow the empty expression or expressions containing superfluous parentheses). The authors explicitly describe an unambiguous context-free grammar that they use as a specification of valid rational expressions in their considerations.
Given an unambiguous context-free grammar generating some subset \(S\) of all rational expressions, it follows by the Chomsky-Schützenberger enumeration theorem that if \(a_n\) denotes the number of rational expressions of (ordinary) length \(n\) in \(S\), then the generating function of the sequence of all such \(a_n\) is algebraic. In fact, the grammar can be directly transformed to an \(\mathbb{N}\)-algebraic system of equations such that the said generating function is the first component of its unique solution. Using Gröbner bases, one can further transform this system to a single algebraic equation satisfied by the generating function. This process is informally explained by the authors.
An exact asymptotic estimate for \(a_n\) as \(n\) tends to infinity can be obtained from such an algebraic equation via singularity analysis. Nevertheless, this is not strictly necessary for the enumeration of rational languages as presented by the authors, so the reader is only referred to [loc. cit.] regarding these matters. Instead, the authors describe in detail how to obtain simple lower and upper bounds for \(a_n\) by making use of Pringsheim’s theorem.
Lower bounds for the number of rational languages determined by rational expressions of given length are then obtained by applying these observations to unambiguous grammars for certain particular families of rational expressions such that any two distinct expressions in this family determine different rational languages. Any lower bound for the number of such expressions thus also gives a lower bound for the number of rational languages. On the other hand, upper bounds are obtained by considering certain normal forms of rational expressions, i.e., families of expressions that still can be used to determine every rational language. Any upper bound for the number of such expressions yields an upper bound for the number of rational languages.
For the entire collection see [Zbl 1470.68001].

MSC:

68Q45 Formal languages and automata
05A16 Asymptotic enumeration

Citations:

Zbl 1165.05001

Software:

Grail

References:

[1] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The design and analysis of computer algo-rithms. Addison-Wesley Series in Computer Science and Information Processing. Addison-Wesley, Reading, MA, 1974. MR 0413592 Zbl 0326.68005 q.v. 461 · Zbl 0326.68005
[2] M. Almeida, N. Moreira, and R. Reis, Enumeration and generation with a string automata representation. Theoret. Comput. Sci. 387 (2007), no. 2, 93-102. MR 2362181 Zbl 1143.68031 q.v. 461 · Zbl 1143.68031
[3] F. Bassino, J. David, and C. Nicaud, Enumeration and random generation of possibly in-complete deterministic automata. Pure Math. Appl. (PU.M.A.) 19 (2008), no. 2-3, 1-15. MR 2566168 Zbl 1224.68043 q.v. 461 · Zbl 1224.68043
[4] F. Bassino, J. David, and A. Sportiello, Asymptotic enumeration of minimal automata. In 29th International Symposium on Theoretical Aspects of Computer Science (C. Dürr and T. Wilke, eds.). Proceedings of the symposium (STACS ’12) held in Paris, Febru-ary 29-March 3, 2012. LIPIcs. Leibniz International Proceedings in Informatics, 14. Schloss Dagstuhl. Leibniz-Zentrum für Informatik, Wadern, 2012, 88-99. MR 3005521 Zbl 1245.68118 q.v. 461 · Zbl 1245.68118
[5] F. Bassino and C. Nicaud, Enumeration and random generation of accessible automata. Theoret. Comput. Sci. 381 (2007), no. 1-3, 86-104. MR 2347395 Zbl 1188.68168 q.v. 460 · Zbl 1188.68168
[6] D. Callan, A determinant of Stirling cycle numbers counts unlabeled acyclic single-source automata. Discrete Math. Theor. Comput. Sci. 10 (2008), no. 2, 77-86. MR 2419277 Zbl 1196.05008 q.v. 461 · Zbl 1196.05008
[7] N. Chomsky and M.-P. Schützenberger. The algebraic theory of context-free languages. In Computer programming and formal systems (P. Brattfort and D. Hirschberg, eds.). North-Holland Publishing Co., Amsterdam, 1963, 118-161. MR 0152391 Zbl 0148.00804 q.v. 463 · Zbl 0148.00804
[8] D. A. Cox, J. Little, and D. O’Shea, Ideals, varieties, and algorithms. An introduction to computational algebraic geometry and commutative algebra. Third edition. Undergraduate Texts in Mathematics. Springer, New York, 2007. MR 2290010 Zbl 1118.13001 q.v. 466 · Zbl 1118.13001
[9] M. Domaratzki, Combinatorial interpretations of a generalization of the Genocchi numbers. J. Integer Seq. 7 (2004), no. 3, Article 04.3.6, 11 pp. https://www.cs.uwaterloo.ca/journals/JIS/VOL7/Domaratzki/doma23.html MR 2110777 Zbl 1092.11010 q.v. 460 · Zbl 1092.11010
[10] M. Domaratzki, Improved bounds on the number of automata accepting finite languages. Internat. J. Found. Comput. Sci. 15 (2004), no. 1, 143-161. Computing and Combinatorics Conference -COCOON ’02. MR 2056728 Zbl 1101.68650 q.v. 460 · Zbl 1101.68650
[11] M. Domaratzki, Enumeration of formal languages. Bull. Eur. Assoc. Theor. Comput. Sci. 89 (2006), 117-133. MR 2267837 Zbl 1169.68466 q.v. 461 · Zbl 1169.68466
[12] M. Domaratzki, D. Kisman, and J. Shallit, On the number of distinct languages accepted by finite automata with n states. J. Autom. Lang. Comb. 7 (2002), no. 4, 469-486. Descriptional complexity of automata, grammars and related structures (Vienna, 2001). MR 1990452 Zbl 1137.68421 q.v. 460 · Zbl 1137.68421
[13] A. Ehrenfeucht and P. Zeiger, Complexity measures for regular expressions. J. Comput. System Sci. 12 (1976), no. 2, 134-146. MR 0418509 Zbl 0329.94024 q.v. 461 · Zbl 0329.94024
[14] K. Ellul, B. Krawetz, J. Shallit, and M.-W. Wang, Regular expressions: new results and open problems. J. Autom. Lang. Comb. 10 (2005), no. 4, 407-437. MR 2376649 Zbl 1143.68434 q.v. 461, 462 · Zbl 1143.68434
[15] P. Flajolet and R. Sedgewick, Analytic combinatorics. Cambridge University Press, Cam-bridge, 2009. MR 2483235 Zbl 1165.05001 q.v. 469, 472, 476 · Zbl 1165.05001
[16] S. Ginsburg, An introduction to mathematical machine theory. Addison-Wesley, Reading, MA, 1962. MR 0145693 Zbl 0102.33804 q.v. 460 · Zbl 0102.33804
[17] I. P. Goulden and D. M. Jackson, Combinatorial enumeration. With a foreword by G.-C. Rota. Wiley-Interscience Series in Discrete Mathematics. John Wiley & Sons, New York, 1983. MR 0702512 Zbl 0519.05001 q.v. 475 · Zbl 0519.05001
[18] H. Gruber and S. Gulan, Simplifying regular expressions. A quantitative perspective. In Language and automata theory and applications (A.-H. Dediu, H. Fernau, and C. Martín-Vide, eds.). Proceedings of the 4 th International Conference on Language and Automata Theory and Applications. Lecture Notes in Computer Science, 6031. Springer, Berlin, 2010, 285-296. MR 2753917 Zbl 1284.68351 q.v. 482 · Zbl 1284.68351
[19] F. Harary, The number of functional digraphs. Math. Ann. 138 (1959), 203-210. MR 0109130 Zbl 0087.38703 q.v. 460 · Zbl 0087.38703
[20] F. Harary, Unsolved problems in the enumeration of graphs. Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 (1960), 63-95. MR 0146820 Zbl 0095.16902 q.v. 460 · Zbl 0095.16902
[21] F. Harary, Combinatorial problems in graphical enumeration. In Applied combinatorial mathematics (E. Beckenbach, ed.). University of California Engineering and Physical Sci-ences Extension Series. John Wiley & Sons, New York etc., 1964, 185-217. Zbl 0158.20801 q.v. 460
[22] F. Harary and E. Palmer, Enumeration of finite automata. Information and Control 10 (1967), 499-508. MR 0215652 Zbl 0168.25903 q.v. 460 · Zbl 0168.25903
[23] M. A. Harrison, A census of finite automata. In Proceedings of the 5 th Annual Symposium on Switching Circuit Theory and Logical Design. Institute of Electrical and Electronics Engineers, 1964, 44-46. q.v. 460
[24] M. A. Harrison, A census of finite automata. Canadian J. Math. 17 (1965), 100-113. MR 0170772 Zbl 0156.01703 q.v. 460 · Zbl 0156.01703
[25] R. Hartshorne, Algebraic geometry. Graduate Texts in Mathematics, 52. Springer, Berlin, 1977. MR 0463157 Zbl 0367.14001 q.v. 466 · Zbl 0367.14001
[26] J. E. Hopcroft and J. D. Ullman, Introduction to automata theory, languages, and computa-tion. Addison-Wesley Series in Computer Science. Addison-Wesley, Reading, MA, 1979. MR 0645539 Zbl 0426.68001 q.v. 464 · Zbl 0426.68001
[27] L. Ilie and S. Yu, Follow automata. Inform. and Comput. 186 (2003), no. 1, 140-162. MR 2001743 Zbl 1059.68063 q.v. 461, 462 · Zbl 1059.68063
[28] D. E. Knuth, The art of computer programming. Vol. 2. Seminumerical algorithms. Third edition. Addison-Wesley, Reading, MA, 1998. MR 3077153 Zbl 0895.65001 q.v. 474 · Zbl 0895.65001
[29] A. D. Korshunov, Об асимптотических оценках числа приведенных автоматов (On asymptotic estimates of the number of finite automata). Diskret. Analiz 6 (1966), 35-50. In Russian. MR 0197225 Zbl 0171.27502 q.v. 460 · Zbl 0171.27502
[30] A. D. Korshunov, Asymptotic estimates of the number of finite automata. Kibernetika (Kiev) 1967, no. 2, 12-19. In Russian. English translation, Cybernetics 3 (1967), no. 2, 9-14. MR 0267983 Zbl 0226.94045 q.v. 460
[31] A. D. Korshunov, Обзор некоторых направлений теории автоматов (A survey of certain trends in automata theory). Diskret. Analiz 25 Sintez Shem i Avtomaty (1974), 19-55, 62. In Russian. MR 0368971 Zbl 0306.94034 q.v. 460
[32] A. D. Korshunov, The number of automata and boundedly determined functions. Hereditary properties of automata. Dokl. Akad. Nauk SSSR 221 (1975), no. 6, 1264-1267. In Russian. English translation, Soviet Math. Dokl. 16 (1975), 515-518. MR 0386893 Zbl 0325.68024 q.v. 460 · Zbl 0325.68024
[33] A. D. Korshunov, О перечислении конечных автоматов (Enumeration of finite automata).
[34] Problemy Kibernet. 34 (1978), 5-82, 272. In Russian. MR 0517814 Zbl 0415.03027 q.v. 460
[35] W. Kuich and A. Salomaa, Semirings, automata, languages. EATCS Monographs on The-oretical Computer Science, 5. Springer, Berlin, 1986. MR 0817983 Zbl 0582.68002 q.v. 465 · Zbl 0582.68002
[36] E. Lebensztayn, On the asymptotic enumeration of accessible automata. Discrete Math. Theor. Comput. Sci. 12 (2010), no. 3, 75-79. MR 2786470 Zbl 1280.68117 q.v. 460 · Zbl 1280.68117
[37] J. Lee and J. Shallit, Enumerating regular expressions and their languages. In Implemen-tation and application of automata (M. Domaratzki, A. Okhotin, K. Salomaa, and S. Yu, eds.). Papers from the 9 th International Conference (CIAA 2004) held at Queen’s Univer-sity, Kingston, ON, July 22-24, 2004. Lecture Notes in Computer Science, 3317. Springer, Berlin, 2005, 2-22. MR 2143391 Zbl 2144483 q.v. 461 · Zbl 1115.68444
[38] E. L. Leiss, Constructing a finite automaton for a given regular expression. SIGACT News 12 (1980), no. 3, 81-87. Zbl 0453.68025 q.v. 461 · Zbl 0453.68025
[39] V. A. Liskovets, The number of connected initial automata. Kibernetika (Kiev) 1969, no. 3, 16-19. In Russian. English translation, Cybernetics 5 (1969), 259-262. MR 0302356 Zbl 0204.32102 q.v. 460
[40] V. A. Liskovets, Exact enumeration of acyclic deterministic automata. Discrete Appl. Math. 154 (2006), no. 3, 537-551. MR 2203203 Zbl 1090.68060 q.v. 461 · Zbl 1090.68060
[41] E. M. Livshits, Асимптотическая формула для числа классов изоморфных автоном-ных автоматов с n состояниями (Asymptotic formula for the number of classes of isomor-phic autonomous automata with n states). Ukrain. Mat. Ž. 16 (1964), 245-246. In Russian. MR 0164848 Zbl 0124.25005 q.v. 460
[42] J. C. Martin, Introduction to languages and the theory of computation. Third edition. McGraw-Hill, New York, N.Y., 2003. q.v. 463
[43] R. McNaughton and H. Yamada, Regular expressions and state graphs for automata. IRE Trans. Electronic Computers 9 (1960), 39-47. Zbl 0156.25501 q.v. 461 · Zbl 0156.25501
[44] C. Nicaud, Average state complexity of operations on unary automata. In Mathematical foundations of computer science 1999. (M. Kutylowski, L. Pacholski, and T. Wierzbicki, eds.). Proceedings of the 24 th International Symposium (MFCS ’99) held in Szklarska Poręba, September 6-10, 1999. Lecture Notes in Computer Science, 1672. Springer, Berlin, 1999, 231-240. MR 1731238 Zbl 0955.68068 q.v. 460 · Zbl 0955.68068
[45] C. Nicaud, On the average size of Glushkov’s automata. In Language and automata theory and applications (A. H. Dediu, A. M. Ionescu, and C. Martín-Vide, eds.). Third Interna-tional Conference on Language and Automata Theory and Applications. LATA 2009. Held in Tarragona, Spain, April 2-8, 2009. Lecture Notes in Computer Science, 5457. Springer, Berlin, 2009, 626-637. MR 2544451 Zbl 1234.68232 q.v. 463 · Zbl 1234.68232
[46] A. Panholzer, Gröbner bases and the defining polynomial of a context-free grammar gen-erating function. J. Autom. Lang. Comb. 10 (2005), no. 1, 79-97. MR 2192586 Zbl 1087.68046 q.v. 465 · Zbl 1087.68046
[47] C. Pomerance, J. M. Robson, and J. Shallit, Automaticity. II. Descriptional complexity in the unary case. Theoret. Comput. Sci. 180 (1997), no. 1-2, 181-201. MR 1453865 Zbl 0959.11015 q.v. 460 · Zbl 0959.11015
[48] C. E. Radke, Enumeration of strongly connected sequential machines. Information and Control 8 (1965), 377-389. MR 0180461 Zbl 0127.01102 q.v. 460 · Zbl 0127.01102
[49] D. Raymond and D. Wood, Grail: a C CC library for automata and expressions. J. Symbolic Comput. 17 (1994), 341-350. Zbl 0942.68803 q.v. 463 · Zbl 0942.68803
[50] R. C. Read, A note on the number of functional digraphs. Math. Ann. 143 (1961), 109-110. MR 0120162 Zbl 0096.38201 q.v. 460 · Zbl 0096.38201
[51] R. W. Robinson, Counting strongly connected finite automata. In Graph theory with ap-plications to algorithms and computer science (Y. Alavi, G. Chartrand, L. Lesniak, D. R. Lick, and C. E. Wall, eds.). Proceedings of the 5 th international conference. Held at West-ern Michigan University, Kalamazoo, MI, June 4-8, 1984. John Wiley & Sons, New York, 1985, 671-685. MR 0812700 Zbl 0572.68042 q.v. 460 · Zbl 0572.68042
[52] W. Rudin, Real and complex analysis. McGraw-Hill Series in Higher Mathematics. McGraw-Hill, New York etc., 1966. MR 0210528 Zbl 0142.01701 q.v. 477 · Zbl 0613.26001
[53] J. Shallit and Y. Breitbart, Automaticity. I. Properties of a measure of descriptional com-plexity. J. Comput. System Sci. 53 (1996), no. 1, 10-25. MR 1409007 Zbl 0859.68059 q.v. 460 · Zbl 0859.68059
[54] R. P. Stanley, Enumerative combinatorics. Vol. 2. With a foreword by G.-C. Rota and Appendix 1 by S. Fomin. Cambridge Studies in Advanced Mathematics, 62. Cambridge University Press, Cambridge, 1999. MR 1676282 Zbl 0928.05001 q.v. 464 · Zbl 0928.05001
[55] V. A. Vyssotsky, A counting problem for finite automata. Technical report. Bell Telephone Laboratories, 1959. q.v. 460
[56] H. Wilf, Generatingfunctionology. Third edition. A. K. Peters, Wellesley, MA, 2006. MR 2172781 Zbl 1092.05001 q.v. 464 · Zbl 1092.05001
[57] D. Ziadi, Regular expression for a language without empty word. Theoret. Comput. Sci. 163 (1996), no. 1-2, 309-315. MR 1407031 Zbl 0878.68080 q.v. 461 · Zbl 0878.68080
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.