×

On suffix tree detection. (English) Zbl 07902294

Summary: A suffix tree is a fundamental data structure for string processing and information retrieval, however, its structure is still not well understood. The suffix trees reverse engineering problem, which its research aims at reducing this gap, is the following. Given an ordered rooted tree \(T\) with unlabeled edges, determine whether there exists a string \(w\) such that the unlabeled-edges suffix tree of \(w\) is isomorphic to \(T\). Previous studies on this problem consider the relaxation of having the suffix links as well as assume a binary alphabet. This paper is the first to consider the suffix tree detection problem, in which the relaxation of having suffix links as input is removed. We study suffix tree detection on two scenarios that are interesting per se. We provide a suffix tree detection algorithm for general alphabet periodic strings. Given an ordered tree \(T\) with \(n\) leaves, our detection algorithm takes \(O(n + | {\Sigma} |^p)\)-time, where p is the unknown in advance length of a period that repeats at least 3 times in a string \(S\) having a suffix tree structure identical to \(T\), if such \(S\) exists. Therefore, it is a polynomial time algorithm if \(p\) is a constant and a linear time algorithm if, in addition, the alphabet has a sub-linear size. We also show some necessary (but insufficient) conditions for binary alphabet general strings suffix tree detection. By this we take another step towards understanding suffix trees structure.

MSC:

68Qxx Theory of computing
Full Text: DOI

References:

[1] Amir, A.; Eisenberg, E.; Levy, A.; Porat, E.; Shapira, N., Cycle detection and correction, ACM Trans. Algorithms, 9, 1, 2012 · Zbl 1301.68282
[2] Amir, A.; Levy, A.; Lewenstein, M.; Lubin, R.; Porat, B., Can we recover the cover?, Algorithmica, 81, 7, 2019 · Zbl 1423.68618
[3] Apostolico, A., The myriad virtues of subword trees, (Combinatorial Algorithms on Words, 1985), 85-96 · Zbl 0572.68067
[4] Bannai, H.; Inenaga, S.; Shinohara, A.; Takeda, M., Inferring strings from graphs and arrays, (International Symposium on Mathematical Foundations of Computer Science, 2003, Springer), 208-217 · Zbl 1124.68348
[5] Breslauer, D.; Italiano, G. F., On suffix extensions in suffix trees, Theor. Comput. Sci., 457, 27-34, 2012 · Zbl 1251.68081
[6] Cazaux, B.; Rivals, E., Reverse engineering of compact suffix trees and links: a novel algorithm, J. Discret. Algorithms, 28, 9-22, 2014 · Zbl 1305.68379
[7] Clément, J.; Crochemore, M.; Rindone, G., Reverse engineering prefix tables, (Proc. 26th Intl. Symp. on Theoretical Aspects of Computer Science STACS. Proc. 26th Intl. Symp. on Theoretical Aspects of Computer Science STACS, LIPIcs, vol. 3, 2009, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik: Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik Germany), 289-300 · Zbl 1236.68306
[8] Crochemore, M.; Iliopoulos, C. S.; Pissis, S. P.; Tischler, G., Cover array string reconstruction, (Proc. 21st Annual Symp. on Combinatorial Pattern Matching (CPM). Proc. 21st Annual Symp. on Combinatorial Pattern Matching (CPM), LNCS, vol. 6129, 2010, Springer), 251-259 · Zbl 1286.68524
[9] Duval, J.; Lecroq, T.; Lefebvre, A., Efficient validation and construction of border arrays and validation of string matching automata, RAIRO Theor. Inform. Appl., 43, 2, 281-297, 2009 · Zbl 1166.68033
[10] Farach, M., Optimal suffix tree construction with large alphabets, (Proc. 38th IEEE Symposium on Foundations of Computer Science, 1997), 137-143
[11] Franek, F.; Gao, S.; Lu, W.; Ryan, P. J.; Smyth, W. F.; Sun, Y.; Yang, L., Verifying a border array in linear time, J. Comb. Math. Comb. Comput., 42, 223-236, 2000 · Zbl 1009.68106
[12] Gawrychowski, P.; Jez, A.; Jez, L., Validating the Knuth-Morris-Pratt failure function, fast and online, Theory Comput. Syst., 54, 2, 337-372, 2014 · Zbl 1380.68473
[13] Gawrychowski, P.; Kociumaka, T.; Radoszewski, J.; Rytter, W.; Walen, T., Universal reconstruction of a string, Theor. Comput. Sci., 812, 174-186, 2020 · Zbl 1435.68398
[14] Gelle, K.; Iván, S., Recognizing union-find trees is NP-complete, even without rank info, Int. J. Found. Comput. Sci., 30, 6-7, 1029-1045, 2019 · Zbl 1427.68059
[15] Gusfield, D., Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, 1997, Cambridge University Press · Zbl 0934.68103
[16] He, M.; Munro, J. I.; Rao, S. S., A Categorization Theorem on Suffix Arrays with Applications to Space Efficient Text Indexes, SODA, vol. 5, 23-32, 2005, Citeseer · Zbl 1297.68064
[17] I, T.; Inenaga, S.; Bannai, H.; Takeda, M., Verifying and enumerating parameterized border arrays, Theor. Comput. Sci., 412, 50, 6959-6981, 2011 · Zbl 1228.68067
[18] I, T.; Inenaga, S.; Bannai, H.; Takeda, M., Inferring strings from suffix trees and links on a binary alphabet, Discrete Appl. Math., 163, 316-325, 2014 · Zbl 1329.68314
[19] Kärkkäinen, J.; Piatkowski, M.; Puglisi, S. J., String inference from longest-common-prefix array, (Proc. 44th Intl. Coll. on Automata, Languages, and Programming, ICALP. Proc. 44th Intl. Coll. on Automata, Languages, and Programming, ICALP, LIPIcs, vol. 80, 2017, Schloss Dagstuhl - Leibniz-Zentrum für Informatik), 62:1-62:14 · Zbl 1441.68302
[20] Kucherov, G.; Tóthmérész, L.; Vialette, S., On the combinatorics of suffix arrays, Inf. Process. Lett., 113, 22, 915-920, 2013 · Zbl 1284.68486
[21] McCreight, E. M., A space-economical suffix tree construction algorithm, J. ACM, 23, 262-272, 1976 · Zbl 0329.68042
[22] Nakashima, Y.; Okabe, T.; I, T.; Inenaga, S.; Bannai, H.; Takeda, M., Inferring strings from Lyndon factorization, Theor. Comput. Sci., 689, 147-156, 2017 · Zbl 1372.68218
[23] Schürmann, K. B.; Stoye, J., Counting suffix arrays and strings, Theor. Comput. Sci., 395, 2-3, 220-234, 2008 · Zbl 1142.68026
[24] Starikovskaya, T.; Vildhøj, H. W., A suffix tree or not a suffix tree?, J. Discret. Algorithms, 32, 14-23, 2015 · Zbl 1328.68330
[25] Ukkonen, E., On-line construction of suffix trees, Algorithmica, 14, 249-260, 1995 · Zbl 0831.68027
[26] Weiner, P., Linear pattern matching algorithm, (Proc. 14 IEEE Symposium on Switching and Automata Theory, 1973), 1-11
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.