×

Semi-local longest common subsequences in subquadratic time. (English) Zbl 1154.68543

Summary: For two strings \(a, b\) of lengths \(m, n\), respectively, the longest common subsequence (LCS) problem consists in comparing \(a\) and \(b\) by computing the length of their LCS. In this paper, we define a generalisation, called “the all semi-local LCS problem”, where each string is compared against all substrings of the other string, and all prefixes of each string are compared against all suffixes of the other string. An explicit representation of the output lengths is of size \(\Theta ((m+n)^{2})\). We show that the output can be represented implicitly by a geometric data structure of size \(O(m+n)\), allowing efficient queries of the individual output lengths. The currently best all string-substring LCS algorithm by Alves et al., based on previous work by Schmidt, can be adapted to produce the output in this form. We also develop the first all semi-local LCS algorithm, running in time \(o(mn)\) when \(m\) and \(n\) are reasonably close. Compared to a number of previous results, our approach presents an improvement in algorithm functionality, output representation efficiency, and/or running time.

MSC:

68U05 Computer graphics; computational geometry (digital and algorithmic aspects)
68W40 Analysis of algorithms
Full Text: DOI

References:

[1] Aho, A. V.; Hirschberg, D. S.; Ullman, J. D., Bounds on the complexity of the longest common subsequence problem, Journal of the ACM, 23, 1-12 (1976) · Zbl 0316.68027
[2] Alves, C. E.R.; Cáceres, E. N.; Song, S. W., An all-substrings common subsequence algorithm, Electronic Notes in Discrete Mathematics, 19, 133-139 (2005) · Zbl 1203.68321
[3] Apostolico, A., String editing and longest common subsequences, (Handbook of Formal Languages, vol. 2 (1997), Springer-Verlag), 361-398
[4] Arlazarov, V. L.; Dinic, E. A.; Kronrod, M. A.; Faradzev, I. A., On economical construction of the transitive closure of an oriented graph, Soviet Mathematical Doklady, 11, 1209-1210 (1970) · Zbl 0214.23601
[5] Bentley, J. L., Multidimensional divide-and-conquer, Communications of the ACM, 23, 4, 214-229 (1980) · Zbl 0434.68049
[6] Burkard, R. E.; Klinz, B.; Rudolf, R., Perspectives of Monge properties in optimization, Discrete Applied Mathematics, 70, 95-161 (1996) · Zbl 0856.90091
[7] Crochemore, M.; Landau, G. M.; Schieber, B.; Ziv-Ukelson, M., Re-use dynamic programming for sequence alignment: An algorithmic toolkit, (String Algorithmics. String Algorithmics, Texts in Algorithmics, vol. 2 (2004), King’s College Publications)
[8] Crochemore, M.; Landau, G. M.; Ziv-Ukelson, M., A subquadratic sequence alignment algorithm for unrestricted score matrices, SIAM Journal on Computing, 32, 6, 1654-1673 (2003) · Zbl 1253.74047
[9] Crochemore, M.; Rytter, W., Text Algorithms (1994), Oxford University Press · Zbl 0844.68101
[10] Gusfield, D., Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology (1997), Cambridge University Press · Zbl 0934.68103
[11] JáJá, J.; Mortensen, C.; Shi, Q., Space-efficient and fast algorithms for multidimensional dominance reporting and counting, (Proceedings of the 15th ISAAC. Proceedings of the 15th ISAAC, Lecture Notes in Computer Science, vol. 3341 (2004), Springer), 558-568 · Zbl 1116.68629
[12] Jones, N. C.; Pevzner, P. A., An introduction to bioinformatics algorithms, (Computational Molecular Biology (2004), The MIT Press)
[13] Kim, S.-R.; Park, K., A dynamic edit distance table, Journal of Discrete Algorithms, 2, 303-312 (2004) · Zbl 1118.68757
[14] Landau, G. M.; Myers, E.; Ziv-Ukelson, M., Two algorithms for LCS consecutive suffix alignment, (Proceedings of the 15th CPM. Proceedings of the 15th CPM, Lecture Notes in Computer Science, vol. 3109 (2004), Springer), 173-193 · Zbl 1103.68134
[15] Landau, G. M.; Myers, E. W.; Schmidt, J. P., Incremental string comparison, SIAM Journal on Computing, 27, 2, 557-582 (1998) · Zbl 0907.68075
[16] Landau, G. M.; Ziv-Ukelson, M., On the common substring alignment problem, Journal of Algorithms, 41, 338-359 (2001) · Zbl 1017.68039
[17] Masek, W. J.; Paterson, M. S., A faster algorithm computing string edit distances, Journal of Computer and System Sciences, 20, 18-31 (1980) · Zbl 0436.68044
[18] Navarro, G., A guided tour to approximate string matching, ACM Computing Surveys, 33, 1, 31-88 (2001)
[19] Preparata, F. P.; Shamos, M. I., Computational Geometry: An Introduction, Texts and Monographs in Computer Science (1985), Springer · Zbl 0759.68037
[20] Schmidt, J. P., All highest scoring paths in weighted grid graphs and their application to finding all approximate repeats in strings, SIAM Journal on Computing, 27, 4, 972-992 (1998) · Zbl 0907.68076
[21] Tiskin, A., All semi-local longest common subsequences in subquadratic time, (Proceedings of CSR. Proceedings of CSR, Lecture Notes in Computer Science, vol. 3967 (2006), Springer), 352-363 · Zbl 1185.68878
[22] Tiskin, A., Longest common subsequences in permutations and maximum cliques in circle graphs, (Proceedings of CPM. Proceedings of CPM, Lecture Notes in Computer Science, vol. 4009 (2006), Springer), 271-282
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.