×

Bit-parallel string matching under Hamming distance in \(O(n\lceil m/w\rceil)\) worst case time. (English) Zbl 1184.68209

Summary: Given two strings, a pattern \(P\) of length \(m\) and a text \(T\) of length \(n\) over some alphabet \(\Sigma \), we consider the string matching problem under \(k\) mismatches. The well-known Shift-Add algorithm [R. A. Baeza-Yates and G. H. Gonnet, Comm. ACM 35, No. 10, 74–82 (1992)] solves the problem in \(O(n\lceil m\log (k)/w\rceil)\) worst case time, where \(w\) is the number of bits in a computer word. We present two algorithms that improve this result to \(O(n\lceil m\log \log (k)/w\rceil)\) and \(O(n\lceil m/w \rceil\)), respectively. The algorithms make use of nested varying length bit-strings, that represent the search state. We call these Matryoshka counters. The techniques we developed are of more general use for string matching problems.

MSC:

68P10 Searching and sorting
68W05 Nonnumerical algorithms
Full Text: DOI

References:

[1] A. Amir, M. Lewenstein, E. Porat, Faster algorithms for string matching with \(k\) mismatches, in: Proceedings of the 11th ACM-SIAM Annual Symposium on Discrete Algorithms, San Francisco, CA, 2000, pp. 794-803; A. Amir, M. Lewenstein, E. Porat, Faster algorithms for string matching with \(k\) mismatches, in: Proceedings of the 11th ACM-SIAM Annual Symposium on Discrete Algorithms, San Francisco, CA, 2000, pp. 794-803 · Zbl 0957.68125
[2] R.A. Baeza-Yates, Efficient text searching, PhD thesis, Department of Computer Science, University of Waterloo, Ontario, 1989; R.A. Baeza-Yates, Efficient text searching, PhD thesis, Department of Computer Science, University of Waterloo, Ontario, 1989 · Zbl 0794.68030
[3] Baeza-Yates, R. A.; Gonnet, G. H., A new approach to text searching, Comm. ACM, 35, 10, 74-82 (1992)
[4] Crochemore, M.; Iliopoulos, C.; Navarro, G.; Pinzon, Y.; Salinger, A., Bit-parallel \((\delta, \gamma)\)-matching suffix automata, J. Discrete Algorithms (JDA), 3, 2-4, 198-214 (2005) · Zbl 1080.68565
[5] M.J. Fischer, M. Paterson, String matching and other products, in: R.M. Karp (Ed.), Proceedings SIAM-AMS Complexity of Computation, Providence, RI, 1974, pp. 113-125; M.J. Fischer, M. Paterson, String matching and other products, in: R.M. Karp (Ed.), Proceedings SIAM-AMS Complexity of Computation, Providence, RI, 1974, pp. 113-125 · Zbl 0301.68027
[6] Galil, Z.; Giancarlo, R., Improved string matching with \(k\) mismatches, SIGACT News, 17, 4, 52-54 (1986) · Zbl 0608.68058
[7] Myers, G., A fast bit-vector algorithm for approximate string matching based on dynamic programming, J. ACM, 46, 3, 395-415 (1999) · Zbl 1065.68663
[8] Navarro, G., A guided tour to approximate string matching, ACM Comput. Surv., 33, 1, 31-88 (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.