×

The factors analysis and algorithm implementation of single-pattern matching. (English) Zbl 1163.68333

Summary: By studying the algorithms of single pattern matching, five factors that have effect on time complexity of the algorithm are analyzed. The five factors are: sorting the characters of pattern string in an increasing order of using frequency, utilizing already-matched pattern suffix information, utilizing already-matched pattern prefix information, utilizing the position factor which is absorbed from quick search algorithm, and utilizing the continue-skip idea which is originally proposed by this paper. Combining all the five factors, a new single pattern matching algorithm is implemented. It’s proven by the experiment that the efficiency of new algorithm is the best of all algorithms.

MSC:

68T10 Pattern recognition, speech recognition
Full Text: DOI

References:

[1] Navarro G, Raffinot M. Flexible pattern matching in strings: Practical on-line search algorithms for texts and biological sequences [M]. Cambridge: Cambridge University Press, 2007: 41–76. · Zbl 1141.92317
[2] Liu G, Li J H, Li S H. Multi-pattern matching algorithm [J]. Journal of Systems Engineering and Electronics, 2006, 17(2): 437–442. · Zbl 1173.68677 · doi:10.1016/S1004-4132(06)60074-1
[3] Knuth D E, Morris J H, Pratt V R. Fast pattern matching in strings [J]. SIAM Journal on Computing, 1977, 6(1): 323–350. · Zbl 0372.68005 · doi:10.1137/0206024
[4] Sunday D M. A very fast substring search algorithm [J]. Communications of the ACM, 1990, 33(8): 132–142. · doi:10.1145/79173.79184
[5] Boyer R S, Moore J S. A fast string searching algorithm [J]. Communications of the ACM, 1977, 20(10): 762–772. · Zbl 1219.68165 · doi:10.1145/359842.359859
[6] Crochemore M, Perrin D. Two-way string matching [J]. Journal of the ACM, 1991, 38(3): 651–675. · Zbl 0808.68063 · doi:10.1145/116825.116845
[7] Liu Q. Character frequency of modern Chinese [EB/OL]. (2007-08-17) [2009-03-24]. http://www.nlp.org.cn/docs/download.php?doc_id=1217 (in Chinese).
[8] Liu G S. Researches on some key problems of high performance Chinese automatic summarization system [D]. Shanghai: Shanghai Jiao Tong University, 2004 (in Chinese).
[9] Yin L H, Zhang D Y, Fang B X. The performance analysis of single pattern matching algorithms for intrusion detection [J]. Computer Engineering and Applications, 2004, 24(13): 1–4 (in Chinese).
[10] Richard C. Tight bounds on the complexity of the Boyer-Moore string matching algorithm [J]. SIAM Journal on Computing, 1994, 23(5): 1075–1091. · doi:10.1137/S0097539791195543
[11] Donald E K. The art of computer programming: Volume 3 [M]. 3rd ed. Beijing: Tsinghua University Press, 2004: 323–402.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.