×

A comparative study of dictionary matching with gaps: limitations, techniques and challenges. (English) Zbl 1518.68433

Summary: Cyber security is a critical modern concern. Network intrusion detection systems (NIDS) perform protocol analysis, content searching and content matching, in order to detect harmful software. The high performance demands of NIDS applications in both speed and space consumption, as well as the diverse characteristics of the scenario conditions, have motivated a recent line of research on several formal problems all within the broad scope of dictionary matching with gaps. The goal of this paper is to supply a comparative survey of this line of research in terms of the problems, the formally proven limitations of any solution suggested, the techniques developed to deal with the limitations and different problems, to supply complementary techniques, and finally, to point out existing challenges still to be handled by future work.

MSC:

68W32 Algorithms on strings
Full Text: DOI

References:

[1] Krishnamurthy, M., Seagren, E.S., Alder, R., Bayles, A.W., Burke, J., Carter, S., Faskha, E.: How to Cheat at Securing Linux. Syngress Publishing Inc., Elsevier Inc (2008)
[2] Aho, AV; Corasick, MJ, Efficient string matching: an aid to bibliographic search, Commun. ACM, 18, 6, 333-340 (1975) · Zbl 0301.68048 · doi:10.1145/360825.360855
[3] Amir, A.; Călinescu, G., Alphabet-independent and scaled dictionary matching, J. Algorithms, 36, 1, 34-62 (2000) · Zbl 0956.68157 · doi:10.1006/jagm.2000.1081
[4] Amir, A.; Farach, M.; Galil, Z.; Giancarlo, R.; Park, K., Dynamic dictionary matching, J. Comput. Syst. Sci., 49, 2, 208-222 (1994) · Zbl 0942.68783 · doi:10.1016/S0022-0000(05)80047-9
[5] Amir, A.; Farach, M.; Idury, RM; Poutré, JAL; Schäffer, AA, Improved dynamic dictionary matching, Inf. Comput., 119, 2, 258-282 (1995) · Zbl 0832.68033 · doi:10.1006/inco.1995.1090
[6] Amir, A.; Keselman, D.; Landau, GM; Lewenstein, M.; Lewenstein, N.; Rodeh, M., Text indexing and dictionary matching with one error, J. Algorithms, 37, 2, 309-325 (2000) · Zbl 0966.68062 · doi:10.1006/jagm.2000.1104
[7] Brodal, G.S., Gasieniec, L.: Approximate dictionary queries. In: 7th Annual Symposium on Combinatorial Pattern Matching CPM (Laguna Beach, California, USA), June 10-12, pp. 65-74 (1996)
[8] Cole, R., Gottlieb, L., Lewenstein, M., Dictionary matching and indexing with errors and don’t cares. In: Proceedings 36th Annual ACM Symposium on the Theory of Computing (STOC), ACM Press, pp. 91-100 (2004) · Zbl 1192.68818
[9] Idury, RM; Schäffer, AA, Multiple matching of parameterized patterns, Theor. Comput. Sci., 154, 2, 203-224 (1996) · Zbl 0873.68074 · doi:10.1016/0304-3975(94)00270-3
[10] Kleene, S.C.: Representation of events in nerve nets and finite automata
[11] Bille, P.; Farach-Colton, M., Fast and compact regular expression matching, Theor. Comput. Sci., 409, 3, 486-496 (2008) · Zbl 1155.68022 · doi:10.1016/j.tcs.2008.08.042
[12] Bille, P., Thorup, M.: Regular expression matching with multi-strings and intervals. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010 (Moses Charikar, ed.), SIAM, pp. 1297-1308 (2010) · Zbl 1288.68299
[13] Myers, EW, A four russians algorithm for regular expression pattern matching, J. ACM, 39, 2, 430-448 (1992) · Zbl 0799.68104 · doi:10.1145/128749.128755
[14] Amir, A.; Kopelowitz, T.; Levy, A.; Pettie, S.; Porat, E.; Shalom, BR, Mind the gap! online dictionary matching with one gap, Algorithmica, 81, 6, 2123-2157 (2019) · Zbl 1412.68075 · doi:10.1007/s00453-018-0526-2
[15] Amir, A., Kopelowitz, T., Levy, A., Pettie, S., Porat, E., Shalom, B.R.: Mind the gap: essentially optimal algorithms for online dictionary matching with one gap. In: 27th International Symposium on Algorithms and Computation, ISAAC (Sydney, Australia), December 12-14, pp. 12:1-12:12 (2016) · Zbl 1398.68207
[16] Amir, A., Levy, A., Porat, E., Shalom, B.R.: Online recognition of dictionary with one gap. In: Proceedings of the Prague Stringology Conference (PSC) (Prague, Czech Republic), August 28-30, pp. 3-17 (2017) · Zbl 1496.68382
[17] Verint Systems, Personal communication. 2013, Addres: Maskit St. 33 Herzliya Israel
[18] Amir, A.; Levy, A.; Porat, E.; Shalom, BR, Dictionary matching with a few gaps, Theor. Comput. Sci., 589, 34-46 (2015) · Zbl 1319.68106 · doi:10.1016/j.tcs.2015.04.011
[19] Chase, M.; Shen, E., Substring-searchable symmetric encryption, PoPETs, 2015, 2, 263-281 (2015)
[20] Curtmola, R., Garay, J.A., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. ACM CCS 06 (Ari Juels Rebecca N. Wright and Sabrina De Capitani di Vimercati, eds.), ACM Press, pp. 9-88 (2006)
[21] Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: IEEE Symposium on Security and Privacy IEEE Computer Society Press, pp. 44-55 (2000)
[22] Boneh, D., Crescenzo, G.D., Ostrovsky, R., Persiano, G.: Public key encryption with keyword search. EUROCRYPT 2004 (Christian Cachin and Jan Camenisch, eds.), LNCS, vol. 3027, Springer, pp. 506-522 (2004) · Zbl 1122.68424
[23] Desmoulins, N., Fouque, P.A., Onete, C., Sanders, O.: Pattern matching on encrypted streams, ASIACRYPT (2018) · Zbl 1446.94125
[24] Baker, B.S.: A theory of parameterized pattern matching: algorithms and applications. In: Proceedings of the 25th Annual ACM Symposium on Theory of Computing (San Diego, CA, USA), May 16-18, pp. 71-80 (1993) · Zbl 1310.68098
[25] Shalom, B.R.: Parameterized dictionary matching with one gap. In: Proceedings of the Prague Stringology Conference (PSC) (Prague, Czech Republic), pp. 103-116 (2018)
[26] Shalom, BR, Parameterized dictionary matching and recognition with one gap, Theor. Comput. Sci., 854, 1-16 (2021) · Zbl 1477.68557 · doi:10.1016/j.tcs.2020.11.017
[27] Levy, A., Shalom, B.R.: Online parameterized dictionary matching with one gap. In: Stringology (Proceedings of the Prague Stringology Conference), pp. 41-55 (2019) · Zbl 1494.68317
[28] Levy, A., Shalom, B.R.: Online parameterized dictionary matching with one gap. Submitted for publication (2020) · Zbl 1494.68317
[29] Burr, S.A., Erdős, P.: On the magnitude of generalized ramsey numbers for graphs. Infinite and finite sets (Colloq., Keszthely, 1973; dedicated to P. Erdős on his 60th birthday) (János Bolyai, ed.), Colloq. Math. Soc., vol. 1, 1975, p. 214-240
[30] Kucherov, G.; Rusinowitch, M., Matching a set of strings with variable length don’t cares, Theor. Comput. Sci., 178, 12, 129-154 (1997) · Zbl 0901.68037 · doi:10.1016/S0304-3975(97)88195-9
[31] Zhang, M.; Zhang, Y.; Hu, L., A faster algorithm for matching a set of patterns with variable length don’t cares, Inf. Process. Lett., 110, 6, 216-220 (2010) · Zbl 1211.68141 · doi:10.1016/j.ipl.2009.12.007
[32] Haapasalo, T., Silvasti, P., Sippu, S., Soisalon-Soininen, E.: Online dictionary matching with variable-length gaps. In: 10th International Symposium on Experimental Algorithms SEA (Kolimpari, Chania, Crete, Greece), May 5-7, pp. 76-87 (2011)
[33] Hon, W.; Lam, TW; Shah, R.; Thankachan, SV; Ting, H.; Yang, Y., Dictionary matching with a bounded gap in pattern or in text, Algorithmica, 80, 2, 698-713 (2018) · Zbl 1391.68129 · doi:10.1007/s00453-017-0288-2
[34] Amir, A., Levy, A., Porat, E., Shalom, B.R.: Dictionary matching with one gap. In: The Proceedings of the 25th Symposium on Combinatorial Pattern Matching (CPM), pp. 11-20 (2014) · Zbl 1390.68781
[35] Abboud, A., Williams, V.V.: Popular conjectures imply strong lower bounds for dynamic problems. In: 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS. pp. 434-443 (2014)
[36] Grønlund, A., Pettie, S.: Threesomes, degenerates, and love triangles. FOCS 621-630 (2014) · Zbl 1422.68112
[37] Kopelowitz, T., Pettie, S., Porat, E.: Higher lower bounds from the 3sum conjecture. In: Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA (2016) · Zbl 1410.68147
[38] Pǎtraşcu, M.: Towards polynomial lower bounds for dynamic problems. STOC 603-610 (2010) · Zbl 1293.68153
[39] Bjørklund, A., Pagh, R., Vassilevska Williams, V., Zwick, U.: Listing triangles. In: Proceedings 41st Int’l Colloquium on Automata, Languages, and Programming (ICALP (I)), pp. 223-234 (2014) · Zbl 1410.05191
[40] Itai, A., Rodeh, M.: Finding a minimum circuit in a graph 7(4), 413-423 (1978) · Zbl 0386.68064
[41] Chiba, N.; Nishizeki, T., Arboricity and subgraph listing algorithms, SIAM J. Comput. (SICOMP), 14, 1, 210-223 (1985) · Zbl 0572.68053 · doi:10.1137/0214017
[42] Pettie, S., Kopelowitz, T., Porat, E.: Dynamic set intersection. WADS (2015) · Zbl 1451.68081
[43] Cohen, H.; Porat, E., Fast set intersection and two-patterns matching, Theor. Comput. Sci., 411, 40-42, 3795-3800 (2010) · Zbl 1207.68270 · doi:10.1016/j.tcs.2010.06.002
[44] Bansal, N.; Williams, R., Regularity lemmas and combinatorial algorithms, Theory Comput., 8, 1, 69-94 (2012) · Zbl 1253.68162 · doi:10.4086/toc.2012.v008a004
[45] Amir, A.; Levy, A.; Porat, E.; Shalom, BR, Online recognition of dictionary with one gap, Inf. Comput., 275, 104633 (2020) · Zbl 1496.68382 · doi:10.1016/j.ic.2020.104633
[46] Mortensen, CW, Fully dynamic orthogonal range reporting on RAM, SIAM J. Comput., 35, 6, 1494-1525 (2006) · Zbl 1115.68059 · doi:10.1137/S0097539703436722
[47] Amir, A.; Farach, M.; Muthukrishnan, S., Alphabet dependence in parameterized matching, Inf. Process. Lett., 49, 3, 111-115 (1994) · Zbl 0795.68077 · doi:10.1016/0020-0190(94)90086-8
[48] Baker, BS, Parameterized duplication in strings: algorithms and an application to software maintenance, SIAM J. Comput., 26, 5, 1343-1362 (1997) · Zbl 0885.68085 · doi:10.1137/S0097539793246707
[49] Deguchi, S., Higashijima, F., Bannai, H., Inenaga, S., Takeda, M.: Parameterized suffix arrays for binary strings. In: Proceedings of the Prague Stringology Conference (PSC) (Prague, Czech Republic), September 1-3, pp. 84-94 (2008)
[50] Ferragina, P.; Grossi, R., The string b-tree: a new data structure for string search in external memory and its applications, J. ACM, 46, 2, 236-280 (1999) · Zbl 1065.68518 · doi:10.1145/301970.301973
[51] Ganguly, A., Hon, W., Shah, R.: A framework for dynamic parameterized dictionary matching. In: 15th Scandinavian Symposium and Workshops on Algorithm Theory SWAT (Reykjavik, Iceland), June 22-24, pp. 10:1-10:14 (2016) · Zbl 1378.68204
[52] Keller, O.; Kopelowitz, T.; Lewenstein, M., On the longest common parameterized subsequence, Theor. Comput. Sci., 410, 51, 5347-5353 (2009) · Zbl 1186.68148 · doi:10.1016/j.tcs.2009.09.011
[53] Lewenstein, M.: Parameterized pattern matching. Encyclopedia of Algorithms 1525-1530,(2016)
[54] Mendivelso, J., Pinzón, Y.: Parameterized matching: Solutions and extensions. In: Proceedings of the Prague Stringology Conference (PSC) (Prague, Czech Republic), August 24-26, pp. 118-131 (2015)
[55] Willard, DE, Log-logarithmic worst-case range queries are possible in space \(\theta (n)\), Inf. Process. Lett., 17, 2, 81-84 (1983) · Zbl 0509.68106 · doi:10.1016/0020-0190(83)90075-3
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.