×

Truncated suffix trees and their application to data compression. (English) Zbl 1044.68031

Summary: The suffix tree is a fundamental data structure in the area of string algorithms and it has been used in many applications including data compression. In this paper we propose a data structure called the truncated suffix tree, which is a truncated version of the suffix tree. We also present two linear-time construction algorithms for truncated suffix trees and two algorithms that delete suffixes from truncated suffix trees.
The truncated suffix tree is particularly a useful data structure for LZ77 that compresses using a sliding window of a fixed size. Our algorithms lead to two implementations of LZ77 that maintain sliding windows by truncated suffix trees. We also present a technique of finding the longest match in a sliding window, which is a crucial step in LZ77.

MSC:

68P05 Data structures
68P30 Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science)
Full Text: DOI

References:

[1] Apostolico, A., The myriad virtues of subword trees, (Apostolico, A.; Galil, Z., Combinatorial Algorithms on Words (1985), Springer: Springer Berlin), 85-95 · Zbl 0572.68067
[2] Bell, T. C., Better OPM/L text compression, IEEE Trans. Comm., COM-34, 12, 1176-1182 (1986)
[3] Bunton, S., Semantically motivated improvements for PPM variants, Comput. J., 40, 2-3, 76-93 (1997)
[4] Crochemore, M.; Rytter, W., Text Algorithms (1994), Oxford University Press: Oxford University Press Oxford · Zbl 0844.68101
[5] M. Farach, Optimal suffix tree construction with large alphabets, Proc. of the 38th Annual Symp. on Foundations of Computer Science, IEEE Computer Society, Los Alamitos, 1997, pp. 137-143.; M. Farach, Optimal suffix tree construction with large alphabets, Proc. of the 38th Annual Symp. on Foundations of Computer Science, IEEE Computer Society, Los Alamitos, 1997, pp. 137-143.
[6] Fiala, E. R.; Greene, D. H., Data compression with finite windows, Comm. ACM, 32, 4, 490-505 (1989)
[7] Giegerich, R.; Kurtz, S., From Ukkonen to McCreight and Weinera unifying view of linear-time suffix tree construction, Algorithmica, 19, 3, 331-353 (1997) · Zbl 0895.68056
[8] Gusfield, D., Algorithms on Strings, Trees, and Sequences (1997), Cambridge University Press: Cambridge University Press Cambridge · Zbl 0934.68103
[9] D.S. Hirschberg, L.M. Stauffer, Parsing algorithms for dictionary compression on the PRAM, Proc. Data Compression Conf., IEEE Computer Society, Los Alamitos, 1994, pp. 136-145.; D.S. Hirschberg, L.M. Stauffer, Parsing algorithms for dictionary compression on the PRAM, Proc. Data Compression Conf., IEEE Computer Society, Los Alamitos, 1994, pp. 136-145.
[10] Hirschberg, D. S.; Stauffer, L. M., Dictionary compression on the PRAM, Parallel Process. Lett., 7, 3, 297-308 (1997)
[11] J. Jiang, Implementing the PATRICIA data structure for compression algorithms with finite size dictionaries, Internat. Conf. on Data Transmission—Advances in Modem and ISDN Technology and Applications, IEE. 1992, London, pp. 123-127.; J. Jiang, Implementing the PATRICIA data structure for compression algorithms with finite size dictionaries, Internat. Conf. on Data Transmission—Advances in Modem and ISDN Technology and Applications, IEE. 1992, London, pp. 123-127.
[12] N.J. Larsson, Extended application of suffix trees to data compression, Proc. Data Compression Conf., IEEE Computer Society, Los Alamitos, 1996, pp. 190-199.; N.J. Larsson, Extended application of suffix trees to data compression, Proc. Data Compression Conf., IEEE Computer Society, Los Alamitos, 1996, pp. 190-199.
[13] N.J. Larsson, The context trees of block sorting compression, Proc. Data Compression Conf., IEEE Computer Society, Los Alamitos, 1998, pp. 189-198.; N.J. Larsson, The context trees of block sorting compression, Proc. Data Compression Conf., IEEE Computer Society, Los Alamitos, 1998, pp. 189-198.
[14] Y. Matias, S. Muthukrishnan, S. Sahinalp, J. Ziv, Augmenting suffix trees with applications, Algorithms—ESA ’98 6th Annual European Symp. Proc., Springer, Berlin, 1998, pp. 67-78.; Y. Matias, S. Muthukrishnan, S. Sahinalp, J. Ziv, Augmenting suffix trees with applications, Algorithms—ESA ’98 6th Annual European Symp. Proc., Springer, Berlin, 1998, pp. 67-78.
[15] McCreight, E. N., A space-economical suffix tree construction algorithm, J. ACM, 23, 2, 262-272 (1976) · Zbl 0329.68042
[16] V.S. Miller, M.N. Wegman, Variations on a theme by Ziv and Lempel, IEEE Inter, Conf. on Comm. ’88: Digital Technology-Spanning the Universe, IEEE, New York, 1988, pp. 390-394.; V.S. Miller, M.N. Wegman, Variations on a theme by Ziv and Lempel, IEEE Inter, Conf. on Comm. ’88: Digital Technology-Spanning the Universe, IEEE, New York, 1988, pp. 390-394.
[17] Rodeh, M.; Pratt, V. R.; Even, S., Linear algorithm for data compression via string matching, J. ACM, 28, 1, 16-24 (1981) · Zbl 0462.94015
[18] K. Sadakane, Text compression using recency rank with context and relation to context sorting, block sorting and \(PPM^*\); K. Sadakane, Text compression using recency rank with context and relation to context sorting, block sorting and \(PPM^*\)
[19] L.M. Stauffer, D.S. Hirschberg, PRAM algorithms for static dictionary compression, Proc. 8th Inter. Parallel Processing Symp., IEEE Computer Society, Los Alamitos, 1994, pp. 344-348.; L.M. Stauffer, D.S. Hirschberg, PRAM algorithms for static dictionary compression, Proc. 8th Inter. Parallel Processing Symp., IEEE Computer Society, Los Alamitos, 1994, pp. 344-348.
[20] Storer, J. A.; Szymanski, T. G., Data compression via textual substitution, J. ACM, 29, 928-951 (1982) · Zbl 0489.68041
[21] Ukkonen, E., On-line construction of suffix trees, Algorithmica, 14, 3, 249-260 (1995) · Zbl 0831.68027
[22] P. Weiner, Linear pattern matching algorithms, 14th Annual Symp. on Switching Automata Theory, IEEE, New York, 1973, pp. 1-11.; P. Weiner, Linear pattern matching algorithms, 14th Annual Symp. on Switching Automata Theory, IEEE, New York, 1973, pp. 1-11.
[23] Welch, T. A., A technique for high-performance data compression, IEEE Comput., 17, 6, 8-19 (1984)
[24] Yu, T. L., Data compression for PC software distribution, Software-Practice Experience, 26, 11, 1181-1195 (1996)
[25] Ziv, J.; Lempel, A., A universal algorithm for sequential data compression, IEEE Trans. Inform. Theory IEEE, 23, 3, 337-343 (1977) · Zbl 0379.94010
[26] Ziv, J.; Lempel, A., Compression of individual sequences via variable-rate coding, IEEE Trans. Inform. Theory, IEEE, 24, 5, 530-536 (1978) · Zbl 0392.94004
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.