×

Pattern correlation matrices and their properties. (English) Zbl 0979.15021

In analyzing sequences of symbols such as text, the pattern correlation polynomials of L. J. Guibas and A. M. Odlyzko [J. Comb. Theory, Ser. A 30, 183-208 (1981; Zbl 0454.68109)] are used to analyze the probabilities of counts of occurrences of strings given probabilities of individual symbols. These involve counts of overlaps between the given set of strings. M. Régnier and W. Szpankowski [Algorithmica 22, No. 4, 631-649 (1998; Zbl 0918.68108)] formed a pattern correlation matrix from these. This paper gives a Jordan block representation, a structure for the row and column spaces, and derives results on the covariance matrix of the joint distribution of frequencies of these strings, and derives a particular statistic with a chi-square distribution.

MSC:

15B52 Random matrices (algebraic aspects)
15A21 Canonical forms, reductions, classification
15A09 Theory of matrix inversion and generalized inverses
68T10 Pattern recognition, speech recognition
62F03 Parametric hypothesis testing
Full Text: DOI

References:

[1] Billingsley, P., Asymptotic distributions of two goodness of fit criteria, Ann. Math. Statist., 27, 1123-1129 (1956) · Zbl 0073.35605
[2] Good, I. J., The serial test for sampling numbers and other tests for randomness, Proc. Cambridge Philos. Soc., 47, 276-284 (1953) · Zbl 0051.36203
[3] Guibas, L.; Odlyzko, A., String overlaps pattern matching and nontransitive games, J. Combin. Theory A, 30, 183-208 (1981) · Zbl 0454.68109
[4] Horn, R. A.; Johnson, C. R., Matrix Analysis (1985), Cambridge University Press: Cambridge University Press Cambridge, UK · Zbl 0576.15001
[5] G. Marsaglia, A current view of random number generation, in: Proceedings of the 16th Symposium on the Interface of Computer Science and Statistics, Elsevier, New York, 1985, p. 3-10; G. Marsaglia, A current view of random number generation, in: Proceedings of the 16th Symposium on the Interface of Computer Science and Statistics, Elsevier, New York, 1985, p. 3-10
[6] Rao, C. R.; Mitra, C., Generalized Inverse of Matrices and its Applications (1971), Wiley: Wiley New York · Zbl 0236.15004
[7] M. Regnier, W. Szpankowski, On the approximate pattern occurrence in a text, in: Proceedings of the Compression and Complexity of SEQUENCE’97, IEEE Computer Society, Positano, 1997, pp. 253-264; M. Regnier, W. Szpankowski, On the approximate pattern occurrence in a text, in: Proceedings of the Compression and Complexity of SEQUENCE’97, IEEE Computer Society, Positano, 1997, pp. 253-264
[8] Regnier, M.; Szpankowski, W., On pattern frequency occurrences in a Markovian sequence, Algorithmica, 22, 631-649 (1998) · Zbl 0918.68108
[9] Rukhin, A. L., Approximate entropy for testing randomness, J. Appl. Probab., 37, 88-100 (2000) · Zbl 1095.65500
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.