×

The statistics of words on rings. (English) Zbl 1083.60061

The authors are concerned with \(N\)-letter sequences \(\left( a_{1},\dots ,a_{N}\right) \) where the letters \(a_{i},\;1\leq i\leq N\), are selected from an \(\alpha \)-letter front \(\left\{ 1,2,\dots ,\alpha \right\}\), with the convention that \(a_{j-N}=a_{j}\). First, for any \(r\geq 2\), there is constructed the generating function for the joint distribution of the numbers of times that specified different \(r\)-letter words appear in a sequence of letters. The latter is defined on a ring, and its distribution is chosen as \(\left( r-1\right)\)th-order Markov. Second, the Poisson limit of the resulting distribution of words is obtained explicitly. Then the case of i.i.d. sequences is paid special attention. Correction terms to the Poisson limit are given and, for \(r=2\), the exact distribution is derived. Finally, it it shown how a hidden Markov model fits into the scheme considered.

MSC:

60J20 Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.)
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
Full Text: DOI

References:

[1] Chen, Ann Probability 3 pp 534– (1975)
[2] Gentleman, Biometrics 45 pp 35– (1989)
[3] Hao, Ann Comb 4 pp 247– (2000)
[4] Leung, J Comput Biol 3 pp 345– (1996)
[5] Percus, ACM Trans Modeling Comput Sci 5 pp 87– (1995)
[6] Reinert, J Comput Biol 5 pp 223– (1998)
[7] Reinert, J Comput Biol 7 pp 1– (2000)
[8] Robin, J Appl Probab 36 pp 179– (1999)
[9] Introduction to Computational Biology. Chapman and Hall, New York, 1995. · doi:10.1007/978-1-4899-6846-3
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.