×

The mean, variance and limiting distribution of two statistics sensitive to phylogenetic tree balance. (English) Zbl 1124.05025

In practical biological applications it is always an important (and often not well appreciated) task to evaluate the resulted phylogenetic tree. There are several available statistical measures to achieve this; Sackin’s and Colless’ indices are among them. This rather technical paper analyses the mean, (co)variance and limiting joint distribution of these indices for large phylogenetic trees under the Yule and uniform models. The study is concerned mainly with the topology of the resulted trees, therefore branch length is ignored.

MSC:

05C05 Trees
60F05 Central limit and other weak theorems
60C05 Combinatorial probability
92D15 Problems related to evolution
62P10 Applications of statistics to biology and medical sciences; meta analysis

Software:

Quicksort

References:

[1] Agapow, P.-M. and Purvis, A. (2002). Power of eight tree shape statistics to detect nonrandom diversification: A comparison by simulation of two models of cladogenesis. Systematic Biology 51 866–872.
[2] Aldous, D. J. (1991a). The continuum random tree II: An overview. In Stochastic Analysis (N. T. Barlow and N. H. Bingham, eds.) 23–70. Cambridge Univ. Press. · Zbl 0791.60008
[3] Aldous, D. J. (1991b). Asymptotic fringe distributions for general families of random trees. Ann. Appl. Probab. 1 228–266. · Zbl 0733.60016 · doi:10.1214/aoap/1177005936
[4] Aldous, D. J. (1996). Probability distributions on cladograms. In Random Discrete Structures (D. Aldous and R. Pemantle, eds.) 1–18. Springer, Berlin. · Zbl 0841.92015
[5] Aldous, D. J. (2001). Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today. Statist. Sci. 16 23–34. · Zbl 1127.60313 · doi:10.1214/ss/998929474
[6] Blum, M. G. B. and François, O. (2005). On statistical tests of phylogeny imbalance: The Sackin and other indices revisited. Math. Biosci. 195 141–153. · Zbl 1065.62183 · doi:10.1016/j.mbs.2005.03.003
[7] Blum, M. G. B. and François, O. (2006). Which random processes describe the Tree-of-Life? A large-scale study of phylogenetic tree imbalance. Systematic Biology 55 685–691.
[8] Chan, K. M. A. and Moore, B. R. (2002). Whole-tree methods for detecting differential diversification rates. Systematic Biology 51 855–865.
[9] Colless, D. H. (1982). Review of phylogenetics: The theory and practice of phylogenetic systematics. Systematic Zoology 31 100–104.
[10] Darwin, C. (1859). The Origin of Species . Reprinted by Penguin Books, London, UK.
[11] Fill, J. A. (1996). On the distribution for binary search trees under the random permutation model. Random Structures Algorithms 8 1–25. · Zbl 0840.60065 · doi:10.1002/(SICI)1098-2418(199601)8:1<1::AID-RSA1>3.0.CO;2-1
[12] Fill, J. A. and Kapur, N. (2004). Limiting distributions for additive functionals on Catalan trees. Theoret. Comput. Sci. 326 69–102. · Zbl 1071.68102 · doi:10.1016/j.tcs.2004.05.010
[13] Flajolet, P. and Louchard, G. (2001). Analytic variations on the Airy distribution. Algorithmica 31 361–377. · Zbl 1064.68065 · doi:10.1007/s00453-001-0056-0
[14] Ford, D. J. (2005). Probabilities on cladogram: Introduction to the alpha model. Arxiv preprint math-0511246.
[15] Harding, E. F. (1971). The probabilities of rooted tree-shapes generated by random bifurcation. Adv. in Appl. Probab. 3 44–77. JSTOR: · Zbl 0241.92012 · doi:10.2307/1426329
[16] Hoare, C. A. R. (1962). Quicksort. Comput. J. 5 10–15. · Zbl 0108.13601 · doi:10.1093/comjnl/5.1.10
[17] Hwang, H.-K. and Neininger, R. (2002). Phase change of limit laws in the quicksort recurrence under varying toll functions. SIAM J. Comput. 31 1687–1722. · Zbl 1008.68166 · doi:10.1137/S009753970138390X
[18] Janson, S. (2003). The Wiener index of simply generated random trees. Random Structures Algorithms 22 337–358. · Zbl 1025.05021 · doi:10.1002/rsa.10074
[19] Kingman, J. F. C. (1982). The coalescent. Stochastic Process. Appl. 13 235–248. · Zbl 0491.60076 · doi:10.1016/0304-4149(82)90011-4
[20] Kirkpatrick, M. and Slatkin, M. (1993). Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution 47 1171–1181.
[21] Knuth, D. E. (1973). The Art of Computer Programming 3 . Sorting and Searching . Addison–Wesley, Reading, MA. · Zbl 0302.68010
[22] Mahmoud, H. (1992). Evolution of Random Search Trees . Wiley, New York. · Zbl 0762.68033
[23] Martinez, C., Panholzer, A. and Prodinger, H. (1998). The number of descendants and ascendants in random search trees. Electron. J. Combin. 5 . · Zbl 0892.05004
[24] McKenzie, A. and Steel, M. (2001). Properties of phylogenetic trees generated by Yule-type speciation models. Math. Biosci. 170 91–112. · Zbl 0977.92017 · doi:10.1016/S0025-5564(00)00061-4
[25] Mooers, A. and Heard, S. B. (1997). Inferring evolutionary process from phylogenetic tree shape. Quarterly Review Biology 72 31–54.
[26] Neininger, R. (2001). On a multivariate contraction method for random recursive structures with applications to quicksort. Random Structures Algorithms 19 498–524. · Zbl 0990.68054 · doi:10.1002/rsa.10010
[27] Neininger, R. (2002). The Wiener index of random trees. Combin. Probab. Comput. 11 587–597. · Zbl 1013.05029 · doi:10.2027/S0963548302005321
[28] Purvis, A., Katzourakis, A. and Agapow, P. M. (2002). Evaluating phylogenetic tree shape: Two modifications to Fusco and Cronk’s method. J. Theor. Biol. 214 99–103.
[29] Rachev, S. T. and Rüschendorf, L. (1995). Probability metrics and recursive algorithms. Adv. in Appl. Probab. 27 770–799. JSTOR: · Zbl 0829.60094 · doi:10.2307/1428133
[30] Rogers, J. S. (1994). Central moments and probability distribution of Colless’ coefficient of tree imbalance. Evolution 48 2026–2036. · Zbl 0851.76057 · doi:10.1063/1.868035
[31] Rogers, J. S. (1996). Central moments and probability distributions of three measures of phylogenetic tree imbalance. Systematic Biology 45 99–110.
[32] Rösler, U. (1991). A limit theorem for “Quicksort.” Theor. Inform. Appl. 25 85–100. · Zbl 0718.68026
[33] Rüschendorf, L. and Neininger, R. (2006). Survey of multivariate aspects of the contraction method. Discrete Math. Theor. Comput. Sci. 8 31–56. · Zbl 1157.60307
[34] Sackin, M. J. (1972). “Good” and “bad” phenograms. Systematic Zoology 21 225–226.
[35] Sedgewick, R. and Flajolet, P. (1996). An Introduction to the Analysis of Algorithms . Addison–Wesley, Reading, MA. · Zbl 0841.68059
[36] Shao, K. and Sokal, R. R. (1990). Tree balance. Systematic Zoology 39 266–276.
[37] Takacs, L. (1991). A Bernoulli excursion and its various applications. Adv. in Appl. Probab. 23 557–585. JSTOR: · Zbl 0738.60069 · doi:10.2307/1427622
[38] Yule, G. U. (1924). A mathematical theory of evolution, based on the conclusions of Dr J. C. Willis. Philos. Trans. Roy. Soc. London Ser. B 213 21–87.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.