×

An adjacent-swap Markov chain on coalescent trees. (English) Zbl 1501.60038

Summary: The standard coalescent is widely used in evolutionary biology and population genetics to model the ancestral history of a sample of molecular sequences as a rooted and ranked binary tree. In this paper we present a representation of the space of ranked trees as a space of constrained ordered matched pairs. We use this representation to define ergodic Markov chains on labeled and unlabeled ranked tree shapes analogously to transposition chains on the space of permutations. We show that an adjacent-swap chain on labeled and unlabeled ranked tree shapes has a mixing time at least of order \(n^3\), and at most of order \(n^4\). Bayesian inference methods rely on Markov chain Monte Carlo methods on the space of trees. Thus it is important to define good Markov chains which are easy to simulate and for which rates of convergence can be studied.

MSC:

60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
60J90 Coalescent processes
62F15 Bayesian inference

Software:

BEAUti; BEAST

References:

[1] Aldous, D. (1983). Random walks on finite groups and rapidly mixing Markov chains. In Séminaire de Probabilités XVII 1981/82, pp. 243-297. Springer. · Zbl 0514.60067
[2] Aldous, D. J. (2000). Mixing time for a Markov chain on cladograms. Combinatorics Prob. Comput.9, 191-204. · Zbl 0961.60077
[3] Cappello, L. and Palacios, J. A. (2020). Sequential importance sampling for multiresolution Kingman-Tajima coalescent counting. Ann. Appl. Statist.14, 727-751. · Zbl 1446.62269
[4] Cappello, L., Veber, A. and Palacios, J. A. (2020). The Tajima heterochronous n-coalescent: Inference from heterochronously sampled molecular data. Available at arXiv:2004.06826.
[5] Diaconis, P. W. and Holmes, S. P. (1998). Matchings and phylogenetic trees. Proc. Nat. Acad. Sci. USA95, 14600-14602. · Zbl 0908.92023
[6] Diaconis, P. and Holmes, S. (2002). Random walks on trees and matchings. Electron. J. Prob.7, 1-17. · Zbl 1007.60071
[7] Diaconis, P. and Wood, P. M. (2013). Random doubly stochastic tridiagonal matrices. Random Structures Algorithms42, 403-437. · Zbl 1278.15035
[8] Dinh, V., Darling, A. E. and Matsen, Iv, F. A. (2018). Online Bayesian phylogenetic inference: Theoretical foundations via sequential Monte Carlo. Syst. Biol.67, 503-517.
[9] Donaghey, R. (1975). Alternating permutations and binary increasing trees. J. Combinatorial Theory A18, 141-148. · Zbl 0304.05101
[10] Drummond, A., Suchard, M., Xie, D. and Rambaut, A. (2012). Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol.29, 1969-1973.
[11] Durrett, R. (2003). Shuffling chromosomes. J. Theoret. Prob.16, 725-750. · Zbl 1047.92020
[12] Felsenstein, J. (2004). Inferring Phylogenies, Vol. 2. Sinauer Associates, Sunderland, MA.
[13] Friedman, N., Ninio, M., Pe’Er, I. and Pupko, T. (2001). A structural EM algorithm for phylogenetic inference. In Proceedings of the Fifth Annual International Conference on Computational Biology, pp. 132-140.
[14] Frost, S. D. W. and Volz, E. M. (2013). Modelling tree shape and structure in viral phylodynamics. Phil. Trans. R. Soc. B368, 20120208.
[15] Janson, S. and Kersting, G. (2011). On the total external length of the Kingman coalescent. Electron. J. Prob.16, 2203-2218. · Zbl 1245.60094
[16] Kingman, J. (1982). The coalescent. Stoch. Process. Appl.13, 235-248. · Zbl 0491.60076
[17] Kirkpatrick, M. and Slatkin, M. (1993). Searching for evolutionary patterns in the shape of a phylogenetic tree. Evolution47, 1171-1181.
[18] Kuznetsov, A. G., Pak, I. M. and Postnikov, A. E. (1994). Increasing trees and alternating permutations. Russian Math. Surveys49, 79. · Zbl 0842.05025
[19] Lacoin, H. (2016). Mixing time and cutoff for the adjacent transposition shuffle and the simple exclusion. Ann. Prob.44, 1426-1487. · Zbl 1408.60061
[20] Levin, D. A. and Peres, Y. (2017). Markov Chains and Mixing Times, American Mathematical Society. · Zbl 1390.60001
[21] Liu, S., Westbury, M. V., Dussex, N., Mitchell, K. J., Sinding, M.-H. S., Heintzman, P. D., Duchêne, D. A., Kapp, J. D., Von Seth, J., Heiniger, H. et al. (2021). Ancient and modern genomes unravel the evolutionary history of the rhinoceros family. Cell184, 4874-4885.
[22] Maliet, O., Gascuel, F. and Lambert, A. (2018). Ranked tree shapes, nonrandom extinctions, and the loss of phylogenetic diversity. Syst. Biol.67, 1025-1040.
[23] Misra, N., Blelloch, G., Ravi, R. and Schwartz, R. (2011). An optimization-based sampling scheme for phylogenetic trees. In International Conference on Research in Computational Molecular Biology (Lecture Notes in Computer Science 6577), pp. 252-266. Springer, Berlin and Heidelberg.
[24] Mossel, E. and Vigoda, E. (2006). Limitations of Markov chain Monte Carlo algorithms for Bayesian inference of phylogeny. Ann. Appl. Prob.16, 2215-2234. · Zbl 1121.60078
[25] Palacios, J. A., Véber, A., Cappello, L., Wang, Z., Wakeley, J. and Ramachandran, S. (2019). Bayesian estimation of population size changes by sampling Tajima’s trees. Genetics213, 967-986.
[26] Rajanala, S. and Palacios, J. A. (2021). Statistical summaries of unlabelled evolutionary trees and ranked hierarchical clustering trees. Available at arXiv:2106.02724.
[27] Sainudiin, R., Stadler, T. and Véber, A. (2015). Finding the best resolution for the Kingman-Tajima coalescent: Theory and applications. J. Math. Biology70, 1207-1247. · Zbl 1342.92144
[28] Schweinsberg, J. (2002). An \({O}(n^2)\) bound for the relaxation time of a Markov chain on cladograms. Random Structures Algorithms20, 59-70. · Zbl 0995.60071
[29] Spade, D., Herbei, R. and Kubatko, L. (2014). A note on the relaxation time of two Markov chains on rooted phylogenetic tree spaces. Statist. Prob. Lett.84, 247-252. · Zbl 1321.60148
[30] Stanley, R. P. (1999). Enumerative Combinatorics, Vol. I. Wadsworth & Brooks/Cole. · Zbl 0928.05001
[31] Štefankovič, D. and Vigoda, E. (2011). Fast convergence of Markov chain Monte Carlo algorithms for phylogenetic reconstruction with homogeneous data on closely related species. SIAM J. Discrete Math.25, 1194-1211. · Zbl 1233.92064
[32] Volz, E. M., Koelle, K. and Bedford, T. (2013). Viral phylodynamics. PLoS Comput. Biol.9, e1002947.
[33] Wakeley, J. (2008). Coalescent Theory: An Introduction. Roberts.
[34] Wilson, D. B. (2004). Mixing times of lozenge tiling and card shuffling Markov chains. Ann. Appl. Prob.14, 274-325. · Zbl 1040.60063
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.