×

Consistency of the maximum likelihood estimator of population tree in a coalescent framework. (English) Zbl 07901366

Summary: We present a proof of consistency of the maximum likelihood estimator (MLE) of population tree in a previously proposed coalescent model. As the model involves tree-topology as a parameter, the standard proof of consistency for continuous parameters does not directly apply. In addition to proving that a consistent sequence of MLE exists, we also prove that the overall MLE, computed by maximizing the likelihood over all tree-topologies, is also consistent. Thus, the MLE of tree-topology is consistent as well. The last result is important because local maxima occur in the likelihood of population trees, especially while maximizing the likelihood separately for each tree-topology. Even though MLE is known to be a dependable estimator under this model, our work proves its effectiveness with mathematical certainty.

MSC:

62-XX Statistics
Full Text: DOI

References:

[1] Bryant, D.; Bouckaert, R.; Felsenstein, J.; Rosenberg, N. A.; RoyChoudhury, A., Inferring species trees directly from biallelic genetic markers: Bypassing gene trees in a full coalescent analysis, Mol. Bol. Evol., 29, 1917-1932, 2012
[2] Chang, J. T., Full reconstruction of Markov models on evolutionary trees: Identiflability and consistency, Math. Biosci., 137, 51-73, 1996 · Zbl 1059.92504
[3] Degnan, J. H.; Rosenberg, N. A., Gene tree discordance, phylogenetic inference, and the multispecies coalescent, Trends Ecol. Evol., 24, 332-340, 2009
[4] Felsenstein, J., Inferring Phylogenies. Sinauer Associates, Sunderland, Massachusetts.
[5] Felsenstein, J., Cases in which parsimony or compatability methods will be positively misleading, Syst. Zool., 27, 401-410, 1978
[6] Lehmann, E. L.; Casella, G., Theory of Point Estimation, 1998, Springer: Springer New York · Zbl 0916.62017
[7] Matson, F. A.; Steel, M., Phylogenetic mixtures on a single tree can mimic a tree of another topology, Syst. Biol., 56, 767-775, 2007
[8] Nielsen, R.; Mountain, J. L.; Huelsenbeck, J. P.; Slatkin, M., Maximum likelihood estimation of population divergence times and population phylogeny in models without mutation, Evolution, 52, 669-677, 1998
[9] Nielsen, R.; Slatkin, M., Likelihood analysis of ongoing gene flow and historical association, Evolution, 54, 44-50, 2000
[10] Peng, J.; Rajeevan, H.; Kubatko, L.; RoyChoudhury, A., A fast likelihood approach for estimation of large phylogenies from continuous trait data, Mol. Phylogenet. Evol., 161, 2021
[11] Rogers, J. S., On the consistency of maximum likelihood estimation of phylogenetic trees from nucleotide sequences, Syst. Biol., 46, 354-357, 1997
[12] Rogers, J. S., Maximum likelihood estimation of phylogenetic trees is consistent when substitution rates vary according to the invariable sites plus gamma distribution, Syst. Biol., 50, 713-722, 2001
[13] RoyChoudhury, A., Composite likelihood-based inferences on genetic data from dependent loci, J. Math. Biol., 62, 65-80, 2011 · Zbl 1232.62153
[14] RoyChoudhury, A., Approximate likelihood estimation of divergence time range using a coalescent-based model, Evolut. Bioinform., 499-509, 2013
[15] RoyChoudhury, A., Identifiability of a coalescent-based population tree model, J. Appl. Probab., 51, 921-929, 2014 · Zbl 1333.92053
[16] RoyChoudhury, A.; Felsenstein, J.; Thompson, E. A., A two-stage pruning algorithm for likelihood computation for a population tree, Genetics, 180, 1095-1105, 2008
[17] RoyChoudhury, A.; Thompson, E. A., Ascertainment correction for a population tree via a Pruning algorithm for likelihood computation, Theor. Popul. Biol., 82, 59-65, 2012 · Zbl 1404.92130
[18] RoyChoudhury, A.; Willis, A.; Bunge, J., Consistency of a phylogenetic tree maximum likelihood estimator, J. Statist. Plann. Inference, 161, 73-80, 2015 · Zbl 1311.62185
[19] Steel, M. A.; Székerly, L. A.; Hendy, M. D., Reconstructing trees when sequence sites evolve at variable rates, J. Comp. Biol., 1, 153-163, 1994
[20] Stoica, P.; Söderström, T., On non-singular information matrices and local identifiability, Int. J. Control, 36, 323-329, 1982 · Zbl 0482.93073
[21] Stoltz, M.; Baeumer, B.; Bouckaert, R.; Fox, C.; Hiscott, G.; Bryant, D., Bayesian inference of species trees using diffusion models, Syst. Biol., 70, 145-161, 2021
[22] Takahata, N.; Nei, M., Gene genealogy and variance of interpopulational nucleotide differences, Genetics, 110, 325-344, 1985
[23] Yang, Z., Statistical properties of the maximum likelihood method of phylogenetic estimation and comparison with distance matrix methods, Syst. Biol., 43, 329-342, 1994
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.