×

Bridging trees for posterior inference on ancestral recombination graphs. (English) Zbl 1425.92002

Summary: We present a new Markov chain Monte Carlo algorithm, implemented in the software Arbores, for inferring the history of a sample of DNA sequences. Our principal innovation is a bridging procedure, previously applied only for simple stochastic processes, in which the local computations within a bridge can proceed independently of the rest of the DNA sequence, facilitating large-scale parallelization.

MSC:

92-04 Software, source code, etc. for problems pertaining to biology
92D15 Problems related to evolution
92D20 Protein sequences, DNA sequences

Software:

Arbores

References:

[1] Kingman JFC. (1982) The coalescent. Stoch. Process. Appl. 13, 235-248. (doi:10.1016/0304-4149(82)90011-4) · Zbl 0491.60076 · doi:10.1016/0304-4149(82)90011-4
[2] Kingman JFC. (1982) On the genealogy of large populations. J. Appl. Prob. 19, 22-43. (doi:10.2307/3213548) · Zbl 0516.92011 · doi:10.2307/3213548
[3] Griffiths RC, Marjoram P. (1997) An ancestral recombination graph. In Progress in population genetics and human evolution (eds P Donnelly, S Tavare), pp. 257-270. Berlin, Germany: Springer. · Zbl 0893.92020
[4] Griffiths RC, Marjoram P. (1996) Ancestral inference from samples of DNA sequences with recombination. J. Comput. Biol. 3, 479-502. (doi:10.1089/cmb.1996.3.479) · doi:10.1089/cmb.1996.3.479
[5] Arenas M. (2013) The importance and application of the ancestral recombination graph. Front. Genet. 4, 206. (doi:10.3389/fgene.2013.00206) · doi:10.3389/fgene.2013.00206
[6] Zhang Y, Perry K, Vinci V, Powell K, Stemmer W, del Cardayre S. (2002) Genome shuffling leads to rapid phenotypic improvement in bacteria. Nature 415, 644-646. (doi:10.1038/415644a) · doi:10.1038/415644a
[7] Fearnhead P, Donnelly P. (2001) Estimating recombination rates from population genetic data. Genetics 159, 1299-1318.
[8] Nielsen R. (2000) Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154, 931-942.
[9] Kuhner MK. (2000) Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393-1401.
[10] McVean GAT, Cardin NJ. (2005) Approximating the coalescent with recombination. Phil. Trans. R. Soc. B 360, 1387-1393. (doi:10.1098/rstb.2005.1673) · doi:10.1098/rstb.2005.1673
[11] Rasmussen MD, Hubisz MJ, Gronau I, Siepel A. (2014) Genome-wide inference of ancestral recombination graphs. PLoS Genet. 10, e1004342. (doi:10.1371/journal.pgen.1004342) · doi:10.1371/journal.pgen.1004342
[12] Roberts GO, Stramer O. (2001) On inference for partially observed nonlinear diffusion models using the Metropolis-Hastings algorithm. Biometrika 88, 603-621. (doi:10.1093/biomet/88.3.603) · Zbl 0985.62066 · doi:10.1093/biomet/88.3.603
[13] Boys RJ, Wilkinson DJ, Kirkwood TB. (2008) Bayesian inference for a discretely observed stochastic kinetic model. Stat. Comput. 18, 125-135. (doi:10.1007/s11222-007-9043-x) · doi:10.1007/s11222-007-9043-x
[14] Song YS, Hein J. (2003) Parsimonious reconstruction of sequence evolution and haplotype blocks. In Algorithms in bioinformatics. WABI 2003 (eds G Benson, RDM Page), pp. 287-302. Lecture Notes in Computer Science, vol. 2812. Berlin, Germany: Springer.
[15] Song YS, Hein J. (2005) Constructing minimal ancestral recombination graphs. J. Comput. Biol. 12, 159-178.
[16] Gusfield D. (2014) ReCombinatorics. New York, NY: The MIT Press.
[17] Song YS. (2003) On the combinatorics of rooted binary phylogenetic trees. Ann. Comb. 7, 365-379. (doi:10.1007/s00026-003-0192-0) · Zbl 1045.05031 · doi:10.1007/s00026-003-0192-0
[18] Song YS. (2006) Properties of subtree-prune-and-regraft operations on totally-ordered phylogenetic trees. Ann. Comb. 10, 147-163. (doi:10.1007/s00026-006-0279-5) · Zbl 1092.05015 · doi:10.1007/s00026-006-0279-5
[19] Hein J, Schierup MH, Wiuf C. (2005) Gene genealogies, variation and evolution. Oxford, UK: Oxford University Press. · Zbl 1113.92048
[20] Robert CP, Casella G. (2004) Monte Carlo statistical methods, 2nd edn. Berlin, Germany: Springer Science+Business Media Inc. · Zbl 1096.62003
[21] Song YS, Wu YW, Gusfield D. (2005) Efficient computation of close lower and upper bounds on the minimum number of recombinations in biological sequence evolution. Bioinformatics 21, i413-i422. (doi:10.1093/bioinformatics/bti1033) · doi:10.1093/bioinformatics/bti1033
[22] Hudson R. (1991) Gene genealogies and the coalescent process. Oxf. Surv. Evol. Biol. 7, 1-44.
[23] Kreitman M. (1983) Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304, 412-417. (doi:10.1038/304412a0) · doi:10.1038/304412a0
[24] Chen G, Marjoram P, Wall J. (2009) Fast and flexible simulation of DNA sequence data. Genome Res. 19, 136-142. (doi:10.1101/gr.083634.108) · doi:10.1101/gr.083634.108
[25] Wilton PR, Carmi S, Hobolth A. (2015) The SMC’ is a highly accurate approximation to the ancestral recombination graph. Genetics 200, 343-355. (doi:10.1534/genetics.114.173898) · doi:10.1534/genetics.114.173898
[26] Marjoram P, Wall JD. (2006) Fast ‘coalescent’ simulation. BMC Genet. 7, 16. (doi:10.1186/1471-2156-7-16) · doi:10.1186/1471-2156-7-16
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.