×

Genetic demographic networks: mathematical model and applications. (English) Zbl 1366.92082

Summary: Recent improvement in the quality of genetic data obtained from extinct human populations and their ancestors encourages searching for answers to basic questions regarding human population history. The most common and successful are model-based approaches, in which genetic data are compared to the data obtained from the assumed demography model. Using such approach, it is possible to either validate or adjust assumed demography. Model fit to data can be obtained based on reverse-time coalescent simulations or forward-time simulations. In this paper we introduce a computational method based on mathematical equation that allows obtaining joint distributions of pairs of individuals under a specified demography model, each of them characterized by a genetic variant at a chosen locus. The two individuals are randomly sampled from either the same or two different populations. The model assumes three types of demographic events (split, merge and migration). Populations evolve according to the time-continuous Moran model with drift and Markov-process mutation. This latter process is described by the Lyapunov-type equation introduced by O’Brien and generalized in our previous works. Application of this equation constitutes an original contribution. In the result section of the paper we present sample applications of our model to both simulated and literature-based demographies. Among other we include a study of the Slavs-Balts-Finns genetic relationship, in which we model split and migrations between the Balts and Slavs. We also include another example that involves the migration rates between farmers and hunters-gatherers, based on modern and ancient DNA samples. This latter process was previously studied using coalescent simulations. Our results are in general agreement with the previous method, which provides validation of our approach. Although our model is not an alternative to simulation methods in the practical sense, it provides an algorithm to compute pairwise distributions of alleles, in the case of haploid non-recombining loci such as mitochondrial and Y-chromosome loci in humans.

MSC:

92D10 Genetics and epigenetics
92D15 Problems related to evolution
91D20 Mathematical geography and demography
Full Text: DOI

References:

[1] Adams, A. M.; Hudson, R. R., Maximum-likelihood estimation of demographic parameters using the frequency spectrum of unlinked single-nucleotide polymorphisms, Genetics, 168, 3, 1699-1712 (2004)
[2] Arenas, M.; Lopes, J. S.; Beaumont, M. A.; Posada, D., Codabc: A computational framework to coestimate recombination, substitution, and molecular adaptation rates by approximate Bayesian computation, Mol. Biol. Evol., 32, 4, 1109-1112 (2015)
[3] Baldia, M.-O., The Corded Ware/Single Grave Culture
[4] Beaumont, M. A.; Rannala, B., The bayesian revolution in genetics, Nature Rev. Genet., 5, 251-261 (2004)
[5] Becquet, C.; Przeworski, M., A new approach to estimate parameters of speciation models with application to apes, Genome Res., 17, 1505-1519 (2007)
[6] Beerli, P.; Felsenstein, J., Maximum-likelihood estimation of effective population numbers in two populations using a coalescent approach, Genetics, 152, 763-773 (1999)
[7] Belle, E. M.S., Comparing models on the genealogical relationships among Neandertal, Cro-Magnoid and modern Europeans by serial coalescent simulations, Heredity, 102, 218-225 (2009)
[8] Bobrowski, A.; Kimmel, M., An Operator Semigroup in Mathematical Genetics (2015), Springer · Zbl 1335.92002
[9] Bobrowski, A.; Kimmel, M., Asymptotic behavior of joint distributions of characteristics of a pair of randomly chosen individuals in discrete-time Fisher-Wright models with mutations and drift, Theor. Popul. Biol., 66, 4, 355-367 (2004) · Zbl 1075.92042
[10] Bobrowski, A.; Kimmel, M.; Arino, O.; Chakraborty, R., A semigroup representation and asymmetric behavior of certain statistics of the Fisher-Wright-Moran coalescent, Handbook of Statist., 19, 215-242 (2001) · Zbl 1004.92026
[11] Bobrowski, A.; Wojdyla, T.; Kimmel, M., Asymptotic behavior of a Moran model with mutations, drift and recombination among multiple loci, J. Math. Biol., 61, 455-473 (2010) · Zbl 1205.92051
[13] Bramanti, B., Genetic discontinuity between local hunter-gatherers and Central Europe’s first farmers, Science, 326, 137-140 (2009)
[14] Cash, J. R.; Karp, A. H., A variable order Runge-Kutta method for initial value problems with rapidly varying right-hand sides, ACM Trans. Math. Software, 16, 3, 201-222 (1990) · Zbl 0900.65234
[15] Chan, Y. L.; Anderson, C. N.; Hadly, E. A., Bayesian estimation of the timing and severity of a population bottleneck from ancient DNA, PLoS Genet., 2, Article e0020059 pp. (2006)
[17] Drummond, A. J.; Rambaut, A., BEAST: Bayesian evolutionary analysis by sampling trees, BMC Evol. Biol., 7 (2007), 214-214
[18] Excoffier, L.; Dupanloup, I.; Huerta-Sanchez, E.; Souza, V. C., Robust demographic inference from genomic and SNP data, PLoS Genetics, 9, 10, Article e1003905 pp. (2013)
[19] Excoffier, L.; Foll, M., fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios, Bioinformatics, 27, 9, 1332-1334 (2011)
[20] Excoffier, L.; Laval, G.; Schneider, S., Arlequin (version 3.0): an integrated software package for population genetics data analysis, Evol. Bioinform. Online, 1, 47 (2005)
[21] Forsythe, G. E.; Malcolm, M. A.; Moler, C. B., Computer Methods for Mathematical Computations (1977), Prentice-Hall · Zbl 0361.65002
[22] Gajic, Z.; Tahir, M.; Qureshi, J., Lyapunov Matrix Equation in System Stability and Control (1995), Academic Press: Academic Press San Diego · Zbl 1153.93300
[23] Gear, W. C., Numerical Initial Value Problems in Ordinary Differential Equations (1971), Prentice-Hall · Zbl 1145.65316
[24] Goodman, S. J., RST Calc: a collection of computer programs for calculating estimates of genetic differentiation from microsatellite data and determining their significance, Mol. Ecol., 6, 881-885 (1997)
[25] Gornung, B. V., Iz predystorii obrazovaniia obshcheslavianskogo iazykovogo edinstva (1963), Izd-vo Akademii nauk SSSR: Izd-vo Akademii nauk SSSR Moskva
[26] Green, R. E.; Krause, J.; Briggs, A. W.; Maricic, T.; Stenzel, U., A draft sequence of the Neandertal genome, Science, 328, 710-722 (2010)
[27] Grimmett, G.; Stirzaker, D., Probability and RandomProcesses (2001), Oxford University Press: Oxford University Press New York · Zbl 1015.60002
[28] group, E., Origins, age, spread and ethnic association of European haplogroups and subclades
[29] Gutenkunst, R. N.; Hernandez, R. D.; Williamson, S. H.; Bustamante, C. D., Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data, PLoS Genetics, 5, 10, Article e1000695 pp. (2009)
[30] Haak, W., Ancient DNA from the first European farmers in 7500-year-old Neolithic sites, Science, 310, 1016-1018 (2005)
[31] Haak, W.; Balanovsky, O.; Sanchez, J. J.; Koshel, S.; Zaporozhchenko, V., Ancient DNA from European Early Neolithic farmers reveals their Near Eastern affinities, PLoS Biology, 8, 11, Article e1000536 pp. (2010)
[32] Harpending, H.; Rogers, A., Genetic perspectives on human origins and differentiation, Annu. Rev. Genomics Hum. Genet., 1, 361-385 (2000)
[33] Hellenthal, G.; Auton, A.; Falush, D., Inferring human colonization history using a copying model, PLoS Genetics, 4, 5, Article e1000078 pp. (2008)
[34] Hey, J.; Nielsen, R., Integration within the Felsenstein equation for improved Markov chain Monte Carlo methods in population genetics, Proc. Natl. Acad. Sci. USA, 104, 2785-2790 (2007)
[35] Hey, J.; Machado, C. A., The study of structured populations-new hope for a difficult and divided science, Nature Rev. Genet., 4, 535-543 (2003)
[36] Hoban, S.; Bertorelle, G.; Gaggiotti, O. E., Computer simulations: tools for population and evolutionary genetics, Nature Rev. Genet., 13, 110-122 (2012)
[37] Howell, N.; Kubacka, I.; Mackey, D. A., How rapidly does the human mitochondrial genome evolve?, Am. J. Hum. Genet., 59, 501-509 (1996)
[38] Hox, J., Multilevel Analysis: Techniques and Applications (2002), Lawrence Erlbaum Associates: Lawrence Erlbaum Associates New Jersey · Zbl 1226.62001
[39] Jorde, L. B., The genetic structure of subdivided human populations: a review; in: Current Developments in Anthropological Genetics, (Mielke, James H.; Crawford, Michael H., Volume 1: Theory and Methods (1980), Plenum Press: Plenum Press New York)
[40] Kayser, M., Evaluation of Y-chromosomal STRs: a multicenter study, Int. J. Leg. Med., 110, 125-129 (1997)
[41] Kimmel, M., Signatures of population expansion in microsatellite repeat data, Genetics, 148, 1921-1930 (1998)
[42] Kimmel, M.; Chakraborty, R., Measures of variation at DNA repeat loci under a General Stepwise Mutation Model, Theor. Popul. Biol., 50, 3, 345-367 (1996) · Zbl 0867.92014
[43] Kimmel, M.; Polanska, J., A model of dynamics of mutation, genetic drift and recombination in DNA-repeat genetic loci, Arch. Control Sci., 9, XVL, 143-157 (1999) · Zbl 1153.92330
[44] Kingman, J., The coalescent, Stochastic Process. Appl., 13, 235-248 (1982) · Zbl 0491.60076
[45] Kremer, M., Population growth and technological change: One million B.C. to 1990, Quart. J. Econ., 108, 3, 681-716 (1993)
[46] Kuhner, M. K., Coalescent genealogy samplers: windows into population history, Trends Ecol. Evol., 24, 2, 86-93 (2009)
[47] Kuhner, M. K., LAMARC 2.0: maximum likelihood and bayesian estimation of population parameters, Bioinformatics, 22, 768-770 (2006)
[48] Lambert, B. W.; Terwilliger, J. D.; Weiss, K. M., Forsim: a tool for exploring the genetic architecture of complex traits with controlled truth, Bioinformatics, 24, 1821-1822 (2008)
[49] Laval, G.; Excoffier, L., SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history, Bioinformatics, 20, 2485-2487 (2004)
[50] Lewontin, R. C., The interaction of selection and linkage. General considerations; heterotic models, Genetics, 49, 1, 49-67 (1964)
[51] Li, H.; Durbin, R., Inference of human population history from individual whole-genome sequences, Nature, 475, 493-496 (2011)
[52] Li, B.; Kimmel, M., Factors influencing ascertainment bias of microsatellite allele sizes: impact on estimates of mutation rates, Genetics, 195, 563-572 (2013)
[53] McEvedy, C.; Woodcock, J., The New Penguin Atlas of Ancient History (2002), Penguin
[54] Meyer, M.; Kircher, M.; Gansauge, M.-T.; Li, H.; Racimo, F., A high-coverage genome sequence from an archaic Denisovan individual, Science, 338, 222-226 (2012)
[55] Nelder, J. A.; Mead, R., A Simplex Method for function minimization, Oxf. J. Math. Phys. Sci. Comput. J., 7, 4, 308-313 (1965) · Zbl 0229.65053
[56] Neuenschwander, S.; Hospital, F.; Guillaume, F.; Goudet, J., quantinemo: an individual-based program to simulate quantitative traits with explicit genetic architecture in a dynamic metapopulation, Bioinformatics, 24, 1552-1553 (2008)
[57] Nielsen, R.; Hubisz, M. J.; Hellmann, I.; Torgerson, D., Darwinian and demographic forces affecting human protein coding genes, Genome Res., 119, 5, 838-849 (2009)
[58] Nielsen, R.; Wakeley, J., Distinguishing migration from isolation: a Markov chain Monte Carlo approach, Genetics, 58, 885-896 (2001)
[59] Nunney, L., The influence of mating system and overlapping generations on effective population size, Evolution, 47, 5, 1329-1341 (1993)
[60] Pazy, A., Semigroups of Linear Operators and Applications to Partial Differential Equations (1983), Springer: Springer New York · Zbl 0516.47023
[61] Peng, B.; Kimmel, M., simupop: a forward-time population genetics simulation environment, Bioinformatics, 21, 3686-3687 (2005)
[62] Perez-Lezaun, A., Population genetics of T-chromosome short tandem repeats in humans, J. Mol. Evol., 45, 3, 265-270 (1997)
[63] Ploski, R., Homogeneity and distinctiveness of Polish paternal lineages revealed Y chromosome microsatellite haplotype analysis, Hum. Genet., 110, 592-600 (2002)
[64] Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; Vetterling, W. T., Numerical Recipes in C - the Art of Scientific Computing (1988), Cambridge University Press · Zbl 0661.65001
[65] Rasmussen, M.; Li, Y.; Lindgreen, S.; Pedersen, J. S.; Albrechtsen, A., Ancient human genome sequence of an extinct Palaeo-Eskimo, Nature, 463, 757-762 (2010)
[66] Reich, D.; Green, R. E.; Kircher, M.; Krause, J.; Patterson, N., Genetic history of an archaic hominin group from Denisova Cave in Siberia, Nature, 468, 1053-1060 (2010)
[67] Schaffner, S. F.; Foo, C.; Gabriel, S.; Reich, D., Calibrating a coalescent simulation of human genome sequence variation, Genome Res., 15, 1576-1583 (2005)
[68] Skoglund, P., Origins and genetic legacy of Neolithic farmers and hunter-gatherers in Europe, Science, 336, 466-469 (2012)
[69] Slatkin, M., A measure of population subdivision based on microsatellite allele frequencies, Genetics, 139, 457-462 (1995)
[70] Stoneking, M.; Krause, J., Learning about human population history from ancient and modern genomes, Nature Rev. Genet., 12, 603-614 (2011)
[71] Thomas, H. L., Archaeology and Indo-European comparative linguistics, Reconstr. Lang. Cult., 58, 281-316 (1992)
[72] Wollstein, A.; Lao, O.; Becker, C. R.; Bauer, S.; Trent, R. J., Demographic history of Oceania inferred from genome-wide data, Curr. Biol., 20, 1983-1992 (2010)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.