×

Sequential Markov coalescent algorithms for population models with demographic structure. (English) Zbl 1213.92037

Summary: We analyse sequential Markov coalescent algorithms for populations with demographic structure: for a bottleneck model, a population-divergence model, and for a two-island model with migration. The sequential Markov coalescent method is an approximation to the coalescent suggested by G. A. T. McVean and N. J. Cardin [Philos. Transact. R. Soc., Ser. B 360, 1387–1393 (2005)], and by P. Marjoram and J. Wall [BMC Genetics 7, 16 ff. (2006)]. Within this algorithm we compute, for two individuals randomly sampled from the population, the correlation between times to the most recent common ancestor and the linkage probability corresponding to two different loci with recombination rate R between them. These quantities characterise the linkage between the two loci in question. We find that the sequential Markov coalescent method approximates the coalescent well in general in models with demographic structure. An exception is the case where individuals are sampled from populations separated by reduced gene flow. In this situation, the correlations may be significantly underestimated. We explain why this is the case.

MSC:

92D15 Problems related to evolution
92D10 Genetics and epigenetics
60J22 Computational methods in Markov chains

Software:

Genomepop; Recodon

References:

[1] Altshuler, D.; Brooks, L.; Chakravarti, A.; Collins, F.; Daly, M.; Donnelly, P., A haplotype map of the human genome, Nature, 437, 7063, 1299-1320 (2005)
[2] Arenas, M.; Posada, D., Recodon: Coalescent simulation of coding DNA sequences with recombination, migration and demography, BMC Bioinformatics, 8, 458 (2007)
[3] Carvajal-Rodriguez, A., Genomepop: A program to simulate genomes in populations, BMC Bioinformatics, 9, 223 (2008)
[4] Chen, G. K.; Marjoram, P.; Wall, J. D., Fast and flexible simulation of DNA sequence data, Genome Research, 19, 1, 136-142 (2009)
[5] Eriksson, A.; Mehlig, B., Gene-history correlation and population structure, Physical Biology, 1, 220-228 (2004)
[6] Griffiths, R. C., Neutral 2-locus multiple allele models with recombination, Theoretical Population Biology, 19, 169-186 (1981) · Zbl 0512.92012
[7] Hoggart, C. J.; Chadeau-Hyam, M.; Clark, T. G.; Lampariello, R.; Whittaker, J. C.; DeIorio, M.; Balding, D. J., Sequence-level population simulations over large genomic regions, Genetics, 177, 3, 1725-1731 (2007)
[8] Hudson, R. R., Properties of a neutral allele model with intragenic recombination, Theoretical Population Biology, 23, 183-201 (1983) · Zbl 0505.62090
[9] Hudson, R. R., Gene genealogies and the coalescent process, (Futuyma, D.; Antonovics, J., Oxford Surveys in Evolutionary Biology (1990), Oxford University Press: Oxford University Press Oxford), 1-43
[10] Hudson, R. R., Two-locus sampling distributions and their application, Genetics, 159, 1805-1817 (2001)
[11] Kaplan, N.; Hudson, R. R., The use of sample genealogies for studying a selectively neutral \(m\)-loci model with recombination, Theoretical Population Biology, 28, 382-396 (1985) · Zbl 0571.92013
[12] Kingman, J. F.C., On the genealogy of large populations, Journal of Applied Probability, 19A, 27-43 (1982) · Zbl 0516.92011
[13] Liang, L.; Zollner, S.; Abecasis, G. R., Genome: A rapid coalescent-based whole genome simulator, Bioinformatics, 23, 12, 1565-1567 (2007)
[14] Liang, Y.; Kelemen, A., Statistical advances and challenges for analyzing correlated high dimensional snp data in genomic study for complex diseases, Statistics Surveys, 2, 43-60 (2008) · Zbl 1196.62144
[15] Lindblad-Toh, K.; Winchester, E.; Daly, M. J.; Wang, D. G.; Hirschhorn, J. N.; Laviolette, J.-P.; Ardlie, K.; Reich, D. E.; Robinson, E.; Sklar, P.; Shah, N.; Thomas, D.; Fan, J.-B.; Gingeras, T.; Warrington, J.; Patil, N.; Hudson, T. J.; Lander, E. S., Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouse, Nature Genetics, 24, 381-386 (2000)
[16] Marjoram, P.; Tavare, S., Modern computational approaches for analysing molecular genetic variation data, Nature Reviews Genetics, 7, 10, 759-770 (2006)
[17] Marjoram, P.; Wall, J., Fast coalescent simulation, BMC Genetics, 7, 16 (2006)
[18] McVean, G. A.T., A genealogical interpretation of linkage disequilibrium, Genetics, 162, 2, 987-991 (2002)
[19] McVean, G. A.T.; Cardin, N. J., Approximating the coalescent with recombination, Philosophical Transactions of the Royal Society B, 360, 1387-1393 (2005)
[20] Nordborg, M., Coalescent theory, (Balding, D. J.; Bishop, M.; Cannings, C., Handbook of Statistical Genetics (2001), John Wiley & Sons), 179-212, (Chapter 7)
[21] Nordborg, M.; Hu, T.; Ishino, Y.; Jhaveri, J.; Toomajian, C.; Zheng, H.; Bakker, E.; Calabrese, P.; Gladstone, J.; Goyal, R.; Jakobsson, M.; Kim, S.; Morozov, Y.; Padhukasahasram, B.; Plagnol, V.; Rosenberg, N.; Shah, C.; Wall, J.; Wang, J.; Zhao, K.; Kalbfleisch, T.; Schulz, V.; Kreitman, M.; Bergelson, J., The pattern of polymorphism in Arabidopsis Thaliana, Plos Biology, 3, 7, 1289-1299 (2005)
[22] Schaffner, S.; Foo, C.; Gabriel, S.; Reich, D.; Daly, M.; Altshuler, D., Calibrating a coalescent simulation of human genome sequence variation, Genome Research, 15, 11, 1576-1583 (2005)
[23] Tajima, F., Statistical method for testing the neutral mutation hypothesis by DNA polymorphism, Genetics, 123, 585-595 (1989)
[24] The International HapMap Consortium, The international HapMap project, Nature, 426, 789-796 (2003)
[25] Wiuf, C.; Hein, J., The ancestry of a sample of sequences subject to recombination, Genetics, 151, 1217-1228 (1999)
[26] Wiuf, C.; Hein, J., Recombination as a point process along sequences, Theoretical Population Biology, 55, 248-259 (1999) · Zbl 0923.92015
[27] Wright, S., Evolution in Mendelian populations, Genetics, 16, 2, 97-159 (1931)
[28] Yu, J.; Buckler, E., Genetic association mapping and genome organization of maize, Current Opinion in Biotechnology, 17, 2, 155-160 (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.