×

A coalescent dual process for a Wright-Fisher diffusion with recombination and its application to haplotype partitioning. (English) Zbl 1367.92071

Theor. Popul. Biol. 112, 126-138 (2016); corrigendum ibid. 130, 203 (2019).
Summary: Duality plays an important role in population genetics. It can relate results from forwards-in-time models of allele frequency evolution with those of backwards-in-time genealogical models; a well known example is the duality between the Wright-Fisher diffusion for genetic drift and its genealogical counterpart, the coalescent. There have been a number of articles extending this relationship to include other evolutionary processes such as mutation and selection, but little has been explored for models also incorporating crossover recombination. Here, we derive from first principles a new genealogical process which is dual to a Wright-Fisher diffusion model of drift, mutation, and recombination. The process is reminiscent of the ancestral recombination graph, a widely-used multilocus genealogical model, but here ancestral lineages are typed and transition rates are regarded as being conditioned on an observed configuration at the leaves of the genealogy. Our approach is based on expressing a putative duality relationship between two models via their infinitesimal generators, and then seeking an appropriate test function to ensure the validity of the duality equation. This approach is quite general, and we use it to find dualities for several important variants, including both a discrete \(L\)-locus model of a gene and a continuous model in which mutation and recombination events are scattered along the gene according to continuous distributions. As an application of our results, we derive a series expansion for the transition function of the diffusion. Finally, we study in further detail the case in which mutation is absent. Then the dual process describes the dispersal of ancestral genetic material across the ancestors of a sample. The stationary distribution of this process is of particular interest; we show how duality relates this distribution to haplotype fixation probabilities. We develop an efficient method for computing such probabilities in multilocus models.

MSC:

92D10 Genetics and epigenetics
92D15 Problems related to evolution
60J20 Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.)

Software:

OEIS

References:

[1] Barbour, A. D.; Ethier, S. N.; Griffiths, R. C., A transition function expansion for a diffusion model with selection, Ann. Appl. Probab., 10, 1, 123-162 (2000) · Zbl 1171.60368
[2] Bobrowski, A.; Wojdyła, T.; Kimmel, M., Asymptotic behavior of a Moran model with mutations, drift and recombination among multiple loci, J. Math. Biol., 61, 455-473 (2010) · Zbl 1205.92051
[3] Donnelly, P.; Kurtz, T. G., Genealogical processes for Fleming-Viot models with selection and recombination, Ann. Appl. Probab., 9, 4, 1091-1148 (1999) · Zbl 0964.60075
[4] Donnelly, P.; Tavaré, S., The population genealogy of the infinitely-many neutral alleles model, J. Math. Biol., 25, 381-391 (1987) · Zbl 0636.92008
[5] Esser, M.; Probst, S.; Baake, E., Partitioning, duality, and linkage disequilibria in the Moran model with recombination, J. Math. Biol., 73, 1, 161-197 (2016) · Zbl 1359.92080
[6] Etheridge, A. M.; Griffiths, R. C., A coalescent dual process in a Moran model with genic selection, Theor. Popul. Biol., 75, 320-330 (2009) · Zbl 1213.92038
[7] Etheridge, A. M.; Griffiths, R. C.; Taylor, J. E., A coalescent dual process in a Moran model with genic selection, and the lambda coalescent limit, Theor. Popul. Biol., 78, 77-92 (2010) · Zbl 1338.92102
[8] Ethier, S. N.; Griffiths, R. C., The infinitely-many-sites model as a measure-valued diffusion, Ann. Probab., 15, 2, 515-545 (1987) · Zbl 0634.92007
[9] Ethier, S. N.; Griffiths, R. C., The neutral two-locus model as a measure-valued diffusion, Adv. Appl. Probab., 22, 4, 773-786 (1990) · Zbl 0718.92009
[10] Ethier, S. N.; Griffiths, R. C., On the two-locus sampling distribution, J. Math. Biol., 29, 131-159 (1990) · Zbl 0729.92012
[11] Ethier, S. N.; Griffiths, R. C., The transition function of a Fleming-Viot process, Ann. Probab., 21, 3, 1571-1590 (1993) · Zbl 0778.60038
[12] Ethier, S. N.; Kurtz, T. G., Fleming-Viot processes in population genetics, SIAM J. Control Optim., 31, 2, 345-386 (1993) · Zbl 0774.60045
[13] Fearnhead, P., The common ancestor at a nonneutral locus, J. Appl. Probab., 39, 38-54 (2002) · Zbl 1001.92037
[14] Fearnhead, P., Haplotypes: the joint distribution of alleles at linked loci, J. Appl. Probab., 40, 505-512 (2003) · Zbl 1028.92018
[15] Fearnhead, P.; Donnelly, P., Estimating recombination rates from population genetic data, Genetics, 159, 1299-1318 (2001)
[16] Golding, G. B., The sampling distribution of linkage disequilibrium, Genetics, 108, 257-274 (1984)
[17] Griffiths, R. C., A transition density expansion for a multi-allele diffusion model, Adv. Appl. Probab., 11, 2, 310-325 (1979) · Zbl 0405.60079
[18] Griffiths, R. C., Lines of descent in the diffusion approximation of neutral Wright-Fisher models, Theor. Popul. Biol., 17, 37-50 (1980) · Zbl 0434.92011
[19] Griffiths, R. C., Neutral two-locus multiple allele models with recombination, Theor. Popul. Biol., 19, 169-186 (1981) · Zbl 0512.92012
[20] Griffiths, R. C., The two-locus ancestral graph, (Basawa, I. V.; Taylor, R. L., Selected Proceedings of the Sheffield Symposium on Applied Probability: 18. Selected Proceedings of the Sheffield Symposium on Applied Probability: 18, IMS Lecture Notes—Monograph Series, vol. 18 (1991)), 100-117 · Zbl 0781.92022
[21] Griffiths, R. C.; Jenkins, P. A.; Song, Y. S., Importance sampling and the two-locus model with subdivided population structure, Adv. Appl. Probab., 40, 2, 473-500 (2008) · Zbl 1144.62092
[22] Griffiths, R. C.; Marjoram, P., Ancestral inference from samples of DNA sequences with recombination, J. Comput. Biol., 3, 4, 479-502 (1996)
[23] Griffiths, R. C.; Marjoram, P., An ancestral recombination graph, (Donnelly, P.; Tavaré, S., Progress in Population Genetics and Human Evolution. Vol. 87 (1997), Springer-Verlag: Springer-Verlag Berlin), 257-270 · Zbl 0893.92020
[24] Handa, K., Quasi-invariance and reversibility in the Fleming-Viot process, Probab. Theory Related Fields, 122, 545-566 (2002) · Zbl 0995.60048
[25] Hudson, R. R., Properties of a neutral allele model with intragenic recombination, Theor. Popul. Biol., 23, 183-201 (1983) · Zbl 0505.62090
[26] Jansen, S.; Kurt, N., On the notion(s) of duality for Markov processes, Probab. Surv., 11, 59-120 (2014) · Zbl 1292.60077
[27] Jenkins, P. A.; Griffiths, R. C., Inference from samples of DNA sequences using a two-locus model, J. Comput. Biol., 18, 1, 109-127 (2011)
[28] Jenkins, P. A.; Song, Y. S., Closed-form two-locus sampling distributions: accuracy and universality, Genetics, 183, 1087-1103 (2009)
[29] Kamm, J. A.; Spence, J. P.; Chan, J.; Song, Y. S., Two-locus likelihoods under variable population size and fine-scale recombination rate estimation, Genetics, 203, 3, 1381-1399 (2016)
[30] Kingman, J. F.C., The coalescent, Stochastic Process. Appl., 13, 3, 235-248 (1982) · Zbl 0491.60076
[31] Krone, S. M.; Neuhauser, C., Ancestral processes with selection, Theor. Popul. Biol., 51, 3, 210-237 (1997) · Zbl 0910.92024
[32] Larribe, F.; Lessard, S., A composite-conditional-likelihood approach for gene mapping based on linkage disequilibrium in windows of marker loci, Stat. Appl. Genet. Mol. Biol., 7, 1 (2008), Article 27 · Zbl 1276.92089
[33] Larribe, F.; Lessard, S.; Schork, N. J., Gene mapping via the ancestral recombination graph, Theor. Popul. Biol., 62, 215-229 (2002) · Zbl 1101.92306
[34] Lohse, K.; Chmelik, M.; Martin, S. H.; Barton, N. H., Efficient strategies for calculating blockwise likelihoods under the coalescent, Genetics, 202, 2, 775-786 (2016)
[35] Lohse, K.; Harrison, R. J.; Barton, N. H., A general method for calculating likelihoods under the coalescent process, Genetics, 189, 977-987 (2011)
[36] Mano, S., Duality between the two-locus Wright-Fisher diffusion model and the ancestral process with recombination, J. Appl. Probab., 50, 256-271 (2013) · Zbl 1302.92075
[37] Neuhauser, C.; Krone, S. M., The genealogy of samples in models with selection, Genetics, 145, 519-534 (1997)
[39] Simonsen, K. L.; Churchill, G. A., A Markov chain model of coalescence with recombination, Theor. Popul. Biol., 52, 43-59 (1997) · Zbl 0908.92024
[40] Stephens, M., Inference under the coalescent, (Balding, D.; Bishop, M.; Cannings, C., Handbook of Statistical Genetics (2007), Wiley: Wiley Chichester, UK), 878-908, (Chapter 26)
[41] Stephens, M.; Donnelly, P., Ancestral inference in population genetics models with selection, Aust. N. Z. J. Stat., 45, 3, 395-430 (2003) · Zbl 1064.62115
[42] Wiuf, C.; Hein, J., On the number of ancestors to a DNA sequence, Genetics, 147, 1459-1468 (1997)
[43] Wright, S., Adaptation and selection, (Jepson, G. L.; Mayr, E.; Simpson, G. G., Genetics, Paleontology and Evolution (1949), Princeton University Press: Princeton University Press Princeton), 365-389
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.