×

Comparative genomics on artificial life. (English) Zbl 1478.92129

Beckmann, Arnold (ed.) et al., Pursuit of the universal. 12th conference on computability in Europe, CiE 2016, Paris, France, June 27 – July 1, 2016. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 9709, 35-44 (2016).
Summary: Molecular evolutionary methods and tools are difficult to validate as we have almost no direct access to ancient molecules. Inference methods may be tested with simulated data, producing full scenarios they can be compared with. But often simulations design is concomitant with the design of a particular method, developed by a same team, based on the same assumptions, when both should be blind to each other. In silico experimental evolution consists in evolving digital organisms with the aim of testing or discovering complex evolutionary processes. Models were not designed with a particular inference method in mind, only with basic biological principles. As such they provide a unique opportunity to blind test the behavior of inference methods. We give a proof of this concept on a comparative genomics problem: inferring the number of inversions separating two genomes. We use Aevol, an in silico experimental evolution platform, to produce benchmarks, and show that most combinatorial or statistical estimators of the number of inversions fail on this dataset while they were behaving perfectly on ad-hoc simulations. We argue that biological data is probably closer to the difficult situation.
For the entire collection see [Zbl 1337.68005].

MSC:

92D15 Problems related to evolution
92D10 Genetics and epigenetics
92B20 Neural networks for/in biological studies, artificial life and related topics
92-08 Computational methods for problems pertaining to biology

Software:

SimPhy; Aevol; INDELible; ALF
Full Text: DOI

References:

[1] Alexeev, N., Aidagulov, R., Alekseyev, M.A.: A computational method for the rate estimation of evolutionary transpositions. In: Ortuño, F., Rojas, I. (eds.) IWBBIO 2015, Part I. LNCS, vol. 9043, pp. 471–480. Springer, Heidelberg (2015) · doi:10.1007/978-3-319-16483-0_46
[2] Alexeev, N., Alekseyev, M.A.: Estimation of the true evolutionary distance under the fragile breakage model. Arxiv (2015). http://arxiv.org/abs/1510.08002
[3] Batut, B., Parsons, D.P., Fischer, S., Beslon, G., Knibbe, C.: In silico experimental evolution: a tool to test evolutionary scenarios. BMC Bioinformatics 14(S15), S11 (2013) · doi:10.1186/1471-2105-14-S15-S11
[4] Beiko, R.G., Charlebois, R.L.: A simulation test bed for hypotheses of genome evolution. Bioinformatics 23(7), 825–831 (2007) · doi:10.1093/bioinformatics/btm024
[5] Berestycki, N., Durrett, R.: A phase transition in the random transposition random walk. Probab. Theory Relat. Fields 136, 203–233 (2006) · Zbl 1102.60005 · doi:10.1007/s00440-005-0479-7
[6] Berthelot, C., Muffato, M., Abecassis, J., Crollius, H.R.: The 3d organization of chromatin explains evolutionary fragile genomic regions. Cell Rep. 10(11), 1913–1924 (2015) · doi:10.1016/j.celrep.2015.02.046
[7] Biller, P., Guéguen, L., Tannier, E.: Moments of genome evolution by double cut-and-join. BMC Bioinform. 16(Suppl 14), S7 (2015) · doi:10.1186/1471-2105-16-S14-S7
[8] Biller, P., Knibbe, C., Guéguen, L., Tannier, E.: Breaking good: accounting for the diversity of fragile regions for estimating rearrangement distances. Genome Biol. Evol. (2016, in press) · doi:10.1093/gbe/evw083
[9] Caprara, A., Lancia, G.: Experimental and statistical analysis of sorting by reversals. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics, pp. 171–183. Springer, Amsterdam (2000) · Zbl 1137.92309 · doi:10.1007/978-94-011-4309-7_16
[10] Dalquen, D.A., Anisimova, M., Gonnet, G.H., Dessimoz, C.: ALF-a simulation framework for genome evolution. Mol. Biol. Evol. 29(4), 1115–1123 (2012) · doi:10.1093/molbev/msr268
[11] Duchemin, W., Daubin, V., Tannier, E.: Reconstruction of an ancestral yersinia pestis genome and comparison with an ancient sequence. BMC Genom. 16(Suppl 10), S9 (2015) · doi:10.1186/1471-2164-16-S10-S9
[12] Eriksen, N., Hultman, A.: Estimating the expected reversal distance after a fixed number of reversals. Adv. Appl. Math. 32, 439–453 (2004) · Zbl 1051.92029 · doi:10.1016/S0196-8858(03)00054-X
[13] Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. MIT Press, London (2009) · Zbl 1170.92022 · doi:10.7551/mitpress/9780262062824.001.0001
[14] Fletcher, W., Yang, Z.: Indelible: a flexible simulator of biological sequence evolution. Mol. Biol. Evol. 26(8), 1879–1888 (2009) · doi:10.1093/molbev/msp098
[15] Hall, B.G.: Simulating DNA coding sequence evolution with EvolveAGene 3. Mol. Biol. Evol. 25(4), 688–695 (2008) · doi:10.1093/molbev/msn008
[16] Hannenhalli, S., Pevzner, P.A.: Transforming men into mice (polynomial algorithm for genomic distance problem). In: Proceedings of 36th Annual Symposium on Foundations of Computer Science (1995) · Zbl 0938.68939 · doi:10.1109/SFCS.1995.492588
[17] Hillis, D.M., Bull, J.J., White, M.E., Badgett, M.R., Molineux, I.J.: Experimental phylogenetics: generation of a known phylogeny. Science 255(5044), 589–592 (1992) · doi:10.1126/science.1736360
[18] Hindré, T., Knibbe, C., Beslon, G., Schneider, D.: New insights into bacterial adaptation through in vivo and in silico experimental evolution. Nat. Rev. Microbiol. 10, 352–365 (2012)
[19] Knibbe, C., Coulon, A., Mazet, O., Fayard, J.-M., Beslon, G.: A long-term evolutionary pressure on the amount of noncoding DNA. Mol. Biol. Evol. 24(10), 2344–2353 (2007) · doi:10.1093/molbev/msm165
[20] Larget, B., Simon, D.L., Kadane, J.B.: On a Bayesian approach to phylogenetic inference from animal mitochondrial genome arrangements (with discussion). J. Roy. Stat. Soc. B 64, 681–693 (2002) · Zbl 1067.62115 · doi:10.1111/1467-9868.00356
[21] Lemaitre, C., Zaghloul, L., Sagot, M.-F., Gautier, C., Arneodo, A., Tannier, E., Audit, B.: Analysis of fine-scale mammalian evolutionary breakpoints provides new insight into their relation to genome organisation. BMC Genom. 10, 335 (2009) · doi:10.1186/1471-2164-10-335
[22] Lin, Y., Moret, M.E.: Estimating true evolutionary distances under the DCJ model. Bioinformatics 24(13), i114–i122 (2008) · doi:10.1093/bioinformatics/btn148
[23] Mallo, D., De Oliveira Martins, L., Posada, D.: Simphy: phylogenomic simulation of gene, locus, and species trees. Syst Biol. 65, 334–344 (2016) · doi:10.1093/sysbio/syv082
[24] Steel, M., Penny, D.: Parsimony, likelihood, and the role of models in molecular phylogenetics. Mol. Biol. Evol. 17(6), 839–850 (2000) · doi:10.1093/oxfordjournals.molbev.a026364
[25] Swenson, K.M., Marron, M., Earnest-DeYoung, J.V., Moret, B.M.E.: Approximating the true evolutionary distance between two genomes. J. Exp. Algorithmics 12, 3.5 (2008) · Zbl 1365.92078 · doi:10.1145/1227161.1402297
[26] Szollösi, G.J., Boussau, B., Abby, S.S., Tannier, E., Daubin, V.: Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations. Proc. Natl. Acad. Sci. U. S. A. 109(43), 17513–17518 (2012) · doi:10.1073/pnas.1202997109
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.