×

The fine structure of galls in phylogenetic networks. (English) Zbl 1402.92313

Summary: A phylogenetic network is a generalization of a phylogenetic tree, allowing properties that are not tree-like. With the growth of genomic data, much of which does not fit ideal tree models, there is greater need to understand the algorithmics and combinatorics of phylogenetic networks.
L. Wang, K. Zhang and L. Zhang [“Perfect phylogenetic networks with recombination”, J. Comput. Biol. 8, No. 1, 69–78 (2001; doi:10.1089/106652701300099119)] studied the problem of constructing a phylogenetic network for a set of n binary sequences derived from the all-zero ancestral sequence, when each site in the sequence can mutate from zero to one at most once in the network, and recombination between sequences is allowed. They showed that the problem of minimizing the number of recombinations in such networks is NP-hard, but introduced a special case of the problem, i.e., to determine whether the sequences could be derived on a phylogenetic network where the recombination cycles are node-disjoint. Wang, Zhang and Zhang [loc. cit.] provide a sufficient, but not a necessary test, for such solutions. We [“Efficient reconstruction of phylogenetic networks (of SNPs) with constrained recombination”, in: Proceedings of the IEEE computer society conference on bioinformatics, CSB ’03. Los Alamitos, CA: IEEE Computer Society. 363 (2003); “Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination”, J. Bioinform. Comput. Biol. 2, No. 1, 173–213 (2004; doi:10.1142/S0219720004000521)] gave a polynomial-time algorithm that is both a necessary and sufficient test. In this paper, we study in much more detail the fine combinatorial structure of node-disjoint cycles in phylogenetic networks, both for purposes of insight into phylogenetic networks and to speed up parts of the previous algorithm. We explicitly characterize all the ways in which mutations can be arranged on a disjoint cycle, and prove a strong necessary condition for a set of mutations to be on a disjoint cycle.
The main contribution here is to show how structure in the phylogenetic network is reflected in the structure of an efficiently-computable graph, called the conflict graph. The success of this approach suggests that additional insight into the structure of phylogenetic networks can be obtained by exploring structural properties of the conflict graph.

MSC:

92D15 Problems related to evolution
05C90 Applications of graph theory
92C42 Systems biology, networks
92D10 Genetics and epigenetics