×

Generalized \(k\)-ary tanglegrams on level graphs: a satisfiability-based approach and its evaluation. (English) Zbl 1252.05152

Summary: A tanglegram is a pair of (not necessarily binary) trees on the same set of leaves with matching leaves in the two trees joined by an edge. Tanglegrams are widely used in computational biology to compare evolutionary histories of species. In this work we present a formulation of two related combinatorial embedding problems concerning tanglegrams in terms of CNF-formulas. The first problem is known as the planar embedding and the second as the crossing minimization problem. We show that our satisfiability-based encoding of these problems can handle a much more general case with more than two, not necessarily binary or complete, trees defined on arbitrary sets of leaves and allowed to vary their layouts. Furthermore, we present an experimental comparison of our technique and several known heuristics for solving generalized binary tanglegrams, showing its competitive performance and efficiency and thus proving its practical usability.

MSC:

05C62 Graph representations (geometric and intersection representations, etc.)
05C10 Planar graphs; geometric and topological aspects of graph theory
05C05 Trees
05C60 Isomorphism problems in graph theory (reconstruction conjecture, etc.) and homomorphisms (subgraph embedding, etc.)

Software:

CPLEX; Sat4j
Full Text: DOI

References:

[1] X. An, M. Koshimura, H. Fujita, R. Hasegawa, QMaxSAT version 0.3 & 0.4, in: Proceedings of the International Workshop on First-Order Theorem Proving, FTP, 2011, pp. 7-15.; X. An, M. Koshimura, H. Fujita, R. Hasegawa, QMaxSAT version 0.3 & 0.4, in: Proceedings of the International Workshop on First-Order Theorem Proving, FTP, 2011, pp. 7-15.
[2] L. Arvestad, A.-C. Berglund, J. Lagergren, B. Sennblad, Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution, in: Proceedings of the 8th Annual International Conference on Computational Molecular Biology, RECOMB, 2004, pp. 326-335.; L. Arvestad, A.-C. Berglund, J. Lagergren, B. Sennblad, Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution, in: Proceedings of the 8th Annual International Conference on Computational Molecular Biology, RECOMB, 2004, pp. 326-335.
[3] Aspvall, B.; Plass, M. F.; Tarjan, R. E., A linear-time algorithm for testing the truth of certain quantified Boolean formulas, Information Processing Letters, 8, 3, 121-123 (1979) · Zbl 0398.68042
[4] Bansal, M. S.; Chang, W.; Eulenstein, O.; Fernández-Baca, D., Generalized binary tanglegrams: algorithms and applications, (Proceedings of the 1st International Conference on Bioinformatics and Computational Biology. Proceedings of the 1st International Conference on Bioinformatics and Computational Biology, Lecture Notes in Computer Science, vol. 5462 (2009)), 114-125
[5] Baumann, F.; Buchheim, C.; Liers, F., Exact bipartite crossing minimization under tree constraints, (Proceedings of the 9th International Symposium on Experimental Algorithms. Proceedings of the 9th International Symposium on Experimental Algorithms, SEA. Proceedings of the 9th International Symposium on Experimental Algorithms. Proceedings of the 9th International Symposium on Experimental Algorithms, SEA, Lecture Notes in Computer Science, vol. 6049 (2010)), 118-128
[6] Berre, D. L.; Parrain, A., The Sat4j library, release 2.2, Journal on Satisfiability, Boolean Modeling and Computation, 7, 59-64 (2010)
[7] Buchin, K.; Buchin, M.; Byrka, J.; Nöllenburg, M.; Okamoto, Y.; Silveira, R. I.; Wolff, A., Drawing (complete) binary tanglegrams: hardness, approximation, fixed-parameter tractability, (Proceedings of the 16th International Symposium on Graph Drawing. Proceedings of the 16th International Symposium on Graph Drawing, Lecture Notes in Computer Science, vol. 5417 (2008)), 324-335 · Zbl 1213.68431
[8] B. Cha, K. Iwama, Y. Kambayashi, S. Miyazaki, Local search algorithms for partial MAXSAT, in: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI/IAAI), 1997, pp. 263-268.; B. Cha, K. Iwama, Y. Kambayashi, S. Miyazaki, Local search algorithms for partial MAXSAT, in: Proceedings of the 14th National Conference on Artificial Intelligence (AAAI/IAAI), 1997, pp. 263-268.
[9] Charleston, M. A.; Parkins, S. L., Lizards, malaria, and jungles in the Caribbean, (Tangled Trees: Phylogeny, Cospeciation and Coevolution (2003), University of Chicago Press), 65-92
[10] DasGupta, B.; He, X.; Jiang, T.; Li, M.; Tromp, J., On the linear-cost subtree-transfer distance between phylogenetic trees, Algorithmica, 25, 2-3, 176-195 (1999) · Zbl 0952.68113
[11] Di Battista, G.; Eades, P.; Tamassia, R.; Tollis, I. G., Graph Drawing: Algorithms for Geometric Representations of Graphs (1998), Prentice Hall
[12] Dufayard, J.; Duret, L.; Penel, S.; Gouy, M.; Rechenmann, F.; Perriere, G., Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases, Bioinformatics, 21, 2596-2603 (2005)
[13] Eades, P.; Wormald, N. C., Edge crossings in drawings of bipartite graphs, Algorithmica, 11, 4, 379-403 (1994) · Zbl 0804.68107
[14] Fernau, H.; Kaufmann, M.; Poths, M., Comparing trees via crossing minimization, Journal of Computer and System Sciences, 76, 7, 593-608 (2010) · Zbl 1210.05023
[15] Garey, M. R.; Johnson, D. S., Computers and Intractability: A Guide to the Theory of NP-Completeness (1979), W.H. Freeman & Co.: W.H. Freeman & Co. New York, NY, USA · Zbl 0411.68039
[16] Garey, M. R.; Johnson, D. S., Crossing number is NP-complete, SIAM Journal on Algebraic and Discrete Methods, 4, 3, 312-316 (1983) · Zbl 0536.05016
[17] M. Gebser, B. Kaufmann, A. Neumann, T. Schaub, Conflict-driven answer set solving, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence, IJCAI-07, 2007, pp. 386-392.; M. Gebser, B. Kaufmann, A. Neumann, T. Schaub, Conflict-driven answer set solving, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence, IJCAI-07, 2007, pp. 386-392. · Zbl 1149.68332
[18] Hafner, M. S.; Sudman, P. D.; Villablanca, F. X.; Spradling, T. A.; Demastes, J. W.; Nadler, S. A., Disparate rates of molecular evolution in cospeciating hosts and parasites, Science, 265, 5175, 1087-1090 (1994)
[19] Holten, D.; Van Wijk, J. J., Visual comparison of hierarchically organized data, Computer Graphics Forum, 27, 3, 759-766 (2008)
[20] Hopcroft, J.; Tarjan, R. E., Efficient planarity testing, Journal of the ACM, 21, 4, 549-568 (1974) · Zbl 0307.68025
[21] IBM ILOG CPLEX Optimization Studio. URL: http://www.ibm.com/software/integration/optimization/cplex-optimization-studio/; IBM ILOG CPLEX Optimization Studio. URL: http://www.ibm.com/software/integration/optimization/cplex-optimization-studio/
[22] K. Kawarabayashi, B.A. Reed, Computing crossing number in linear time, in: Proceedings of the 39th Annual ACM Symposium on Theory of Computing, STOC, 2007, pp. 382-390.; K. Kawarabayashi, B.A. Reed, Computing crossing number in linear time, in: Proceedings of the 39th Annual ACM Symposium on Theory of Computing, STOC, 2007, pp. 382-390. · Zbl 1232.90339
[23] S. Khot, N.K. Vishnoi, The unique games conjecture, integrality gap for cut problems and embeddability of negative type metrics into \(l_1\); S. Khot, N.K. Vishnoi, The unique games conjecture, integrality gap for cut problems and embeddability of negative type metrics into \(l_1\)
[24] A. Kügel, Improved exact solver for the weighted Max-SAT problem, in: Proceedings of the Workshop Pragmatics of SAT, PoS10, 2010, pp. 1-12.; A. Kügel, Improved exact solver for the weighted Max-SAT problem, in: Proceedings of the Workshop Pragmatics of SAT, PoS10, 2010, pp. 1-12.
[25] S. Leipert, Level planarity testing and embedding in linear time, Ph.D. Thesis, Institut für Informatik, Universität zu Köln, 1998.; S. Leipert, Level planarity testing and embedding in linear time, Ph.D. Thesis, Institut für Informatik, Universität zu Köln, 1998. · Zbl 0942.68087
[26] H. Lin, K. Su, C.-M. Li, Within-problem learning for efficient lower bound computation in Max-SAT solving, in: Proceedings of the 23rd National Conference on Artificial intelligence, AAAI’08, 2008, pp. 351-356.; H. Lin, K. Su, C.-M. Li, Within-problem learning for efficient lower bound computation in Max-SAT solving, in: Proceedings of the 23rd National Conference on Artificial intelligence, AAAI’08, 2008, pp. 351-356.
[27] Maddison, W. P., Gene trees in species trees, Systematic Biology, 46, 3, 523-536 (1997)
[28] Nöllenburg, M.; Holten, D.; Völker, M.; Wolff, A., Drawing binary tanglegrams: an experimental evaluation, (Proceedings of the Eleventh Workshop on Algorithm Engineering and Experiments (ALENEX) (2009)), 106-119 · Zbl 1430.68231
[29] Page, R. D.M., Tangled Trees: Phylogeny, Cospeciation, and Coevolution (2002), University of Chicago Press
[30] Porschen, S.; Speckenmeyer, E., Satisfiability of mixed Horn formulas, Discrete Applied Mathematics, 155, 11, 1408-1419 (2007) · Zbl 1122.68115
[31] Randerath, B.; Speckenmeyer, E.; Boros, E.; Hammer, P. L.; Kogan, A.; Makino, K.; Simeone, B.; Cepek, O., A satisfiability formulation of problems on level graphs, Electronic Notes in Discrete Mathematics, 9, 269-277 (2001)
[32] Sanderson, M. J.; McMahon, M. M., Inferring angiosperm phylogeny from EST data with widespread gene duplication, BMC Evolutionary Biology, 7, Suppl. 1, S3+ (2007)
[33] Sixth Max-Sat evaluation. URL: http://maxsat.ia.udl.cat/introduction/; Sixth Max-Sat evaluation. URL: http://maxsat.ia.udl.cat/introduction/
[34] Speckenmeyer, E.; Porschen, S., PARTIAL MAX-SAT of level graph (mixed-Horn)formulas, Studies in Logic, 3, 3, 24-43 (2010)
[35] Speckenmeyer, E.; Wotzlaw, A.; Porschen, S., A satisfiability-based approach for embedding generalized tanglegrams on level graphs, (Proceedings of the 14th International Conference on Theory and Applications of Satisfiability Testing. Proceedings of the 14th International Conference on Theory and Applications of Satisfiability Testing, SAT’11. Proceedings of the 14th International Conference on Theory and Applications of Satisfiability Testing. Proceedings of the 14th International Conference on Theory and Applications of Satisfiability Testing, SAT’11, Lecture Notes in Computer Science, vol. 6695 (2011)), 134-144 · Zbl 1330.68225
[36] Syvanen, M., Cross-species gene transfer; implications for a new theory of evolution, Journal of Theoretical Biology, 112, 333-343 (1985)
[37] Venkatachalam, B.; Apple, J.; John, K. St.; Gusfield, D., Untangling tanglegrams: Comparing trees by their drawings, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 7, 4, 588-597 (2010)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.