×

Approximation algorithms for constrained generalized tree alignment problem. (English) Zbl 1186.68593

Summary: In the generalized tree alignment problem, we are given a set \(S\) of \(k\) biologically related sequences and we are interested in a minimum cost evolutionary tree for \(S\). In many instances of this problem partial phylogenetic tree for \(S\) is known. In such instances, we would like to make use of this knowledge to restrict the tree topologies that we consider and construct a biologically relevant minimum cost evolutionary tree. So, we propose the following natural generalization of the generalized tree alignment problem, a problem known to be MAX-SNP Hard, stated as follows:
Constrained Generalized Tree Alignment Problem [the author, Algorithms and heuristics for constrained generalized alignment problem, DIMACS Technical Report 2007-21 (2007)]: Given a set \(S\) of \(k\) related sequences and a phylogenetic forest comprising of node-disjoint phylogenetic trees that specify the topological constraints that an evolutionary tree of \(S\) needs to satisfy, construct a minimum cost evolutionary tree for \(S\).
In this paper, we present constant approximation algorithms for the constrained generalized tree alignment problem. For the generalized tree alignment problem, a special case of this problem, our algorithms provide a guaranteed error bound of \(2 - 2/k\).

MSC:

68W40 Analysis of algorithms
05C85 Graph algorithms (graph-theoretic aspects)
05C05 Trees
68W25 Approximation algorithms
Full Text: DOI

References:

[1] Altschul, S.; Lipman, D., Trees, stars, and multiple sequence alignment, SIAM Journal on Applied Math, 49, 197-209 (1989) · Zbl 0695.92006
[2] Berman, P.; Ramaiyer, V., Improved approximations for the steiner tree problem, Journal of Algorithms, 17, 3, 381-408 (1994) · Zbl 0820.68049
[3] S. Divakaran, Algorithms and heuristics for constrained generalized alignment problem, DIMACS Technical Report. 2007-21, 2007; S. Divakaran, Algorithms and heuristics for constrained generalized alignment problem, DIMACS Technical Report. 2007-21, 2007
[4] D.Z. Du, Y. Zhang, Q. Zeng, On better heuristic for euclidean steiner minimum trees, in: Proc. of the 32nd Symposium on the Foundations of Computer Science, 1991 pp. 431-439; D.Z. Du, Y. Zhang, Q. Zeng, On better heuristic for euclidean steiner minimum trees, in: Proc. of the 32nd Symposium on the Foundations of Computer Science, 1991 pp. 431-439
[5] Gusfield, D., Efficient methods for multiple sequence alignment with guaranteed error bounds, Bulletin of Mathematical Biology, 55, 1, 141-154 (1993) · Zbl 0756.92020
[6] Gusfield, D., Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology (1997), Cambridge University Press · Zbl 0934.68103
[7] Hein, J., A new method that simultaneously aligns and reconstructs ancestral sequences for any number of homologous sequences, when the phylogeny is given, Molecular Biology and Evolution., 6, 649-668 (1989)
[8] Hein, J., Unified to alignment and phylogenies, Methods in Enzymology, 183, 626-645 (1990)
[9] T. Jiang, E.L. Lawler, L. Wang, Aligning sequences via an evolutionary tree: Complexity and approximation, in: Symposium on Theory of Computing, 1994, pp. 760-769; T. Jiang, E.L. Lawler, L. Wang, Aligning sequences via an evolutionary tree: Complexity and approximation, in: Symposium on Theory of Computing, 1994, pp. 760-769 · Zbl 1345.92106
[10] Kapoor, S.; Ramesh, H., Algorithms for enumerating all spanning trees of undirected and weighted graphs, SIAM Journal of Computing (1995) · Zbl 0827.68053
[11] Kruskal, J. B.; Sankoff, D., An anthology of algorithms and concepts for sequence comparison, (Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison (1983), Addison Wesley)
[12] Sankoff, D., Minimal mutation trees of sequences, SIAM Journal on Applied Math, 28, 35-42 (1975) · Zbl 0315.05101
[13] Schwikowski, B.; Vingron, M., The deferred path heuristic for the generalized tree alignment problem, (In Proceedings of the first annual International Conference on Computational Molecular Biology. In Proceedings of the first annual International Conference on Computational Molecular Biology, (RECOMB’97) (1997), ACM Press), 257-266
[14] Wang, L.; Gusfield, D., Improved approximation algorithms for tree alignment, Journal of Algorithms, 25, 255-273 (1997) · Zbl 0895.68061
[15] Wang, L.; Jiang, T., On the complexity of multiple sequence alignment, Journal of Computational Biology, 1, 337-348 (1994)
[16] Wang, L.; Jiang, T., Algorithmic methods for multiple sequence alignment, (Current Topics in Computational Molecular Biology (2002), MIT Press), 72-110
[17] Wang, L.; Jiang, T.; Gusfield, D., A more efficient approximation scheme for tree alignment, SIAM Journal of Computing, 30, 283-299 (2001) · Zbl 0965.05034
[18] Wang, L.; Jiang, T.; Lawler, E. L., Approximation algorithms for tree alignment with a given phylogeny, Algorithmica, 16, 302-315 (1996) · Zbl 0862.68119
[19] A. Zelikovsky, Better approximation bounds for the network and Euclidean Steiner tree problems, Technical Report 96-06, Department of Computer Science, U. of Virginia, 1996; A. Zelikovsky, Better approximation bounds for the network and Euclidean Steiner tree problems, Technical Report 96-06, Department of Computer Science, U. of Virginia, 1996
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.