×

A linear-time algorithm for the perfect phylogeny haplotype problem. (English) Zbl 1121.92050

Summary: The inference of evolutionary trees from binary species-character matrices is a fundamental computational problem in classical phylogenetic studies. Several problems arising in this field lead to different variants of the inference problem; some of these concern input data with missing values or incomplete matrices. A model of inference from incomplete data that has recently gained a remarkable interest is the Perfect Phylogeny Haplotype problem (PPH) introduced by V. Bafna, D. Gusfield, G. Lancia and S. Yooseph [Haplotyping as perfect phylogeny: a direct approach. J. Comput. Biol. 10, 323–340 (2003)] and successfully applied to infer haplotypes from genotype data. A stated open issue in this research field is the linear-time solution of this inference problem. We solve this problem and give an \(O(nm)\)-time algorithm to complete matrices of \(n\) rows and \(m\) columns to represent PPH solutions; we show that solving the problem requires recognizing special posets of width \(2\).

MSC:

92D15 Problems related to evolution
68U99 Computing methodologies and applications
68W05 Nonnumerical algorithms
Full Text: DOI